mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2026-04-29 12:28:27 +02:00
3f5dfa472ea6771c821ee0bb10dee7de41ef6021
1414238 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3f5dfa472e |
tools build: Fix rust feature detection
Features in FEATURE_TESTS_BASIC will be set as being available if
test-all.c builds, so since the rust test isn't included in test-all.c,
we can't have 'rust' in there, remove it from FEATURE_TESTS_BASIC and
use feature-check so that it tries to build test-rust.bin, doing the
actual feature detection.
On a system lacking a rust compiler:
Makefile.config:1158: Rust is not found. Test workloads with rust are disabled.
Auto-detecting system features:
... libdw: [ on ]
... glibc: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libpython: [ on ]
... libcapstone: [ on ]
... llvm-perf: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... libopenssl: [ on ]
... rust: [ OFF ]
$ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
/bin/sh: line 1: rustc: command not found
$ file /tmp/build/perf-tools-next/feature/test-rust.bin
/tmp/build/perf-tools-next/feature/test-rust.bin: cannot open `/tmp/build/perf-tools-next/feature/test-rust.bin' (No such file or directory)
$
$ perf -vv | grep RUST
rust: [ OFF ] # HAVE_RUST_SUPPORT
$
And after installing it:
... rust: [ on ]
$ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
$ file /tmp/build/perf-tools-next/feature/test-rust.bin
/tmp/build/perf-tools-next/feature/test-rust.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9c416edf673ee3705b97bae893a99a6fcf1ee258, for GNU/Linux 3.2.0, with debug_info, not stripped
$
$ perf -vv | grep RUST
rust: [ on ] # HAVE_RUST_SUPPORT
$
Fixes:
|
||
|
|
335047109d |
perf tests: Test annotate with data type profiling and C
Exercise the annotate command with data type profiling feature with C. For that extend the existing data type profiling shell test to profile the datasym workload, then annotate the result expecting to see some data structures from the C code. Committer testing: root@number:~# perf test 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -vv 'perf data type profiling tests' 83: perf data type profiling tests: --- start --- test child forked, pid 125028 Basic Rust perf annotate test Basic annotate test [Success] Pipe Rust perf annotate test Pipe annotate test [Success] Basic C perf annotate test Basic annotate test [Success] Pipe C perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:~# Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
f60a5c2296 |
perf tests: Test annotate with data type profiling and rust
Exercise the annotate command with data type profiling feature on the rust runtime. For that add a new shell test, which will profile the code_with_type workload, then annotate the result expecting to see some data structures from the rust code. Committer testing: root@number:~# perf test 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -v 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -vv 'perf data type profiling tests' 83: perf data type profiling tests: --- start --- test child forked, pid 111044 Basic perf annotate test Basic annotate test [Success] Pipe perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:~# Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
2e05bb52a1 |
perf test workload: Add code_with_type test workload
The purpose of the workload is to gather samples of rust runtime. To
achieve that it has a dummy rust library linked with it.
Per recommendations for such scenarios [1], the rust library is
statically linked.
An example:
$ perf record perf test -w code_with_type
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.160 MB perf.data (4074 samples) ]
$ perf report --stdio --dso perf -s srcfile,srcline
45.16% ub_checks.rs ub_checks.rs:72
6.72% code_with_type.rs code_with_type.rs:15
6.64% range.rs range.rs:767
4.26% code_with_type.rs code_with_type.rs:21
4.23% range.rs range.rs:0
3.99% code_with_type.rs code_with_type.rs:16
[...]
[1]: https://doc.rust-lang.org/reference/linkage.html#mixed-rust-and-foreign-codebases
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
6a32fa5ccd |
tools build: Add a feature test for rust compiler
Add a feature test to identify if the rust compiler is available, so that perf could build rust based worloads based on that. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
ff9aeb6bd1 |
perf test parse-metric: Ensure aggregate counts appear to have run
Commit |
||
|
|
c60ee958d6 |
perf test record.sh: Fix shellcheck warning
Add quotes to avoid the following warning:
```
In tests/shell/record.sh line 264:
[ $(uname -m) = "s390x" ] && {
^---------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
```
Fixes:
|
||
|
|
36a1b0061a |
perf build: Reduce pmu-events related copying and mkdirs
When building to an output directory the previous code would remove files and then copy the source files over. Each source file copy would have a rule to make its directory. All JSON for every architecture was considered a source file. This led to unnecessary copying as a file would be deleted and then the same file copied again, unnecessary directory making, and copying of files not used in the build. A side-effect would be a lot of build messages. This change makes it so that all computed output files are created and then compared to all files in the OUTPUT directory. By filtering out the files that would be copied, unnecessary files can be determined and then deleted - note, this is a phony target which would remake the pmu-events.c if always depended upon, and so the dependency is conditional on there being files to remove. This has some overhead as the $(OUTPUT)/pmu-events is "find" over rather than just "rm -fr", but the savings from unnecessary copying, etc. should make up for this new make overhead. The copy target just does copying but has a dependency on the directory it needs being built, avoiding repetitive mkdirs. The source files for copying only consider the JEVENTS_ARCH unless the JEVENTS_ARCH is all. The metric JSON is only generated if appropriate, rather than always being generated and jevents.py deciding whether or not to use the files. The mypy and pylint targets are fixed as variable names had changed but the rules not updated. The line count of a build with "make -C tools/perf O=/tmp/perf clean all" prior to this change was 2181 lines, after this change it is 1596 lines. This is a reduction of 585 lines or about 27%. The generated pmu-events.c for JEVENTS_ARCH "x86" and "all" were validated as being identical after this change. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
4479884d1f |
perf lock contention: fix segfault in lock contention -b/--use-bpf
When run on a kernel without BTF info, perf crashes:
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
Program received signal SIGSEGV, Segmentation fault.
0x00005555556915b7 in btf.type_cnt ()
(gdb) bt
#0 0x00005555556915b7 in btf.type_cnt ()
#1 0x0000555555691fbc in btf_find_by_name_kind ()
#2 0x00005555556920d0 in btf.find_by_name_kind ()
#3 0x00005555558a1b7c in init_numa_data (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:125
#4 0x00005555558a264b in lock_contention_prepare (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:313
#5 0x0000555555620702 in __cmd_contention (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2084
#6 0x0000555555622c8d in cmd_lock (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2755
#7 0x0000555555651451 in run_builtin (p=0x555556104f00 <commands+576>, argc=3, argv=0x7fffffffea10)
at perf.c:349
#8 0x00005555556516ed in handle_internal_command (argc=3, argv=0x7fffffffea10) at perf.c:401
#9 0x000055555565184e in run_argv (argcp=0x7fffffffe7fc, argv=0x7fffffffe7f0) at perf.c:445
#10 0x0000555555651b9f in main (argc=3, argv=0x7fffffffea10) at perf.c:553
Check if btf loading failed, and don't do anything with it in
init_numa_data(). This leads to the following error message, instead of
just a crash:
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -ESRCH
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
920c5570a6 |
perf sort: Replace static cacheline size with sysconf cacheline size
Testing: - Built perf - Executed perf mem record and report Committer notes: This addresses a TODO and improves the situation where record and report/c2c are performed on the same machine or in machines with the same cacheline size, but the proper way is to store the cacheline size in the perf.data header at 'record' time and then use it at post processing time. Signed-off-by: Ricky Ringler <ricky.ringler@proton.me> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20260129004223.26799-1-ricky.ringler@proton.me Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c73a56ed3c |
perf test: Fix test case Leader sampling on s390
The subtest 'Leader sampling' some time fails on s390.
- for z/VM guest: Disable the test for z/VM guest. There is no
CPU Measurement facility to run the test successfully.
- for LPAR: Use correct event names.
A detailed analysis follows here:
Now to the debugging and investigation:
1. With command
perf record -e '{cycles,cycles}:S' -- ....
the first cycles event starts sampling.
On s390 this sets up sampling with a frequency of 4000 Hz.
This translates to hardware sample rate of 1377000 instructions per
micro-second to meet a frequency of 4000 HZ.
2. With first event cycles now sampling into a hardware buffer, an
interrupt is triggered each time a sampling buffer gets full.
The interrupt handler is then invoked and debug output shows the
processing of samples. The size of one hardware sample is 32 bytes.
With an interrupt triggered when the hardware buffer page of 4KB
gets full, the interrupt handler processes 128 samples.
(This is taken from s390 specific fast debug data gathering)
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x0 count 0x1502e8
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x1502e8 count 0x1502e8
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x2a05d0 count 0x1502e8
2025-11-07 14:35:51.977252 000003ffe013cbfa \
perf_event_count_update event->count 0x3f08b8 count 0x1502e8
2025-11-07 14:35:51.977252 000003ffe013cbfa \
perf_event_count_update event->count 0x540ba0 count 0x1502e8
2025-11-07 14:35:51.977253 000003ffe013cbfa \
perf_event_count_update event->count 0x690e88 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0x7e1170 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0x931458 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0xa81740 count 0x1502e8
3. The value is constantly increasing by the number of instructions
executed to generate a sample entry. This is the first line of the
pairs of lines. count 0x1502e8 --> 1377000
# perf script | grep 1377000 | wc -l
214
# perf script | wc -l
428
#
That is 428 lines in total, and half of the lines contain value
1377000.
4. The second event cycles is opened against the counting PMU, which
is an independent PMU and is not interrupt driven. Once enabled it
runs in the background and keeps running, incrementing silently
about 400+ counters. The counter values are read via assembly
instructions.
This second counter PMU's read call back function is called when the
interrupt handler of the sampling facility processes each sample. The
function call sequence is:
perf_event_overflow()
+--> __perf_event_overflow()
+--> __perf_event_output()
+--> perf_output_sample()
+--> perf_output_read()
+--> perf_output_read_group()
for_each_sibling_event(sub, leader) {
values[n++] = perf_event_count(sub, self);
printk("%s sub %p values %#lx\n", __func__, sub, values[n-1]);
}
The last function perf_event_count() is invoked on the second event
cylces *on* the counting PMU. An added printk statement shows the
following lines in the dmesg output:
# dmesg|grep perf_output_read_group |head -10
[ 332.368620] perf_output_read_group sub 00000000d80b7c1f values 0x3a80917 (1)
[ 332.368624] perf_output_read_group sub 00000000d80b7c1f values 0x3a86c7f (2)
[ 332.368627] perf_output_read_group sub 00000000d80b7c1f values 0x3a89c15 (3)
[ 332.368629] perf_output_read_group sub 00000000d80b7c1f values 0x3a8c895 (4)
[ 332.368631] perf_output_read_group sub 00000000d80b7c1f values 0x3a8f569 (5)
[ 332.368633] perf_output_read_group sub 00000000d80b7c1f values 0x3a9204b
[ 332.368635] perf_output_read_group sub 00000000d80b7c1f values 0x3a94790
[ 332.368637] perf_output_read_group sub 00000000d80b7c1f values 0x3a9704b
[ 332.368638] perf_output_read_group sub 00000000d80b7c1f values 0x3a99888
#
This correlates with the output of
# perf report -D | grep 'id 00000000000000'|head -10
..... id 0000000000000006, value 00000000001502e8, lost 0
..... id 000000000000000e, value 0000000003a80917, lost 0 --> line (1) above
..... id 0000000000000006, value 00000000002a05d0, lost 0
..... id 000000000000000e, value 0000000003a86c7f, lost 0 --> line (2) above
..... id 0000000000000006, value 00000000003f08b8, lost 0
..... id 000000000000000e, value 0000000003a89c15, lost 0 --> line (3) above
..... id 0000000000000006, value 0000000000540ba0, lost 0
..... id 000000000000000e, value 0000000003a8c895, lost 0 --> line (4) above
..... id 0000000000000006, value 0000000000690e88, lost 0
..... id 000000000000000e, value 0000000003a8f569, lost 0 --> line (5) above
Summary:
- Above command starts the CPU sampling facility, with runs interrupt
driven when a 4KB page is full. An interrupt processes the 128 samples
and calls eventually perf_output_read_group() for each sample to save it
in the event's ring buffer.
- At that time the CPU counting facility is invoked to read the value of
the event cycles. This value is saved as the second value in the
sample_read structure.
- The first and odd lines in the perf script output displays the period
value between 2 samples being created by hardware. It is the number
of instructions executes before the hardware writes a sample.
- The second and even lines in the perf script output displays the number
of CPU cycles needed to process each sample and save it in the event's
ring buffer.
These 2 different values can never be identical on s390.
Since event leader sampling is not possible on s390 the perf tool will
return EOPNOTSUPP soon. Perpare the test case for that.
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Jan Polensky <japo@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
64ea7a4620 |
perf annotate: Fix register usage in data type profiling
On data type profiling, it tried to match register name with a partial string. For example, it allowed to match with "%rbp)" or "%rdi,8)". But with recent change in the area, it doesn't match anymore and break the data type profiling. Let's pass the correct register name by removing the unwanted part. Add arch__dwarf_regnum() to handle it in a single place. Closes: 7d3n23li6drroxrdlpxn7ixehdeszkjdftah3zyngjl2qs22ef@yelcjv53v42o Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zecheng Li <zli94@ncsu.edu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
bb5a920b90 |
perf stat: Ensure metrics are displayed even with failed events
Currently, `perf stat` skips or hides metrics when the underlying
hardware events cannot be counted (e.g., due to insufficient permissions
or unsupported events).
In `--metric-only` mode, this often results in missing columns or blank
spaces, making the output difficult to parse.
Modify the logic to ensure metrics are consistently displayed by
propagating NAN (Not a Number) through the expression evaluator.
Specifically:
1. Update `prepare_metric()` in stat-shadow.c to treat uncounted events
(where `run == 0`) as NAN. This leverages the existing math in expr.y
to propagate NAN through metric expressions.
2. Remove the early return in the display logic's `printout()` function
that was previously skipping metrics in `--metric-only` mode for
failed events.
l
3. Simplify `perf_stat__skip_metric_event()` to no longer depend on
event runtime.
Tested:
1. `perf all metrics test` did not crash while paranoid is 2.
2. Multiple combinations with `CPUs_utilized` while paranoid is 2.
$ ./perf stat -M CPUs_utilized -a -- sleep 1
Performance counter stats for 'system wide':
<not supported> msec cpu-clock:u # nan CPUs CPUs_utilized
1,006,356,120 duration_time
1.004375550 seconds time elapsed
$ ./perf stat -M CPUs_utilized -a -j -- sleep 1
{"counter-value" : "<not supported>", "unit" : "msec", "event" : "cpu-clock:u", "event-runtime" : 0, "pcnt-running" : 100.00, "metric-value" : "nan", "metric-unit" : "CPUs CPUs_utilized"}
{"counter-value" : "1006642462.000000", "unit" : "", "event" : "duration_time", "event-runtime" : 1, "pcnt-running" : 100.00}
$ ./perf stat -M CPUs_utilized -a --metric-only -- sleep 1
Performance counter stats for 'system wide':
CPUs CPUs_utilized
nan
1.004424652 seconds time elapsed
$ ./perf stat -M CPUs_utilized -a --metric-only -j -- sleep 1
{"CPUs CPUs_utilized" : "none"}
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
446c595dc0 |
perf test addr2line_inlines: Ensure inline information shows on LBR leaves
Expand the addr2line inline function testing to also run for an LBR callchain, skipping if LBR support isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
04f81f45b4 |
perf callchain lbr: Make the leaf IP that of the sample
The current IP of a leaf function when reported from a perf record with
"--call-graph lbr" is the "to" field of the LBR branch stack record.
The sample for the event being recorded may be further into the function
and there may be inlining information associated with it.
Rather than use the branch stack "to" field in this case switch to the
callchain appending the sample->ip and thereby allowing the inline
information to show.
Before this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop 467586 4649.344493: 950905 cpu_core/cycles/P:
55dfda2829c0 parent+0x0 (perf)
inlineloop.c:31
55dfda282a96 inlineloop+0x86 (perf)
inlineloop.c:47
55dfda236420 run_workload+0x59 (perf)
builtin-test.c:715
55dfda236b03 cmd_test+0x413 (perf)
builtin-test.c:825
...
```
After this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop 529703 11878.680815: 950905 cpu_core/cycles/P:
555ce86be9e6 leaf+0x26
inlineloop.c:20 (inlined)
555ce86be9e6 middle+0x26
inlineloop.c:27 (inlined)
555ce86be9e6 parent+0x26 (perf)
inlineloop.c:32
555ce86bea96 inlineloop+0x86 (perf)
inlineloop.c:47
555ce8672420 run_workload+0x59 (perf)
builtin-test.c:715
555ce8672b03 cmd_test+0x413 (perf)
builtin-test.c:825
...
```
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
a724a8fce5 |
perf kvm stat: Fix build error
Since commit |
||
|
|
e5e66adfe4 |
perf regs: Remove __weak attributive arch_sdt_arg_parse_op() function
In line with the previous patch, the __weak arch_sdt_arg_parse_op() function is removed. Architectural-specific implementations in the arch/ directory are now converted into sub-functions within the util/perf-regs-arch/ directory. The perf_sdt_arg_parse_op() function will call these sub-functions based on the EM_HOST. This change enables cross-architecture calls to arch_sdt_arg_parse_op(). No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with powerpc and x86 Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
16dccbb842 |
perf regs: Remove __weak attributive arch__xxx_reg_mask() functions
Currently, some architecture-specific perf-regs functions, such as arch__intr_reg_mask() and arch__user_reg_mask(), are defined with the __weak attribute. This approach ensures that only functions matching the architecture of the build/run host are compiled and executed, reducing build time and binary size. However, this __weak attribute restricts these functions to be called only on the same architecture, preventing cross-architecture functionality. For example, a perf.data file captured on x86 cannot be parsed on an ARM platform. To address this limitation, this patch removes the __weak attribute from these perf-regs functions. The architecture-specific code is moved from the arch/ directory to the util/perf-regs-arch/ directory. The appropriate architectural functions are then called based on the EM_HOST. No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with s390 and riscv Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
e716e69cf6 |
perf arch: Update arch headers to use relative UAPI paths
The architectural specific headers perf_regs.h currently rely on the host architecture's 'asm/perf_regs.h'. This can lead to compilation inconsistencies or failures when including and building perf for a target architecture that differs from the host's architecture. Explicitly point to the UAPI headers within the tools source tree using relative paths. This ensures that perf is always built against the intended architecture. No functional changes are intended. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c2e28ae294 |
perf regs: Fix abort for "-I" or "--user-regs" options
Fix an issue where the `perf` tool aborts unexpectedly when running the
following command:
```
perf record -e cycles -I -- true
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use '-I?' to list register names
```
The usage of the `-I` or `--user-regs` options without specifying any
registers should default to sampling all general-purpose registers.
However, this currently causes an abnormal termination.
The issue was introduced by commit
|
||
|
|
cee275edcd |
perf metricgroup: Don't early exit if no CPUID table exists
The failure to find a table of metrics with a CPUID shouldn't early
exit as the metric code will now also consider the default table.
When searching for a metric or metric group,
pmu_metrics_table__for_each_metric() considers all tables and so the
caller doesn't need to switch the table to do this.
Fixes:
|
||
|
|
f637bb2eed |
perf tests: build-test coverage for NO_JEVENTS=1
Leo reported 'perf stat' being broken and this highlighted that the 'make NO_JEVENTS=1' variant is missing from 'make -C tools/perf build-test', add it. Closes: https://lore.kernel.org/linux-perf-users/20260205175250.GC3529712@e132581.arm.com/ Reported-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1d9622c3c1 |
perf tests: Additional 'perf stat' tests
Recently 'perf stat' regressed in per CPU mode [1]. Let's expand test coverage to catch the same breakage again as well as to test the repeat, pid, detailed and no aggregation options. [1] https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
a108a6a4b9 |
perf record: Make logs more readable for event open failures
Since commit
|
||
|
|
84cb36da81 |
perf thread: Don't require machine to compute the e_machine
The machine can be calculated from a thread via its maps. Don't require the machine argument to simplify callers and also to delay computing the machine until a little later. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c4f4392264 |
perf header: Add e_machine/e_flags to the header
Add 64-bits of feature data to record the ELF machine and flags. This allows readers to initialize based on the data. For example, `perf kvm stat` wants to initialize based on the kind of data to be read, but at initialization time there are no threads to base this data upon and using the host means cross platform support won't work. The values in the perf_env also act as a cache for these within the session. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
07ad6f31b6 |
perf session: Add e_flags to the e_machine helper
Allow e_flags as well as e_machine to be computed using the e_machine helper. This isn't currently used, the argument is always NULL, but it will be used for a new header feature. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
43af548436 |
perf kvm: Wire up e_machine
Pass the e_machine to the kvm functions so that they aren't just wired to EM_HOST. In the case of a session move some setup until the session is created. As the session isn't fully running the default EM_HOST is returned as no e_machine can be found in a running machine. This is, however, some marginal progress to cross platform support. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
ceea279f93 |
perf kvm stat: Remove use of the arch directory
`perf kvm stat` supports record and report options. By using the arch directory a report for a different machine type cannot be supported. Move the kvm-stat code out of the arch directory and into util/kvm-stat-arch following the pattern of perf-regs and dwarf-regs. Avoid duplicate symbols by renaming functions to have the architecture name within them. For global variables, wrap them in an architecture specific function. Selecting the architecture to use with `perf kvm stat` is selected by EM_HOST, ie no different than before the change. Later the ELF machine can be determined from the session or a header feature (ie EM_HOST at the time of the record). The build and #define HAVE_KVM_STAT_SUPPORT is now redundant so remove across Makefiles and in the build. Opportunistically constify architectural structs and arrays. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8c5b40678c |
libperf build: Always place libperf includes first
When building tools/perf the CFLAGS can contain a directory for the
installed headers.
As the headers may be being installed while building libperf.a this can
cause headers to be partially installed and found in the include path
while building an object file for libperf.a.
The installed header may reference other installed headers that are
missing given the partial nature of the install and then the build fails
with a missing header file.
Avoid this by ensuring the libperf source headers are always first in
the CFLAGS.
Fixes:
|
||
|
|
d2ac7e4418 |
perf test kvm: Add stat live testing
Ensure the `perf kvm stat live -p ..` has some basic functionality. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
b5c9bcde61 |
perf capstone: Support for dlopen-ing libcapstone.so
If perf is built with LIBCAPSTONE_DLOPEN=1, support dlopen-ing libcapstone.so and then calling the necessary functions by looking them up using dlsym. The types come from capstone.h which means the libcapstone feature check needs to pass, and NO_CAPSTONE=1 hasn't been defined. This will cause the definition of HAVE_LIBCAPSTONE_SUPPORT. Earlier versions of this code tried to declare the necessary capstone.h constants and structs, but they weren't stable and caused breakages across libcapstone releases. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
169343cc8f |
perf build: Remove NO_LIBCAP that controls nothing
Using libcap was removed in commit |
||
|
|
e205952db7 |
perf jevents: Validate that all names given an Event
Validate they exist in a JSON file from one directory found from one directory above the model's JSON directory. This avoids broken fallback encodings being created. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
82e53e7ae0 |
perf jevents: Add cycles breakdown metric for arm64/AMD/Intel
Breakdown cycles to user, kernel and guest. Add a common_metrics.py file for such metrics. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
e74f72a7e2 |
perf jevents: Add mesh bandwidth saturation metric for Intel
Memory bandwidth saturation from CBOX/CHA events present in broadwellde, broadwellx, cascadelakex, haswellx, icelakex, skylakex and snowridgex. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
5dc81578ad |
perf jevents: Add upi_bw metric for Intel
Break down UPI read and write bandwidth using uncore_upi counters. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
6ec3058e70 |
perf jevents: Add local/remote miss latency metrics for Intel
Derive from CBOX/CHA occupancy and inserts the average latency as is provided in Intel's uncore performance monitoring reference. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1fee2701a7 |
perf jevents: Add C-State metrics from the PCU PMU for Intel
Use occupancy events fixed in: https://lore.kernel.org/lkml/20240226201517.3540187-1-irogers@google.com/ Metrics are at the socket level referring to cores, not hyperthreads. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
2166b44be9 |
perf jevents: Add dir breakdown metrics for Intel
Breakdown directory hit, misses and requests. The implementation uses the M2M and CHA PMUs present in server models broadwellde, broadwellx cascadelakex, emeraldrapids, icelakex, sapphirerapids and skylakex. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
cde9c1a5d9 |
perf jevents: Add local/remote "mem" breakdown metrics for Intel
Breakdown local and remote memory bandwidth, read and writes. The implementation uses the HA and CHA PMUs present in server models broadwellde, broadwellx cascadelakex, emeraldrapids, haswellx, icelakex, ivytown, sapphirerapids and skylakex. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
130f4245af |
perf jevents: Add mem_bw metric for Intel
Break down memory bandwidth using uncore counters. For many models this matches the memory_bandwidth_* metrics, but these metrics aren't made available on all models. Add support for free running counters. Query the event JSON when determining which what events/counters are available. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
426b844289 |
perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
Number of outstanding load misses per cycle. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
d666f0172a |
perf jevents: Add FPU metrics for Intel
Metrics break down of floating point operations. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
2f3d6ea05d |
perf jevents: Add context switch metrics for Intel
Metrics break down context switches for different kinds of instruction. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
59341f4e17 |
perf jevents: Add ILP metrics for Intel
Use the counter mask (cmask) to see how many cycles an instruction takes to retire. Present as a set of ILP metrics. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
d80edef231 |
perf jevents: Add load store breakdown metrics ldst for Intel
Give breakdown of number of instructions. Use the counter mask (cmask) to show the number of cycles taken to retire the instructions. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
7413633e25 |
perf jevents: Add L2 metrics for Intel
Give a breakdown of various L2 counters as metrics, including totals, reads, hardware prefetcher, RFO, code and evictions. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
cd1c6a4874 |
perf jevents: Add ports metric group giving utilization on Intel
The ports metric group contains a metric for each port giving its utilization as a ratio of cycles. The metrics are created by looking for UOPS_DISPATCHED.PORT events. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
397fdb3a24 |
perf jevents: Add software prefetch (swpf) metric group for Intel
Add metrics that breakdown software prefetch instruction use. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |