Add quotes to avoid the following warning:
```
In tests/shell/record.sh line 264:
[ $(uname -m) = "s390x" ] && {
^---------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
```
Fixes: c73a56ed3c ("perf test: Fix test case Leader sampling on s390")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The subtest 'Leader sampling' some time fails on s390.
- for z/VM guest: Disable the test for z/VM guest. There is no
CPU Measurement facility to run the test successfully.
- for LPAR: Use correct event names.
A detailed analysis follows here:
Now to the debugging and investigation:
1. With command
perf record -e '{cycles,cycles}:S' -- ....
the first cycles event starts sampling.
On s390 this sets up sampling with a frequency of 4000 Hz.
This translates to hardware sample rate of 1377000 instructions per
micro-second to meet a frequency of 4000 HZ.
2. With first event cycles now sampling into a hardware buffer, an
interrupt is triggered each time a sampling buffer gets full.
The interrupt handler is then invoked and debug output shows the
processing of samples. The size of one hardware sample is 32 bytes.
With an interrupt triggered when the hardware buffer page of 4KB
gets full, the interrupt handler processes 128 samples.
(This is taken from s390 specific fast debug data gathering)
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x0 count 0x1502e8
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x1502e8 count 0x1502e8
2025-11-07 14:35:51.977248 000003ffe013cbfa \
perf_event_count_update event->count 0x2a05d0 count 0x1502e8
2025-11-07 14:35:51.977252 000003ffe013cbfa \
perf_event_count_update event->count 0x3f08b8 count 0x1502e8
2025-11-07 14:35:51.977252 000003ffe013cbfa \
perf_event_count_update event->count 0x540ba0 count 0x1502e8
2025-11-07 14:35:51.977253 000003ffe013cbfa \
perf_event_count_update event->count 0x690e88 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0x7e1170 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0x931458 count 0x1502e8
2025-11-07 14:35:51.977254 000003ffe013cbfa \
perf_event_count_update event->count 0xa81740 count 0x1502e8
3. The value is constantly increasing by the number of instructions
executed to generate a sample entry. This is the first line of the
pairs of lines. count 0x1502e8 --> 1377000
# perf script | grep 1377000 | wc -l
214
# perf script | wc -l
428
#
That is 428 lines in total, and half of the lines contain value
1377000.
4. The second event cycles is opened against the counting PMU, which
is an independent PMU and is not interrupt driven. Once enabled it
runs in the background and keeps running, incrementing silently
about 400+ counters. The counter values are read via assembly
instructions.
This second counter PMU's read call back function is called when the
interrupt handler of the sampling facility processes each sample. The
function call sequence is:
perf_event_overflow()
+--> __perf_event_overflow()
+--> __perf_event_output()
+--> perf_output_sample()
+--> perf_output_read()
+--> perf_output_read_group()
for_each_sibling_event(sub, leader) {
values[n++] = perf_event_count(sub, self);
printk("%s sub %p values %#lx\n", __func__, sub, values[n-1]);
}
The last function perf_event_count() is invoked on the second event
cylces *on* the counting PMU. An added printk statement shows the
following lines in the dmesg output:
# dmesg|grep perf_output_read_group |head -10
[ 332.368620] perf_output_read_group sub 00000000d80b7c1f values 0x3a80917 (1)
[ 332.368624] perf_output_read_group sub 00000000d80b7c1f values 0x3a86c7f (2)
[ 332.368627] perf_output_read_group sub 00000000d80b7c1f values 0x3a89c15 (3)
[ 332.368629] perf_output_read_group sub 00000000d80b7c1f values 0x3a8c895 (4)
[ 332.368631] perf_output_read_group sub 00000000d80b7c1f values 0x3a8f569 (5)
[ 332.368633] perf_output_read_group sub 00000000d80b7c1f values 0x3a9204b
[ 332.368635] perf_output_read_group sub 00000000d80b7c1f values 0x3a94790
[ 332.368637] perf_output_read_group sub 00000000d80b7c1f values 0x3a9704b
[ 332.368638] perf_output_read_group sub 00000000d80b7c1f values 0x3a99888
#
This correlates with the output of
# perf report -D | grep 'id 00000000000000'|head -10
..... id 0000000000000006, value 00000000001502e8, lost 0
..... id 000000000000000e, value 0000000003a80917, lost 0 --> line (1) above
..... id 0000000000000006, value 00000000002a05d0, lost 0
..... id 000000000000000e, value 0000000003a86c7f, lost 0 --> line (2) above
..... id 0000000000000006, value 00000000003f08b8, lost 0
..... id 000000000000000e, value 0000000003a89c15, lost 0 --> line (3) above
..... id 0000000000000006, value 0000000000540ba0, lost 0
..... id 000000000000000e, value 0000000003a8c895, lost 0 --> line (4) above
..... id 0000000000000006, value 0000000000690e88, lost 0
..... id 000000000000000e, value 0000000003a8f569, lost 0 --> line (5) above
Summary:
- Above command starts the CPU sampling facility, with runs interrupt
driven when a 4KB page is full. An interrupt processes the 128 samples
and calls eventually perf_output_read_group() for each sample to save it
in the event's ring buffer.
- At that time the CPU counting facility is invoked to read the value of
the event cycles. This value is saved as the second value in the
sample_read structure.
- The first and odd lines in the perf script output displays the period
value between 2 samples being created by hardware. It is the number
of instructions executes before the hardware writes a sample.
- The second and even lines in the perf script output displays the number
of CPU cycles needed to process each sample and save it in the event's
ring buffer.
These 2 different values can never be identical on s390.
Since event leader sampling is not possible on s390 the perf tool will
return EOPNOTSUPP soon. Perpare the test case for that.
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Jan Polensky <japo@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On my system, perf list is very slow to print the whole events. I think
there's a performance issue in SDT and uprobes event listing. I noticed
this issue while running perf test on x86 but it takes long to check
some CoreSight event which should be skipped quickly.
Anyway, some test uses perf list to check whether the required event is
available before running the test. The perf list command can take an
argument to specify event class or (glob) pattern. But glob pattern is
only to suppress output for unmatched ones after checking all events.
In this case, specifying event class is better to reduce the number of
events it checks and to avoid buggy subsystems entirely.
No functional changes intended.
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20241016065654.269994-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Add it to the record.sh shell test to verify if it tracks cgroup
information correctly. It records with --all-cgroups option can check
if it has PERF_RECORD_CGROUP and the names are not "unknown".
$ sudo ./perf test -vv 95
95: perf record tests:
--- start ---
test child forked, pid 2871922
169c90-169cd0 g test_loop
perf does have symbol 'test_loop'
Basic --per-thread mode test
Basic --per-thread mode test [Success]
Register capture test
Register capture test [Success]
Basic --system-wide mode test
Basic --system-wide mode test [Success]
Basic target workload test
Basic target workload test [Success]
Branch counter test
branch counter feature not supported on all core PMUs (/sys/bus/event_source/devices/cpu) [Skipped]
Cgroup sampling test
Cgroup sampling test [Success]
---- end(0) ----
95: perf record tests : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240818212948.2873156-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Subtest for system-wide record with '--threads=cpu' option fails due
to a limit of open file descriptors on systems with 128 or more CPUs
as the default limit is set to 1024.
The number of open file descriptors should be slightly above
nmb_events*nmb_cpus + nmb_cpus(for perf.data.n) + 4*nmb_cpus(for pipes),
which equals 8*nmb_cpus. Therefore, temporarily raise the limit to
16*nmb_cpus for the test.
Committer notes:
Instead of disabling ShellCheck warnings all the uses of 'uname -n',
i.e. those:
In tests/shell/record.sh line 35:
default_fd_limit=$(ulimit -Sn)
^-^ SC3045 (warning): In POSIX sh, ulimit -S is undefined.
We can just switch from using '/bin/sh' to '/bin/bash' for this test, as
bash _has_ 'ulimit -n', so ShellCheck will not emit that warning.
There are dozens of 'perf test' shell tests that do just that,
'/bin/bash' is a reasonable expectation for those tests.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Radostin Stoyanov <rstoyano@redhat.com>
Link: https://lore.kernel.org/linux-perf-users/20240429085721.10122-1-vmolnaro@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record test depends on finding symbol test_loop in perf, and fails if
perf has been stripped and no debug object is available. In that case, skip
the test instead.
Example:
Note, building with perl support adds option -Wl,-E which causes the
linker to add all (global) symbols to the dynamic symbol table. So the
test_loop symbol, being global, does not get stripped unless NO_LIBPERL=1
Before:
$ make NO_LIBPERL=1 -C tools/perf >/dev/null 2>&1
$ strip tools/perf/perf
$ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
$ tools/perf/perf test -v 'record tests'
91: perf record tests :
--- start ---
test child forked, pid 118750
Basic --per-thread mode test
Per-thread record [Failed missing output]
Register capture test
Register capture test [Success]
Basic --system-wide mode test
System-wide record [Skipped not supported]
Basic target workload test
Workload record [Failed missing output]
test child finished with -1
---- end ----
perf record tests: FAILED!
After:
$ tools/perf/perf test -v 'record tests'
91: perf record tests :
--- start ---
test child forked, pid 120025
perf does not have symbol 'test_loop'
perf is missing symbols - skipping test
test child finished with -2
---- end ----
perf record tests: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf test -F 81 ("perf record tests") -v fails on s390x on the
linux-next branch.
The test case is x86 specific can not be executed on s390x. The test
case depends on x86 register names such as:
... | egrep -q 'available registers: AX BX CX DX ....'
Skip this test case on s390x.
Output before:
# perf test -F 81
81: perf record tests : FAILED!
#
Output after:
# perf test -F 81
81: perf record tests : Skip
#
Fixes: 24f378e660 ("perf test: Add basic perf record tests")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220428122821.3652015-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>