Commit Graph

250 Commits

Author SHA1 Message Date
tbkka ab861d5890 Pass architecture into Benchmark_Driver to fix build-script -B (#33100)
* Pass architecture into Benchmark_Driver to fix `build-script -B`

* "Benchmark_Driver compare" does not need the architecture
2020-07-25 11:15:49 -07:00
tbkka 3181dd1e4c Fix a bunch of python lint errors (#32951)
* Fix a bunch of python lint errors

* adjust indentation
2020-07-17 14:30:21 -07:00
Erik Eckstein 2387732ab5 benchmarks: support new executable file names in perf_test_driver
rdar://problem/65508278
2020-07-16 15:43:37 +02:00
Erik Eckstein a46cda8c51 benchmarks: fix run_smoke_bench to support new benchmark executable naming scheme
Find the right benchmark executable with a glob pattern.
Also, add an option "-arch" to select between executables for different architectures.
2020-07-07 11:01:49 +02:00
Meghana Gupta 911ac8e45e Fix code size reporting when input directory is missing a trailing '/'
run_smoke_bench script fails to report code size changes if you have a
trailing '/' in <old_build_dir> but not <new_build_dir>.

This change appends a separator if it is missing
2020-05-14 13:49:35 -07:00
Sergej Jaskiewicz cce9e81f0b Support Python 3 in the benchmark suite 2020-02-28 01:45:35 +03:00
Ross Bayer b1961745e0 [Python: black] Reformatted the benchmark Python sources using utils/python_format.py. 2020-02-08 15:32:44 -08:00
Michael Gottesman 2840a7609d When gathering counters, check for instability and FAIL otherwise.
The way we already gather numbers for this test is that we run two runs of
`Benchmark_O $TEST` with num-samples=2, iters={2,3}. Under the assumption that
the only difference in counter numbers can be caused by that extra iteration,
subtracting the group of counts for 2,3 gives us the number of counts in that
iteration.

In certain cases, I have found that a small subset of the benchmarks are
producing weird output and I haven't had the time to look into why. That being
said, I do know what these weird results look like, so in this commit we do some
extra validation work to see if we need to fail a test due to instability.

The specific validation is that:

1. We perform another run with num-samples=2, iter=5 and subtract the iter=3
counts from that. Under the assumption that overall work should increase
linearly with iteration size in our benchmarks, we check if the counts are
actual 2x.

2. If either `result[iter=3] - result[iter=2]` or `result[iter=5] -
result[iter=3]` is negative. All of the counters we gather should never decrease
with iteration count.
2020-01-15 14:41:21 -08:00
Michael Gottesman 461f17e5b7 Change -csv flag to be --emit-csv. 2020-01-15 14:41:21 -08:00
Michael Gottesman 35aa0405d1 Pattern match test names, not numbers to capture test names from Benchmark_O --list
This makes the output of the test more readable.
2020-01-15 14:41:21 -08:00
Michael Gottesman 676411f0b0 Have dtrace aggregate rr opts and start tracking {retain,release}_n.
Otherwise, one can get results that seem to imply more rr traffic when in
reality, one was not tracking {retain,release}_n that as a result of better
optimization become just simple retain, release.
2020-01-15 14:39:55 -08:00
Michael Gottesman 6fff30c122 [benchmark-dtrace] Enabling multiprocessing option to speed up gathering data. 2020-01-08 16:06:56 -08:00
Michael Gottesman c7c2e6e17b [benchmark-dtrace] Fix the amount of samples taken along side the number of iters.
Otherwise, the output is not stable.
2020-01-08 16:06:56 -08:00
Michael Gottesman d48cdd9cad [benchmark-dtrace] Set SWIFT_DETERMINISTIC_HASHING=1 before calling subjobs.
This prevents a bunch of instability in the retain, release numbers. I am still
getting some of it, but this helps a lot.
2020-01-08 16:06:56 -08:00
Alex Hoppen 932525d762 [gardening] Fix several python-lint warnings 2019-10-29 10:40:20 -07:00
Alex Hoppen 776e2c0030 Revert "Migrate building SwiftSyntax to swift_build_support" 2019-10-29 09:55:32 -07:00
Alex Hoppen 46501b881f [gardening] Fix several python-lint warnings 2019-10-25 15:58:07 -07:00
Erik Eckstein 81a5c0f479 run_smoke_bench: make num_retries configurable 2019-10-14 11:37:42 +02:00
Pavol Vaskovic cc0e16ca34 [benchmark] LogParser: measurement metadata 2019-07-23 19:44:41 +02:00
Pavol Vaskovic 007d398f4a [Gardening] ReportFormatter: tying up loose ends 2019-05-24 00:18:44 +02:00
Pavol Vaskovic b3f7996ea7 [benchmark] ReportFormatter: better inline headers
Improve inline headers in `single_table` mode to also print labels for the numeric columns.

Sections in the `single_table` are visually distinguished by a separator row preceding the the inline headers.

Separated header label styles for git and markdown modes with UPPERCASE and **Bold**  formatting respectively.

Inlined section template definitions.
2019-05-23 23:24:51 +02:00
Pavol Vaskovic 73b31006ee [benchmark] Fix help printing for run_smoke_bench 2019-05-23 21:40:44 +02:00
Pavol Vaskovic 9750581bf5 [benchmark] ReportFormatter: right-align num cols 2019-05-23 19:32:34 +02:00
Pavol Vaskovic af7ef03aaf [benchmark] ReportFormatter: refactor header logic
Confine the logic for printing headers to the header function.
2019-05-23 17:28:21 +02:00
Pavol Vaskovic a998e18e18 [benchmark] ReportFormatter: faster templating
It is slightly faster to simply concatenate strings that don’t require special formatting.
2019-05-23 12:29:19 +02:00
Pavol Vaskovic 49d25bfc51 [benchmark] ReportFomatter: de-tuple
Remove unnecessary list-to-tuple conversions.
2019-05-23 12:20:19 +02:00
Pavol Vaskovic 081e1c94a5 [benchmark] Add unit test for single table report 2019-05-22 14:54:00 +02:00
Michael Gottesman c86c1763c6 [benchmarks] Add support to the build-script swiftpm benchmarks for building the benchmarks in -Osize. 2019-04-11 10:10:38 -07:00
Michael Gottesman 53ff97428a [benchmarks] Change the build_script_helper to use subdirectories for each build and install final binaries in a toplevel ./bin build directory.
This will let me:

1. Add -Osize support easily.
2. Put all of the binaries in the same directory so that Benchmark_Driver can
   work with them via the -tools argument.
2019-04-10 22:18:50 -07:00
Michael Gottesman 115f7a43e0 Move build_script_helper from ./benchmarks/utils => ./benchmarks/scripts. 2019-04-10 22:18:50 -07:00
Pavol Vaskovic 691007b029 [benchmark] LogParser: Accept -?! in bench. names
Extend parser to support benchmark names that include `-?!` in names, to fully support the new Naming Convention from PR #20334.
2019-02-19 23:31:58 +01:00
Pavol Vaskovic 84e7d4dfb8 [benchmark] Adjust Driver’s console output format
…to handle longer benchmark names, assuming maximum length of 40 characters.
2019-02-19 23:28:51 +01:00
Pavol Vaskovic 3f179f39e0 Increase # of independent samples for changes.
Multimodal benchmarks with significant delta between the modes can report false performance changes when we gather too few independent samples. This increases the minimal number of independent samples from 5 to 10.
Fix for https://bugs.swift.org/browse/SR-9907
2019-02-12 11:42:51 +01:00
Pavol Vaskovic 85ba83191e [benchmark] Remove unused function get_results
Remove the `get_results` function, which is no longer used after the refactoring that rebased the benchmark measurements on `BenchmarDriver` class in #21684.
2019-02-04 10:11:57 +01:00
Patrick Balestra 3c6b3ab4cc [benchmark] Fix linter errors in create_benchmark.py 2019-01-21 22:40:23 +01:00
Patrick Balestra 0b3fa54249 [benchmark] Split template into separate line and fix linter errors 2019-01-21 22:40:23 +01:00
Patrick Balestra 1ca47e7870 [benchmark] Add script to automate creation of new single-source benchmarks
Adds a `create_benchmark` script that automates the following three tasks:
1. Add a new Swift file (YourTestNameHere.swift), built according to the template below, to the {{single-source}}directory.
2. Add the filename of the new Swift file to CMakeLists.txt
3. Edit main.swift. Import and register your new Swift module.

The process of adding new benchmarks is now automated and a lot less error-prone.
2019-01-20 22:22:08 +01:00
Gwynne Raskind faf8a5edb6 Fix indentation for python_lint 2019-01-17 02:00:40 -06:00
Gwynne Raskind 09b4159cb2 Global replace of "assertEquals" with "assertEqual" in compliance with deprecation of assertEquals name in Python 2.7 2019-01-16 04:06:38 -06:00
Pavol Vaskovic 2c271493d5 [benchmark] Limit of Accuracy in Setup Overhead
Clarified limit of accuracy in setup overhead detection.
2019-01-09 18:01:06 +01:00
Pavol Vaskovic 2096151ee9 [benchmark] BenchmarkDoctor: Lower runtime limit
Warn about runtimes under 20 μs and flag 0 μs runtimes as errors.
2019-01-08 19:16:40 +01:00
Pavol Vaskovic 8a8a3ad6df [benchmark] Limit setup overhead detection (>20)
For really small runtimes < 20 μs this method of setup overhead detection doesn’t work. Even 1μs change in 20μs runtime is 5%. Just return no overhead.
2019-01-08 19:15:29 +01:00
Pavol Vaskovic d854f0f898 [benchmark] test_performance with BenchmarkDriver
Refactored `test_perfomance` function to use existing  BenchmarkDriver and TestComparator.

This replaces hand-rolled parser and comparison logic with library functions which already have full unit test coverage.
2019-01-08 00:22:00 +01:00
Pavol Vaskovic cd4886aa2b [Gardening] Move imports and DriverArgs to top 2019-01-07 20:59:47 +01:00
Pavol Vaskovic 4a716445df [benchmark] BernchmarkDriver run in batch mode
Finished support for running all active tests in one batch. Returns a dictionary of PerformanceTestResults.

Known test names are passed to the harness in a compressed form as test numbers.
2019-01-07 20:59:39 +01:00
Pavol Vaskovic df3389259b [benchmark] BenchmarkDriver: store test_numbers 2019-01-07 20:57:47 +01:00
Pavol Vaskovic 1f58ad6662 [Gardening] Better names: _tests_by_name_or_number 2019-01-07 20:57:47 +01:00
Pavol Vaskovic 3023ab5545 [benchmark] BenchmarkDriver sample_time support
Added support for Benchmark_X’s `--sample-time` parameter .
2019-01-07 20:57:42 +01:00
Pavol Vaskovic b831f93dd4 [benchmark] BenchmarkDoctor: Optional markdown arg
Don't require the presence of `markdown` argument for initialization.
(It doesn't exist when BenchmarkDoctor is used from `run_smoke_bench`.)
2018-12-21 21:26:24 +01:00
Pavol Vaskovic 46f94d7709 [benchmark] BenchmarkDriver check --markdown
Added `--markdown` flag for the `check` command to output the `BenchmarkDoctor`’s report in the Markdown format (as used by swift-ci on GitHub).
2018-12-21 01:22:38 +01:00