Commit Graph

1638 Commits

Author SHA1 Message Date
Pavol Vaskovic
06061976da [benchmark] BenchmarkDoctor checks setup overhead
Detect setup overhead in benchmark and report if it exceeds 5%.
2018-08-17 08:50:04 +02:00
Pavol Vaskovic
7725c0096e [benchmark] Measure and analyze benchmark runtimes
`BenchmarkDoctor` measures benchmark execution (using `BenchmarkDriver`) and verifies that their runtime stays under 2500 microseconds.
2018-08-17 08:40:39 +02:00
Pavol Vaskovic
ab16999e20 [benchmark] Created BenchmarkDoctor (naming)
`BenchmarkDoctor` analyzes performance tests and reports their conformance to the set of desired criteria. First two rules verify the naming convention.

`BenchmarkDoctor` is invoked from `Benchmark_Driver` with `check` aurgument.
2018-08-17 08:40:39 +02:00
Pavol Vaskovic
076415f969 [benchmark] Strangler run_benchmarks
Replaced guts of the `run_benchmarks` function with implementation from `BenchmarDriver`. There was only single client which called it with `verbose=True`, so this parameter could be safely removed.

Function `instrument_test` is replaced by running the `Benchmark_0` with `--memory` option, which implements the MAX_RSS measurement while also excluding the overhead from the benchmarking infrastructure. The incorrect computation of standard deviation was simply dropped for measurements of more than one independent sample. Bogus aggregated `Totals` statistics were removed, now reporting only the total number of executed benchmarks.
2018-08-17 08:40:39 +02:00
Pavol Vaskovic
a84db83062 [benchmark] BenchmarkDriver can run tests
The `run` method on `BenchmarkDriver` invokes the test harness with specified number of iterations, samples. It supports mesuring memory use and in the verbose mode it also collects individual samples and monitors the system load by counting the number of voluntary and involuntary context switches.

Output is parsed using `LogParser` from `compare_perf_tests.py`. This makes that file a required dependency for the driver, therefore it is also copied to the bin directory during the build.
2018-08-17 08:39:50 +02:00
Pavol Vaskovic
e80165f316 [benchmark] Exclude only outliers from the top
Option to exclude the outliers only from top of the range, leaving in the outliers on the min side.
2018-08-17 08:39:50 +02:00
Pavol Vaskovic
27cc77c590 [benchmark] Exclude outliers from samples
Introduce algorithm for excluding of outliers after collecting all samples using the Interquartile Range rule.

The `exclude_outliers` method uses 1st and 3rd Quartile to compute Interquartile Range, then uses inner fences at Q1 - 1.5*IQR and Q3 + 1.5*IQR to remove samples outside this fence.

Based on experiments with collecting hundreads and thousands of samples (`num_samples`) per test with low iteration count (`num_iters`) with ~1s runtime, this rule is very effective in providing much better quality of sample population, effectively removing short environmental fluctuations that were previously averaged into the overall result (by the adaptively determined `num_iters` to run for ~1s), enlarging the reported result with these measurement errors. This technique can be used for some benchmarks, to get more stable results faster than before.

This outlier filering is employed when parsing `--verbose` test results.
2018-08-17 08:39:50 +02:00
Pavol Vaskovic
91077e3289 [benchmark] Introduced PerformanceTestSamples
* Moved the functionality to compute median, standard deviation and related statistics from `PerformanceTestResult` into `PerformanceTestSamples`.
* Fixed wrong unit in comments
2018-08-17 08:39:50 +02:00
Pavol Vaskovic
bea35cb7c1 [benchmark] LogParser measure environment
Measure more of environment during test

In addition to measuring maximum resident set size, also extract number of voluntary and involuntary context switches from the verbose mode.
2018-08-17 00:32:04 +02:00
Pavol Vaskovic
c60e223a3b [benchmark] LogParser: tab & space delimited logs
Added support for tab delimited and formatted log output (space aligned columns as output to console by Benchmark_Driver).
2018-08-17 00:32:04 +02:00
Pavol Vaskovic
d0cdaee798 [benchmark] LogParser support for --verbose mode
LogParser doesn’t use `csv.reader` anymore.
Parsing is handled by a Finite State Machine. Each line is matched against a set of (mutually exclusive) regular expressions that represent known states. When a match is found, corresponding parsing action is taken.
2018-08-17 00:32:04 +02:00
Pavol Vaskovic
9852e9a32a [benchmark] Extracted LogParser class 2018-08-17 00:32:04 +02:00
Pavol Vaskovic
d079607488 [benchmark] Documentation improvements 2018-08-17 00:32:04 +02:00
Pavol Vaskovic
ce39b12929 [benchmark] Strangler: BenchmarkDriver get_tests
See https://www.martinfowler.com/bliki/StranglerApplication.html for more info on the used pattern for refactoring legacy applications.

Introduced class `BenchmarkDriver` as a beginning of strangler application that will gradually replace old functions. Used it instead of `get_tests()` function in Benchmark_Driver.

The interaction with Benchmark_O is simulated through mocking. `SubprocessMock` class records the invocations of command line processes and responds with canned replies in the format of Benchmark_O output.

Removed 3 redundant lit tests that are now covered by the unit test `test_gets_list_of_all_benchmarks_when_benchmarks_args_exist`. This saves 3 seconds from test execution. Keeping only single integration test that verifies that the plumbing is connected correstly.
2018-08-17 00:32:04 +02:00
Pavol Vaskovic
69d5d5e732 [benchmark] Adding tests for BenchmarkDriver
The imports are a bit sketchy because it doesn’t have `.py` extension and they had to be hacked manually. :-/

Extracted `parse_args` from `main` and added test coverage for argument parsing.
2018-08-17 00:32:04 +02:00
Pavol Vaskovic
0b990a82a5 [benchmark] Extracted test_utils.py
Moving the `captured_output` function to own file.

Adding homegrown unit testing helper classes `Stub` and `Mock`.

The issue is that the unittest.mock was added in Python 3.3 and we need to run on Python 2.7. `Stub` and `Mock` were organically developed as minimal implementations to support the common testing patterns used on the original branch, but since I’m rewriting the commit history to provide an easily digestible narrative, it makes sense to introduce them here in one step as a custom unit testing library.
2018-08-16 20:08:34 +02:00
Pavol Vaskovic
343f284227 [benchmark] Removed legacy submit command
Also removed inused imports.
2018-08-16 20:08:34 +02:00
Pavol Vaskovic
179b12103f [benchmark] Refactor formatting responsibilities
Moved result formatting methods from `PerformanceTestResult` and `ResultComparison` to `ReportFormatter`, in order to free PTR to take more computational responsibilities in the future.
2018-08-16 17:44:59 +02:00
Pavol Vaskovic
f29fef6b67 [benchmark] Print delim in verbose config 2018-08-16 08:11:13 +02:00
swift-ci
63665cd229 Merge pull request #18667 from overlazy/HeapSortBenchmark 2018-08-15 21:09:01 -07:00
Erik Eckstein
edc7a0f96c benchmarks: add an option to the compare_perf_tests script to output improvements and regressions in an single table.
Instead of separate tables. Only affects git and markdown output.
2018-08-14 13:38:14 -07:00
Erik Eckstein
b4a61d7155 benchmarks: add an option to the bench_code_size script to separately report code size for the swift libraries 2018-08-14 13:38:14 -07:00
Kirill Chibisov
248e843d51 Fixed typo in identifier 2018-08-14 06:18:53 +03:00
Kirill Chibisov
b7ff8cf9d3 Fixed some typos 2018-08-13 20:10:30 +03:00
Kirill Chibisov
38db0d7ce2 Added benchmark for heapSort path of stdlib sort
This benchmark makes sorting benchmarks more complete. Now we
can measure all paths of stdlib sorting function.
2018-08-13 16:11:01 +03:00
Erik Eckstein
c1ff9bcedc benchmarks: add a script to report code size of benchmark files 2018-08-07 13:16:24 -07:00
Erik Eckstein
f82da4113b benchmarks: minor comment fix 2018-08-07 13:15:23 -07:00
Erik Eckstein
89d07e5998 benchmarks: add a script to do a very fast check of which benchmarks changed.
Runs ~ 1 min to compare two benchmark builds.
2018-08-03 10:17:53 -07:00
Erik Eckstein
729989473f benchmarks: fix the iteration count for some benchmarks
So that a single iteration is within ~2ms (and also not too short).
2018-07-31 10:59:33 -07:00
Erik Eckstein
868c5a1fb7 benchmarks: Marking some benchmarks as unstable
The noise for those benchmarks even cannot be removed by using a high sample count.
2018-07-31 10:59:33 -07:00
Erik Eckstein
30cc37e5f4 benchmarks: fix the Chars benchmark by adding blackHoles and testing all possible Char comparisons 2018-07-31 10:59:33 -07:00
Erik Eckstein
2839c1707d benchmarks: fix the Calculator benchmark by covering more String compare test cases 2018-07-31 10:59:33 -07:00
Erik Eckstein
1a161c28ae benchmarks: compile with the new -align-module-to-page-size option, if supported by the compiler
To stabilize the benchmark results.
2018-07-27 17:15:14 -07:00
eeckstein
b042215abd Merge pull request #18211 from palimondo/fluctuation-of-the-pupil
[benchmark] Alphabetic sorting of tests and warning about incorrect use of memory option
2018-07-25 14:45:23 -07:00
Erik Eckstein
53f2660e62 benchmarks: Convert the PartialApplyDynamicType into a lit test
This benchmark was added to test if the compiler crashes.
For some reason it was added as benchmark and not as lit test.
It has no value as benchmark anyway because the compiler optimizes away pretty much everything.
2018-07-25 11:32:23 -07:00
Pavol Vaskovic
1a382ab775 [benchmark] Warn about incorrect --memory use 2018-07-25 08:02:28 +02:00
Pavol Vaskovic
a61a756b4d [benchmark] Fix: alphabetic sorting of tests 2018-07-25 07:45:07 +02:00
Erik Eckstein
bd43d54b99 benchmarks: Move the setup and teardown functions out of the sample loop.
This is important to minimize the runtime when many samples are taken.
2018-07-24 20:20:23 -07:00
Erik Eckstein
50db6f1ed3 benchmarks: fix the setup functions of the CharacterProperties benchmark.
Force all globals to be initialized in the setup functions
2018-07-24 20:20:23 -07:00
Erik Eckstein
f1afba1ad1 benchmarks: adapted some iteration counts and/or workload sizes for benchmarks.
Those are benchmarks which took way too long or short to execute a single iteration or benchmarks which changed in time anyway because of previous fixes.

I renamed those benchmarks so that they are now treated as "new" benchmarks.
2018-07-24 20:20:23 -07:00
Erik Eckstein
f6c24a05cc benchmarks: extract setup code into the setUpFunction in some benchmarks where setup time is significant 2018-07-24 20:20:23 -07:00
Erik Eckstein
8876c24104 benchmarks: add some blackHole calls to prevent the optimizer removing important parts of a benchmark 2018-07-24 20:18:17 -07:00
Erik Eckstein
1a7bee55c6 benchmarks: fix the iteration count of some benchmarks.
Some benchmarks wrongly executed the loop N+1 times ("0...N" instead of "0..<N")

mt
2018-07-24 20:18:17 -07:00
Ben Cohen
345879429b [stdlib] Take several underscored stdlib functions private (#18134)
* Make _sanityCheck internal

* Make _debugPrecondition internal

* Make Optional._unsafelyUnwrappedUnchecked internal.

* Make _precondition internal

* Switch Foundation _sanityChecks to assertions

* Update file check tests

* Remove one more _debugPrecondition

* Update Optimization-with-check tests
2018-07-24 18:26:19 -07:00
eeckstein
46a83909c6 Merge pull request #18124 from palimondo/fluctuation-of-the-pupil
[benchmark] Measure memory with rusage and a TON of gardening
2018-07-23 12:40:38 -07:00
Pavol Vaskovic
362f925e37 [benchmark][Gardening] Docs and error handling
* Improved documentation.
* Corrected`fflush` usage in `parse` error handling.
* Removed unused `passThroughArgs`.
2018-07-23 18:01:23 +02:00
Pavol Vaskovic
4ed3dcfcc5 [benchmark][Gardening] --sample-time renaming
Sample time is a better name for what was previously called `iter-scale`.
2018-07-23 17:15:54 +02:00
Pavol Vaskovic
df5ccf3e26 [benchmark][Gardening] Better naming and comments
* Restored property doc comments on `TestConfig`
* Better name for func `usage` is `getResourceUtilization`
2018-07-23 11:17:16 +02:00
Pavol Vaskovic
19613733a4 [benchmark] Log the MAX_RSS only w/ --memory flag
Printing of the MAX_RSS is now hidden behind the optional `--memory` flag.
2018-07-22 06:25:23 +02:00
Ben Cohen
4694310e51 [stdlib] Some minor cleanup (#18130)
* Remove case destructuring to _

* Remove some Iterator.Element

* Which idiot wrote this? Oh.

* Switch NibbleSort to just use default impls... shouldn't change perf
2018-07-21 17:29:57 -07:00