Commit Graph

172 Commits

Author SHA1 Message Date
Pavol Vaskovic
7bbf2e1d0a [benchmark] Gardening: Extract constant oneSecond 2018-09-14 23:40:43 +02:00
Pavol Vaskovic
1ba2ee41d9 [benchmark] Gardening: Code format class Timer 2018-09-14 23:40:43 +02:00
Pavol Vaskovic
496a277419 [benchmark] Gardening: Indentation of .listTests 2018-09-14 23:40:43 +02:00
Ben Langmuir
423e145b0c Revert "[benchmark] Report Quantiles from Benchmark_O and a TON of Gardening" 2018-09-14 13:24:01 -07:00
Pavol Vaskovic
a56c55c8e4 [benchmark] Round quantile idx to nearest or even
Explicitly use round-half-to-even rounding algorithm to match the behavior of numpy's quantile(interpolation='nearest') and quantile estimate type R-3, SAS-2. See:
https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample
2018-09-10 10:45:00 +02:00
Pavol Vaskovic
313dfda5a4 [benchmark] Option: delta encoded quantiles format
Added `--delta` argument to print the quantiles in delta encoded format, that ommits 0s.

This results in machine and human readable output that highlights modes and is easily digestible, giving you the feel for the underlying probability distribution of the samples in the reported results:

````
$ ./Benchmark_O --num-iters=1 --num-samples=20 --quantile=20 --delta 170 171 184 185 198 199 418 419 432 433 619 620
#,TEST,SAMPLES,MIN(μs),𝚫V1,𝚫V2,𝚫V3,𝚫V4,𝚫V5,𝚫V6,𝚫V7,𝚫V8,𝚫V9,𝚫VA,𝚫VB,𝚫VC,𝚫VD,𝚫VE,𝚫VF,𝚫VG,𝚫VH,𝚫VI,𝚫VJ,𝚫MAX
170,DropFirstArray,20,171,,,,,,,,,,,,,,,,,,,2,29
171,DropFirstArrayLazy,20,168,,,,,,,,,,,,,,,,,,,,8
184,DropLastArray,20,55,,,,,,,,,,,,,,,,,,,,26
185,DropLastArrayLazy,20,65,,,,,,,,,,,,,,,,,,,1,90
198,DropWhileArray,20,214,1,,,,,,,,,,,,,,,,,1,27,2
199,DropWhileArrayLazy,20,464,,,,1,,,,,,,,1,1,1,4,9,1,9,113,2903
418,PrefixArray,20,132,,,,,,,,,,,,,,,,,1,1,32,394
419,PrefixArrayLazy,20,168,,,,,,,,,,,,1,,2,9,1,15,8,88,3338
432,PrefixWhileArray,20,252,1,,,,1,,,,,,,,,,,1,,,,30
433,PrefixWhileArrayLazy,20,168,,,,,,,,,,,,,1,,6,6,14,43,28,10200
619,SuffixArray,20,68,,,,,,,,,,,,,1,,,,22,1,1,4
620,SuffixArrayLazy,20,65,,,,,,,,,,,,,,,,,,1,9,340
````
2018-09-03 16:00:05 +02:00
Pavol Vaskovic
13e7c3faaf [benchmark] Gardening maxRSS as Int? 2018-09-02 11:47:43 +02:00
Pavol Vaskovic
1f465b9bf7 [benchmark] Report quantiles from samples
The default benchmark result reports statistics of a normal distribution — mean and standard deviation. Unfortunately the samples from our benchmarks are *not normally distributed*. To get a better picture of the underlying probability distribution, this adds support for reporting quantiles.

See https://en.wikipedia.org/wiki/Quantile

This gives better subsample of the measurements in the summary, without need to resort to the use of a full verbose mode, which might be unnecessarily slow.
2018-08-31 23:16:34 +02:00
Pavol Vaskovic
6079d4ff03 [benchmark] Rename SampleRunner -> TestRunner
It is now running all the benchmarks, so it’s a TestRunner.
2018-08-31 18:05:26 +02:00
Pavol Vaskovic
ae7d82b607 [benchmark] Gardening: Even nicer microseconds 2018-08-31 18:05:20 +02:00
Pavol Vaskovic
df3b38589e [benchmark] Gardening: Fixed method indentation 2018-08-31 17:25:16 +02:00
Pavol Vaskovic
cdcb631469 [benchmark] Refactor run runBenchmarks logVerbose
Extracted nested func logVerbose as instance method on SampleRunner.

Internalized the free functions `runBech` and `runBenchmarks` into SampleRunner as methods `run` and `runBenchmarks`.
2018-08-31 17:17:48 +02:00
Pavol Vaskovic
265f537d10 [benchmark] Extract yield & add resetMeasurements 2018-08-31 17:17:48 +02:00
Pavol Vaskovic
be39c02001 [benchmark] Refactor numIters computation
The spaghetti if-else code was untangled into nested function that computes `iterationsPerSampleTime` and a single constant `numIters` expression that takes care of the overflow capping as well as the choice between fixed and computed `numIters` value.

The `numIters` is now computed and logged only once per benchmark measurement instead of on every sample.

The sampling loop is now just a single line. Hurrah!

Modified test to verify that the `LogParser` maintains `num-iters` derived from the `Measuring with scale` message across samples.
2018-08-31 17:17:48 +02:00
Pavol Vaskovic
46ee2a4bd8 [benchmark] Refactor sampling loop with addSample
Extracted sample saving to inner func `addSample`.
Used it to save the `oneIter` sample from `numIters` calibration when it comes out as 1 and continue the for loop to next sample.

This simplified following code that can now always measure the sample with `numIters` and save it.
2018-08-31 07:32:23 +02:00
Pavol Vaskovic
3c55e30382 [benchmark] Gardening: Documentation of numIters
Clarified the need for capping `numIters` according to the discussion at https://github.com/apple/swift/pull/17268#issuecomment-404831035

The sampling loop is a hairy piece of code, because it’s trying to reuse the calibration measurement as a regular sample, in case the computed `numIters` turns out to be 1. But it conflicts with the case when `fixedNumIters` is 1, necessitating a separate measurement in the else branch… That was a quick fix back then, but its hard to make it clean. More thinking is required…
2018-08-31 07:32:22 +02:00
Pavol Vaskovic
6e27af7142 [benchmark] Gardening: Sensibly rename variables
To make sense of this spaghetti code, let’s first use reasonable variable names:
* scale -> numIters
* elapsed_time -> time
2018-08-31 07:32:21 +02:00
Pavol Vaskovic
46bef893d0 [benchmark] Gardening: DRYer verbose log 2018-08-31 07:32:20 +02:00
Pavol Vaskovic
9c4876eabd [benchmark] Refactor to currency type Int
Removed unnecessary use of UInt64, where appropriate, following the advice from Swift Language Guide:

> Use the `Int` type for all general-purpose integer constants and variables in your code, even if they’re known to be nonnegative. Using the default integer type in everyday situations means that integer constants and variables are immediately interoperable in your code and will match the inferred type for integer literal values.
https://docs.swift.org/swift-book/LanguageGuide/TheBasics.html#ID324
2018-08-31 07:32:19 +02:00
Pavol Vaskovic
f017d98050 [benchmark] Refactor to report samples in μs
Moved the adjustment of `lastSampleTime` to account for the `scale` (`numIters`) and conversion to microseconds into SampleRunner’s `measure` method.
2018-08-31 07:32:18 +02:00
Pavol Vaskovic
a03aede90d [benchmark] Gardening: scale was always Int
Since the `scale` (or `numIters`) is passed to the `test.runFunction` as `Int`, the whole type-casting dance here was just silly!
2018-08-31 07:32:17 +02:00
Pavol Vaskovic
bf4a343124 [benchmark] Gardening: numSamples UInt vs Int
Type check command line argument to be non-negative, but store value in currency type `Int`.
2018-08-31 07:32:16 +02:00
Pavol Vaskovic
77dff0a1d7 [benchmark] Gardening: afterRunSleep is UInt32 2018-08-31 07:32:14 +02:00
Pavol Vaskovic
aa4b84934a [benchmark] Move stats computation to BenchResults 2018-08-31 07:32:12 +02:00
Pavol Vaskovic
0db20feda2 [benchmark] Fix index computation for quantiles
Turns out that both the old code in `DriverUtils` that computed median, as well as newer quartiles in `PerformanceTestSamples` had off-by-1 error.

It trully is the 3rd of the 2 hard things in computer science!
2018-08-31 07:32:10 +02:00
Pavol Vaskovic
5cd9f53840 [benchmark] Refactor min max median computation
We can spare 2 array passes (for min and max), if we just sort first.
2018-08-31 07:32:09 +02:00
Pavol Vaskovic
963995fa2b [benchmark] Refactor mean and stdev computation 2018-08-31 07:32:08 +02:00
Pavol Vaskovic
974994c13a [benchmark] Gardening: Timer Parasite Control
Improve conformance to Swift Naming Guidelines
https://swift.org/documentation/api-design-guidelines/#parameter-names

Removed the gross undescores and ticks from parameter names. Ticks are ectoparasites feeding on the blood. We are just measuring time and there is also no [mysterious ticking noise](http://bit.ly/TickNoise) here either…
2018-08-31 07:32:06 +02:00
Pavol Vaskovic
c1a694de30 [benchmark] Gardening: Extract constant oneSecond 2018-08-31 07:32:04 +02:00
Pavol Vaskovic
b373132548 [benchmark] Gardening: Code format class Timer 2018-08-31 07:32:03 +02:00
Pavol Vaskovic
e6cff27bab [benchmark] Gardening: Indentation of .listTests 2018-08-31 07:32:02 +02:00
swift-ci
33d95a3d22 Merge pull request #18984 from graydon/follow-my-simple-instruction 2018-08-30 16:20:29 -07:00
Pavol Vaskovic
b482a049c3 [benchmark] Gentleman yields. Don’t be a CPU hog!
It pays off of to be a nice process and yield the processor at regular intervals, to prevent having measured samples corrupted by preemptive multitasking.

When the scheduled time slice (10ms on Mac OS) is probably about to expire during the next measurement, we voluntarily yield  and resume the measurement within a fresh scheduler quantum.

This cooperative approach to multitasking improves the sample quality and robustness of the measurement process.
2018-08-28 05:37:43 +02:00
Pavol Vaskovic
7969be1c35 [benchmark] Explicitly enable static dispatch
Marking `Timer` and `SampleRunner` classes as `final` to make sure their methods use static dispatch.
2018-08-28 05:37:43 +02:00
Pavol Vaskovic
c98006c10b [benchmark] Gardening: Fixed indentation 2018-08-28 05:37:43 +02:00
Graydon Hoare
94725dff2b [benchmark] Add preliminary helper for measuring instructions executed. 2018-08-25 01:29:37 -07:00
Pavol Vaskovic
7ae5d7754c [benchmark] Report totals as a sentence
Clean up after removing bogus agregate statistics from last line of the log. It makes more sense to report the total number of executed benchmarks as a sentence that trying to fit into the format of preceding table.

Added test assertion that `run_benchmarks` return csv formatted log, as it is used to write the log into file in `log_results`.
2018-08-23 18:01:46 +02:00
Pavol Vaskovic
f29fef6b67 [benchmark] Print delim in verbose config 2018-08-16 08:11:13 +02:00
eeckstein
b042215abd Merge pull request #18211 from palimondo/fluctuation-of-the-pupil
[benchmark] Alphabetic sorting of tests and warning about incorrect use of memory option
2018-07-25 14:45:23 -07:00
Pavol Vaskovic
1a382ab775 [benchmark] Warn about incorrect --memory use 2018-07-25 08:02:28 +02:00
Pavol Vaskovic
a61a756b4d [benchmark] Fix: alphabetic sorting of tests 2018-07-25 07:45:07 +02:00
Erik Eckstein
bd43d54b99 benchmarks: Move the setup and teardown functions out of the sample loop.
This is important to minimize the runtime when many samples are taken.
2018-07-24 20:20:23 -07:00
Pavol Vaskovic
362f925e37 [benchmark][Gardening] Docs and error handling
* Improved documentation.
* Corrected`fflush` usage in `parse` error handling.
* Removed unused `passThroughArgs`.
2018-07-23 18:01:23 +02:00
Pavol Vaskovic
4ed3dcfcc5 [benchmark][Gardening] --sample-time renaming
Sample time is a better name for what was previously called `iter-scale`.
2018-07-23 17:15:54 +02:00
Pavol Vaskovic
df5ccf3e26 [benchmark][Gardening] Better naming and comments
* Restored property doc comments on `TestConfig`
* Better name for func `usage` is `getResourceUtilization`
2018-07-23 11:17:16 +02:00
Pavol Vaskovic
19613733a4 [benchmark] Log the MAX_RSS only w/ --memory flag
Printing of the MAX_RSS is now hidden behind the optional `--memory` flag.
2018-07-22 06:25:23 +02:00
Pavol Vaskovic
f89d41ad3b [benchmark] Print detailed argument help
The `--help` option now prints standard usage description with documentaion for all arguments:

````
 $ Benchmark_O --help
usage: Benchmark_O [--argument=VALUE] [TEST [TEST ...]]

positional arguments:
 TEST           name or number of the benchmark to measure

optional arguments:
 --help         show this help message and exit
 --num-samples  number of samples to take per benchmark; default: 1
 --num-iters    number of iterations averaged in the sample;
                default: auto-scaled to measure for 1 second
 --iter-scale   number of seconds used for num-iters calculation
                default: 1
 --verbose      increase output verbosity
 --delim        value delimiter used for log output; default: ,
 --tags         run tests matching all the specified categories
 --skip-tags    don't run tests matching any of the specified
                categories; default: unstable,skip
 --sleep        number of seconds to sleep after benchmarking
 --list         don't run the tests, just log the list of test
                numbers, names and tags (respects specified filters)
````
2018-07-21 22:58:44 +02:00
Pavol Vaskovic
50c79c5972 [benchmark][Gardening] Local parser error handling
In case of invalid command line arguments, there is no reasonable recovery, the `ArgumentParser` can exit the program itself.  It is therefore no longer necessary to propagate the `ArgumentError`s outside of the parser.
2018-07-21 13:30:32 +02:00
Pavol Vaskovic
f674dd5cf0 [benchmark][Gardening] Handle --help inside parser
Moved the printing of help message inside the `ArgumentParser`, which has all the necessary info.

Added test that checks the `--help` option.
2018-07-21 13:16:08 +02:00
Pavol Vaskovic
e5cbfccd22 [benchmark][Gardening] Declarative ArgumentParser
The `ArgumentParser` now has a configuration phase which specifies the supported arguments and their handling. The configured parser is then executed using the `parse` method which returns the parsed result.
2018-07-21 01:32:40 +02:00