swift-mirror

mirror of https://github.com/apple/swift.git synced 2025-12-14 20:36:38 +01:00

Author	SHA1	Message	Date
swift-ci	32a967f1ea	Merge pull request #39171 from eltociear/patch-22	2022-01-13 07:01:02 -08:00
Evan Wilde	6956b7c5c9	Replace /usr/bin/python with /usr/env/python /usr/bin/python doesn't exist on ubuntu 20.04 causing tests to fail. I've updated the shebangs everywhere to use `/usr/bin/env python` instead.	2021-09-28 10:05:05 -07:00
Ikko Ashimine	c48f6e09bb	[benchmark] Fix typo in compare_perf_tests.py formating -> formatting	2021-09-04 09:10:34 +09:00
tbkka	3181dd1e4c	Fix a bunch of python lint errors (#32951 ) * Fix a bunch of python lint errors * adjust indentation	2020-07-17 14:30:21 -07:00
Sergej Jaskiewicz	cce9e81f0b	Support Python 3 in the benchmark suite	2020-02-28 01:45:35 +03:00
Ross Bayer	b1961745e0	[Python: black] Reformatted the benchmark Python sources using utils/python_format.py.	2020-02-08 15:32:44 -08:00
Pavol Vaskovic	cc0e16ca34	[benchmark] LogParser: measurement metadata	2019-07-23 19:44:41 +02:00
Pavol Vaskovic	007d398f4a	[Gardening] ReportFormatter: tying up loose ends	2019-05-24 00:18:44 +02:00
Pavol Vaskovic	b3f7996ea7	[benchmark] ReportFormatter: better inline headers Improve inline headers in `single_table` mode to also print labels for the numeric columns. Sections in the `single_table` are visually distinguished by a separator row preceding the the inline headers. Separated header label styles for git and markdown modes with UPPERCASE and Bold formatting respectively. Inlined section template definitions.	2019-05-23 23:24:51 +02:00
Pavol Vaskovic	9750581bf5	[benchmark] ReportFormatter: right-align num cols	2019-05-23 19:32:34 +02:00
Pavol Vaskovic	af7ef03aaf	[benchmark] ReportFormatter: refactor header logic Confine the logic for printing headers to the header function.	2019-05-23 17:28:21 +02:00
Pavol Vaskovic	a998e18e18	[benchmark] ReportFormatter: faster templating It is slightly faster to simply concatenate strings that don’t require special formatting.	2019-05-23 12:29:19 +02:00
Pavol Vaskovic	49d25bfc51	[benchmark] ReportFomatter: de-tuple Remove unnecessary list-to-tuple conversions.	2019-05-23 12:20:19 +02:00
Pavol Vaskovic	691007b029	[benchmark] LogParser: Accept -?! in bench. names Extend parser to support benchmark names that include `-?!` in names, to fully support the new Naming Convention from PR #20334.	2019-02-19 23:31:58 +01:00
Erik Eckstein	040aa06fec	benchmarks: combine everything which is needed into run_smoke_bench Now, run_smoke_bench runs the benchmarks, compares performance and code size and reports the results - on stdout and as a markdown file. No need to run bench_code_size.py and compare_perf_tests.py separately. This has two benefits: - It's much easier to run it locally - It's now more transparent what's happening in '@swiftci benchmark', because now all the logic is in run_smoke_bench rather than in the not visible script on the CI bot. I also remove the branch-arguments from ReportFormatter in ompare_perf_tests.py. They were not used anyway. For a smooth rollout in CI, I created a new script rather than changing the existing one. Once everything is setup in CI, I'll delete the old run_smoke_test.py and bench_code_size.py.	2018-11-01 16:41:39 -07:00
Pavol Vaskovic	897b9ef82e	[benchmark] Gardening: Fix linter nitpicks	2018-10-27 06:15:23 +02:00
Pavol Vaskovic	397c44747b	[benchmark] Exclude outliers from sample Use the box-plot inspired technique for filtering out outlier measurements. Values that are higher than the top inner fence (TIF = Q3 + IQR * 1.5) are excluded from the sample.	2018-10-11 19:48:20 +02:00
Pavol Vaskovic	0d318b6464	[benchmark] Discard oversampled quantile values When num_samples is less than quantile + 1, some of the measurements are repeated in the report summary. Parsed samples should strive to be a true reflection of the measured distribution, so we’ll correct this by discarding the repetated artifacts from quantile estimation. This avoids introducting a bias from this oversampling into the empirical distribution obtained from merging independent samples. See also: https://en.wikipedia.org/wiki/Oversampling_and_undersampling_in_data_analysis	2018-10-11 18:56:27 +02:00
Pavol Vaskovic	61a092a695	[benchmark] LogParser delta quantiles support Support for reading delta-encoded quantiles format.	2018-10-11 18:56:27 +02:00
Pavol Vaskovic	012e07cdd2	[benchmark] LogParser support for quantile format Gather all samples published in the benchamark summary from the `Benchmark_O --quantile` output format.	2018-10-09 15:52:28 +02:00
Pavol Vaskovic	9ba571f641	[benchmark] Parse yield timings from verbose log	2018-10-09 09:52:14 +02:00
Pavol Vaskovic	0f25849f8c	[benchmark] Parse setup time from verbose log	2018-10-09 09:52:06 +02:00
Pavol Vaskovic	78159e1fe3	[benchmark] Fix drop mean and sd on merge	2018-10-09 09:50:45 +02:00
Pavol Vaskovic	f729b8e623	[benchmark] Fix merging max_rss when None	2018-10-06 11:43:56 +02:00
Pavol Vaskovic	a9f0ce4338	[benchmark] Fix quantile estimation type The correct quantile estimation type for printing all measurements in the summary report while `quantile == num-samples - 1` is R-1, SAS-3. It's the inverse of empirical distribution function. References: * https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample * discussion in https://github.com/apple/swift/pull/19097#issuecomment-421238197	2018-09-20 09:19:07 +02:00
Ben Langmuir	541c48f9e4	Merge pull request #19328 from palimondo/test-twice-commit-once [benchmark] Report Quantiles from Benchmark_O and a TON of Gardening (take 2)	2018-09-17 11:54:08 -07:00
Pavol Vaskovic	f0e7b8737a	[benchmark] Round quantile idx to nearest or even Explicitly use round-half-to-even rounding algorithm to match the behavior of numpy's quantile(interpolation='nearest') and quantile estimate type R-3, SAS-2. See: https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample	2018-09-14 23:40:43 +02:00
Pavol Vaskovic	e48b5fdb34	[benchmark] Fix index computation for quantiles Turns out that both the old code in `DriverUtils` that computed median, as well as newer quartiles in `PerformanceTestSamples` had off-by-1 error. It trully is the 3rd of the 2 hard things in computer science!	2018-09-14 23:40:43 +02:00
Ben Langmuir	423e145b0c	Revert "[benchmark] Report Quantiles from Benchmark_O and a TON of Gardening"	2018-09-14 13:24:01 -07:00
Pavol Vaskovic	2ad8bf732a	[benchmarks] Rename column label SPEEDUP to RATIO Since the results comparisons are now used to also compare code sizes in addition to runtimes, it makes sense to rename the column label to the more neutral term “ratio” instead of old “speedup”.	2018-09-13 22:00:52 +02:00
Pavol Vaskovic	a56c55c8e4	[benchmark] Round quantile idx to nearest or even Explicitly use round-half-to-even rounding algorithm to match the behavior of numpy's quantile(interpolation='nearest') and quantile estimate type R-3, SAS-2. See: https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample	2018-09-10 10:45:00 +02:00
Pavol Vaskovic	0db20feda2	[benchmark] Fix index computation for quantiles Turns out that both the old code in `DriverUtils` that computed median, as well as newer quartiles in `PerformanceTestSamples` had off-by-1 error. It trully is the 3rd of the 2 hard things in computer science!	2018-08-31 07:32:10 +02:00
Erik Eckstein	1f32935fc4	benchmarks: fix regexp for parsing code size results Accept a '.' in the benchmark name which is used for .o and .dylib files	2018-08-27 17:07:40 -07:00
Pavol Vaskovic	049ffb34b0	[benchmark] Fix parsing formatted text The test number column in the space justified column format emmited by the Benchmark_Driver to stdout while logging to file is right aligned, so it must handle leading whitespace.	2018-08-23 12:31:00 +02:00
Pavol Vaskovic	0d64386b53	[benchmark] Documentation improvements Improving complience with PEP 257 -- Docstring Conventions https://www.python.org/dev/peps/pep-0257/	2018-08-23 11:45:43 +02:00
Pavol Vaskovic	076415f969	[benchmark] Strangler `run_benchmarks` Replaced guts of the `run_benchmarks` function with implementation from `BenchmarDriver`. There was only single client which called it with `verbose=True`, so this parameter could be safely removed. Function `instrument_test` is replaced by running the `Benchmark_0` with `--memory` option, which implements the MAX_RSS measurement while also excluding the overhead from the benchmarking infrastructure. The incorrect computation of standard deviation was simply dropped for measurements of more than one independent sample. Bogus aggregated `Totals` statistics were removed, now reporting only the total number of executed benchmarks.	2018-08-17 08:40:39 +02:00
Pavol Vaskovic	e80165f316	[benchmark] Exclude only outliers from the top Option to exclude the outliers only from top of the range, leaving in the outliers on the min side.	2018-08-17 08:39:50 +02:00
Pavol Vaskovic	27cc77c590	[benchmark] Exclude outliers from samples Introduce algorithm for excluding of outliers after collecting all samples using the Interquartile Range rule. The `exclude_outliers` method uses 1st and 3rd Quartile to compute Interquartile Range, then uses inner fences at Q1 - 1.5IQR and Q3 + 1.5IQR to remove samples outside this fence. Based on experiments with collecting hundreads and thousands of samples (`num_samples`) per test with low iteration count (`num_iters`) with ~1s runtime, this rule is very effective in providing much better quality of sample population, effectively removing short environmental fluctuations that were previously averaged into the overall result (by the adaptively determined `num_iters` to run for ~1s), enlarging the reported result with these measurement errors. This technique can be used for some benchmarks, to get more stable results faster than before. This outlier filering is employed when parsing `--verbose` test results.	2018-08-17 08:39:50 +02:00
Pavol Vaskovic	91077e3289	[benchmark] Introduced PerformanceTestSamples * Moved the functionality to compute median, standard deviation and related statistics from `PerformanceTestResult` into `PerformanceTestSamples`. * Fixed wrong unit in comments	2018-08-17 08:39:50 +02:00
Pavol Vaskovic	bea35cb7c1	[benchmark] LogParser measure environment Measure more of environment during test In addition to measuring maximum resident set size, also extract number of voluntary and involuntary context switches from the verbose mode.	2018-08-17 00:32:04 +02:00
Pavol Vaskovic	c60e223a3b	[benchmark] LogParser: tab & space delimited logs Added support for tab delimited and formatted log output (space aligned columns as output to console by Benchmark_Driver).	2018-08-17 00:32:04 +02:00
Pavol Vaskovic	d0cdaee798	[benchmark] LogParser support for --verbose mode LogParser doesn’t use `csv.reader` anymore. Parsing is handled by a Finite State Machine. Each line is matched against a set of (mutually exclusive) regular expressions that represent known states. When a match is found, corresponding parsing action is taken.	2018-08-17 00:32:04 +02:00
Pavol Vaskovic	9852e9a32a	[benchmark] Extracted LogParser class	2018-08-17 00:32:04 +02:00
Pavol Vaskovic	d079607488	[benchmark] Documentation improvements	2018-08-17 00:32:04 +02:00
Pavol Vaskovic	179b12103f	[benchmark] Refactor formatting responsibilities Moved result formatting methods from `PerformanceTestResult` and `ResultComparison` to `ReportFormatter`, in order to free PTR to take more computational responsibilities in the future.	2018-08-16 17:44:59 +02:00
Erik Eckstein	edc7a0f96c	benchmarks: add an option to the compare_perf_tests script to output improvements and regressions in an single table. Instead of separate tables. Only affects git and markdown output.	2018-08-14 13:38:14 -07:00
Erik Eckstein	45a2ae48ce	benchmarks: replace the Ounchecked build with an Osize build We don't measure Ounchecked anymore. On the other hand we want to benchmark the Osize build.	2017-10-06 14:09:43 -07:00
Pavol Vaskovic	e7b243cad7	Fixed false statement in documentation.	2017-06-04 18:40:20 +02:00
Pavol Vaskovic	dea7d8fe77	Consistent --output; Improved coverage: main() Coverage at 99% according to coverage.py * `compare_perf_tests.py` now always outputs the same format to stdout as is written to `--output` file * Added integration test for the main() function * Added tests for console output (and suppressed it leaking during testing) * Fixed file name in test’s file header	2017-06-04 18:31:06 +02:00
Pavol Vaskovic	9265a71ac6	Improved coverage: ReportFormatter Coverage at 87% according to coveragy.py Also fixed spelling errors in documentation.	2017-06-02 02:28:44 +02:00

1 2

91 Commits