`BenchmarkDoctor` analyzes performance tests and reports their conformance to the set of desired criteria. First two rules verify the naming convention.
`BenchmarkDoctor` is invoked from `Benchmark_Driver` with `check` aurgument.
Replaced guts of the `run_benchmarks` function with implementation from `BenchmarDriver`. There was only single client which called it with `verbose=True`, so this parameter could be safely removed.
Function `instrument_test` is replaced by running the `Benchmark_0` with `--memory` option, which implements the MAX_RSS measurement while also excluding the overhead from the benchmarking infrastructure. The incorrect computation of standard deviation was simply dropped for measurements of more than one independent sample. Bogus aggregated `Totals` statistics were removed, now reporting only the total number of executed benchmarks.
The `run` method on `BenchmarkDriver` invokes the test harness with specified number of iterations, samples. It supports mesuring memory use and in the verbose mode it also collects individual samples and monitors the system load by counting the number of voluntary and involuntary context switches.
Output is parsed using `LogParser` from `compare_perf_tests.py`. This makes that file a required dependency for the driver, therefore it is also copied to the bin directory during the build.
See https://www.martinfowler.com/bliki/StranglerApplication.html for more info on the used pattern for refactoring legacy applications.
Introduced class `BenchmarkDriver` as a beginning of strangler application that will gradually replace old functions. Used it instead of `get_tests()` function in Benchmark_Driver.
The interaction with Benchmark_O is simulated through mocking. `SubprocessMock` class records the invocations of command line processes and responds with canned replies in the format of Benchmark_O output.
Removed 3 redundant lit tests that are now covered by the unit test `test_gets_list_of_all_benchmarks_when_benchmarks_args_exist`. This saves 3 seconds from test execution. Keeping only single integration test that verifies that the plumbing is connected correstly.
The imports are a bit sketchy because it doesn’t have `.py` extension and they had to be hacked manually. :-/
Extracted `parse_args` from `main` and added test coverage for argument parsing.
Reintroduced feature lost during `BenchmarkInfo` modernization: All registered benchmarks are ordered alphabetically and assigned an index. This number can be used as a shortcut to invoke the test instead of its full name. (Adding and removing tests from the suite will naturally reassign the indices, but they are stable for a given build.)
The `--list` parameter now prints the test *number*, *name* and *tags* separated by delimiter.
The `--list` output format is modified from:
````
Enabled Tests,Tags
AngryPhonebook,[String, api, validation]
...
````
to this:
````
\#,Test,[Tags]
2,AngryPhonebook,[String, api, validation]
…
````
(There isn’t a backslash before the #, git was eating the whole line without it.)
Note: Test number 1 is Ackermann, which is marked as “skip”, so it’s not listed with the default `skip-tags` value.
Fixes the issue where running tests via `Benchmark_Driver` always reported each test as number 1. Each test is run independently, therefore every invocation was “first”. Restoring test numbers resolves this issue back to original state: The number reported in the first column when executing the tests is its ordinal number in the Swift Benchmark Suite.
Fixed failure in `get_tests` which depended on the removed `Benchmark_O --run-all` option for listing all test (not just the pre-commit set).
Fix: Restored the ability to run tests by ordinal number from `Benchmark_Driver` after the support for this was removed from `Benchmark_O`.
Added tests that verify output format of `Benchmark_O --list` and the support for `--skip-tags= ` option which effectively replaced the old `--run-all` option. Other tools, like `Benchmark_Driver` depend on it.
Added integration tests for the dependency between `Benchmark_Driver` and `Benchmark_O`.
Running pre-commit test set isn’t tested explicitly here. It would take too long and it is run fairly frequently by CI bots, so if that breaks, we’ll know soon enough.
Disable the random hash seed while benchmarking. By its nature, it makes the number of hash collisions fluctuate between runs, adding unnecessary noise to benchmark results.
I expect we'll be able to re-enable random seeding here once we have made hash collisions cheaper -- they are currently always resolved by calling the Key's Equatable implementation, which can be expensive.
Support specifying a baseline branch to compare the current results
against. Previously, the master branch was hardcoded.
Fixes: rdar://problem/32751587
Add support for running benchmarks by reffering to them by their ordinal number in `Benchmark_Driver`, as is supported by `Benchmark_O`(`Onone`, `Ounchecked`).
Updated documentation to reflect this.
SR-4780 Can not run performance tests that are not in precommit suite
Modified driver to honor command line arguments when listing enabled tests. Fixed interaction between filters (positional arguments) and --run-all option.
Benchmark_Driver lists available benchmarks with --run-all option when benchmarks or filters are specified.
Make sure all Python code in the repo specifies which exceptions to
catch when using `except:`.
Regressions can be catched using `flake8-blind-except` going forward.
* E101: indentation contains mixed spaces and tabs
* E111: indentation is not a multiple of four
* E128: continuation line under-indented for visual indent
* E302: expected 2 blank lines, found 1
* W191: indentation contains tabs
Benchmark_Driver used string comparison to calculate the minimum and maximum
of durations in benchmark results, sometimes leading to wildly inaccurate reports.
The repo contains roughly 80 Python scripts. "snake_case" naming is used for
local variables in all those scripts. This is the form recommended by the PEP 8
naming recommendations (Python Software Foundation) and typically associated
with idiomatic Python code.
However, in nine of the 80 scripts there were at least one instance of
"camelCase" naming prior to this commit.
This commit improves consistency in the Python code base by making sure that
these nine remaining files follow the variable naming convention used for
Python code in the project.
References:
* PEP 8: https://www.python.org/dev/peps/pep-0008/
* pep8-naming: https://pypi.python.org/pypi/pep8-naming