Introduce algorithm for excluding of outliers after collecting all samples using the Interquartile Range rule.
The `exclude_outliers` method uses 1st and 3rd Quartile to compute Interquartile Range, then uses inner fences at Q1 - 1.5*IQR and Q3 + 1.5*IQR to remove samples outside this fence.
Based on experiments with collecting hundreads and thousands of samples (`num_samples`) per test with low iteration count (`num_iters`) with ~1s runtime, this rule is very effective in providing much better quality of sample population, effectively removing short environmental fluctuations that were previously averaged into the overall result (by the adaptively determined `num_iters` to run for ~1s), enlarging the reported result with these measurement errors. This technique can be used for some benchmarks, to get more stable results faster than before.
This outlier filering is employed when parsing `--verbose` test results.
* Moved the functionality to compute median, standard deviation and related statistics from `PerformanceTestResult` into `PerformanceTestSamples`.
* Fixed wrong unit in comments
Measure more of environment during test
In addition to measuring maximum resident set size, also extract number of voluntary and involuntary context switches from the verbose mode.
LogParser doesn’t use `csv.reader` anymore.
Parsing is handled by a Finite State Machine. Each line is matched against a set of (mutually exclusive) regular expressions that represent known states. When a match is found, corresponding parsing action is taken.
Moving the `captured_output` function to own file.
Adding homegrown unit testing helper classes `Stub` and `Mock`.
The issue is that the unittest.mock was added in Python 3.3 and we need to run on Python 2.7. `Stub` and `Mock` were organically developed as minimal implementations to support the common testing patterns used on the original branch, but since I’m rewriting the commit history to provide an easily digestible narrative, it makes sense to introduce them here in one step as a custom unit testing library.
Moved result formatting methods from `PerformanceTestResult` and `ResultComparison` to `ReportFormatter`, in order to free PTR to take more computational responsibilities in the future.
Coverage at 99% according to coverage.py
* `compare_perf_tests.py` now always outputs the same format to stdout as is written to `--output` file
* Added integration test for the main() function
* Added tests for console output (and suppressed it leaking during testing)
* Fixed file name in test’s file header
compare_perf_test.py is now covered with unit tests and public methods are documented in the implementation.
Minor refactoring to better conform to Python conventions:
* classes declared in new style
* proper private method prefix of single underscore
* replacing map with list comprehension where it was clearer
Unit test are executed as part of validation-test.
.gitignore was modified to ignore .coverage and htmlcov artifacts generated by the coverage.py package