The way we already gather numbers for this test is that we run two runs of
`Benchmark_O $TEST` with num-samples=2, iters={2,3}. Under the assumption that
the only difference in counter numbers can be caused by that extra iteration,
subtracting the group of counts for 2,3 gives us the number of counts in that
iteration.
In certain cases, I have found that a small subset of the benchmarks are
producing weird output and I haven't had the time to look into why. That being
said, I do know what these weird results look like, so in this commit we do some
extra validation work to see if we need to fail a test due to instability.
The specific validation is that:
1. We perform another run with num-samples=2, iter=5 and subtract the iter=3
counts from that. Under the assumption that overall work should increase
linearly with iteration size in our benchmarks, we check if the counts are
actual 2x.
2. If either `result[iter=3] - result[iter=2]` or `result[iter=5] -
result[iter=3]` is negative. All of the counters we gather should never decrease
with iteration count.
Otherwise, one can get results that seem to imply more rr traffic when in
reality, one was not tracking {retain,release}_n that as a result of better
optimization become just simple retain, release.
This patch adds a benchmark to the Swift benchmark suite based on the
ChaCha20 encryption algorithm.
As Swift evolves it is important that it tackles more and more features
and possible use cases. One of these great use-cases is low-level CPU
intensive code, and cryptographic algorithms are a really important
test-bed.
This benchmark therefore provides a real-world test case for Swift's
optimiser. My ideal outcome here is that Swift should be able to perform
as well at this benchmark as a naive equivalent C implementation.
Global variables are being used as if they are "free black
holes". They might be even more expensive than actually calling
blackHole. But the right thing to do is use local variables and pass
them into a black hole at the end of computation.
Specifically, I add some benchmarks for weak, unowned, unsafe (unowned), and
unmanaged. The reason for the split in between unsafe (unowned) and unmanaged is
that one is testing the raw compiler features and the other is validating stdlib
performance.
This benchmark was "accidentally" accessing a global variable in the
main iteration loop, which is in some situations very expensive and
has nothing to do with what the benchmark is trying to test.
FindStringNaive is a simple benchmark which implements a naive String
finding algorithm that currently shows a lot of ARC traffic, hopefully
to be reduced in the future.
This currently copies the array each time it swaps elements. This
makes it 1500x slower than it should be to sort the array. The
benchmark now runs in 15ms but should be around 10us when fully
optimized.
This algorithm is an interesting optimization problem involving array
optimization, uniqueness, bounds checks, and exclusivity. But the
general first order problem is how to modify a CoW data structure
that's stored in a class property. As it stands, the property getter
retains the class property around the modify accesses that checks
uniqueness.