This updates the performance inliner to iterate on inlining in cases
where devirtualization or specialization after the first pass of
inlining expose new opportunities for inlining. Similarly, in some cases
inlining exposes new opportunities for devirtualization, e.g. when we
inline an initializer and can now see an alloc_ref that allows us to
devirtualize some class_methods.
The implementation currently has some inefficiencies which increase the
swift compilation time for the stdlib by around 3% (this is swift-time
only, no LLVM time, so overall time does not grow by this much).
Unfortunately the (unchanged) current implementation of the core
inlining trades off improved estimates of code growth for increased
compile time, and that plays a part in why compile time increases as
much as it does. Despite this, I have some ideas on how to win some of
that time back in future patches.
Performance differences are mixed, and this will likely require some
further inliner tuning to reduce or remove some of the losses seen here
at -O. I will open radars for the losses.
Wins:
DeltaBlue 10.2%
EditDistance 13.8%
SwiftStructuresInsertionSort 32.6%
SwiftStructuresStack 34.9%
Losses:
PopFrontArrayGeneric -12.7%
PrimeNum -19.0%
RC4 -30.7%
Sim2DArray -14.6%
There were a handful of wins and losses at Onone and Ounchecked as
well. I'll review the perf testing output and open radars accordingly.
The new test case shows an example of the power of the closer
integration here. We are able to completely devirtualize and inline a
series of class_method applies (10 deep in this case, but in theory
substantially deeper) in a single pass of the inliner, whereas before we
could only do a single level per pass of inlining & devirtualization.
Swift SVN r27561
During inlining we'll now attempt to first devirtualize and specialize
within the function that we're going to inline into. If we're successful
devirtualizing and inlining, and we'll attempt to inline into the newly
exposed callees first, before inlining into the function we began with.
This does not remove any existing passes of devirtualization or
specialization yet, partially because we don't completely handle all
cases that they handle at this point (e.g. specializing partial
applies).
We do end up specializing deeper into the call graph with this approach
than we did prior to this commit.
I will have some follow-on changes that integrate things further,
allowing us to devirtualize in more cases after inlining into a given
function.
I will also add some directed tests in a future commit.
I tested the stdlib build and this made no difference in build
times. Perhaps after removing other existing phases we'll recapture some
build time.
I'm not seeing reproducible performance differences with this change,
which is not a big surprise at this point. This sets us up for being
able to improve the compilation pipeline in a future release.
Swift SVN r27327
This leaves nothing but the helper for specializing an ApplySite in
Generics.h/Generics.cpp, and I expect to rename these files accordingly
at some point.
Swift SVN r26827
Another refactoring step towards splitting the generic specializer into
a pass vs. the cloner vs. a utility that can specialize a given
ApplySite.
Swift SVN r26817
More refactoring of generic specializer, on the path to making the this
new function the primary utility that can be used from other passes.
Swift SVN r26791
There was only one remaining user since it was removed from any function
interfaces, and that should really just use SmallVector directly.
Swift SVN r26790
As with r26754, this is another step towards simplifying the generic
specializer interface. Since we now properly mangle and can therefore
test if we already have a specialization of this function, we no longer
need to do the bucketing to avoid duplicated work.
The stdlib build is as fast or faster, and the only diffs I see appear
to be either function ordering, UUIDs, or the bit of non-determinism
I've seen in block ordering.
Swift SVN r26765
threaded into IRGen; tests to follow when that's done.
I made a preliminary effort to make the inliner do the
right thing with try_apply, but otherwise tried to avoid
touching the optimizer any more than was required by the
removal of ApplyInstBase.
Swift SVN r26747
Use existing machinery of the generic specializer to produce generic specializations of closures referenced by partial_apply instructions. Thanks to the newly introduced ApplyInstBase class, the required changes in the generic specializer are very minimal.
rdar://19290942
Swift SVN r26582
SIL cloning is not always followed by sil-combine, which could do the clean-up. Therefore, take care of removing the dead code after "unreachable" instructions at the end of the cloning process.
Swift SVN r26029