This updates the performance inliner to iterate on inlining in cases
where devirtualization or specialization after the first pass of
inlining expose new opportunities for inlining. Similarly, in some cases
inlining exposes new opportunities for devirtualization, e.g. when we
inline an initializer and can now see an alloc_ref that allows us to
devirtualize some class_methods.
The implementation currently has some inefficiencies which increase the
swift compilation time for the stdlib by around 3% (this is swift-time
only, no LLVM time, so overall time does not grow by this much).
Unfortunately the (unchanged) current implementation of the core
inlining trades off improved estimates of code growth for increased
compile time, and that plays a part in why compile time increases as
much as it does. Despite this, I have some ideas on how to win some of
that time back in future patches.
Performance differences are mixed, and this will likely require some
further inliner tuning to reduce or remove some of the losses seen here
at -O. I will open radars for the losses.
Wins:
DeltaBlue 10.2%
EditDistance 13.8%
SwiftStructuresInsertionSort 32.6%
SwiftStructuresStack 34.9%
Losses:
PopFrontArrayGeneric -12.7%
PrimeNum -19.0%
RC4 -30.7%
Sim2DArray -14.6%
There were a handful of wins and losses at Onone and Ounchecked as
well. I'll review the perf testing output and open radars accordingly.
The new test case shows an example of the power of the closer
integration here. We are able to completely devirtualize and inline a
series of class_method applies (10 deep in this case, but in theory
substantially deeper) in a single pass of the inliner, whereas before we
could only do a single level per pass of inlining & devirtualization.
Swift SVN r27561
This callback is called on each newly generated instruction that results
from cloning the body of the callee. The intent is to use this to
collect a subset of newly generated instructions,
e.g. ApplyInst/TryApplyInst.
Swift SVN r26843
threaded into IRGen; tests to follow when that's done.
I made a preliminary effort to make the inliner do the
right thing with try_apply, but otherwise tried to avoid
touching the optimizer any more than was required by the
removal of ApplyInstBase.
Swift SVN r26747
without a valid SILDebugScope. An assertion in IRGenSIL prevents future
optimizations from regressing in this regard.
Introducing SILBuilderWithScope and SILBuilderwithPostprocess to ease the
transition.
This patch is large, but mostly mechanical.
<rdar://problem/18494573> Swift: Debugger is not stopping at the set breakpoint
Swift SVN r22978
Reinstantiate rr22712. Now that updating SSA form is more robust we should not
run into troubles anymore.
Original commit message:
"Use the inliner's heuristic to decide which instructions are 'free' (constants,
etc).
We were not jumpthreading a block because it had four instruction in it - two of
them integer_literals."
rdar://18594600
Swift SVN r22761
Use the inliner's heuristic to decide which instructions are 'free' (constants,
etc).
We were not jumpthreading a block because it had four instruction in it - two of
them integer_literals.
rdar://18594600
Swift SVN r22712
Fixes part of <rdar://problem/16196801>.
Inline generic functions, but only when:
- There are no unbound archetypes being substituted (due to various
assumptions in TypeSubstCloner about having all concrete types).
- When no substitution is an existential (due to
<rdar://problem/17431105>, <rdar://problem/17544901>, and
<rdar://problem/17714025>).
This gets things limping along, but we really need to fix the above
limitations so that mandatory inlining never fails.
This doesn't enable inlining generics in the performance inliner. There
is no reason it shouldn't work as well, but there is no compelling
reason to do so now and it could have unintended effects on performance.
Some highlights from PreCommitBench -
O0:
old (ms) new (ms) delta (ms) speedup
ForLoops 1127.00 294.00 833.00 283.3%
LinkedList 828.00 165.00 663.00 401.8%
R17315246 982.00 288.00 694.00 241.0%
SmallPT 3018.00 1388.00 1630.00 117.4%
StringWalk 1276.00 89.00 1187.00 1333.7%
-- most others improve ~10% --
O3:
old (ms) new (ms) delta (ms) speedup
Ackermann 4138.00 3724.00 414.00 11.1%
Life 59.00 64.00 5.00 -7.8%
Phonebook 2103.00 1815.00 288.00 15.9%
R17315246 430.00 582.00 152.00 -26.1%
StringWalk 1173.00 1097.00 76.00 6.9%
Ofast:
old (ms) new (ms) delta (ms) speedup
Ackermann 3505.00 3715.00 210.00 -5.7%
Life 49.00 41.00 8.00 19.5%
Memset 684.00 554.00 130.00 23.5%
Phonebook 2166.00 1769.00 397.00 22.4%
StringWalk 829.00 790.00 39.00 4.9%
I've opened the following to track remaining issues that need to be
fixed before we can inline all transparent function applications:
<rdar://problem/17431105>
<rdar://problem/17544901>
<rdar://problem/17714025>
<rdar://problem/17768777>
<rdar://problem/17768931>
<rdar://problem/17769717>
Swift SVN r20378
info for them and generally clean up the inline scope handling a bit.
Fix the debug scope handling for all clients of SILCloner, especially
the SIL-level spezializers and inliners.
This also adds a ton of additional assertions that will ensure that
future optimization passes won't mess with the debug info in a way that
could confuse the LLVM backend.
Swift SVN r18984
hierarchy. I still need to figure out a reliable way to write testcases
for this. For now it's ensured via an assertion in SILCloner::postprocess.
Swift SVN r18917
Instead, fall back on the version in SILCloner, which is identical
except the SILCLoner version properly remaps types when cloning generic
code.
Swift SVN r18905
Mandatory-inlined (aka transparent functions) are still treated as if they
had the location and scope of the call site. <rdar://problem/14845844>
Support inline scopes once we have an optimizing SIL-based inliner
Patch by Adrian Prantl.
Swift SVN r18835
Teach the mandatory inliner to drop debug_value[_addr] instructions
when inlining. Otherwise, we get debug_value's for all of the arguments
splattered all over the place. We want transparent functions to be
effectively ((nodebug)) in C parlance.
Swift SVN r12404
Make ApplyInst and PartialApplyInst directly take substitutions for generic functions instead of trying to stage out substitutions separately. The legacy reasons for doing this are gone.
Swift SVN r8747
After implementing this I realized that a lot of the logic currently in MandatoryInlining.cpp should be moved into the SILInliner so it can be reused in an optimizing inliner. I plan on doing that refactoring immediately but decided to go ahead and commit this since it's a working incremental step.
Swift SVN r7771