Teach dominator based simplifications to also thread dominated edges.
The code now handles cond_br and switch_enum terminators for both value based
simplifications (where the use is dominated) and jump threading edges (the edge
is dominated).
Update simplify_cfg.sil test cases for split edges.
This also handles the test case from rdar://20390647.
Swift SVN r27843
This reverts commit r27739 reapplying r27722.
The test (validation/stdlib/Hashing.swift) that failed is expected to sometimes
fails.
Original message
"SimplifyCFG: Fix a bug in the jump threading code
When jump threading a block we used to propagate phi values directly into the
threaded block - instead of leaving in a copy. Because of this the SSA updater
would propagate the value feeding the copy from the next iteration.
Now when jump threading the destination block into the edge we leave in the phi
(copy) such that the SSA updater picks up value of the right iteration.
rdar://20617338"
Swift SVN r27741
When jump threading a block we used to propagate phi values directly into the
threaded block - instead of leaving in a copy. Because of this the SSA updater
would propagate the value feeding the copy from the next iteration.
Now when jump threading the destination block into the edge we leave in the phi
(copy) such that the SSA updater picks up value of the right iteration.
rdar://20617338
Swift SVN r27722
This updates the performance inliner to iterate on inlining in cases
where devirtualization or specialization after the first pass of
inlining expose new opportunities for inlining. Similarly, in some cases
inlining exposes new opportunities for devirtualization, e.g. when we
inline an initializer and can now see an alloc_ref that allows us to
devirtualize some class_methods.
The implementation currently has some inefficiencies which increase the
swift compilation time for the stdlib by around 3% (this is swift-time
only, no LLVM time, so overall time does not grow by this much).
Unfortunately the (unchanged) current implementation of the core
inlining trades off improved estimates of code growth for increased
compile time, and that plays a part in why compile time increases as
much as it does. Despite this, I have some ideas on how to win some of
that time back in future patches.
Performance differences are mixed, and this will likely require some
further inliner tuning to reduce or remove some of the losses seen here
at -O. I will open radars for the losses.
Wins:
DeltaBlue 10.2%
EditDistance 13.8%
SwiftStructuresInsertionSort 32.6%
SwiftStructuresStack 34.9%
Losses:
PopFrontArrayGeneric -12.7%
PrimeNum -19.0%
RC4 -30.7%
Sim2DArray -14.6%
There were a handful of wins and losses at Onone and Ounchecked as
well. I'll review the perf testing output and open radars accordingly.
The new test case shows an example of the power of the closer
integration here. We are able to completely devirtualize and inline a
series of class_method applies (10 deep in this case, but in theory
substantially deeper) in a single pass of the inliner, whereas before we
could only do a single level per pass of inlining & devirtualization.
Swift SVN r27561
The GlobalPropertyOpt pass performs a static analysis over the whole module.
If it can prove that an array property call (_getArrayPropertyIsNativeNoTypeCheck) always yiels true,
then it replaces the call with a literal-true.
The pass runs on the high-level SIL using the array semantics calls.
Currently it only handles the isNativeNoTypeCheck array property, but in future it might handle additinal properties
(therefore I chose this general name for it).
It gives +24% on DeltaBlue.
Swift SVN r27361
During inlining we'll now attempt to first devirtualize and specialize
within the function that we're going to inline into. If we're successful
devirtualizing and inlining, and we'll attempt to inline into the newly
exposed callees first, before inlining into the function we began with.
This does not remove any existing passes of devirtualization or
specialization yet, partially because we don't completely handle all
cases that they handle at this point (e.g. specializing partial
applies).
We do end up specializing deeper into the call graph with this approach
than we did prior to this commit.
I will have some follow-on changes that integrate things further,
allowing us to devirtualize in more cases after inlining into a given
function.
I will also add some directed tests in a future commit.
I tested the stdlib build and this made no difference in build
times. Perhaps after removing other existing phases we'll recapture some
build time.
I'm not seeing reproducible performance differences with this change,
which is not a big surprise at this point. This sets us up for being
able to improve the compilation pipeline in a future release.
Swift SVN r27327
... with disabled test 1_stdlib/Bit.swift for ios.
Most likely the problem of 1_stdlib/Bit.swift (only on armv7) is just uncovered by this change.
Unfortunately I have no possibility to debug the problem on a device. Therefore I filed rdar://problem/20521110
Swift SVN r27274
This avoids that an unoptimized imported function is linked instead the optimized version from the stdlib.
rdar://problem/20485253
It gives considerable performance improvmenets for some benchmarks with -Onone. E.g.
PopFrontUnsafePointer: +281%
ArrayOfPOD: +92%
StrComplexWalk: +91%
ArrayOfGenericPOD: +61%
Several others are within the range of +10% to +30%.
For the implementation I added runSILPassesForOnone() in Passes.cpp.
Here we can add other optimizations for -Onone in the future.
Swift SVN r27206
If a conformance to _BridgedToObjectiveC is statically known, generate a more efficient code by using the newly introduced library functions for bridging casts.
This covers the casts resulting from SIL optimizations.
Tests are included. I tried to cover most typical casts from ObjC types into Swift types and vice versa and to check that we always generate something more efficient than a checked_cast or unconditional_checked_cast. But probably even more tests should be written or generated by means of gyb files to make sure that nothing important is missing.
The plan is to make the bridged casts SIL optimization a guaranteed optimization. Once it is done, there is no need to lower the bridged casts in a special way inside Sema, because they all can be handled by the optimizer in a uniform way. This would apply to bridging of Error types too.
With this change, no run-time conformance checks are performed at run-time if conformances are statically known at compile-time.
As a result, the performance of rdar://19081345 is improved by about 15%. In the past, conformance checks in this test took 50% of its execution time, then after some improvements 15% and now it is 0%, as it should be.
Swift SVN r27102
This callback is called on each newly generated instruction that results
from cloning the body of the callee. The intent is to use this to
collect a subset of newly generated instructions,
e.g. ApplyInst/TryApplyInst.
Swift SVN r26843
This leaves nothing but the helper for specializing an ApplySite in
Generics.h/Generics.cpp, and I expect to rename these files accordingly
at some point.
Swift SVN r26827
If someone has conflicting changes, please feel free to revert this,
commit your changes, and then do the same unindenting of the resulting
file as a separate step.
Swift SVN r26819
Another refactoring step towards splitting the generic specializer into
a pass vs. the cloner vs. a utility that can specialize a given
ApplySite.
Swift SVN r26817
More refactoring of generic specializer, on the path to making the this
new function the primary utility that can be used from other passes.
Swift SVN r26791
There was only one remaining user since it was removed from any function
interfaces, and that should really just use SmallVector directly.
Swift SVN r26790
As with r26754, this is another step towards simplifying the generic
specializer interface. Since we now properly mangle and can therefore
test if we already have a specialization of this function, we no longer
need to do the bucketing to avoid duplicated work.
The stdlib build is as fast or faster, and the only diffs I see appear
to be either function ordering, UUIDs, or the bit of non-determinism
I've seen in block ordering.
Swift SVN r26765
Avoid running a pass a second time if the pass has its own optimize-until-no-more-changes loop.
This pass property can be configured.
Currently I assume that all our function passes don't need to run a second time (without other changes in between).
It further reduces the number of pass runs by about 6%. But this improvement is cosmetic. There is no significant compile time reduction.
There are no performance changes.
Swift SVN r26760
It avoids that a pass runs a second time if didn't make any changes in the previous run and no other pass changed the function in between.
rdar://problem/20336764
In this change I also removed the CompleteFunctions analysis which is now obsolete.
Some measurements with swiftbench (-wmo, single threaded):
It reduces the number of pass runs by about 28%. Because only passes are skipped that don't do anything, the effect on compile time is not so dramatic.
The time spent in runSILOptimizationPasses is reduced by ~9% which gives a total compile time reduction of about 3%.
Swift SVN r26757
To set the PassKind automatically, I needed to refactor some code of the pass manager and the pass definitions.
The main changes are:
1) SILPassManager now has an add-function for each pass: PM.add(createP()) -> PM.addP()
2) I removed the ARGS argument in Passes.def, which we didn't use anyway.
Swift SVN r26756
threaded into IRGen; tests to follow when that's done.
I made a preliminary effort to make the inliner do the
right thing with try_apply, but otherwise tried to avoid
touching the optimizer any more than was required by the
removal of ApplyInstBase.
Swift SVN r26747
Be more careful when replacing targets of terminators. This finally implements a long-due FIXME in CheckedCastBrJumpThreading.
rdar://20345557.
Swift SVN r26725
I completely missed that one of the CondFailOpt optimization was already implemented in SimplifyCFG.
I move the other optimization also into SimplifyCFG because both share some code.
Swift SVN r26626
Use existing machinery of the generic specializer to produce generic specializations of closures referenced by partial_apply instructions. Thanks to the newly introduced ApplyInstBase class, the required changes in the generic specializer are very minimal.
rdar://19290942
Swift SVN r26582
Currently the analysis invalidation API is marked as 'protected', which
means that only the pass itself can invalidate analysis. However, we have
utility functions that should invalidate analysis selectively. Exposing
the invalidation API will allow us to pass the pass to the utility and allow
it to invalidate analysis selectively.
Swift SVN r26459
The old invalidation lattice was incorrect because changes to control flow could cause changes to the
call graph, so we've decided to change the way passes invalidate analysis. In the new scheme, the lattice
is replaced with a list of traits that passes preserve or invalidate. The current traits are Calls and Branches.
Now, passes report which traits they preserve, which is the opposite of the previous implementation where
passes needed to report what they invalidate.
Node: I tried to limit the changes in this commit to mechanical changes to ease the review. I will cleanup some
of the code in a following commit.
Swift SVN r26449