I measured zero performance regressions on the test suite. I added an early
CSE pass because of a specific example in one of the tests where the inliner
brought in an integer literal that needed to be CSE-ed before inst-combine
could fold a comparison into a constant, and simplify-cfg get rid of that
constant.
Representing our literal values as instructions (e.g. integer_literal) forces
us to run CSE constantly. We also need to re-run InstCombine very frequently
because InstSimplifier is not allowed to create new instructions!
Swift SVN r26736
Attempt to devirtualize any apply that we come across in the performance
inliner prior to attempting to inline.
The is the first step of getting the inliner/specializer/devirtualizer
working together so that we can converge on high quality code with less
work.
Although this is not meant to directly improve performance, but rather
be a step towards converging to high quality code with fewer passes,
because it alters what gets inlined when, it did have a (mostly)
positive effect on performance.
These are some of the larger deltas I see, where the percentage is
percentage speed-up, and negative percentages indicate a slow-down.
-O:
---
BenchLangCallingCFunction 16.7%
CaptureProp 17.1%
Sim2DArray 22.0%
-Ounchecked
-----------
BenchLangCallingCFunction -11.2%
QuickSort 39.4%
SwiftStructuresBubbleSort -26.7%
Swift SVN r26728
Be more careful when replacing targets of terminators. This finally implements a long-due FIXME in CheckedCastBrJumpThreading.
rdar://20345557.
Swift SVN r26725
Such terminators are replaced with a simple branch instruction.
This optimization was done for cond_br but not for other terminators, like switch_enum.
Swift SVN r26716
Given a strong_pin for which we have not yet seen a strong_unpin, a safe
guaranteed call sequence is of the following form:
retain(x)
call f(@guaranteed x)
release(x)
where f is an array semantic call that we know does not touch globals and thus
are known to not change ref counts.
rdar://20305817
Swift SVN r26662
After many attempts I found that the change in this commit does not regress
the performance of the test suite. The passes that I added in this commit come
to replace the late 'SSAPasses' pipe. I am still not completely sure why there
is such a strong dependency between SimplifyCFG and InstCombine.
Swift SVN r26658
This commit splits DominanceAnalysis into two analysis (Dom and PDom) that
can be cached and invalidates using the common FunctionAnalysisBase interface
independent of one another.
Swift SVN r26643
Before the change the RCIdentityAnalysis kept a single map that contained
the module's RC information. When function passes needed to invalidate the
analysis they had to clear the RC information for the entire module. The
problem was mitigated by the fact that we process one function at a time, and
we start processing a new function less frequently.
II adopted the DominanceAnalysis structure. We should probably implement
this functionality as CRTP.
Swift SVN r26636
I completely missed that one of the CondFailOpt optimization was already implemented in SimplifyCFG.
I move the other optimization also into SimplifyCFG because both share some code.
Swift SVN r26626
The linker pulls in new functions (which can change the call graph) but it
does not change the control flow of the processed function.
Swift SVN r26614
The current approach does not improve the compile time. Mark has a better plan
for making the inliner/devirtualizer/specializer work together.
Swift SVN r26613
We are able to jump-thread all kinds of terminators these days. Only jump thread
switch-enums if there is really a chance of simplifying. Don't jump thread
blocks with function calls in them.
Swift SVN r26599
Small tweak to function signature optimization so that it filters out
applies that might have multiple targets, as well as applies of partial
applies, which will take some updates to handle.
Resolves rdar://problem/19632244 and rdar://problem/20049782.
I've opened rdar://problem/20306331 to consider handling
apply-of-partial-apply at some point in the future.
Swift SVN r26585
Use existing machinery of the generic specializer to produce generic specializations of closures referenced by partial_apply instructions. Thanks to the newly introduced ApplyInstBase class, the required changes in the generic specializer are very minimal.
rdar://19290942
Swift SVN r26582
The overflow check for X * 1 does not guard the check for X * -1
because X could be MIN_INT, and MIN_INT*-1 overflows. In the new code
we check that the new multiplier is smaller (not smaller or equal).
Swift SVN r26553
Previously we were removing overflow checks if a previous (or future) check
used a smaller multiplier. However, this logic is incorrect for signed integers
because it does not consider underflows (example, x * 10 vs. x * -100000).
The code now checks that the multiplier that's dominated by another check is
closer or equal to zero (the absolute value is smaller or equal to the
guarding multiplier).
Swift SVN r26550
Use the knowledge of previous (or future) overflow checks to remove multiplication by smaller values.
For example:
x * 100
x * 10 // can remove this overflow checks.
Swift SVN r26543
Now that we generate autoclosure functions with the transparent bit set,
we only need to check that bit to determine whether we can delete
inlined functions.
Swift SVN r26535
We no longer need or use it since we can always refer to the same bit on
the applied function when deciding whether to inline during mandatory
inlining.
Resolves rdar://problem/19478366.
Swift SVN r26534
Previously, we were being very conservative and were not trying to look through
any RCId uses. Now we understand how to look through RCIdentical instructions to
pair a pin and unpin. We also understand how to use the new getRCUses API on
RCIdentityAnalysis to get all uses of a value, looking through RCIdentical
instructions.
I also added some code to COWArrayOpts to teach it how to look through enum insts (which I needed).
Additionally I got stuck and added support for automatic indentation in Debug
statements. This is better than having to indent by hand every time.
There were no significant perf changes since this code was not being emitted by
the frontend. But without this +0 self causes COW to break.
rdar://20267677
Swift SVN r26529
We already did this for element accesses, this just extends it to handle array value.
This prevents a regression that is caused by my needing to insert an enum
so I can properly RAW a strong_pin operand onto the strong pin's users in a
subsequent patch to improve RemovePin for +0 self.
rdar://20267677
Swift SVN r26528
This should clear the way for removing isTransparent on apply entirely.
Previously we marked any apply of an autoclosure transparent, but now
that the mandatory inliner inlines anything marked transparent, we don't
need that.
Resolves rdar://problem/20286251.
Swift SVN r26525
Mandatory inlining normally only looks at isTransparent() on the apply
instruction. This change makes it also inline in cases where the apply
is not marked isTransparent(), but the applied function is. This can
arise in cases where previous transparent inlining exposes new
opportunities, e.g. when a transparent function is passed as a parameter
to another transparent function.
Resolves rdar://problem/19419019.
Swift SVN r26516