Commit Graph

115 Commits

Author SHA1 Message Date
Erik Eckstein
ebbecf5e5a Do not inline "availability.osversion" functions in mid-level inliner.
Because GlobalOpt must see this functions (rdar://problem/20708979)



Swift SVN r27964
2015-04-30 11:07:42 +00:00
Mark Lacey
1859b476d4 Further integration of inlining, devirtualization, and specialization.
This updates the performance inliner to iterate on inlining in cases
where devirtualization or specialization after the first pass of
inlining expose new opportunities for inlining. Similarly, in some cases
inlining exposes new opportunities for devirtualization, e.g. when we
inline an initializer and can now see an alloc_ref that allows us to
devirtualize some class_methods.

The implementation currently has some inefficiencies which increase the
swift compilation time for the stdlib by around 3% (this is swift-time
only, no LLVM time, so overall time does not grow by this much).

Unfortunately the (unchanged) current implementation of the core
inlining trades off improved estimates of code growth for increased
compile time, and that plays a part in why compile time increases as
much as it does. Despite this, I have some ideas on how to win some of
that time back in future patches.

Performance differences are mixed, and this will likely require some
further inliner tuning to reduce or remove some of the losses seen here
at -O. I will open radars for the losses.

Wins:
DeltaBlue                        10.2%
EditDistance                     13.8%
SwiftStructuresInsertionSort     32.6%
SwiftStructuresStack             34.9%

Losses:
PopFrontArrayGeneric            -12.7%
PrimeNum                        -19.0%
RC4                             -30.7%
Sim2DArray                      -14.6%

There were a handful of wins and losses at Onone and Ounchecked as
well. I'll review the perf testing output and open radars accordingly.

The new test case shows an example of the power of the closer
integration here. We are able to completely devirtualize and inline a
series of class_method applies (10 deep in this case, but in theory
substantially deeper) in a single pass of the inliner, whereas before we
could only do a single level per pass of inlining & devirtualization.

Swift SVN r27561
2015-04-22 04:48:13 +00:00
Nadav Rotem
32211041d2 Rename @semantics -> @_semantics.
Swift SVN r27533
2015-04-21 17:10:06 +00:00
Nadav Rotem
680c565af5 Refactor the code that checks if a function is marked with noopt semantics. NFC.
Swift SVN r27485
2015-04-20 17:27:31 +00:00
Nadav Rotem
926042ff81 Add @semantics("optimize.never") to disable optimizations of a specific function.
This commit adds a flag to disable optimizations on a specific functions. The
primary motivation of this patch is to allow the optimizer developers to reduce
testcasese by disabling optimizations of parts of the code without having to
recompile the compiler or inspect SIL. The annotations  "inline(never)"
and "optimize.none" can go a long way.

The second motivation for this patch is to allow our internal adopters to work
around compiler bugs.

rar://19745484

Usage:

@semantics("optimize.never")
public func miscompile() { ... }

Swift SVN r27475
2015-04-20 05:06:55 +00:00
Mark Lacey
3e1e90c31b Add a period at the end of a comment.
Swift SVN r27348
2015-04-16 05:19:53 +00:00
Mark Lacey
f6ec796780 Integrate generic specialization into the inliner.
During inlining we'll now attempt to first devirtualize and specialize
within the function that we're going to inline into. If we're successful
devirtualizing and inlining, and we'll attempt to inline into the newly
exposed callees first, before inlining into the function we began with.

This does not remove any existing passes of devirtualization or
specialization yet, partially because we don't completely handle all
cases that they handle at this point (e.g. specializing partial
applies).

We do end up specializing deeper into the call graph with this approach
than we did prior to this commit.

I will have some follow-on changes that integrate things further,
allowing us to devirtualize in more cases after inlining into a given
function.

I will also add some directed tests in a future commit.

I tested the stdlib build and this made no difference in build
times. Perhaps after removing other existing phases we'll recapture some
build time.

I'm not seeing reproducible performance differences with this change,
which is not a big surprise at this point. This sets us up for being
able to improve the compilation pipeline in a future release.

Swift SVN r27327
2015-04-15 21:08:51 +00:00
Mark Lacey
8296a43356 It is not useful to link in the main inliner loop.
We never have anything but definitions in the call graph, so we can just
assert that.

Swift SVN r27311
2015-04-15 04:31:56 +00:00
Mark Lacey
02724cbb8f Minor renaming.
Replace CallSites with Applies.

Swift SVN r27308
2015-04-15 03:08:08 +00:00
Mark Lacey
ed66cfd544 Use a callback in the linker to notify clients of newly deserialized functions.
Previous attempts to update the callgraph explicitly after calls to
linkFunction() weren't completely effective because we can deserialize
deeply and introduce multiple new function bodies in the process.

This gets us a bit closer, but only adds new call graph nodes. It does
not currently add edges for everything that gets deserialized (and this
is not fatal, so it is a step forward).

Swift SVN r27120
2015-04-08 06:46:15 +00:00
Mark Lacey
0c4249293a Update the call graph with newly linked-in functions.
We claim to maintain the call graph in these passes, so we really need
to add nodes for new functions we pull in.

Also, link in functions when building the call graph, and only allow
functions with bodies to be added to the call graph.

This makes the call graph more consistent.

At some point we need to revisit our linking story because we've got
code spread out over several phases now where it might make sense to do
a single up-front linking pass that potentially pulls in
never-referenced functions (e.g. pull in all foo() that could be reached
in a given class hierarchy up front, even if in reality only C.foo() is
ever called).

Swift SVN r27096
2015-04-07 22:29:52 +00:00
Mark Lacey
cd6d1488e8 Use SmallVector consistently across the call graph.
Swift SVN r27073
2015-04-07 07:27:06 +00:00
Mark Lacey
fd6ccb9b52 Bail from cost computation as soon as we know we'll return false.
Swift SVN r27034
2015-04-06 06:27:15 +00:00
Mark Lacey
d0233db99d Unindent and improve interface to helper function.
Swift SVN r27033
2015-04-06 06:27:14 +00:00
Mark Lacey
6c86c523a2 Move call graph maintenance logic out of main inlining logic.
Swift SVN r27032
2015-04-06 06:27:13 +00:00
Mark Lacey
fc4c83c984 Unindent a level.
Swift SVN r27031
2015-04-06 06:27:12 +00:00
Mark Lacey
18f7a5be7a Replace a condition with an assert.
This makes it easier to continue to simplify this code, and it should be
reasonable to maintain this invariant in the future.

Swift SVN r27030
2015-04-06 06:27:11 +00:00
Mark Lacey
fea3321f59 Update the generic specializer to maintain the call graph.
Swift SVN r27024
2015-04-05 19:27:40 +00:00
Mark Lacey
bad8678f89 Add CallGraphEditor helper class and use it in the inliner.
Swift SVN r27008
2015-04-05 04:30:08 +00:00
Mark Lacey
217966eefd Maintain the call graph during inlining.
Swift SVN r26998
2015-04-05 02:27:58 +00:00
Mark Lacey
730ef41385 Make devirtualizer clients remove old applies.
This makes it feasible for clients to maintain the call graph.

Swift SVN r26997
2015-04-05 02:27:57 +00:00
Mark Lacey
066274e672 Remove invalidation of loop analysis.
We already invalidate all the analyses for each function we inline into,
so this shouldn't be necessary.

We should also be able to remove the invalidation of dominators for the
same reason, but I am getting one test failure when I do that so it
needs further investigation.

Swift SVN r26939
2015-04-03 14:50:08 +00:00
Mark Lacey
283b08c511 Properly invalidate analyses when we devirtualize during inlining.
Swift SVN r26934
2015-04-03 06:17:02 +00:00
Mark Lacey
23b6bd84f6 Do not build the call graph just to maintain it.
Before this commit, passes that were attempting to maintain the call
graph would actually build it if it wasn't already valid, just for the
sake of maintaining it.

Now we only maintain it if we already had a valid call graph built.

Swift SVN r26873
2015-04-02 17:16:01 +00:00
Mark Lacey
6ec9593a15 Split the worthiness logic away from the mechanics of inlining.
Swift SVN r26870
2015-04-02 07:40:02 +00:00
Mark Lacey
3e4d43be11 Coding style tweaks.
Swift SVN r26868
2015-04-02 07:39:58 +00:00
Mark Lacey
07a5ebed66 Rename devirtualizeApply to tryDevirtualizeApply.
This is more consistent with the other naming in the devirtualization
code for things that might fail.

Swift SVN r26788
2015-04-01 02:10:52 +00:00
Mark Lacey
7d5256f03a Devirtualize in the performance inliner.
Attempt to devirtualize any apply that we come across in the performance
inliner prior to attempting to inline.

The is the first step of getting the inliner/specializer/devirtualizer
working together so that we can converge on high quality code with less
work.

Although this is not meant to directly improve performance, but rather
be a step towards converging to high quality code with fewer passes,
because it alters what gets inlined when, it did have a (mostly)
positive effect on performance.

These are some of the larger deltas I see, where the percentage is
percentage speed-up, and negative percentages indicate a slow-down.

-O:
---
BenchLangCallingCFunction         16.7%
CaptureProp                       17.1%
Sim2DArray                        22.0%

-Ounchecked
-----------
BenchLangCallingCFunction        -11.2%
QuickSort                         39.4%
SwiftStructuresBubbleSort        -26.7%

Swift SVN r26728
2015-03-30 21:24:46 +00:00
Nadav Rotem
240ff14db1 Split DominanceAnalysis into Dom and PDom using FunctionAnalysisBase.
This commit splits DominanceAnalysis into two analysis (Dom and PDom) that
can be cached and invalidates using the common FunctionAnalysisBase interface
independent of one another.

Swift SVN r26643
2015-03-27 20:54:28 +00:00
Nadav Rotem
908e75e934 Teach the perf inliner to invalidate only functions that were modified.
Swift SVN r26461
2015-03-23 23:57:43 +00:00
Nadav Rotem
d78b376d07 [passes] Replace the old invalidation lattice with a new invalidation scheme.
The old invalidation lattice was incorrect because changes to control flow could cause changes to the
call graph, so we've decided to change the way passes invalidate analysis.  In the new scheme, the lattice
is replaced with a list of traits that passes preserve or invalidate. The current traits are Calls and Branches.
Now, passes report which traits they preserve, which is the opposite of the previous implementation where
passes needed to report what they invalidate.

Node: I tried to limit the changes in this commit to mechanical changes to ease the review. I will cleanup some
of the code in a following commit.

Swift SVN r26449
2015-03-23 21:18:58 +00:00
Mark Lacey
05950693e3 Do not inline based on @transparent in the performance inliner.
We used to do this because the mandatory inliner couldn't deal with
generics, and we were marking some things in the stdlib as @transparent
for performance reasons.

In comparing performance before/after this change, I saw noise at -Onone
and -O, and a couple differences at -Ounchecked that could be real (but
are on benchmarks that tend to be very noisy so it's hard to tell for
certain).

This change is important because I am going to commit another change
that marks protocol witness thunks as @transparent in the lead-up to
making the mandatory inliner devirtualize. I don't want that change to
generate a bunch of performance diffs and/or size diffs, which might
happen if we were to force inline *all* of those protocol witness
thunks (as opposed to the ones that will eventually be inlined by the
mandatory inliner because we're able to devirtualize the calls).

Swift SVN r26386
2015-03-21 03:16:37 +00:00
Mark Lacey
d57fdb9426 Move the responsibility for deleting the apply outside of inlineFunction().
Make the clients remove the apply, which paves the way for the clients
to potentially update the call graph when inlining is successful.

Swift SVN r26075
2015-03-13 01:18:05 +00:00
Erik Eckstein
3d23935d39 Inliner: inline small functions even into cold blocks, because it can reduce code size.
Gives following code size improvements (positive % means size reduction):
PerfTests_O              7.9%
PerfTests_Ounchecked     1.0%
PerfTests_Onone          0.4%
libswiftCore.dylib      -0.1%

Performance is approximately the same. There are only few changes above 10%, and this seems to be noise.



Swift SVN r25485
2015-02-23 17:15:52 +00:00
Erik Eckstein
b8ef26a3bd inliner: remove the obsolete CannotBeInlined cost value
Swift SVN r25427
2015-02-20 15:48:08 +00:00
Dmitri Hrybenko
61286f0260 Fix warnings produced by a newer version of Clang
Swift SVN r25257
2015-02-12 23:50:47 +00:00
Erik Eckstein
4aac127226 Don't inline into thunks, except very small functions.
rdar://problem/19701613

Code size reductions (negative means less code size):
bin/PerfTests_O:  -3.7%
bin/PerfTests_Ounchecked:  -1.9%
bin/PerfTests_Onone:  +0.2%
stdlib/core/macosx/Swift.o:  -2.2%

The -2.2% in Swift.o constitutes of about +5% in specializations and -11% in protocoll witnesses in the dylib.
(-> still room for improvement regarding specializations)

Note that completely disabling inlining into thunks (even small functions) would increase the code size.

There is litte change in performance, a few + and - within 10%.
Beyond this there is (+ means faster):
Phonebook@O: +26%
ImageProc@Ounchecked: +14%
StringWalk@Ounchecked: -16%




Swift SVN r25001
2015-02-05 17:39:39 +00:00
Erik Eckstein
4491f2a00d Improvement of the inlining heuristic.
Main changes:
*) Instruction costs are not counted for blocks which are dead after inlining
*) Terminator instructions which get constant after inlining increase the threshold
*) Calls inside loops increase the threshold

In theory this should be a step towards making the performance not so dependent on the inlining heuristic.
But I must admit that I still did some fine tuning of all the parameters to get the best results.

Improvements in the benchmarks:
-O:
Chars: +11%
CommonMarkRender: +11%
DollarReduce: +22%
ForLoops: +22%
Forest: +10%
HeapSort: +36%
ImageProc: +14%
StrCat: +14%
StrComplexWalk: +70%
StrToInt: +11%
StringWalk: +99%

-Ounchecked:
Ary: +40%
Ary2: +30%
EditDistance: +22%
Forest: +18%
HeapSort: +50%
Histogram: +11%
StrCat: +12%
StrComplexWalk: +63%
StrSplitter: +11%
StrToInt: +17%
StringWalk: +75%

Regressions (I will file radars for them):
-Ounchecked:
PolymorphicCalls: -21%
QuickSort: -22%
Rectangles: -12%

Code size of the PerfTests_O decreased by 8% 
Code size of the PerfTests_Ounchecked increased by 1%



Swift SVN r24801
2015-01-28 19:01:00 +00:00
Michael Gottesman
897325b096 Codebase Gardening. NFC.
1. Eliminate unused variable warnings.
2. Change field names to match capitalization of the rest of the field names in the file.
3. Change method names to match rest of the file.
4. Change get,set method for a field to match the field type.

Swift SVN r24501
2015-01-19 00:34:07 +00:00
Erik Eckstein
7ce0850e6c Improve ConstantTracker in PerformanceInliner.
This lets the inliner better check if a closure is passed to the callee.
It fixes a problem that copy forwarding generates a pattern which could not be analyzed by the ConstantTracker:
  <rdar://problem/19426897> [Inliner] Fail to fully optimize RangeAssignment (especially with copy forwarding)

RangeAssignment is now ~6x faster with -O. 
Some other improvements: PrimeNum@O: +50%, ImageProc@Ounchecked: +15%, QuickSort@Ounchecked: +20%
There is one degradation in -O and -Ounchecked, which I still have to check: CommonMarkRender: -15%




Swift SVN r24414
2015-01-14 18:18:05 +00:00
Erik Eckstein
406698f5c1 Add an internal inliner option for testing the inline heuristics.
When setting the new option -sil-inline-test-threshold=<n> the inliner uses a simplified model
for instruction costs. This helps to test the inline heuristic.



Swift SVN r24178
2015-01-05 09:51:49 +00:00
Erik Eckstein
5cffa73bbd Refactoring: extract some code of the inliner to the projection classes.
Michael, thanks for your comments!

This also adds support for tuples and enums in Projection::getOperandForAggregate().



Swift SVN r24112
2014-12-23 16:26:08 +00:00
Erik Eckstein
e54e97b1fa Improve the inlining heuristic for cases where a closure is passed to a function.
This change adds a general method to see if inlining would enable constant propagation or
inlining of a closure.
For this it does not matter if a constant/function_ref/etc. is passed directly or within
a struct.
This first version only handles closures which are passed to an apply in the callee.

It fixes the performance problem of RangeAssignment (rdar://problem/19252374) and shows
minor improvements in some other benchmarks, e.g. CommonMarkRender.

The impact on code size is negligible (< 1%).



Swift SVN r24109
2014-12-23 09:42:52 +00:00
Andrew Trick
e8bae49cce Have SILPerformanceInlinerPass print its level: Early, Middle, Late.
Swift SVN r23901
2014-12-12 23:57:02 +00:00
Arnold Schwaighofer
9ff70c8bda Inliner: inline @inline(__always) function in cold blocks
Swift SVN r23687
2014-12-04 19:24:38 +00:00
Mark Lacey
20d2b07107 Use the new call graph in the performance inliner.
Resolves rdar://problem/18185768.

Swift SVN r22597
2014-10-08 07:22:56 +00:00
Erik Eckstein
9a6699ac43 Enable inlining of functions with the global_init attribute.
Inlining of such functions should only be done after GlobalOpt otherwise
GlobalOpt will not hoist them out of loops.



Swift SVN r22283
2014-09-25 14:30:38 +00:00
Erik Eckstein
c16c510167 Set SILLinkage according to visibility.
Now the SILLinkage for functions and global variables is according to the swift visibility (private, internal or public).

In addition, the fact whether a function or global variable is considered as fragile, is kept in a separate flag at SIL level.
Previously the linkage was used for this (e.g. no inlining of less visible functions to more visible functions). But it had no effect,
because everything was public anyway.

For now this isFragile-flag is set for public transparent functions and for everything if a module is compiled with -sil-serialize-all,
i.e. for the stdlib.

For details see <rdar://problem/18201785> Set SILLinkage correctly and better handling of fragile functions.

The benefits of this change are:
*) Enable to eliminate unused private and internal functions
*) It should be possible now to use private in the stdlib
*) The symbol linkage is as one would expect (previously almost all symbols were public).

More details:

Specializations from fragile functions (e.g. from the stdlib) now get linkonce_odr,default
linkage instead of linkonce_odr,hidden, i.e. they have public visibility.
The reason is: if such a function is called from another fragile function (in the same module),
then it has to be visible from a third module, in case the fragile caller is inlined but not
the specialized function.

I had to update lots of test files, because many CHECK-LABEL lines include the linkage, which has changed.

The -sil-serialize-all option is now handled at SILGen and not at the Serializer.
This means that test files in sil format which are compiled with -sil-serialize-all
must have the [fragile] attribute set for all functions and globals.

The -disable-access-control option doesn't help anymore if the accessed module is not compiled
with -sil-serialize-all, because the linker will complain about unresolved symbols.

A final note: I tried to consider all the implications of this change, but it's not a low-risk change.
If you have any comments, please let me know.



Swift SVN r22215
2014-09-23 12:33:18 +00:00
Mark Lacey
b147b51200 Small clean-up in performance inliner.
Swift SVN r21560
2014-08-29 02:58:09 +00:00
Erik Eckstein
99cc7603be Add an @inline(__always) function attribute.
This will let the performance inliner inline a function even if the costs are too high.
This attribute is only a hint to the inliner.
If the inliner has other good reasons not to inline a function,
it will ignore this attribute. For example if it is a recursive function (which is
currently not supported by the inliner).

Note that setting the inline threshold to 0 does disable performance inlining at all and in
this case also the @inline(__always) has no effect.



Swift SVN r21452
2014-08-26 00:56:34 +00:00