Commit Graph

49 Commits

Author SHA1 Message Date
Mark Lacey
5394f41a3b Add a check to avoid infinite looping in the pass manager.
It's possible to construct programs where the optimization pass manager
will just continually execute, never making progress.

Add a check to the pass manager that only allows us to optimize a
limited number of functions with the function passes before moving on.

Unfortunately even the tiny test case that I have for this takes minutes
before we bail out with the limit I've set (which is not *that* much
bigger than the maximum that I saw in our build). I don't think it would
be prudent to add that test to the test suite, and I haven't managed to
come up with something that finishes in a more reasonable amount of time.

rdar://problem/21260480
2016-03-03 06:47:36 -08:00
Mark Lacey
27b63abeda Move RLE after inlining.
In theory we should be able to eliminate more loads if we run this after
the mem2reg that is after inlining. We aren't really relying heavily on
having promoted values like this prior to inlining.

Again, I see no significant performance delta, but this seems like the
best place to put this pass if we're only running it once per run of the
SSA passes.
2016-03-01 15:30:37 -08:00
Mark Lacey
014e312a2e Run redundant load elimination earlier in the pipeline.
Doing this earlier means that optimizations that are looking at SIL
values (rather than memory) have more opportunities earlier.

Minimal impact at the moment, but this may allow for removing some later
passes that are repeated.
2016-03-01 14:06:49 -08:00
Dmitri Gribenko
a9f8d97d3e Replace 'unsigned int' with 'unsigned'
'unsigned' is more idiomatic in LLVM style.
2016-02-27 16:20:27 -08:00
Mark Lacey
fa4e499e0e Fix comments. 2016-02-26 22:40:11 -08:00
Mark Lacey
f288c6c645 Remove two runs of the passes in AddSSAPasses.
Re-apply b00dcbe with a small test update, and a small change in pass
ordering.

I measure around a 10% reduction in compile times of release no-assert
builds of the stdlib and StdlibUnitTest.

For release + debug-swift builds, I see 20% reduction in stdlib compile
time.

My latest measurements show a few regressions at -O:
  Calculator
  NSError
  SetIsSubsetOf
  Sim2DArray

There is a small (0.1%) reduction in the libswiftCore.dylib size.

Being able to remove these is a consequence of the reordering that
happened in e50daa6.
2016-02-26 21:03:58 -08:00
Mark Lacey
b6de7239e6 Revert "Remove two runs of the passes in AddSSAPasses."
This reverts commit b00dcbebbf due to a
test failure.
2016-02-24 22:12:29 -08:00
Mark Lacey
b00dcbebbf Remove two runs of the passes in AddSSAPasses.
I measure around a 10% reduction in compile times of release no-assert
builds of the stdlib and StdlibUnitTest.

For release + debug-swift builds, I see 20% reduction in stdlib compile
time.

I saw no reproducible regressions in the benchmarks, and a few
improvements.

There is a small (0.1%) reduction in the libswiftCore.dylib size.

Being able to remove these is a consequence of the reordering that
happened in e50daa6.
2016-02-24 21:54:27 -08:00
Mark Lacey
e50daa6e3b Shuffle around some of the optimization passes.
The end goal here is to end up with a good pass ordering that will allow
us to only run one set of these passes, rather than running them
twice. This is a start in that direction.

No real impact measured on compile times as of this change. On
benchmarks I see a mix of regressions and improvements.

-O improvements:
  Calculator           -17.6%     1.21x
  Chars                -54.4%     2.19x
  PolymorphicCalls     -14.7%     1.17x
  SetIsSubsetOf        -14.1%     1.16x
  Sim2DArray           -14.1%     1.16x
  StrToInt             -30.4%     1.44x

-O regressions:
  CaptureProp          +32.9%     0.75x
  DictionarySwap       +36.0%     0.74x
  XorLoop              +39.8%     0.72x

-Ounchecked improvements:
  Chars                -58.0%     2.38x

-Ounchecked regressions:
  CaptureProp          +33.3%     0.75x

-Onone improvements:
  StrToInt             -14.9%     1.18x
  StringWalk           -47.6%     1.91x
  StringWithCString    -17.2%     1.21x
  (many more smaller improvements)

-Onone regressions:
  Calculator           +21.5%     0.82x
  OpenClose            +10.1%     0.91x
2016-02-24 14:18:08 -08:00
Mark Lacey
594a0d8c08 Use AddSSAPasses to add low-level passes.
This eliminates a pretty similar list of passes added in a similar order
with just re-using the ordering from AddSSAPasses. Beyond the particular
inliner pass (which is maintained with this change), there was nothing
really specific to low-level code with the order that was present before.

I measure a 1% increase in compile time of the stdlib, no perf
regressions (at -O), and a few decent improvements:
 19 CaptureProp                           5233             4129     -1104    -21.1%     1.27x
 30 ErrorHandling                         3053             2678      -375    -12.3%     1.14x
 65 Sim2DArray                             610              518       -92    -15.1%     1.18x

I expect to be able to get back the 1% compile-time hit (and probably
more) with future changes.
2016-02-20 14:38:21 -08:00
Mark Lacey
945065f37d Change where in the pass manager we validate that analyses are unlocked.
Verify just prior to running passes, and after running each pass, that
no analyses are locked from being invalidated.
2016-02-19 13:32:40 -08:00
Xin Tong
79c1f38724 Remove 1/5 iterations of redundant load elim. I do not see performance
regression. but do see a compilation time improvement
2016-02-09 22:20:10 -08:00
Xin Tong
4837889e63 Reapply Add a dead function elimination pass before we run SIL highlevel optimizations
I see improvement in compiling stdlib -O.

=== Before adding the pass ===
real time: 1m3.472s

=== After adding the pass ===
real time: 1m1.793s
2016-02-05 22:19:02 -08:00
Slava Pestov
f2157c93d1 Revert "Add a dead function elimination pass before we run SIL highlevel optimizations."
This reverts commit 909c3b28c4 because it
broke SILOptimizer/sil_witness_tables_external_witnesstable.swift.
2016-02-05 20:57:11 -08:00
Xin Tong
909c3b28c4 Add a dead function elimination pass before we run SIL highlevel optimizations.
I see slight compilation time improvements.
2016-02-05 20:22:35 -08:00
Mark Lacey
82fd057eaf Remove devirtualization and specialization from the inliner.
Now that we process functions in bottom-up order in the pass manager and
have a mechanism to restart the pass pipeline on the current
function (or on a newly created callee function), we can split these
passes back out from the inliner and end up with the same benefits we
had from initially integrating them. We get the further benefit of fully
optimizing newly created callee functions before continuing with the
function that resulted in the creation of those callee
functions (e.g. as a result of a specialization pass running).
2016-02-04 08:52:01 -08:00
Mark Lacey
378e94b901 Formatting changes on recently added lines. 2016-02-01 21:50:01 -08:00
Mark Lacey
beb0f7dc2f Update pass manager execution strategy for function passes.
Allow function passes to:

1. Add new functions, to be optimized before continuing with the current
   function.
2. Restart the pipeline on the current function after the current pass
   completes.

This makes it possible to fully optimize callees that are the result of
specialization prior to generating interprocedural information or making
inlining choices about these callees.

It also allows us to solve a phase-ordering issue we have with generic
specialization, devirtualization, and inlining, by rescheduling the
current function after changes happen in one of these passes as opposed
to running all of these as part of the inlining pass as happens today.

Currently this is NFC since we have no passes that use this
functionality.
2016-02-01 16:47:26 -08:00
Xin Tong
f73626eb28 Remove 4/5 runs of dead store elimination. I did not measure a real performance difference on
my local machine.

rdar://24392141

This is going to cut compilation time spent in dead store elim by 5X

The last iteration of dead store ran just before the last iteration of arc-sequence-opt
allows us to catch some opportunites passes like Mem2Reg can not eliminate. And this allows
more code motion freedom.

Stdlib -O after removing 4/5 dead stores.
=========================================

Running Time	Self (ms)		Symbol Name
22082.0ms   37.1%	0.0	 	    swift::runSILOptimizationPasses(swift::SILModule&)
21905.0ms   36.8%	0.0	 	     swift::SILPassManager::runOneIteration()
17616.0ms   29.6%	35.0	 	      swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
16667.0ms   28.0%	55.0	 	       swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
3063.0ms    5.1%	7.0	 	        (anonymous namespace)::SimplifyCFGPass::run()
2936.0ms    4.9%	20.0	 	        (anonymous namespace)::ARCSequenceOpts::run()
2343.0ms    3.9%	3.0	 	        (anonymous namespace)::SILCombine::run()
1900.0ms    3.1%	110.0	 	        (anonymous namespace)::SILCSE::run()
1642.0ms    2.7%	43.0	 	        (anonymous namespace)::RedundantLoadElimination::run()
1113.0ms    1.8%	6.0	 	        (anonymous namespace)::GenericSpecializer::run()
788.0ms    1.3%	120.0	 	        (anonymous namespace)::DCE::run()
495.0ms    0.8%	3.0	 	        (anonymous namespace)::SILCodeMotion::run()
304.0ms    0.5%	1.0	 	        (anonymous namespace)::StackPromotion::run()
292.0ms    0.4%	1.0	 	        (anonymous namespace)::ConstantPropagation::run()
269.0ms    0.4%	5.0	 	        (anonymous namespace)::ABCOpt::run()
236.0ms    0.3%	35.0	 	        (anonymous namespace)::SILSROA::run()
192.0ms    0.3%	2.0	 	        (anonymous namespace)::SILMem2Reg::run()
146.0ms    0.2%	65.0	 	        (anonymous namespace)::SILLowerAggregate::run()
132.0ms    0.2%	5.0	 	        (anonymous namespace)::LICM::run()
132.0ms    0.2%	7.0	 	        (anonymous namespace)::DeadStoreElimination::run()
96.0ms    0.1%	65.0	 	        (anonymous namespace)::Devirtualizer::run()
67.0ms    0.1%	59.0	 	        (anonymous namespace)::DeadObjectElimination::run()
62.0ms    0.1%	44.0	 	        (anonymous namespace)::RemovePinInsts::run()

StdlibUnitTest -O after removing 4/5 dead stores.
=================================================

Running Time	Self (ms)		Symbol Name
6958.0ms   26.9%	0.0	 	    swift::runSILOptimizationPasses(swift::SILModule&)
6923.0ms   26.8%	0.0	 	     swift::SILPassManager::runOneIteration()
5638.0ms   21.8%	5.0	 	      swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
5363.0ms   20.7%	8.0	 	       swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
1535.0ms    5.9%	1.0	 	        (anonymous namespace)::ARCSequenceOpts::run()
789.0ms    3.0%	2.0	 	        (anonymous namespace)::SimplifyCFGPass::run()
704.0ms    2.7%	0.0	 	        (anonymous namespace)::SILCombine::run()
615.0ms    2.3%	36.0	 	        (anonymous namespace)::SILCSE::run()
506.0ms    1.9%	14.0	 	        (anonymous namespace)::RedundantLoadElimination::run()
224.0ms    0.8%	44.0	 	        (anonymous namespace)::DCE::run()
150.0ms    0.5%	1.0	 	        (anonymous namespace)::SILCodeMotion::run()
113.0ms    0.4%	1.0	 	        (anonymous namespace)::StackPromotion::run()
98.0ms    0.3%	4.0	 	        (anonymous namespace)::DeadStoreElimination::run()
80.0ms    0.3%	3.0	 	        (anonymous namespace)::ABCOpt::run()
74.0ms    0.2%	5.0	 	        (anonymous namespace)::LICM::run()
2016-02-01 12:52:36 -08:00
Nadav Rotem
52ea0c6c48 Revert "Remove one invocation of the ARC optimizer."
This reverts commit 0515889cf0.

I made a mistake and did not catch this regression when I measured the change on
my local machine. The regression was detected by our automatic performance
tests. Thank you @slavapestov for identifying the commit.
2016-01-28 21:03:10 -08:00
Nadav Rotem
0515889cf0 Remove one invocation of the ARC optimizer.
Removing one of the invocation of the ARC optimizer. I did not measure any
regressions on the performance test suite (using -O), but I did see a
reduction in compile time on rdar://24350646.
2016-01-26 15:51:25 -08:00
Mark Lacey
5948ac38a6 Fix coding style: capitalize member variable 2016-01-18 22:38:29 -08:00
Mark Lacey
c37697d38e Add the stand-alone generic specializer pass back to the pipeline.
On the whole it looks like this currently benefits performance.

As with the devirtualization pass, once the updated inliner is
committed, the position of this pass in the pipeline will change.
2016-01-08 08:21:00 -08:00
Mark Lacey
57abe19198 Add the stand-alone devirtualizer pass back to the pipeline.
It looks like this has minimal performance impact either way. Once the
changes to make the inliner a function pass are committed, the position
of this in the pipeline will change.
2016-01-08 00:40:03 -08:00
Michael Gottesman
385c4a54dc [passmanager] When visiting functions in runFunctionPasses, make sure to check continueTransforming.
While debugging some code I noticed that we were not checking
continueTransforming everywhere that we needed to. This commit adds the missing
check.
2016-01-07 19:22:47 -08:00
Mark Lacey
176ba99c84 Don't run the stand-alone devirtualization and specialization passes.
They aren't needed at the moment, and running the specialization pass
early might have resulted in some performance regressions.

We can add these back in (and in the appropriate place in the pipeline)
when the changes to unbundle this functionality from the inliner goes in.
2016-01-07 10:36:28 -08:00
practicalswift
1339b5403b Consistent use of header comment format.
Correct format:
//===--- Name of file - Description ----------------------------*- Lang -*-===//
2016-01-04 13:26:31 +01:00
Mark Lacey
149e1e4059 Fix 80-column violations. 2016-01-03 13:15:56 -08:00
Zach Panzarino
e3a4147ac9 Update copyright date 2015-12-31 23:28:40 +00:00
practicalswift
22e10737e2 Fix typos 2015-12-26 01:19:40 +01:00
Mark Lacey
70938b1aee Add a stand-alone devirtualizer pass.
Add back a stand-alone devirtualizer pass, running prior to generic
specialization. As with the stand-alone generic specializer pass, this
may add functions to the pass manager's work list.

This is another step in unbundling these passes from the performance
inliner.
2015-12-21 23:42:37 -08:00
Mark Lacey
faba6e56b7 Add a stand-alone generic specializer pass.
Begin unbundling devirtualization, specialization, and inlining by
recreating the stand-alone generic specializer pass.

I've added a use of the pass to the pipeline, but this is almost
certainly not going to be the final location of where it runs. It's
primarily there to ensure this code gets exercised.

Since this is running prior to inlining, it changes the order that some
functions are specialized in, which means differences in the order of
output of one of the tests (one which similarly changed when
devirtualization, specialization, and inlining were bundled together).
2015-12-18 14:08:56 -08:00
Mark Lacey
dbde7cc4c1 Update the pass manager to allow for function creation in function passes.
Add interfaces and update the pass execution logic to allow function
passes to create new functions, or ask for functions to be optimized
prior to continuing.

Doing so results in the pass pipeline halting execution on the current
function, and continuing with newly added functions, returning to the
previous function after the newly added functions are fully optimized.
2015-12-18 14:08:56 -08:00
Mark Lacey
90b45c4dd7 Extract method to run all function passes over a given function.
More small refactoring in the pass manager.
2015-12-17 14:57:30 -08:00
Mark Lacey
d770376981 Replace tabs with spaces.
Also run clang-format over the changed area.
2015-12-17 12:25:03 -08:00
Mark Lacey
fbb7abc7c6 Fix 80-column violations. 2015-12-17 12:25:03 -08:00
Mark Lacey
bed0da6472 Typo: consequtive -> consecutive 2015-12-16 22:43:54 -08:00
Mark Lacey
3ed75f4fb0 Move the pass manager's function worklist into PassManager.
Make it a std::vector that reserves enough space based on the number of
functions in the initial bottom-up ordering.

This is the first step in making it possible for function passes to
notify the pass manager of new functions to process.
2015-12-16 21:30:33 -08:00
Mark Lacey
226a825807 Simplify the pass manager execution logic.
Make it a bit more clear that we're alternating between collecting (and
then running) function passes, and running module passes. Removes some
duplication that was present.

Reapplies 9d4d3c8 with fixes for bisecting pass execution.
2015-12-15 15:17:53 -08:00
Mark Lacey
59544560d1 Revert "Simplify the pass manager execution logic."
This reverts commit 9d4d3c8055.

I forgot to finish up changes required to make -Xllvm
-sil-opt-pass-count continue working the way it did, so I'll back that
out until I have those changes as well.
2015-12-15 13:23:59 -08:00
Mark Lacey
9d4d3c8055 Simplify the pass manager execution logic.
Make it a bit more clear that we're alternating between collecting (and
then running) function passes, and running module passes. Removes some
duplication that was present.
2015-12-15 13:08:08 -08:00
Mark Lacey
a8fbc4722f Minor pass manager refactoring.
Extract the code related to running a module pass into a separate
function.
2015-12-15 10:25:38 -08:00
Arnold Schwaighofer
edf9ca06fc Unroll loops with known short trip count
This enables array value propagation in array literal loops like:

for e in [2,3,4] {
  r += e
}

Allowing us to completely get rid of the array.

rdar://19958821
SR-203
2015-12-14 12:03:42 -08:00
Arnold Schwaighofer
6662e7432a Reapply Add a pass to propagate constant array values to array subscript calls
This reverts commit 82ff59c0b9.

Original commit message:

This allows us to compile the function:

func valueArray() -> Int{
  var a = [1,2,3]
  var r = a[0] + a[1] + a[2]
  return r
}

Down to just a return of the value 6. And should eventually allow us to remove
the overhead of vararg calls.

rdar://19958821
2015-12-14 12:03:41 -08:00
Mark Lacey
6c4bc75d3f Use a work list when running function passes.
Rather than iterating over an array of functions, build a work list and
pop functions off of it.

This is a small step towards allowing function passes to create new
functions to be processed.
2015-12-13 20:42:07 -08:00
Michael Gottesman
d94fa0a515 Merge pull request #501 from practicalswift/fix-typos-5
Fix typos (5 of 30)
2015-12-13 18:55:20 -06:00
practicalswift
fdeb03033c Fix typo: classsic → classic 2015-12-14 00:11:23 +01:00
practicalswift
39f3e49e27 Fix typo: analyis → analysis 2015-12-13 23:56:40 +01:00
Andrew Trick
739b0e9c56 Reorganize SILOptimizer directories for better discoverability.
(libraries now)

It has been generally agreed that we need to do this reorg, and now
seems like the perfect time. Some major pass reorganization is in the
works.

This does not have to be the final word on the matter. The consensus
among those working on the code is that it's much better than what we
had and a better starting point for future bike shedding.

Note that the previous organization was designed to allow separate
analysis and optimization libraries. It turns out this is an
artificial distinction and not an important goal.
2015-12-11 15:14:23 -08:00