Commit Graph

142 Commits

Author SHA1 Message Date
Arnold Schwaighofer
1d97e5eb38 Fix the description of sil-view-cfg.
Swift SVN r21242
2014-08-15 22:54:11 +00:00
Arnold Schwaighofer
97b6216f47 Add option to view the CFG after optimizations
View the sil cfg of a function at the end of the compilation pipeline with
  swiftc -O -Xllvm -sil-view-cfg -Xllvm -view-cfg-only-for-function=foobar

Swift SVN r21238
2014-08-15 22:45:53 +00:00
Arnold Schwaighofer
04ca5e128e ABCOpts: Add bounds check hoisting of induction identity accesses
We can now hoist the check in:

 for (i = start; i != end; ++i)
   a[i] = ...

or

 for i in start ..< end
   a[i] = ...

We will also hoist invariant checks as in

 k =
 for i in start ..< end
   a[k] = ...

We will also hoists the overflow check for "++i" out of the loop.

The only thing blocking vectorization of memset loops is the fact that we are
overflow checking the type size muliplication of array accesses. "a[i]" is
translated to "a + sizeof(T) * i" and this multiplication is still overflow
checked.

We can remove bounds checks in PrimeNum, XorLoop, Hash, MemSet, NBody, Walsh.

Memset                 ,  2371.00   ,  1180.00  ,  1179.00  ,        99.9%
XorLoop                ,  1403.00   ,  1255.00  ,  149.00   ,        11.9%

rdar://14757945

Swift SVN r21182
2014-08-13 20:00:08 +00:00
Arnold Schwaighofer
24d4d9d904 SimplifyCFG: Implement general jumpthreading using the SSAUpdater
simplifyInstruction conciously does not create constants (AFAIK) so we need to
run instruction combine after simplify-cfg to enable more cfg simplification
exposed by jumpthreading. We can revisit this decision in a follow-up commit if
necessary (I believe it to be useful for simplifyInstruction to be able to
create constants and for simplifycfg to use simplifyInstruction on the branch
condition).

O:
benchmark      ,  baserun0  ,  optrun0  ,  delta,  speedup
Fibonacci      ,  1473.00   ,  1317.00  ,  75.00   ,        5.8%
Histogram      ,  407.00    ,  390.00   ,  23.00   ,        6.1%
InsertionSort  ,  1273.00   ,  1200.00  ,  79.00   ,        6.7%
Life           ,  74.00     ,  69.00    ,  4.00    ,        5.8%
NestedLoop     ,  937.00    ,  883.00   ,  56.00   ,        6.4%
R17315246      ,  8.00      ,  801.00   ,  793.00  ,        -99.0%
SelectionSort  ,  1150.00   ,  921.00   ,  226.00  ,        24.5%

Ounchecked:
Histogram      ,  394.00    ,  342.00   ,  51.00   ,        15.0%
InsertionSort  ,  1122.00   ,  1024.00  ,  85.00   ,        8.3%
Life           ,  57.00     ,  44.00    ,  9.00    ,        20.9%
SelectionSort  ,  1312.00   ,  1060.00  ,  246.00  ,        23.3%

The R17315246 regression is somewhat bad. We dependent on a loop form such that
LLVM transforms the loop into an inner loop that just iterates from x to y and
this is unrolled (good version) the slow version has a loop with cond_fail
control flow and is not unrolled. The loop does nothing more than count up.
I have not being able to narrow this down further.

rdar://16821595

Swift SVN r21124
2014-08-09 00:05:30 +00:00
Nadav Rotem
6358d67a73 Cleanup the optimization driver. NFC.
Swift SVN r21058
2014-08-06 01:02:09 +00:00
Nadav Rotem
7faa5883df Add a basic Class Hierarchy Analysis. At this point it only lists classes that are inherited from in this module.
Swift SVN r20710
2014-07-29 23:01:01 +00:00
Michael Gottesman
e9a7f91667 Revert "Add the frontend option -disable-sil-perf-optzns."
Revert "For debugging purposes allow passes to stop any more passes from running by calling PassManager::stopRunning()."

This reverts commit r20604.
This reverts commit r20606.

This was some debugging code that snuck in.

Swift SVN r20615
2014-07-28 06:21:30 +00:00
Michael Gottesman
7136c53208 For debugging purposes allow passes to stop any more passes from running by calling PassManager::stopRunning().
The intended use case is the user puts in a counter and wants the pass manager
to ensure that no further passes run.

Swift SVN r20606
2014-07-27 18:37:12 +00:00
Michael Gottesman
112269fb33 Add the frontend option -disable-sil-perf-optzns.
Swift SVN r20604
2014-07-27 18:37:11 +00:00
Michael Gottesman
ec533fafa7 [load-store-opts] Enable global load store opts.
rdar://17680758



Swift SVN r20349
2014-07-22 23:43:44 +00:00
Arnold Schwaighofer
f6b0682988 Array bounds check optimization pass
Implements redundant bounds check elimination for basic blocks and along the
dominator tree of loops.

No induction variable based hoisting yet.

O3:
NBody          ,  473.00     ,  122.00    ,  294.2%
QuickSort      ,  477.00     ,  310.00    ,  53.9%
RC4            ,  1022.00    ,  736.00    ,  38.6%
Walsh          ,  1781.00    ,  1142.00   ,  55.5%

No effect on Ofast.

Disabled for now.

Swift SVN r20199
2014-07-19 01:18:50 +00:00
Arnold Schwaighofer
2eeec98426 SIL Passes: Kill dead code and stale comment
Swift SVN r20198
2014-07-19 01:18:50 +00:00
Andrew Trick
250bb973bb Add an array optimization pass that hoists make_mutable calls.
This gives us a 10x speedup on -Ofast memset, which was the original
goal. We're now within 2x of C, but the C code produces movdqu instead
of movups for twice the throughput:
<rdar://problem/17722727> Memset at -Ofast is 2x slower than C

-O3 results:
| benchmark     | baserun0 |  optrun0 |   delta | speedup |
| Memset        | 39885.00 | 33978.00 | 5907.00 |   17.4% |
| NBody         |   459.00 |   440.00 |   19.00 |    4.3% |
| QuickSort     |   456.00 |   439.00 |   17.00 |    3.9% |
| StringWalk    |   625.00 |   647.00 |   22.00 |   -3.4% |
| SmallPT       |   557.00 |   575.00 |   18.00 |   -3.1% |
| Phonebook     |  1804.00 |  1862.00 |   58.00 |   -3.1% |

Memset, NBody, and Quicksort are the ones we expected to improve.

We don't get much gain on O3 because retains/release and bounds checks
are still there.
<rdar://problem/17719220> QuickSort -O3 has retains/releases in the inner loop.

Given the Ofast results, I think the small degradations at O3 are noise.

-Ofast results:
| benchmark     | baserun0 |  optrun0 |   delta | speedup |
| Memset        |  5453.00 |   452.00 | 5001.00 | 1106.4% |
| NBody         |   772.00 |   437.00 |  335.00 |   76.7% |
| Walsh         |  1530.00 |  1096.00 |  434.00 |   39.6% |
| QuickSort     |   682.00 |   524.00 |  158.00 |   30.2% |
| Phonebook     |  1453.00 |  1561.00 |  108.00 |   -6.9% |
| Hash          |   993.00 |   958.00 |   35.00 |    3.7% |
| StringWalk    |   458.00 |   446.00 |   12.00 |    2.7% |
| StringBuilder |  1603.00 |  1568.00 |   35.00 |    2.2% |

Swift SVN r20145
2014-07-18 06:52:17 +00:00
Arnold Schwaighofer
39e77dc7c6 Enable loop rotation and LICM
We want this to get tested.

O3:
Ackermann 8.0%
GlobalClass 30.8%
R17315246 -50.2%
Phonebook 7.4%

Ofast:
RC4 -9.7%

The R17315246 regression is because LLVM seems to unable to 'unswitch' the loop
that makes up this benchmark after rotation. The only explaination I have atm is
that after rotation the first exit is a cond_fail.

I looked at RC4's profile and did not see anything suspicious. I was chasing a
10% regression yesterday in phonebook (today it seems i see about 7%
improvement). So I am not sure how 'stable' wrt to cache effects our benchmarks
are (we are calling into runtimes and whatnot).

Swift SVN r20098
2014-07-17 16:12:45 +00:00
Arnold Schwaighofer
12cb97d284 PassManager: Reset state and remove all currently owned transformations
In the current setup analysis information is not reused by new pass managers.
There is no point in having different pass managers. Instead, we can just remove
transformations, reset the internal state of the pass manager, and add new
transformation passes. Analysis information can be reused.

Reuse one pass manager in the pass pipeline so that we don't have to
unnecessarily recompute analysis information.

Swift SVN r19917
2014-07-14 03:42:39 +00:00
Michael Gottesman
7d5751594d Add in a post order analysis that lazily recomputes post orders for functions when they are invalidated.
This ensures that if we have a bunch of passes in a row which modify the CFG, we
do not continually rebuild the post order, while at the same time preserving the
property of multiple passes which do not touch the CFG sharing the same post
order, reverse post order rather than recomputing them.

rdar://17654239

Swift SVN r19913
2014-07-14 01:32:24 +00:00
Mark Lacey
021983017a Add a SIL SCC visitor and an induction variable analysis.
The induction variable analysis derives from the SCC visitor CRTP-style
and uses it to drive analysis to find the IVs of a function.

The current definition of induction variable is very weak, but enough to
use for very basic bounds-check elimination.

This is not quite ready for real use. There is an assert that I've
commented out that is firing but should not be, and that will require
some more investigation.

Swift SVN r19845
2014-07-11 02:48:03 +00:00
Arnold Schwaighofer
51eb9269ab Add a SIL LICM pass
The main purpose of this pass is to hoist invariant loads out of loops. This
will enable llvm to vectorize loops with array accesses in Ofast once we hoist
the makeUnique functions.

Disabled for now.

rdar://17142604

Swift SVN r19713
2014-07-08 23:51:48 +00:00
Arnold Schwaighofer
5ad13207e5 Add a loop rotation pass
This is to support loop invariant code motion and bound check
hoisting.

Disabled for now.

Swift SVN r19635
2014-07-07 21:05:23 +00:00
Nadav Rotem
55c136662b Fix a typo in the code that initializes the inliner. The high-level inliner does not inline @semantics functions. We did not notice this change because there is no code in the standard library that is annotated with @semantics.
Swift SVN r19591
2014-07-07 05:52:52 +00:00
Nadav Rotem
eacd26b203 Split the optimization pipe into two parts: high-level and low-level optimizations.
In the high-level we don't inline functions with special semantics to allow high-level optimizations.

In this change we are moving from 3 SSA iterations into two high-level and two-low level iterations of the SSA optimization pipeline.
This change reduces the SmallPT benchmark execution time by 50% and changes the overall testsuite score by 9%.



Swift SVN r19581
2014-07-05 21:10:03 +00:00
Mark Lacey
795d22b8a2 Add another pass of CFG simplification after inlining.
Inlining exposes more opportunities for CFG simplifications, and this
could be beneficial before ARC opts.

Because we create inline "caches" fairly late we also need this in order
to clean up redundant checked_cast_br instructions that are exposed as a
result of inlining since we only run the SSA passes once after the
inline cache pass.

The change to actually optimize the checked_cast_br is forthcoming.

Swift SVN r19557
2014-07-04 06:38:22 +00:00
Mark Lacey
9d80cf2d8e Remove extra whitespace after paren, and fix comment.
Swift SVN r19556
2014-07-04 06:38:22 +00:00
Nadav Rotem
1ec9248f1f Refactor the code that sets the SSA pass manager. NFC.
Swift SVN r19442
2014-07-02 00:12:10 +00:00
Nadav Rotem
cabc53a4a2 Update the optimization pipeline to be mostly non-iterative.
There are no major regressions on the pre-commit performance workloads.



Swift SVN r19166
2014-06-25 17:16:28 +00:00
Nadav Rotem
50b2d0e73d Rename early binding -> inline caches.
Swift SVN r19063
2014-06-21 05:06:37 +00:00
Michael Gottesman
a0f7d9c3fd Add an option to run the inst-count pass after performing optimizations.
This will enable via the -print-stats function the ability to quickly
find out the final count of various forms of instructions. My intention
is to use this to count retains and releases.

Swift SVN r18946
2014-06-17 02:30:34 +00:00
Nadav Rotem
ccc061a69f Add the EarlyBinding pass as a late lowering pass (not a part of the iterating pass manager).
Swift SVN r18863
2014-06-13 07:21:18 +00:00
Nadav Rotem
cd1bae4bab Refactor the code that adds analysis passes. NFC.
Swift SVN r18862
2014-06-13 07:18:05 +00:00
Michael Gottesman
fd33095b9a [g-arc-opts] Enable the Global ARC Optimizer by default.
Keep in mind that there is still more work to be done in the optimizer related
to loops, partial merging, etc. But this is the most basic multiple basic block
optimizer that has the features we want.

Swift SVN r18707
2014-06-05 04:59:40 +00:00
Michael Gottesman
3ebbaaa091 [sil-enum-simplification] Add new pass enum simplification that propagates enum case information down the CFG. Currently it only simplifies ref count operations in the same basic block, but it could be pushed further.
Swift SVN r18698
2014-06-04 04:43:05 +00:00
Manman Ren
b3e72be9d9 Remove unused deserialized SILFunctions.
The deserializer holds a reference to the deserialized SILFunction, which
prevents Dead Function Elimination from erasing them. 

We have a tradeoff on how often we should clean up the unused deserialized
SILFunctions. If we clean up at every optimization iteration, we may
end up deserializing the same SILFunction multiple times. For now, we clean
up only after we are done with the optimization iteration.

rdar://17046033


Swift SVN r18697
2014-06-04 00:30:34 +00:00
Mark Lacey
8cbefaeef4 During DCE, eliminate branches to dead regions of code.
Enhances DCE to make unreachable those regions of code that have no
effect.

This allows loops like:
  for i in 0..n {
    // do nothing
  }
to be eliminated by first running DCE to make the loop unreachable, and
the CFG simplification to actually delete the blocks that make up the
loop (assuming we're talking -Ofast and cond_fails have been removed).

What's especially nice is that this can make unreachable several levels
of dead code, including deleting the code that produces the values used
to conditionally branch to other dead code, all in a single pass rather
than needing to iterate between DCE and CFG simplification to achieve
the same effect. For example, this:

  func f(b: Bool, c: Bool, d: Bool) {
    if (b && c) {
      // nothing useful here
      if (c && d) {
        // nothing useful here
        if (b && d) {
          // nothing useful here
        }
      }
    }
  }

is effectively reduced to:
  func f(b: Bool, c: Bool, d: Bool) {
    goto end      // pretend for a second we have goto
    if (b && c) {
      // nothing useful here
      if (c && d) {
        // nothing useful here
        if (b && d) {
          // nothing useful here
        }
      }
    }
    end:
  }
after a single pass, after which unreachable code elimination reduces
this to:
  func f(b: Bool, c: Bool, d: Bool) {
  }

Swift SVN r18664
2014-05-30 00:00:11 +00:00
Mark Lacey
8156008cd8 Add a dead code elimination optimization pass.
In a loop like this:
  var j = 2
  for var i = 0; i < 100; ++i {
    j += 3
  }
it will completely eliminate j.

It does not yet support rewriting conditional branches as unconditional
branches in the cases where only empty blocks are control dependent on
an edge. Once this support is added, it will also completely eliminate
the loop itself.

Swift SVN r18615
2014-05-24 07:02:18 +00:00
Mark Lacey
a9368d137b Update file contents for rename of DeadCodeElimination.
Now that the file is called DiagnoseUnreachable.cpp, update the contents
to match.

Swift SVN r18614
2014-05-24 07:02:17 +00:00
Nadav Rotem
6af5232769 Re-enable the global initializer hoisting pass.
rdar://16989584



Swift SVN r18531
2014-05-21 23:25:12 +00:00
Nadav Rotem
ff5601aaf1 Disable the GlobalOpt pass that hoists initializers because it regresses the Phobebook benchmark.
Swift SVN r17998
2014-05-13 07:53:48 +00:00
Andrew Trick
2e18d8ab49 Reenable the GlobalOpt pass.
Despite my comment in r17554, the pass was still disabled because of a
potential regression. After enough inlining, _cocoaStringSubscript
addressor is hoisted outside the string comparison, which is of course
a loop.

The regression is hard to measure, ~0.5%, so it's been decided that
we're going to live with it, rather than doing something nasty like
recognizing certain variable names. The post-WWDC fix:
<rdar://problem/16836228> Add an @cold attribute to identify unlikely
code paths.

Swift SVN r17609
2014-05-07 06:42:44 +00:00
Michael Gottesman
1f93ec5480 [devirtualization] Remove deep devirtualization code that we are not using for WWDC.
If we decide in the future to do this we can always revert this commit.

Swift SVN r17293
2014-05-03 00:33:46 +00:00
Nadav Rotem
57662545fd Don't hoist global initializers from a cold path into the entry block of the function.
And especially don't do that for String subscript. The plan now is to just hoist global initializers out of loops.

And we are faster than ObjC on the StringSort benchmark.



Swift SVN r17240
2014-05-02 06:46:24 +00:00
Andrew Trick
ab129dfb39 Add -global-opt pass.
Currently, this pass simply hoists calls to addressor functions up to
the function entry point. This solves most of the perfomance problem.

Fixes <rdar://problem/16500879> Need to hoist @swift_once outside of loops.

Swift SVN r16684
2014-04-23 01:09:48 +00:00
Michael Gottesman
cffc3d372d [constant-propagation] Refactor constant propagation slightly to disable diagnostics so we can use it in the performance passes to help with branch simplification.
This commit also enables constant propagation in the performance
pipeline.

Since we are close to WWDC, this commit purposefully minimally touches
the pass (despite my hands wanted to refactor it so bad) just enough so
that we get the desired result with minimal in tree turmoil.

rdar://16604715

Swift SVN r16388
2014-04-16 01:49:16 +00:00
Michael Gottesman
0727628c8c [deserialization] Add in the linker pass.
Swift SVN r15671
2014-03-31 08:40:36 +00:00
Chris Lattner
8869767260 Implement the rest of rdar://16242700
Fix a phase ordering problem: SILGen of a noreturn function doesn't drop an unreachable after the function,
and doing so is problematic for various reasons (all expressions would have to handle their insertion point
vaporizing, and would have to emit unreachable code diagnostics).  Instead, run a simple pass that folds
noreturn calls and diagnoses unreachable code, and do it before DI.  This prevents DI from seeing false
paths, and rejecting what seems like invalid code.



Swift SVN r14711
2014-03-06 01:29:32 +00:00
Michael Gottesman
c25d6f8390 [mandatory-inlining] Use getOptions() instead of passing around options.
Swift SVN r14493
2014-02-28 01:51:53 +00:00
Michael Gottesman
29e1a53bbb [deserialization] Deserialize transparent functions lazily iff they will be used in mandatory inlining.
Swift SVN r14490
2014-02-28 01:05:01 +00:00
Nadav Rotem
d52cbc89dd Rename AllocRefElim -> DeadObjectElim. NFC.
Swift SVN r14179
2014-02-20 23:14:59 +00:00
Michael Gottesman
8cff098f1e Split SILCodeMotion into two passes, LoadStoreOpts and SILCodeMotion.
LoadStoreOpts removes duplicate loads, forwards stores to loads, and eliminates
dead stores.

Swift SVN r13789
2014-02-11 23:36:51 +00:00
Andrew Trick
04b2b5256b First implementation of <rdar://15922760> Deep devirtualization -
specialize on polymorphic arguments.

This can be enabled with: -sil-devirt-threshold 500.

It currently improves RC4 (when enabled) by 20%, but will be much more
important after Michael's load elimination with alias analysis lands.

This implementation is suitable for experimentation. Superficial code
reviews are also welcome. Although be warned that the design is overly
complex and I plan to rewrite it. I initially abandoned the idea of
incrementally specializing one function at a time, thinking that we
need to analyze full chains. However, I since realized after talking
to Nadav that the incremental approach can be made to work. A lot of
book-keeping will go away with that change.

TODO:

- Resolve protocol argument types. Currently we assume they can be
  reinitialized at applies, but I don't think they can unless they are
  @inouts.  This is an issue with the existing local devirtualizer
  that prevents it working across calls.

- Properly mangle the specialized methods. Find existing
  specializations by demangling rather than maintaining a map.

- Rewrite the logic for specializing chains for simplicity.

- Enable by default.

Swift SVN r13642
2014-02-07 19:10:27 +00:00
Andrew Trick
47b936fbae Let passes get their options (current configuration) from the
PassManager.

I think this is much cleaner and more flexible. The various pass
builders have no business marshalling these things around, and they
shouldn't be bound to the pass C'tor. In the future we will be able
override and dynamically modify pass configuration this way.

Swift SVN r13626
2014-02-07 05:01:00 +00:00