Commit Graph

80 Commits

Author SHA1 Message Date
Andrew Trick
be1881aa1f Remove redundant Transform.getName() definitions.
At some point, pass definitions were heavily macro-ized. Pass
descriptive names were added in two places. This is not only redundant
but a source of confusion. You could waste a lot of time grepping for
the wrong string. I removed all the getName() overrides which, at
around 90 passes, was a fairly significant amount of code bloat.

Any pass that we want to be able to invoke by name from a tool
(sil-opt) or pipeline plan *should* have unique type name, enum value,
commend-line string, and name string. I removed a comment about the
various inliner passes that contradicted that.

Side note: We should be consistent with the policy that a pass is
identified by its type. We have a couple passes, LICM and CSE, which
currently violate that convention.
2017-04-09 15:20:28 -07:00
Erik Eckstein
ee034d0fcb DeadStoreElimination: fix wrong elimination of dead store before is_unique instruction
fixes SR-4524
2017-04-07 09:31:44 -07:00
practicalswift
6d1ae2a39c [gardening] 2016 → 2017 2017-01-06 16:41:22 +01:00
Michael Gottesman
38ec08f45f [gardening] Standardize SILBasicBlock successor/predecessor methods that deal with blocks rather than the full successor data structure to have the suffix 'Block'.
This was already done for getSuccessorBlocks() to distinguish getting successor
blocks from getting the full list of SILSuccessors via getSuccessors(). This
commit just makes all of the successor/predecessor code follow that naming
convention.

Some examples:

getSingleSuccessor() => getSingleSuccessorBlock().
isSuccessor() => isSuccessorBlock().
getPreds() => getPredecessorBlocks().

Really, IMO, we should consider renaming SILSuccessor to a more verbose name so
that it is clear that it is more of an internal detail of SILBasicBlock's
implementation rather than something that one should consider as apart of one's
mental model of the IR when one really wants to be thinking about predecessor
and successor blocks. But that is not what this commit is trying to change, it
is just trying to eliminate a bit of technical debt by making the naming
conventions here consistent.
2016-11-27 12:32:51 -08:00
Michael Gottesman
96837babda Merge pull request #5920 from gottesmm/vacation_gardening
Vacation gardening
2016-11-25 09:17:21 -06:00
Michael Gottesman
bf6920650c [gardening] Drop BB from all argument related code in SILBasicBlock.
Before this commit all code relating to handling arguments in SILBasicBlock had
somewhere in the name BB. This is redundant given that the class's name is
already SILBasicBlock. This commit drops those names.

Some examples:

getBBArg() => getArgument()
BBArgList => ArgumentList
bbarg_begin() => args_begin()
2016-11-25 01:14:36 -06:00
practicalswift
797b80765f [gardening] Use the correct base URL (https://swift.org) in references to the Swift website
Remove all references to the old non-TLS enabled base URL (http://swift.org)
2016-11-20 17:36:03 +01:00
Michael Gottesman
bffa7addaf [semantic-arc] Eliminate default {Load,Store}OwnershipQualification argument to SILBuilder::create{Load,Store}(...)
Today, loads and stores are treated as having @unowned(unsafe) ownership
semantics. This leaves the user to specify ownership changes on the loaded or
stored value independently of the load/store by inserting ARC operations. With
the change to Semantic SIL, this will no longer be true. Instead loads, stores
have ownership semantics that one must reason about such as copy, take, and
trivial.

This change moves us closer to that world by eliminating the default
OwnershipQualification argument from create{Load,Store}. This means that the
compiler developer cannot ignore reasoning about the ownership semantics of the
memory operation that they are creating.

Operationally, this is a NFC change since I have just gone through the compiler
and updated all places where we create loads, stores to pass in the former
default argument ({Load,Store}OwnershipQualifier::Unqualified), to
SILBuilder::create{Load,Store}(...). For now, one can just do that in situations
where one needs to create loads/stores, but over time, I am going to tighten the
semantics up via the verifier.

rdar://28685236
2016-10-30 13:07:06 -07:00
Xin Tong
65ba367beb Wire up new epilogue release matcher with DSE and RLE
rdar://26446587
2016-09-13 10:58:28 -07:00
Xin Tong
8b7020e808 A few minor refacting in DSE 2016-08-10 20:25:38 -07:00
practicalswift
8d03ea1347 [gardening] Fix some recently introduced typos. 2016-06-19 21:28:36 +02:00
Xin Tong
500acb98e0 Rename LSBase.h to LoadStoreOptUtils.h 2016-06-08 10:57:27 -07:00
Xin Tong
46436d5caa Fix a memory leak in DSE 2016-05-17 15:39:59 -07:00
practicalswift
c262b42ae0 [gardening] Fix recently introduced whitespace typos. (#2443) 2016-05-06 23:49:39 -07:00
Slava Pestov
0280ed7ec5 SIL Optimizer: Remove some dead code, NFC 2016-05-05 12:25:05 -07:00
Xin Tong
a6082ae8f1 Update comments and remove dead functions in DSE 2016-05-03 21:23:25 -07:00
Xin Tong
15df3016e0 Make smaller stores generated for stores that are partly dead deterministic
We do not have problem right now because the we only do partial dead store when
we end up generating only 1 single smaller store.
2016-05-03 17:53:11 -07:00
Xin Tong
c4eda1abbb Fix a bug in dead store elimination. Make sure we consider SILargument when we
invalidate base.

Also small change on how BlockState is allocated.
2016-05-03 16:55:39 -07:00
practicalswift
798877ae77 [gardening] "if (foo)[SPACE][SPACE]{" → "if (foo)[SPACE]{" 2016-04-03 22:57:05 +02:00
Xin Tong
524ed34583 Make sure epilogue releases do not kill redundant loads
I did not measure a performance improvements with this.
2016-03-23 23:59:54 -07:00
Xin Tong
64e2710102 Move LSBase.x to SILOptimizer/Utils/. NFC. 2016-03-07 22:07:13 -05:00
Xin Tong
ba0249c924 Rename SILValueProjection.x to LSBase.x. NFC 2016-03-07 21:26:56 -05:00
Xin Tong
55377e727a Move createExtract to ProjectionPath::createExtract. NFC. 2016-03-07 21:26:56 -05:00
Xin Tong
bfc258f628 Simplify LSValue::reduce for redundant load elimination
LSValue::reduce reduces a set of LSValues (mapped to a set of LSLocations) to
a single LSValue.

It can then be used as the forwarding value for the location.

Previously, we expand into intermediate nodes and leaf nodes and then go bottom
up, trying to create a single LSValue out of the given LSValues.

Instead, we now use a recursion to go top down. This simplifies the code. And this
is fine as we do not expect to run into type tree that are too deep.

Existing test cases ensure correctness.
2016-03-07 21:26:56 -05:00
Kevin Yu
8f193c856e [gardening] Fix typos "cant" -> "can't", "dont" -> "don't" 2016-03-06 00:25:14 +00:00
Xin Tong
ddb9bba50e Improve the compilation time of redundant load elimination
For forwarding on allocstacks, we can invalidate the forwable bit when we
hit the deallocate stack.

This helps compilation time as we do not need to propagate these bits down
to subsequent basic blocks.
2016-02-29 10:11:40 -08:00
Xin Tong
19c528e59d Make more passes respect no.optimize 2016-02-26 16:03:17 -08:00
Xin Tong
a48584ccbc Create a fast path for not-final release instruction.
For a release on a guaranteed function paramater, we know right away
that its not the final release and therefore does not call deinit.

Therefore we know it does not read or write memory other than the reference
count.

This reduces the compilation time of dead store and redundant load elim. As
we need to go over alias analysis to make sure tracked locations do not alias
with it.
2016-02-20 22:00:36 -08:00
Xin Tong
95f3280461 Remove a double negative. NFC 2016-02-19 20:45:23 -08:00
Xin Tong
84a6ff1d98 And lastly rename NewProjection to Projection. This is a NFC. rdar://24520269 2016-02-09 22:20:10 -08:00
Xin Tong
dd8244f1a7 Set the kill bit for the store at the end of the basic block where the stored location is de-allocated.
rdar://24354423
2016-01-26 20:05:46 -08:00
Erik Eckstein
9f83c43a02 SIL: remove unused functions from SILValue 2016-01-26 09:37:08 -08:00
Xin Tong
f5bd3eab49 Optimize compilation time for RLE and DSE with respective
to the new projection path. We do not need to trace from the accessed field
to the base object when we've done it before in enumerateLSLOcations

Stdlib -O

=== Before ===

Running Time        Self (ms)           Symbol Name
25137.0ms   37.3%   0.0         swift::runSILOptimizationPasses(swift::SILModule&)
24939.0ms   37.0%   0.0         swift::SILPassManager::runOneIteration()
20226.0ms   30.0%   29.0        swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
19241.0ms   28.5%   83.0        swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
3214.0ms    4.7%    10.0        (anonymous namespace)::SimplifyCFGPass::run()
3005.0ms    4.4%    14.0        (anonymous namespace)::ARCSequenceOpts::run()
2438.0ms    3.6%    7.0         (anonymous namespace)::SILCombine::run()
2217.0ms    3.2%    54.0        (anonymous namespace)::RedundantLoadElimination::run()
2212.0ms    3.2%    131.0       (anonymous namespace)::SILCSE::run()
1195.0ms    1.7%    11.0        (anonymous namespace)::GenericSpecializer::run()
1168.0ms    1.7%    39.0        (anonymous namespace)::DeadStoreElimination::run()
853.0ms    1.2%     150.0               (anonymous namespace)::DCE::run()
499.0ms    0.7%     7.0                 (anonymous namespace)::SILCodeMotion::run()

=== After ===

Running Time    Self (ms)               Symbol Name
22955.0ms   38.2%       0.0       swift::runSILOptimizationPasses(swift::SILModule&)
22777.0ms   37.9%       0.0       swift::SILPassManager::runOneIteration()
18447.0ms   30.7%       30.0      swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
17510.0ms   29.1%       67.0      swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
2944.0ms    4.9%        5.0       (anonymous namespace)::SimplifyCFGPass::run()
2884.0ms    4.8%        12.0      (anonymous namespace)::ARCSequenceOpts::run()
2277.0ms    3.7%        1.0       (anonymous namespace)::SILCombine::run()
1951.0ms    3.2%        117.0     (anonymous namespace)::SILCSE::run()
1803.0ms    3.0%        54.0      (anonymous namespace)::RedundantLoadElimination::run()
1096.0ms    1.8%        10.0      (anonymous namespace)::GenericSpecializer::run()
911.0ms    1.5% 53.0              (anonymous namespace)::DeadStoreElimination::run()
795.0ms    1.3% 135.0             (anonymous namespace)::DCE::run()
453.0ms    0.7% 9.0               (anonymous namespace)::SILCodeMotion::run()
2016-01-25 20:10:04 -08:00
Xin Tong
546471ac4d Port dead store elimination and redundant load elimination to use the new projection.
This patch also implements some of the missing functions used by RLE and DSE in new projection
that exist in the old projection.

New projection provides better memory usage, eventually we will phase out the old projection code.

New projection is now copyable, i.e. we have a proper constructor for it.  This helps make the code
more readable.

We do see a bit increase in compilation time in compiling stdlib -O, this is a result of the way
we now get types of a projection path, but I expect this to go down (away) with further improvement
on how memory locations are constructed and cached with later patches.

=== With the OLD Projection. ===

Total amount of memory allocated.
--------------------------------
Bytes Used	Count		   Symbol Name
13032.01 MB      50.6%	2158819    swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
2879.70 MB      11.1%	3076018    (anonymous namespace)::ARCSequenceOpts::run()
2663.68 MB      10.3%	1375465	   (anonymous namespace)::RedundantLoadElimination::run()
1534.35 MB       5.9%	5067928	   (anonymous namespace)::SimplifyCFGPass::run()
1278.09 MB       4.9%	576714	   (anonymous namespace)::SILCombine::run()
1052.68 MB       4.0%	935809	   (anonymous namespace)::DeadStoreElimination::run()
 771.75 MB       2.9%	1677391	   (anonymous namespace)::SILCSE::run()
 715.07 MB       2.7%	4198193	   (anonymous namespace)::GenericSpecializer::run()
 434.87 MB       1.6%	652701	   (anonymous namespace)::SILSROA::run()
 402.99 MB       1.5%	658563	   (anonymous namespace)::SILCodeMotion::run()
 341.13 MB       1.3%	962459	   (anonymous namespace)::DCE::run()
 279.48 MB       1.0%	415031	   (anonymous namespace)::StackPromotion::run()

Compilation time breakdown.
--------------------------
Running Time	Self (ms)	    Symbol Name
25716.0ms   35.8%	0.0	    swift::runSILOptimizationPasses(swift::SILModule&)
25513.0ms   35.5%	0.0	    swift::SILPassManager::runOneIteration()
20666.0ms   28.8%	24.0	    swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
19664.0ms   27.4%	77.0	    swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
3272.0ms    4.5%	12.0	    (anonymous namespace)::SimplifyCFGPass::run()
3266.0ms    4.5%	7.0	    (anonymous namespace)::ARCSequenceOpts::run()
2608.0ms    3.6%	5.0	    (anonymous namespace)::SILCombine::run()
2089.0ms    2.9%	104.0	    (anonymous namespace)::SILCSE::run()
1929.0ms    2.7%	47.0	    (anonymous namespace)::RedundantLoadElimination::run()
1280.0ms    1.7%	14.0	    (anonymous namespace)::GenericSpecializer::run()
1010.0ms    1.4%	45.0	    (anonymous namespace)::DeadStoreElimination::run()
966.0ms    1.3%	191.0	 	    (anonymous namespace)::DCE::run()
496.0ms    0.6%	6.0	 	    (anonymous namespace)::SILCodeMotion::run()

=== With the NEW Projection. ===

Total amount of memory allocated.
--------------------------------
Bytes Used	Count		    Symbol Name
11876.64 MB      48.4%	22112349    swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
2887.22 MB      11.8%	3079485	    (anonymous namespace)::ARCSequenceOpts::run()
1820.89 MB       7.4%	1877674	    (anonymous namespace)::RedundantLoadElimination::run()
1533.16 MB       6.2%	5073310	    (anonymous namespace)::SimplifyCFGPass::run()
1282.86 MB       5.2%	577024	    (anonymous namespace)::SILCombine::run()
 772.21 MB       3.1%	1679154	    (anonymous namespace)::SILCSE::run()
 721.69 MB       2.9%	936958	    (anonymous namespace)::DeadStoreElimination::run()
 715.08 MB       2.9%	4196263	    (anonymous namespace)::GenericSpecializer::run()

Compilation time breakdown.
--------------------------
Running Time	Self (ms)	    Symbol Name
25137.0ms   37.3%	0.0	    swift::runSILOptimizationPasses(swift::SILModule&)
24939.0ms   37.0%	0.0	    swift::SILPassManager::runOneIteration()
20226.0ms   30.0%	29.0	    swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform*>)
19241.0ms   28.5%	83.0	    swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform*>, swift::SILFunction*)
3214.0ms    4.7%	10.0	    (anonymous namespace)::SimplifyCFGPass::run()
3005.0ms    4.4%	14.0	    (anonymous namespace)::ARCSequenceOpts::run()
2438.0ms    3.6%	7.0	    (anonymous namespace)::SILCombine::run()
2217.0ms    3.2%	54.0	    (anonymous namespace)::RedundantLoadElimination::run()
2212.0ms    3.2%	131.0	    (anonymous namespace)::SILCSE::run()
1195.0ms    1.7%	11.0	    (anonymous namespace)::GenericSpecializer::run()
1168.0ms    1.7%	39.0	    (anonymous namespace)::DeadStoreElimination::run()
853.0ms    1.2%	150.0	 	    (anonymous namespace)::DCE::run()
499.0ms    0.7%	7.0	 	    (anonymous namespace)::SILCodeMotion::run()
2016-01-25 20:08:29 -08:00
Erik Eckstein
74d44b74e7 SIL: remove SILValue::getDef and add a cast operator to ValueBase * as a repelacement. NFC. 2016-01-25 15:00:49 -08:00
practicalswift
71e00fefa1 [gardening] Fix typos: "word word" (two spaces) → "word word" (one space) 2016-01-24 21:27:16 +01:00
Xin Tong
894aa621cd Make Fix_lifetime instruction inert from a store prospective 2016-01-20 16:52:53 -08:00
Michael Gottesman
e25cd8860c Remove all uses of ilist_node::getNextNode() and ilist_node::getPrevNode() in favor of just using iterators.
This change is needed for the next update to ToT LLVM. It can be put
into place now without breaking anything so I am committing it now.

The churn upstream on ilist_node is neccessary to remove undefined
behavior. Rather than updating the different ilist_node patches for the
hacky change required to not use iterators, just use iterators and keep
everything as ilist_nodes. Upstream they want to eventually do this, so
it makes sense for us to just do it now.

Please do not introduce new invocations of
ilist_node::get{Next,Prev}Node() into the tree.
2016-01-19 14:44:58 -06:00
practicalswift
7b2bc2a0f4 Fix recently introduced typos. 2016-01-04 21:22:33 +01:00
Xin Tong
ba40b3f1a7 Take a more displined approach in DSE as to how to a function is optimized.
Now we have 3 cases.

1. OptimizeNone (for functions with too many basicblocks and too many locations). Simply return.
2. Pessimisitc single iteration data flow (for functions with many basic blocks and many locations).
3. Optimistic multiple iteration data flow (for functions with some basic blocks and some locations
   and require iterative data flow).

With this change stdlib and stdlibunittest has some changes in dead store(DS)
eliminated.

stdlib: 202 -> 203 DS.
stdlibunittest: 42 - 39 DS.

Compilation time improvement: with this change on a RELEASE+ASSERT compiler for stdlibunittest.

Running Time        Self (ms)               Symbol Name
5525.0ms    5.3%    25.0                     (anonymous namespace)::ARCSequenceOpts::run()
3500.0ms    3.4%    25.0                     (anonymous namespace)::RedundantLoadElimination::run()
3050.0ms    2.9%    25.0                     (anonymous namespace)::SILCombine::run()
2700.0ms    2.6%    0.0                      (anonymous namespace)::SimplifyCFGPass::run()
2100.0ms    2.0%    75.0                     (anonymous namespace)::SILCSE::run()
1450.0ms    1.4%    0.0                      (anonymous namespace)::DeadStoreElimination::run()
750.0ms    0.7%     75.0                     (anonymous namespace)::DCE::run()

Compilation time improvement: with this change on a DEBUG compiler for stdlibunittest.

Running Time        Self (ms)               Symbol Name
42300.0ms    4.9%   50.0                      (anonymous namespace)::ARCSequenceOpts::run()
35875.0ms    4.1%   0.0                       (anonymous namespace)::RedundantLoadElimination::run()
30475.0ms    3.5%   0.0                       (anonymous namespace)::SILCombine::run()
19675.0ms    2.3%   0.0                       (anonymous namespace)::SILCSE::run()
18150.0ms    2.1%   25.0                      (anonymous namespace)::SimplifyCFGPass::run()
12475.0ms    1.4%   0.0                       (anonymous namespace)::DeadStoreElimination::run()
5775.0ms    0.6%    0.0                       (anonymous namespace)::DCE::run()

I do not see a compilation time change in stdlib.

Existing tests ensure correctness.
2016-01-04 10:04:54 -08:00
practicalswift
1339b5403b Consistent use of header comment format.
Correct format:
//===--- Name of file - Description ----------------------------*- Lang -*-===//
2016-01-04 13:26:31 +01:00
practicalswift
50baf2e53b Use consistent formatting in top of file headers. 2016-01-04 02:17:48 +01:00
practicalswift
6c32688275 Fix recently introduced typos. 2016-01-03 21:16:23 +01:00
Xin Tong
4ea79fec2b There are simply too many locations and too many basic blocks in some
functions for dead store elimination to handle.  In the worst case, The number of
memory behavior or alias queries we need to do is roughly linear to
the # BBs x(times) # of locations.

Put in some heuristic to trade off accuracy for compilation time.

NOTE: we are not disabling DSE for these offending functions. instead we are running
a one iteration pessimistic data flow as supposed to the multiple iteration optimistic
data flow we've done previously.

With this change. I see compilation time on StdlibUnitTest drops significantly.
50%+ drop in time spent in DSE in StdlibUnit with a release compiler.

I will update more Instruments data post-commit once i get close to my desktop.

I see a slight drop in # of dead stores (DS) elimination in stdlib and stdlibUnit test.

stdlib: 203 DS -> 202 DS. (RLE is affected slightly as well. 6313 -> 6295 RL).

stdlibunittest :  43 DS -> 42. (RLE is not affected).

We are passing all existing dead store tests.
2016-01-03 10:48:02 -08:00
Xin Tong
013d08d439 Add a bailout location # threshold in DSE.
In StdlibUnitTest, there is this function that has too many (2450) LSLocations
and the data flow in DSE takes too long to converge.

StdlibUnittest.TestSuite.(addForwardRangeReplaceableCollectionTests <A, B where A: Swift.RangeReplaceableCollectionType, B: Swift.RangeReplaceableCollectionType, A.SubSequence: Swift.CollectionType, B.Generator.Element: Swift.Equatable, A.SubSequence == A.SubSequence.SubSequence, A.Generator.Element == A.SubSequence.Generator.Element> (Swift.String, makeCollection : ([A.Generator.Element]) -> A, wrapValue : (StdlibUnittest.OpaqueValue<Swift.Int>) -> A.Generator.Element, extractValue : (A.Generator.Element) -> StdlibUnittest.OpaqueValue<Swift.Int>, makeCollectionOfEquatable : ([B.Generator.Element]) -> B, wrapValueIntoEquatable : (StdlibUnittest.MinimalEquatableValue) -> B.Generator.Element, extractValueFromEquatable : (B.Generator.Element) -> StdlibUnittest.MinimalEquatableValue, checksAdded : StdlibUnittest.Box<Swift.Set<Swift.String>>, resiliencyChecks : StdlibUnittest.CollectionMisuseResiliencyChecks, outOfBoundsIndexOffset : Swift.Int) -> ()).(closure #18)

This function alone takes ~20% of the total amount of time spent in DSE in StdlibUnitTest.

And DSE does not eliminate any dead store in the function either. I added this threshold
to abort on functions that have too many LSLocations.

I see no difference in # of dead store eliminated in the Stdlib.
2016-01-02 17:37:53 -08:00
Xin Tong
310f48eab0 Improve dead store elimination compilation time.
If we know a function is not a one iteration function which means its
its BBWriteSetIn and BBWriteSetOut have been computed and converged,
and a basic block does not even have StoreInsts, there is no point
in processing every instruction in the last iteration of the data flow
again as no store will be eliminated.

We can simply skip the basic block and rely on the converged BBWriteSetIn
to process its predecessors.

Compilation time improvement: 1.7% to 1.5% of overall compilation time.
on stdlib -O. This represents a 4.0% of all SILOptimzations(37.2%)

Existing tests ensure correctness.
2016-01-01 10:16:42 -08:00
Zach Panzarino
e3a4147ac9 Update copyright date 2015-12-31 23:28:40 +00:00
practicalswift
6dac32e416 Fix typos in code merged today 2015-12-29 23:19:48 +01:00
Xin Tong
be4bf816bd Update some comments and rename runIterativeDF -> runIterativeDSE,
mergeSuccessorStates -> mergeSuccessorLiveIns in DSE. Also fix some includes.
2015-12-29 10:13:02 -08:00
Xin Tong
7c43b47bd0 revert some WIP code 2015-12-28 14:41:16 -08:00