Commit Graph

871 Commits

Author SHA1 Message Date
Roman Levenstein
9d4fc913d9 Track dependencies of SIL instructions on opened archetypes which they use
Till now there was no way in SIL to explicitly express a dependency of an instruction on any opened archetypes used by it. This was a cause of many errors and correctness issues. In many cases the code was moved around without taking into account these dependencies, which resulted in breaking the invariant that any uses of an opened archetype should be dominated by the definition of this archetype.

This patch does the following:
- Map opened archetypes to the instructions defining them, i.e. to open_existential instructions.
- Introduce a helper class SILOpenedArchetypesTracker for creating and maintaining such mappings.
- Introduce a helper class SILOpenedArchetypesState for providing a read-only API for looking up available opened archetypes.
- Each SIL instruction which uses an opened archetype as a type gets an additional opened archetype operand representing a dependency of the instruction on this archetype. These opened archetypes operands are an in-memory representation. They are not serialized. Instead, they are re-constructed when reading binary or textual SIL files.
- SILVerifier was extended to conduct more thorough checks related to the usage of opened archetypes.
2016-06-24 10:36:52 -07:00
Mark Lacey
f47cd8afd0 Fix leak in BasicCalleeAnalysis.
Each time we delete the pass manager, we delete the analyses it
holds. In this analysis we were holding onto memory that wasn't getting
released when the analysis was deleted.

Fixes rdar://problem/26245872.
2016-06-23 07:49:06 -07:00
Xin Tong
e06e1be1ad Merge pull request #2950 from trentxintong/SCM
Remove last bit of retain release code motion in SILCodeMotion.
2016-06-08 13:41:20 -07:00
practicalswift
46cbb98049 [gardening] Fix incorrect file name in header. 2016-06-08 22:00:05 +02:00
swift-ci
41177fa12c Merge pull request #2952 from trentxintong/MISC 2016-06-08 12:08:32 -07:00
Xin Tong
500acb98e0 Rename LSBase.h to LoadStoreOptUtils.h 2016-06-08 10:57:27 -07:00
Xin Tong
be793d26eb Remove last bit of retain release code motion in SILCodeMotion. All these code are
replaced by retain release code motion. This code has been disabled for sometime now.

This should bring the retain release code motion into a close. The retain release
code motion pipeline looks like this. There could be some minor cleanups after this though.

1. We perform a global data flow for retain release code motion in RRCM (RetainReleaseCodeMotion)
2. We perform a local form of retain release code motion in SILCodeMotion. This is more
for cases which can not be handled in RRCM. e.g. sinking into a switch is more efficiently
done in a local form, the retain is not needed on the None block. Release on SILArgument needs
to be split to incoming values, this can not be done in RRCM and other cases.
3. We do not perform code motion in ASO, only elimination which are very important.

Some modifications to test cases, they look different, but functionally the same.
RRCM has this canonicalization effect, i.e. it uses the rc root, instead of
the SSA value the retain/release is currently using. As a result some test cases need
to be modified.

I also removed some test cases that do not make sense anymore and lot of duplicate test
cases between earlycodemotion.sil and latecodemotion.sil. These tests cases only have retains
and should be used to test early code motion.
2016-06-08 10:49:33 -07:00
practicalswift
8df3859ce7 [gardening] Fix recently introduced typos. 2016-06-05 11:11:44 +02:00
practicalswift
57bccc8b06 [gardening] Fix inconsistent formatting. 2016-06-04 00:37:15 +02:00
Xin Tong
f2954f9235 Merge pull request #2725 from trentxintong/SFSO
Share a few createIncrement/createDecrement functions
2016-05-26 13:58:02 -07:00
Xin Tong
c758848c92 Share a few createIncrement/createDecrement functions 2016-05-26 12:55:25 -07:00
Xin Tong
a752b443d9 Merge pull request #2712 from trentxintong/SFSO
Change FSO explosion heurisitics.
2016-05-26 12:22:55 -07:00
Xin Tong
d8e11f59e1 Change FSO explosion heurisitics. We explode when we can not find all the
epilogue releases for an argument.

I did not measure performance difference with this change.

rdar://25451364
2016-05-26 10:17:14 -07:00
Xin Tong
35471ab345 Merge pull request #2310 from trentxintong/SFSO
Simplify function signature optimzation and fix a memory leak in function signature.
2016-05-25 15:11:12 -07:00
Xin Tong
db9ee7c614 Fix a memory leak in FSO
Make sure the destructor of the SmallVector in ProjectionTreeNode gets
called when the BumpPtrAllocator is destroy'ed.
2016-05-25 15:08:18 -07:00
Xin Tong
fb3eb0b646 Simplify function signature optimzation.
Several functionalities have been added to FSO over time and the logic has become
muddled.

We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.

This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.

Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.

With this change, it will be easier to implement function signature optimization as now
we can do them independently now.

Small modifications to the test cases.
2016-05-25 11:12:27 -07:00
Xin Tong
b5b905e3cc Bring back rc identity cache.
This takes off 0.7% of the 27.7% of the time we spent in SILoptimizer when building
Stdlib.
2016-05-25 09:25:27 -07:00
swiftix
4eaed25a5e Merge pull request #2491 from swiftix/devirtualizer-fixes
Small fixes for the sil-devirtualizer
2016-05-11 21:35:46 -07:00
Mark Lacey
921dededad Use a bump pointer allocator in the callee set creation.
Shaves about 19% of the time from the construction of these sets. The
SmallVector size was chosen to minimize the number of dynamic
allocations we end up doing while building the stdlib. This should be a
reasonable size for most projects, too. It's a bit wasteful in space,
but the total amount of allocated space here is pretty small to begin
with.
2016-05-11 17:07:27 -07:00
Roman Levenstein
73b6a38edc [sil-devirtualizer] Do not perform a speculative devirtualization for no-opt callees. 2016-05-11 16:28:50 -07:00
Arnold Schwaighofer
4df87a6554 Refactor unsafeGuaranteed code into utility functions.
NFC.
2016-05-08 08:10:43 -07:00
Erik Eckstein
d6e86b7c4b Add a new SIL pass to move conditions closer to switch_enum to enable jump threading.
For details see the comment in ConditionForwarding.cpp.
This optimization pass helps to optimize loops iterating over closed ranges, e.g. for i in 0...n { }
2016-05-05 10:34:08 -07:00
Xin Tong
57e2bdb123 Revert "Simplify function signature optimization" 2016-04-25 16:33:17 -07:00
Xin Tong
633ca2e92b Simplify function signature optimzation.
Several functionalities have been added to FSO over time and the logic has become
muddled.

We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.

This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.

Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.

With this change, it will be easier to implement function signature optimization as now
we can do them independently now.

Minimal modifications to the test cases.
2016-04-25 15:28:51 -07:00
practicalswift
9a078b54ef [gardening] Fix recently introduced typo: "a executable" → "an executable"
[gardening] Fix recently introduced typo: "a offset" → "an offset"
[gardening] Fix recently introduced typo: "accessiblity" → "accessibility"
[gardening] Fix recently introduced typo: "cant" → "can't"
[gardening] Fix recently introduced typo: "inteference" → "interference"
[gardening] Fix recently introduced typo: "unsatified" → "unsatisfied"
[gardening] Remove accidental space.
2016-04-24 22:11:59 +02:00
Xin Tong
b27697eff1 Merge pull request #2246 from trentxintong/CodeMotion
Rename mayUseValue to mayHaveSymmetricInteference
2016-04-19 19:40:33 -07:00
Xin Tong
49f1c66d7b Rename mayUseValue to mayHaveSymmetricInteference 2016-04-19 15:23:45 -07:00
Xin Tong
3824a9479e Merge pull request #2064 from trentxintong/CodeMotion
implement retain release code motion.
2016-04-18 18:16:50 -07:00
swift-ci
48e0aac9a6 Merge pull request #2234 from trentxintong/IP 2016-04-18 16:41:10 -07:00
Xin Tong
51b1c0bc68 Implement retain, release code motion.
Iterative data flow retain sinking and release hoisting.

This allows us to sink retains and hoist releases across harmless loops. which is
an improvement on the SILCodeMotion retain sinking and release hoisting.

It also separates the duty of moving retain and release with the duty of eliminating them
in ASO.

This should eventually replace RR code motion in SILcodemotion and insertion point
in ARCsequence opts (ASO).

This is the performance difference i get with retain sinking and release hoisting.
After disabling retain release code motion in ASO and SILCodeMotion. we can start to take
those code out once this lands.

I see that we go from 24.5% of time spent in SILOptimizations w.r.t. the whole stdlib compilation
to 25.1%.

Improvement is better (i.e. retain sinking and hoisting releases result in performance gain).

<details open>
  <summary>Regression (7)</summary>

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
SetIsSubsetOf                                           | 441     | 510     | +15.7%    | **0.86x**
SetIntersect                                            | 1041    | 1197    | +15.0%    | **0.87x**
BenchLangCallingCFunction                               | 184     | 211     | +14.7%    | **0.87x**
Sim2DArray                                              | 326     | 372     | +14.1%    | **0.88x**
SetIsSubsetOf_OfObjects                                 | 498     | 567     | +13.9%    | **0.88x**
GeekbenchGEMM                                           | 945     | 1022    | +8.2%     | **0.92x**
COWTree                                                 | 3839    | 4181    | +8.9%     | **0.92x(?)**

</details>

<details >
  <summary>Improvement (31)</summary>

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
ObjectiveCBridgeFromNSDictionaryAnyObjectToString       | 174526  | 165392  | -5.2%     | **1.06x**
RGBHistogram                                            | 3128    | 2957    | -5.5%     | **1.06x**
ObjectiveCBridgeToNSDictionary                          | 16510   | 15494   | -6.2%     | **1.07x**
LuhnAlgoLazy                                            | 2294    | 2120    | -7.6%     | **1.08x**
DictionarySwapOfObjects                                 | 6477    | 5994    | -7.5%     | **1.08x**
StringRemoveDupes                                       | 1610    | 1485    | -7.8%     | **1.08x**
ObjectiveCBridgeFromNSSetAnyObjectToString              | 159358  | 147824  | -7.2%     | **1.08x**
ObjectiveCBridgeToNSSet                                 | 16191   | 14924   | -7.8%     | **1.08x**
DictionaryHashableClass                                 | 1839    | 1704    | -7.3%     | **1.08x**
DictionaryLiteral                                       | 2906    | 2678    | -7.8%     | **1.09x(?)**
StringUtilsUnderscoreCase                               | 10031   | 9187    | -8.4%     | **1.09x**
LuhnAlgoEager                                           | 2320    | 2113    | -8.9%     | **1.10x**
ObjectiveCBridgeFromNSSetAnyObjectToStringForced        | 99553   | 90348   | -9.2%     | **1.10x**
RIPEMD                                                  | 3327    | 3009    | -9.6%     | **1.11x**
Combos                                                  | 595     | 538     | -9.6%     | **1.11x**
Roman                                                   | 10      | 9       | -10.0%    | **1.11x**
StringUtilsCamelCase                                    | 10783   | 9646    | -10.5%    | **1.12x**
SetIntersect_OfObjects                                  | 2511    | 2182    | -13.1%    | **1.15x**
SwiftStructuresTrie                                     | 28331   | 24339   | -14.1%    | **1.16x**
Dictionary2OfObjects                                    | 3748    | 3115    | -16.9%    | **1.20x**
DictionaryOfObjects                                     | 2473    | 2050    | -17.1%    | **1.21x**
Dictionary                                              | 894     | 737     | -17.6%    | **1.21x**
Dictionary2                                             | 2268    | 1859    | -18.0%    | **1.22x**
StringIteration                                         | 8027    | 6344    | -21.0%    | **1.27x**
Phonebook                                               | 8207    | 6436    | -21.6%    | **1.28x**
BenchLangArray                                          | 119     | 91      | -23.5%    | **1.31x**
LinkedList                                              | 8267    | 6297    | -23.8%    | **1.31x**
StrToInt                                                | 5585    | 4180    | -25.2%    | **1.34x**
Dictionary3OfObjects                                    | 1122    | 831     | -25.9%    | **1.35x**
Dictionary3                                             | 731     | 515     | -29.6%    | **1.42x**
SuperChars                                              | 513353  | 258735  | -49.6%    | **1.98x**
2016-04-18 15:39:17 -07:00
Xin Tong
bfc9683b49 Use a SmallPtrSet instead of a DenseSet. More memory efficient 2016-04-18 14:54:39 -07:00
practicalswift
0c89048988 [gardening] Fix recently introduced typo: "transistive" → "transitive" 2016-04-14 22:26:44 +02:00
Xin Tong
31b6c65039 Fix a logic error in eraseUseOfValue.
I failed to create a test case. And we hav existing tests in FSO that will
exercise this.

rdar://25559780
2016-04-13 20:53:28 -07:00
Erik Eckstein
3e52d24853 add a debug dump function for ValueLifetimeAnalysis 2016-04-13 13:22:30 -07:00
Xin Tong
1a4f567685 More conservative about when we can move a release across an instruction
We now consider effect of deinit in addition to the released value.

rdar://25362826

This is the only 10%+ regression i measured on my machine. no performance improvement.

Sim2DArray                                              | 326     | 366     | +12.3%    | **0.89x**
2016-04-12 20:39:30 -07:00
Erik Eckstein
6cdfc2e469 EscapeAnalysis: Make the CGNode class public. It's used by clients. NFC. 2016-04-08 10:20:47 -07:00
Slava Pestov
5aa99fa346 SILOptimizer: Create non-[fragile] specializations of [fragile] functions where possible
Change the optimizer to only make specializations [fragile] if both the
original callee is [fragile] *and* the caller is [fragile].

Otherwise, the specialized callee might be [fragile] even if it is never
called from a [fragile] function, which inhibits the optimizer from
devirtualizing calls inside the specialization.

This opens up some missed optimization opportunities in the performance
inliner and devirtualization, which currently reject fragile->non-fragile
references:

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
DictionaryRemoveOfObjects                               | 38391   | 35859   | -6.6%     | **1.07x**
Hanoi                                                   | 5853    | 5288    | -9.7%     | **1.11x**
Phonebook                                               | 18287   | 14988   | -18.0%    | **1.22x**
SetExclusiveOr_OfObjects                                | 20001   | 15906   | -20.5%    | **1.26x**
SetUnion_OfObjects                                      | 16490   | 12370   | -25.0%    | **1.33x**

Right now, passes other than performance inlining and devirtualization
of class methods are not checking invariants on [fragile] functions
at all, which was incorrect; as part of the work on building the
standard library with -enable-resilience, I added these checks, which
regressed performance with resilience disabled. This patch makes up for
these regressions.

Furthermore, once SIL type lowering is aware of resilience, this will
allow the stack promotion pass to make further optimizations after
specializing [fragile] callees.
2016-04-08 02:10:31 -07:00
practicalswift
abfecfde17 [gardening] if ([space]…[space]) → if (…), for(…) → for (…), while(…) → while (…), [[space]x, y[space]] → [x, y] 2016-04-04 16:22:11 +02:00
Mark Lacey
99d4485713 Fix double delete in generic specialization.
We ended up adding the same instruction twice to a SmallVector of
instructions to be deleted. To avoid this, we'll track these
to-be-deleted instructions in a SmallSetVector instead.

We were also failing to add an instruction that we can delete to the set
of instructions to be deleted, so I fixed that as well.

I've added a test case, but it's currently disabled because fixing this
turned up another issue in the same code which I still need to take a
look at.

Fixes rdar://problem/25369617.
2016-03-30 13:10:00 -07:00
Xin Tong
f95d9b3c92 Change FSO heuristic.
FSO functions that have high potential but does not have caller inside
current module.

The thunk can then be inlined into the module calling the function and
the function would get the benefit of FSO.

The heuristic for selecting such function is
1. Have no indirect caller. This would introduce a thunk.
2. Have potential to give better performance. i.e. function argument can
be O2G.

Regression
TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
BenchLangCallingCFunction                               | 184     | 211     | +14.7%    | **0.87x**
Calculator                                              | 55      | 59      | +7.3%     | **0.93x**
DeadArray                                               | 687     | 741     | +7.9%     | **0.93x**
MonteCarloPi                                            | 39275   | 41669   | +6.1%     | **0.94x**

Improvement
TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
LuhnAlgoLazy                                            | 2478    | 2327    | -6.1%     | **1.06x**
OpenClose                                               | 54      | 51      | -5.6%     | **1.06x**
SortLettersInPlace                                      | 1016    | 946     | -6.9%     | **1.07x**
ObjectiveCBridgeFromNSDictionaryAnyObjectToStringForced | 149993  | 139755  | -6.8%     | **1.07x**
Phonebook                                               | 9666    | 8992    | -7.0%     | **1.07x**
ObjectiveCBridgeFromNSDictionaryAnyObjectToString       | 222713  | 206538  | -7.3%     | **1.08x**
LuhnAlgoEager                                           | 2393    | 2226    | -7.0%     | **1.08x**
Dictionary                                              | 1307    | 1196    | -8.5%     | **1.09x**
JSONHelperDeserialize                                   | 3808    | 3492    | -8.3%     | **1.09x**
StdlibSort                                              | 7310    | 4084    | -44.1%    | **1.79x**

I see 0.15% increase in code size for Benchmark_O.

Thanks @gottesmm for suggesting this opportunity.

rdar://25345056
2016-03-29 23:04:36 -07:00
Arnold Schwaighofer
255779082e Add a peephole optimization for the builtin "unsafeGuaranteed"
We can remove the retain/release pair preceeding the builtins based on the
knowledge that the lifetime of the reference is guaranteed by someone hanging on
to the reference elsewhere.
2016-03-27 06:47:16 -07:00
Slava Pestov
a9ad760b78 SIL: Clean up duplicated "can be referenced from a fragile function" checks 2016-03-25 22:46:50 -07:00
Chris Lattner
e4c7bca43a Merge pull request #1861 from practicalswift/weekly-cleanup-01
[gardening] Weekly gardening: typos, duplicate includes, header formatting, etc.
2016-03-24 22:39:36 -07:00
Xin Tong
f557a3253d Merge pull request #1857 from trentxintong/FSO
Rename FunctionSignatureOptCloner to FunctionSignatureOpts
2016-03-24 15:57:34 -07:00
practicalswift
d00a5ef814 [gardening] Weekly gardening: typos, duplicate includes, header formatting, etc. 2016-03-24 22:41:10 +01:00
Xin Tong
5907b8a3e2 Rename FunctionSignatureOptCloner to FunctionSignatureOpts
Eventually, we decided to do this

1. Have the function signature opts (used to be called the cloner to create
the optimized function.
2. Mark the thunk as always_inline
3. Rely on the inliner to inline the thunk to get the benefit of calling optimized
function directly.
2016-03-24 12:50:12 -07:00
Xin Tong
e0ba695d17 Merge pull request #1852 from trentxintong/FSO
Remove function signature rewriter and make function signature analysis a Util
2016-03-24 12:42:05 -07:00
Xin Tong
9a3761000c Move function signature analysis to a Util
We really only need this signature analysis in the cloner pass now.
2016-03-24 11:17:47 -07:00
Xin Tong
3f075dfe47 Remove function signature rewriter.
We decided to use the inliner to rewrite the caller's callsites.

And eventually I will turn FunctionSignatureAnalysis into a Utility.
As its data should only be used and kept in the cloner pass.
2016-03-24 10:50:47 -07:00
Xin Tong
c44006aa9d Merge pull request #1824 from trentxintong/RLE
Make sure non-epilogue releases do not kill redundant loads
2016-03-24 10:48:21 -07:00