Commit Graph

1020 Commits

Author SHA1 Message Date
Xin Tong
7d26a47200 Fix a non-deterministic retain/release insertion in RRCodeMotion 2016-04-19 22:14:38 -07:00
Xin Tong
c2e8c81227 Disable retain release code motion. There is some nondeterminism in
how we insert the new retain release instructions
2016-04-19 21:42:59 -07:00
Xin Tong
49f1c66d7b Rename mayUseValue to mayHaveSymmetricInteference 2016-04-19 15:23:45 -07:00
practicalswift
092007bf12 [gardening] Fix recently introduced typos. 2016-04-19 21:48:05 +02:00
Xin Tong
51b1c0bc68 Implement retain, release code motion.
Iterative data flow retain sinking and release hoisting.

This allows us to sink retains and hoist releases across harmless loops. which is
an improvement on the SILCodeMotion retain sinking and release hoisting.

It also separates the duty of moving retain and release with the duty of eliminating them
in ASO.

This should eventually replace RR code motion in SILcodemotion and insertion point
in ARCsequence opts (ASO).

This is the performance difference i get with retain sinking and release hoisting.
After disabling retain release code motion in ASO and SILCodeMotion. we can start to take
those code out once this lands.

I see that we go from 24.5% of time spent in SILOptimizations w.r.t. the whole stdlib compilation
to 25.1%.

Improvement is better (i.e. retain sinking and hoisting releases result in performance gain).

<details open>
  <summary>Regression (7)</summary>

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
SetIsSubsetOf                                           | 441     | 510     | +15.7%    | **0.86x**
SetIntersect                                            | 1041    | 1197    | +15.0%    | **0.87x**
BenchLangCallingCFunction                               | 184     | 211     | +14.7%    | **0.87x**
Sim2DArray                                              | 326     | 372     | +14.1%    | **0.88x**
SetIsSubsetOf_OfObjects                                 | 498     | 567     | +13.9%    | **0.88x**
GeekbenchGEMM                                           | 945     | 1022    | +8.2%     | **0.92x**
COWTree                                                 | 3839    | 4181    | +8.9%     | **0.92x(?)**

</details>

<details >
  <summary>Improvement (31)</summary>

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
ObjectiveCBridgeFromNSDictionaryAnyObjectToString       | 174526  | 165392  | -5.2%     | **1.06x**
RGBHistogram                                            | 3128    | 2957    | -5.5%     | **1.06x**
ObjectiveCBridgeToNSDictionary                          | 16510   | 15494   | -6.2%     | **1.07x**
LuhnAlgoLazy                                            | 2294    | 2120    | -7.6%     | **1.08x**
DictionarySwapOfObjects                                 | 6477    | 5994    | -7.5%     | **1.08x**
StringRemoveDupes                                       | 1610    | 1485    | -7.8%     | **1.08x**
ObjectiveCBridgeFromNSSetAnyObjectToString              | 159358  | 147824  | -7.2%     | **1.08x**
ObjectiveCBridgeToNSSet                                 | 16191   | 14924   | -7.8%     | **1.08x**
DictionaryHashableClass                                 | 1839    | 1704    | -7.3%     | **1.08x**
DictionaryLiteral                                       | 2906    | 2678    | -7.8%     | **1.09x(?)**
StringUtilsUnderscoreCase                               | 10031   | 9187    | -8.4%     | **1.09x**
LuhnAlgoEager                                           | 2320    | 2113    | -8.9%     | **1.10x**
ObjectiveCBridgeFromNSSetAnyObjectToStringForced        | 99553   | 90348   | -9.2%     | **1.10x**
RIPEMD                                                  | 3327    | 3009    | -9.6%     | **1.11x**
Combos                                                  | 595     | 538     | -9.6%     | **1.11x**
Roman                                                   | 10      | 9       | -10.0%    | **1.11x**
StringUtilsCamelCase                                    | 10783   | 9646    | -10.5%    | **1.12x**
SetIntersect_OfObjects                                  | 2511    | 2182    | -13.1%    | **1.15x**
SwiftStructuresTrie                                     | 28331   | 24339   | -14.1%    | **1.16x**
Dictionary2OfObjects                                    | 3748    | 3115    | -16.9%    | **1.20x**
DictionaryOfObjects                                     | 2473    | 2050    | -17.1%    | **1.21x**
Dictionary                                              | 894     | 737     | -17.6%    | **1.21x**
Dictionary2                                             | 2268    | 1859    | -18.0%    | **1.22x**
StringIteration                                         | 8027    | 6344    | -21.0%    | **1.27x**
Phonebook                                               | 8207    | 6436    | -21.6%    | **1.28x**
BenchLangArray                                          | 119     | 91      | -23.5%    | **1.31x**
LinkedList                                              | 8267    | 6297    | -23.8%    | **1.31x**
StrToInt                                                | 5585    | 4180    | -25.2%    | **1.34x**
Dictionary3OfObjects                                    | 1122    | 831     | -25.9%    | **1.35x**
Dictionary3                                             | 731     | 515     | -29.6%    | **1.42x**
SuperChars                                              | 513353  | 258735  | -49.6%    | **1.98x**
2016-04-18 15:39:17 -07:00
Xin Tong
d84de12943 Revert "Change FSO explosion heuristic"
This reverts commit fa09c6b71d.

Broke Linux build. And also PR "please benchmark" does not seem to catch it.
2016-04-14 11:05:00 -07:00
Xin Tong
fa09c6b71d Change FSO explosion heuristic
If we can not find the epilogue releases for all the fields with
reference sematics, but we found for some fields. Explode the argument.

I do not see a performance improvement with this change

rdar://25451364
2016-04-13 19:40:53 -07:00
Erik Eckstein
b8feb278dc DeadObjectElemination: fix two problems with handling of the tuple_extract following an array init call
1) handle cases where the tuple_extract is not in the same basic block as the init call. This is not a correctness issue, but might miss some opportunities.
2) bail if there are multiple tuple_extract. This is a correctness issue, but a theoretical one. I don't think that in reality we will ever get multiple tuple_extracts out of SILGen.
2016-04-13 13:28:29 -07:00
Arnold Schwaighofer
d97e7d93c7 UnsafeGuaranteedPeephole: Also use RCIdentity for matching retains 2016-04-13 09:11:27 -07:00
Arnold Schwaighofer
7d86d6f664 UnsafeGuaranteedPeephole: Use RCIdentityFunctionInfo when matching release 2016-04-13 08:46:31 -07:00
Ted Kremenek
35a99b2b5a Merge pull request #2129 from jdhealy/gardening-deindent-comment-RedundantOverflowCheckRemovalPass
[gardening] De-indent comment in `RedundantOverflowCheckRemovalPass`.
2016-04-10 15:07:27 -07:00
J.D. Healy
c809ca1469 [gardening] De-indent comment in RedundantOverflowCheckRemovalPass.
[ci skip]
2016-04-10 17:47:08 -04:00
Arnold Schwaighofer
75100a9598 UnsafeGuaranteedPeephole: Also skip sideeffect free instructions 2016-04-10 14:28:02 -07:00
practicalswift
872070900d [gardening] Consistent formatting of STATISTIC(…, "…"); 2016-04-09 23:51:23 +02:00
practicalswift
0e91354da3 [gardening] Fix recently introduced typo: "domiator" → "dominator" 2016-04-09 12:19:21 +02:00
Mark Lacey
de884107c8 We need to be very careful in generic specialization of recurisive functions.
We were waiting to delete old apply / try_apply instructions until after
fully specializing all the apply / try_apply in the function. This is
problematic when we have a recursive call and specializing the function
that we're currently processing, since we end up cloning the function
with the old apply / try_apply present.

Rather than doing this, clean up the old apply / try_apply immediately
after processing each one.

Resolves SR-1114 / rdar://problem/25455308.
2016-04-08 23:30:01 -07:00
Arnold Schwaighofer
be7ddc69d9 UnsafeGuaranteedPeephole: Skip debug instructions and ignore retain/release uses 2016-04-08 17:54:39 -07:00
Erik Eckstein
1eab8aa955 Re-instate "StackPromotion: Ignore unreachable blocks in post-dominator tree."
With a bug fix which should ensure that it doesn't violate the stack nesting.

Original commit: 3d050f7b43
2016-04-08 10:20:47 -07:00
Slava Pestov
8f038cead4 SIL: Stricter asserts for non-fragile references from fragile functions
Building off of the previous patches, add stricter assertions to
inlining passes and SIL serialization.
2016-04-08 02:16:30 -07:00
Slava Pestov
5aa99fa346 SILOptimizer: Create non-[fragile] specializations of [fragile] functions where possible
Change the optimizer to only make specializations [fragile] if both the
original callee is [fragile] *and* the caller is [fragile].

Otherwise, the specialized callee might be [fragile] even if it is never
called from a [fragile] function, which inhibits the optimizer from
devirtualizing calls inside the specialization.

This opens up some missed optimization opportunities in the performance
inliner and devirtualization, which currently reject fragile->non-fragile
references:

TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
DictionaryRemoveOfObjects                               | 38391   | 35859   | -6.6%     | **1.07x**
Hanoi                                                   | 5853    | 5288    | -9.7%     | **1.11x**
Phonebook                                               | 18287   | 14988   | -18.0%    | **1.22x**
SetExclusiveOr_OfObjects                                | 20001   | 15906   | -20.5%    | **1.26x**
SetUnion_OfObjects                                      | 16490   | 12370   | -25.0%    | **1.33x**

Right now, passes other than performance inlining and devirtualization
of class methods are not checking invariants on [fragile] functions
at all, which was incorrect; as part of the work on building the
standard library with -enable-resilience, I added these checks, which
regressed performance with resilience disabled. This patch makes up for
these regressions.

Furthermore, once SIL type lowering is aware of resilience, this will
allow the stack promotion pass to make further optimizations after
specializing [fragile] callees.
2016-04-08 02:10:31 -07:00
swiftix
baf8e7d7cb Merge pull request #2067 from swiftix/SR-249
Add [nonatomic] attribute to all SIL reference counting instructions.

Support this attribute at SIL level,  IRGen and LLVM-based ARC passes.
2016-04-06 23:56:43 -07:00
eeckstein
2de4dbb765 Merge pull request #2084 from eeckstein/remove_from_parent
Remove removeFromParent
2016-04-06 16:43:11 -07:00
Jordan Rose
52b961de61 Revert "Fix post-dominator tree in stack promotion"
This broke the test suite under optimizations with a SIL verifier error: "stack dealloc does
not match most recent stack alloc".

This reverts commit 7a2ca23bc2, reversing
changes made to 4c55e8d7a7.
2016-04-06 16:02:20 -07:00
Erik Eckstein
0f1a89d5dc SpeculativeDevirtualizer: erase a dead block and not just remove it from the function.
And add a few asserts to make the code clearer.
2016-04-06 14:55:47 -07:00
Erik Eckstein
3d050f7b43 StackPromotion: Ignore unreachable blocks in post-dominator tree.
Unreachable blocks prevented stack promotion in some cases.
Now we use our own post-dominator tree which ignores unreachable blocks instead of the standard post-dominator tree provided by the PostDominanceAnalysis.
Unreachable blocks (better: unreachable sub-graphs) are of no interrest because we don't have to insert the dealloc instructions in unreachable blocks anyway.
2016-04-06 11:11:13 -07:00
Roman Levenstein
2e77b3990b Add [nonatomic] attribute to all SIL reference counting instructions. 2016-04-06 01:52:43 -07:00
Andrew Trick
6ffafe4dd3 NFC: Remove a dead call in the DeadObjectElimination pass.
Cleanup after a pasto. Thanks Roman.
2016-04-05 16:28:27 -07:00
practicalswift
798877ae77 [gardening] "if (foo)[SPACE][SPACE]{" → "if (foo)[SPACE]{" 2016-04-03 22:57:05 +02:00
Chris Lattner
4c57791516 Merge pull request #2025 from wxxsw/pull
[gardening] Put white spaces in between if/while clauses and braces where it is missing.
2016-04-02 10:46:18 -07:00
Ge Sen
5ad36b2962 [gardening] Put white spaces in between if/while clauses and braces where it is missing.
For instance:

'if (foo){' => 'if (foo) {'
2016-04-02 14:43:45 +08:00
Michael Gottesman
7361e35bb9 Revert "Putting white spaces in between if/while clauses and braces." 2016-04-01 22:00:25 -07:00
Ge Sen
7dd61bdfa9 [gardening] Put white spaces in between if/while clauses and braces where it is missing.
For instance:

'if (foo){' => 'if (foo) {'
2016-04-02 08:22:23 +08:00
practicalswift
fa1d5d231a [gardening] Fix recently introduced typo: "althouth" → "although" 2016-04-01 23:14:16 +02:00
eeckstein
045ee83705 Merge pull request #1992 from eeckstein/onfastpath
introduce the onFastPath builtin
2016-03-31 17:41:26 -07:00
Arnold Schwaighofer
308ce091b7 Actually return the result from the _withUnsafeGuaranteed closure call 2016-03-31 16:43:46 -07:00
Erik Eckstein
a47a62d644 A new onFastPath built-in.
It is a hint to the optimizer that the code, where this builtin is called, is on the fast path.
Specifically, the inliner takes it into account and increases the assumed benefit for code where the builtin is located.

Compared to the fastPath/slowPath builtins, this builtin can be placed into plain linear code and doesn't need to be used in conditions.
Compared to the @inline(__always) attribute, this builtin has also an effect on the caller function. Let's assume
	foo() calls bar() contains onFastPath
and both foo and bar are small functions. Then if bar gets inlined into foo, the builtin also increases the chances that foo gets inlined.
This would not be the case if @inline(__always) is used just for bar.
2016-03-31 12:53:44 -07:00
Erik Eckstein
fd3f343dab SIL: add a utility function to check if a terminator exits a function. NFC 2016-03-31 09:29:15 -07:00
practicalswift
109cf92d17 [gardening] Fix recently introduced typo: "extry" → "entry" 2016-03-31 13:42:47 +02:00
Mark Lacey
84473f242a Do not specialize dead apply/partial_apply.
Do not specialize an apply/partial_apply that we've already added to the
set of dead instructions. Doing so can result in creating a new
instruction which we will leave around, and which will have a type
mismatch in its parameter list.

Fixes rdar://problem/25447450.
2016-03-30 21:16:00 -07:00
Mark Lacey
99d4485713 Fix double delete in generic specialization.
We ended up adding the same instruction twice to a SmallVector of
instructions to be deleted. To avoid this, we'll track these
to-be-deleted instructions in a SmallSetVector instead.

We were also failing to add an instruction that we can delete to the set
of instructions to be deleted, so I fixed that as well.

I've added a test case, but it's currently disabled because fixing this
turned up another issue in the same code which I still need to take a
look at.

Fixes rdar://problem/25369617.
2016-03-30 13:10:00 -07:00
Chris Lattner
5b7e030810 Merge pull request #1952 from practicalswift/gardening-20160330
[gardening] Remove unused code, fix typos and improve file header formatting
2016-03-30 10:24:05 -07:00
Xin Tong
4183d972da Merge pull request #1946 from trentxintong/FS
Change FSO heuristic.
2016-03-30 10:11:27 -07:00
practicalswift
74987b2164 [gardening] Remove unused code. 2016-03-30 18:20:49 +02:00
Xin Tong
f95d9b3c92 Change FSO heuristic.
FSO functions that have high potential but does not have caller inside
current module.

The thunk can then be inlined into the module calling the function and
the function would get the benefit of FSO.

The heuristic for selecting such function is
1. Have no indirect caller. This would introduce a thunk.
2. Have potential to give better performance. i.e. function argument can
be O2G.

Regression
TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
BenchLangCallingCFunction                               | 184     | 211     | +14.7%    | **0.87x**
Calculator                                              | 55      | 59      | +7.3%     | **0.93x**
DeadArray                                               | 687     | 741     | +7.9%     | **0.93x**
MonteCarloPi                                            | 39275   | 41669   | +6.1%     | **0.94x**

Improvement
TEST                                                    | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
---                                                     | ---     | ---     | ---       | ---
LuhnAlgoLazy                                            | 2478    | 2327    | -6.1%     | **1.06x**
OpenClose                                               | 54      | 51      | -5.6%     | **1.06x**
SortLettersInPlace                                      | 1016    | 946     | -6.9%     | **1.07x**
ObjectiveCBridgeFromNSDictionaryAnyObjectToStringForced | 149993  | 139755  | -6.8%     | **1.07x**
Phonebook                                               | 9666    | 8992    | -7.0%     | **1.07x**
ObjectiveCBridgeFromNSDictionaryAnyObjectToString       | 222713  | 206538  | -7.3%     | **1.08x**
LuhnAlgoEager                                           | 2393    | 2226    | -7.0%     | **1.08x**
Dictionary                                              | 1307    | 1196    | -8.5%     | **1.09x**
JSONHelperDeserialize                                   | 3808    | 3492    | -8.3%     | **1.09x**
StdlibSort                                              | 7310    | 4084    | -44.1%    | **1.79x**

I see 0.15% increase in code size for Benchmark_O.

Thanks @gottesmm for suggesting this opportunity.

rdar://25345056
2016-03-29 23:04:36 -07:00
Erik Eckstein
d581cbdd9d Re-instate "PerformanceInliner: Improve the inlining heuristic to reduce code size."
...with a fix in the shortest-path-analysis

This reinstates commit 4bd1216702
2016-03-29 16:33:47 -07:00
Erik Eckstein
be8faad792 Revert my recent inliner commits.
There are some failures in the tests.
2016-03-28 11:11:25 -07:00
practicalswift
5004c21c6a Merge pull request #1877 from practicalswift/cleanups-20160325
[gardening] Daily cleanup: typos, header formatting, pep8 fixes.
2016-03-28 09:46:55 +02:00
practicalswift
11a8b6c2ba [gardening] Daily cleanup: typos, header formatting. 2016-03-28 09:29:38 +02:00
Erik Eckstein
4bd1216702 PerformanceInliner: Improve the inlining heuristic to reduce code size.
It now detects more opportunities for inlining, like some patters with RC instructions or loads/stores from/to stack locations in the caller.
On the other hand a new shortest path analysis limits inlining to those cases where it really gives a benefit.

As the inlining decision now depends on many parameters, the test-threshhold option is removed because it doe not make much sense anymore.
Instead the inliner test files are modified to model the "real" instruction costs.
2016-03-27 10:59:29 -07:00
Arnold Schwaighofer
255779082e Add a peephole optimization for the builtin "unsafeGuaranteed"
We can remove the retain/release pair preceeding the builtins based on the
knowledge that the lifetime of the reference is guaranteed by someone hanging on
to the reference elsewhere.
2016-03-27 06:47:16 -07:00