Commit Graph

72 Commits

Author SHA1 Message Date
Andrew Trick
4ca3c232b7 Fix LICM to avoid hoisting never-executed traps
It is legal for the optimizer to consider code after a loop always
reachable, but when a loop has no exits, or when the loops exits are
dominated by a conditional statement, we should not consider
conditional statements within the loop as dominating all possible
execution paths through the loop. At least not when there is at least
one path through the loop that contains a "synchronization point",
such as a function that may contain a memory barrier, perform I/O, or
exit the program.

Sadly, we still don't model synchronization points in the optimizer,
so we need to conservatively assume all loops have a synchronization
point and avoid hoisting conditional traps that may never be executed.

Fixes rdar://66791257 (Print statement provokes "Can't unsafeBitCast
between types of different sizes" when optimizations enabled)

Originated in 2014.
2020-08-12 13:55:46 -07:00
Michael Gottesman
d064241599 [ssa-updater] Modernize style before adding support for guaranteed parameters.
Specifically:

1. I made methods, variables camelCase.
2. I expanded out variable names (e.x.: bb -> block, predBB -> predBlocks, U -> wrappedUse).
3. I changed typedef -> using.
4. I changed a few c style for loops into for each loops using llvm::enumerate.

NOTE: I left the parts needed for syncing to LLVM in the old style since LLVM
needs these to exist for CRTP to work correctly for the SILSSAUpdater.
2020-08-06 15:41:00 -07:00
Andrew Trick
5826e75b00 Generalize the MemAccessUtils API.
For use outside access enforcement passes.

Add isUniquelyIdentifiedAfterEnforcement.

Rename functions for clarity and generality.

Rename isUniquelyIdentifiedOrClass to isFormalAccessBase.

Rename findAccessedStorage to identifyFormalAccess.

Rename findAccessedStorageNonNested to findAccessedStorage.

Part of generalizing the utility for use outside the access
enforcement passes.
2020-07-17 10:13:20 -07:00
Erik Eckstein
ba4da8e0d3 LICM: enable more stores to moved out of a loop
Even if a store is not dominating the loop exits, it makes sense to move it out of the loop if the pre-header also as a store to the same memory location.
When this is done, dead-store-elimination can then most likely remove the store in the pre-header.
2020-05-18 15:31:34 +02:00
Andrew Trick
1c12de3241 Fix LICM combined load/store hoisting/sinking optimization.
This loop optimization hoists and sinks a group of loads and stores to
the same address.

Consider this SIL...

PRELOOP:
  %stackAddr = alloc_stack $Index
  %outerAddr1 = struct_element_addr %stackAddr : $*Index, #Index.value
  %innerAddr1 = struct_element_addr %outerAddr1 : $*Int, #Int._value

  %outerAddr2 = struct_element_addr %stackAddr : $*Index, #Index.value
  %innerAddr2 = struct_element_addr %outerAddr2 : $*Int, #Int._value

LOOP:
  %_ = load %innerAddr2 : $*Builtin.Int64
  store %_ to %outerAddr2 : $*Int
  %_ = load %innerAddr1 : $*Builtin.Int64

There are two bugs:

1) LICM miscompiles code during combined load/store hoisting and sinking.

When the loop contains an aliasing load from a difference projection
value, the optimization sinks the store but never replaces the
load. At runtime, the load reads a stale value.

FIX: isOnlyLoadedAndStored needs to check for other load instructions
before hoisting/sinking a seemingly unrelated set of
loads/stores. Checking side effect instructions is insufficient. The
same bug could happen with stores, which also do not produce side
effects.

Fixes <rdar://61246061> LICM miscompile:
Combined load/store hoisting/sinking with aliases

2) The LICM algorithm is not robust with respect to address projection
   because it identifies a projected address by its SILValue. This
   should never be done! It is trivial to represent a project path
   using an IndexTrieNode (there is also an abstraction called
   "ProjectionPath", but it should _never_ actually be stored by an
   analysis because of the time and space complexity of doing so).

The second bug is not necessary to fix for correctness, so will be
fixed in a follow-up commit.
2020-04-03 08:25:20 -07:00
Andrew Trick
73ee38c162 Add tracing to LICM for reloaded store/restored load optimization. 2020-04-03 08:25:20 -07:00
Erik Eckstein
3ad7d548c2 LICM: hoist calls to global_init functions
Global initializers are executed only once.
Therefore it's possible to hoist such an initializer call to a loop pre-header - in case there are no conflicting side-effects in the loop before the call.
Also, the call must post-dominate the loop pre-header. Otherwise it would be executed speculatively.
2020-03-23 16:08:56 +01:00
swift_jenkins
47af5bcec0 Merge remote-tracking branch 'origin/master' into master-next 2019-12-18 17:39:43 -08:00
Ravi Kandhadai
935686460c [SIL Optimization] Create a new utility InstructionDeleter to delete instructions
and eliminate dead code. This is meant to be a replacement for the utility:
recursivelyDeleteTriviallyDeadInstructions. The new utility performs more aggresive
dead-code elimination for ownership SIL.

This patch also migrates most non-force-delete uses of
recursivelyDeleteTriviallyDeadInstructions to the new utility.
and migrates one force-delete use of recursivelyDeleteTriviallyDeadInstructions
(in IRGenPrepare) to use the new utility.
2019-12-18 13:17:17 -08:00
swift-ci
64f712b1b4 Merge remote-tracking branch 'origin/master' into master-next 2019-11-01 11:09:36 -07:00
Erik Eckstein
c29cdd972b LICM: add an optimization to move multiple loads and stores from/to the same memory location out of a loop.
This is a combination of load hoisting and store sinking, e.g.

  preheader:
    br header_block
  header_block:
    %x = load %not_aliased_addr
    // use %x and define %y
    store %y to %not_aliased_addr
    ...
  exit_block:

is transformed to:

  preheader:
    %x = load %not_aliased_addr
    br header_block
  header_block:
    // use %x and define %y
    ...
  exit_block:
    store %y to %not_aliased_addr
2019-10-31 19:07:17 +01:00
Erik Eckstein
6c6b6849e0 LICM: rename MayWrites -> SideEffectInsts
Because the set includes all side-effect instructions, also may-reads.
NFC
2019-10-31 12:52:25 +01:00
swift-ci
90fcb675dc Merge remote-tracking branch 'origin/master' into master-next 2019-10-30 09:50:07 -07:00
eeckstein
7df5feb697 Revert "LICM: add an optimization to move multiple loads and stores from/to the same memory location out of a loop." 2019-10-30 17:26:49 +01:00
swift-ci
e981f7fa0d Merge remote-tracking branch 'origin/master' into master-next 2019-10-30 04:29:59 -07:00
Erik Eckstein
584581e9b9 LICM: add an optimization to move multiple loads and stores from/to the same memory location out of a loop.
This is a combination of load hoisting and store sinking, e.g.

  preheader:
    br header_block
  header_block:
    %x = load %not_aliased_addr
    // use %x and define %y
    store %y to %not_aliased_addr
    ...
  exit_block:

is transformed to:

  preheader:
    %x = load %not_aliased_addr
    br header_block
  header_block:
    // use %x and define %y
    ...
  exit_block:
    store %y to %not_aliased_addr
2019-10-29 16:49:48 +01:00
Erik Eckstein
4e8cfdeabb LICM: rename MayWrites -> SideEffectInsts
Because the set includes all side-effect instructions, also may-reads.
NFC
2019-10-29 10:21:16 +01:00
swift-ci
ded4197d59 Merge remote-tracking branch 'origin/master' into master-next 2019-10-02 13:29:41 -07:00
Andrew Trick
bddc69c8a6 Organize SILOptimizer/Utils headers. Remove Local.h.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.

New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h

Removed:
- Local.h
- Two conflicting CFG.h files

This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.

Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.

Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.

Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
2019-10-02 11:34:54 -07:00
Jonas Devlieghere
b4d268e9e1 Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances in the swift repo.
2019-08-15 11:32:39 -07:00
Michael Gottesman
b6b61ddd6a [ownership] Skip functions with ownership in LICM. 2019-08-05 17:36:46 -07:00
Andrew Trick
8cc013ed3f Fix LICM debug output typo. 2019-05-14 10:45:53 -07:00
Joe Shajrawi
d80f2d2e6e [exclusivity] teach LICM how to handle static markers 2019-03-14 12:39:21 -07:00
Adrian Prantl
ff63eaea6f Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

      for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
2018-12-04 15:45:04 -08:00
Joe Shajrawi
45b61d11d7 [LICM] Only add dynamic begin_access checks to the list of access scopes to be analyzed
This should provide slight compile-time improvement
2018-10-31 17:31:29 -07:00
Erik Eckstein
87cf7eff03 SIL optimizer: fix a compiler non-determinism in LICM.
SR-8844
rdar://problem/44762620
2018-09-25 12:58:42 -07:00
Joe Shajrawi
c6a4e2cdd2 [Exclusivity] Handle mayRelease instructions conservatively in AccessEnforcementOpts and LICM 2018-09-04 13:23:22 -07:00
Joe Shajrawi
95344f6591 [LICM/Exclusivity] Hoist (some) conflicting begin_accesses out of loops
Consider the attached test cases:

We have a begin_access [dynamic] to a global inside of a loop

There’s a nested conflict on said access due to an apply() instruction between the begin and end accesses.

LICM is currently very conservative: If there are any function calls inside of the loop that conflict with begin and end access, we do not hoist out of the loop.

However, if all conflicting applies are “sandwiched” between the begin and end access. So there’s no reason we can’t hoist out of the loop.

See radar rdar://problem/43660965 - this improves some internal benchmarks by over 3X
2018-08-28 11:51:16 -07:00
Joe Shajrawi
8e935f6046 Fix a warning in LICM in release builds 2018-08-27 10:11:23 -07:00
Joe Shajrawi
5e2f3d8448 [LICM]: support hosting ref_element_addr even if they are not guaranteed to be executed
In some instances, some instructions, like ref_element_addr, can be hoisted outside of loops even if they are not guaranteed to be executed.

We currently don’t support that / bail. We only try to do so / do further analysis only for begin_access because they are extremely heavy.

However, we need to support hosting of ref_element_addr in that case, if it does not have a loop dependent operand, in order to be able to hoist begin_access instructions in some benchmarks.

Initial local testing shows that this PR, when we enable exclusivity, improves the performance of a certain internal benchmark by over 40%

See rdar://problem/43623829
2018-08-23 14:13:37 -07:00
Joe Shajrawi
7281a76deb [AccessEnforcementOpts] Add mergeAccesses optimization 2018-08-09 16:15:25 -07:00
Bob Wilson
8e330ee344 NFC: Fix indentation around the newly renamed LLVM_DEBUG macro.
Jordan used a sed command to rename DEBUG to LLVM_DEBUG. That caused some
lines to wrap and messed up indentiation for multi-line arguments.
2018-07-21 00:56:18 -07:00
Jordan Rose
cefb0b62ba Replace old DEBUG macro with new LLVM_DEBUG
...using a sed command provided by Vedant:

$ find . -name \*.cpp -print -exec sed -i "" -E "s/ DEBUG\(/ LLVM_DEBUG(/g" {} \;
2018-07-20 14:37:26 -07:00
swift-ci
9b2d9606f6 Merge pull request #17614 from shajrawi/licm_asan 2018-06-28 17:02:08 -07:00
Joe Shajrawi
3d411d22e0 [LICM] Fix an ASAN use-after-free bug in rewrite 2018-06-28 15:46:05 -07:00
Joe Shajrawi
437b0d8e13 [LICM] Array Semantics: only hoist kGetCount and kGetCapacity
We can’t hoist everything that is hoist-able

The canHoist method does not do all the required analysis

Some of the work is done at COW Array Opt

TODO: Refactor COW Array Opt + canHoist - radar 41601468
2018-06-28 14:42:20 -07:00
Joe Shajrawi
c3ddaf92cb [LICM] Further refactoring: Simplify hosting of begin access instructions - get rid of HoistPairSet and hoistAndSinkInstructionPair 2018-06-27 20:54:24 -07:00
Joe Shajrawi
f56b5d8730 [LICM] Add support for Hosting <Instruction, Instruction Set> Pairs
Support having the target of each hoist instruction as multiple sinks.
2018-06-27 16:25:35 -07:00
Joe Shajrawi
bc59eaad70 [LICM] Refactoring + Improvements + Exclusivity Support
Major refactoring + tuning of LICM. Includes:
Support for hosting more array semantic calls
Remove restrictions for sinking instructions
Add support for hoisting and sinking instruction pairs (begin and end accesses)

Testing with Exclusivity enabled on a couple of benchmarks shows:
ReversedArray 7x improvement
StringWalk 2.6x improvement
2018-06-26 13:26:37 -07:00
Joe Shajrawi
19a6bb5bdb [LICM] Code Hygiene - rip out sinkCondFail
Removing this optimization from SIL: It is not worth the extra code complexity and compilation time.

More in-depth explanation for the reasoning behind my decision:
1) What is being done there is obviously not LICM (more below) - even if it is useful it should be its own separate optimization
2) The regression that caused us to add this code is no longer there in most cases - 10% in only one specific corner-case
3) Even if the regression was still there, this is an extremely specific code pattern that we are pattern-matching against. Said pattern would be hard to find in any real code.

There is a small code snippet in rdar://17451529 that caused us to add this optimization. Looking at it now we see that the only difference is in loop1 example -

The only difference in SIL level is in loop 1:
  %295 = tuple_extract %294 : $(Builtin.Int64, Builtin.Int1), 0
  %296 = tuple_extract %294 : $(Builtin.Int64, Builtin.Int1), 1
  cond_fail %296 : $Builtin.Int1
  %298 = struct $Int (%295 : $Builtin.Int64)
  store %298 to %6 : $*Int
  %300 = builtin "cmp_eq_Int64"(%292 : $Builtin.Int64, %16 : $Builtin.Int64) : $Builtin.Int1
  cond_br %300, bb1, bb12

The cond_fail instruction in said loop is moved below the store instruction / above the builtin.

Looking at the resulting IR. And how LLVM optimizes it. It is almost the same.

If we look at the assembly code being executed then, before removing this optimization, we have:
LBB0_11:
	testq	%rcx, %rcx
	je	LBB0_2
	decq	%rcx
	incq	%rax
	movq	%rax, _$S4main4sum1Sivp(%rip)
	jno	LBB0_11

After removing it we have:
LBB0_11:
	incq	%rax
	testq	%rcx, %rcx
	je	LBB0_2
	decq	%rcx
	movq	%rax, %rdx
	incq	%rdx
	jno	LBB0_11

There is no extra load/movq which was mentioned the radar.
2018-06-14 11:10:11 -07:00
Andrew Trick
cdcb7c7a2c [NFC] SideEffectAnalysis refactoring and cleanup.
Make this a generic analysis so that it can be used to analyze any
kind of function effect.

FunctionSideEffect becomes a trivial specialization of the analysis.

The immediate need for this is to introduce an new
AccessedStorageAnalysis, although I foresee it as a generally very
useful utility. This way, new kinds of function effects can be
computed without adding any complexity or compile time to
FunctionSideEffects. We have the flexibility of computing different
kinds of function effects at different points in the pipeline.

In the case of AccessedStorageAnalysis, it will compute both
FunctionSideEffects and FunctionAccessedStorage in the same pass by
implementing a simple wrapper on top of FunctionEffects.

This cleanup reflects my feeling that nested classes make the code
extremely unreadable unless they are very small and either private or
only used directly via its parent class. It's easier to see how these
classes compose with a flat type system.

In addition to enabling new kinds of function effects analyses, I
think this makes the implementation of side effect analysis easier to
understand by separating concerns.
2018-04-16 17:05:04 -07:00
Erik Eckstein
db69b8d433 SideEffectAnalysis: don't assume the worst side-effects for a release instruction
Instead let the client decide what to do with this.
Sometimes the client knows what side effect a release instruction really has.
2018-01-19 11:32:35 -08:00
eeckstein
b126b62256 Revert "Optimization changes to completely fold OptionSet literals" 2018-01-18 22:05:07 -08:00
Erik Eckstein
9907ffc09d SideEffectAnalysis: don't assume the worst side-effects for a release instruction
Instead let the client decide what to do with this.
Sometimes the client knows what side effect a release instruction really has.
2018-01-18 18:27:17 -08:00
John McCall
ab3f77baf2 Make SILInstruction no longer a subclass of ValueBase and
introduce a common superclass, SILNode.

This is in preparation for allowing instructions to have multiple
results.  It is also a somewhat more elegant representation for
instructions that have zero results.  Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction.  Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.

A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.

Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
2017-09-25 02:06:26 -04:00
practicalswift
492f5cd35a [gardening] Remove redundant repetition of type names (DRY): RepeatedTypeName foo = dyn_cast<RepeatedTypeName>(bar)
Replace `NameOfType foo = dyn_cast<NameOfType>(bar)` with DRY version `auto foo = dyn_cast<NameOfType>(bar)`.

The DRY auto version is by far the dominant form already used in the repo, so this PR merely brings the exceptional cases (redundant repetition form) in line with the dominant form (auto form).

See the [C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es11-use-auto-to-avoid-redundant-repetition-of-type-names) for a general discussion on why to use `auto` to avoid redundant repetition of type names.
2017-05-05 09:45:53 +02:00
practicalswift
5e255e07d7 [gardening] Remove redundant logic 2017-04-11 23:04:55 +02:00
Andrew Trick
be1881aa1f Remove redundant Transform.getName() definitions.
At some point, pass definitions were heavily macro-ized. Pass
descriptive names were added in two places. This is not only redundant
but a source of confusion. You could waste a lot of time grepping for
the wrong string. I removed all the getName() overrides which, at
around 90 passes, was a fairly significant amount of code bloat.

Any pass that we want to be able to invoke by name from a tool
(sil-opt) or pipeline plan *should* have unique type name, enum value,
commend-line string, and name string. I removed a comment about the
various inliner passes that contradicted that.

Side note: We should be consistent with the policy that a pass is
identified by its type. We have a couple passes, LICM and CSE, which
currently violate that convention.
2017-04-09 15:20:28 -07:00
practicalswift
6d1ae2a39c [gardening] 2016 → 2017 2017-01-06 16:41:22 +01:00
practicalswift
38be6125e5 [gardening] C++ gardening: Terminate namespaces, fix argument names, ...
Changes:
* Terminate all namespaces with the correct closing comment.
* Make sure argument names in comments match the corresponding parameter name.
* Remove redundant get() calls on smart pointers.
* Prefer using "override" or "final" instead of "virtual". Remove "virtual" where appropriate.
2016-12-17 00:32:42 +01:00