Commit Graph

239 Commits

Author SHA1 Message Date
Erik Eckstein
0456d95cb0 SIL: Use StackList in BasicBlockWorklist and BasicBlockSetVector
plus: I moved both data structures into a separate header file.
2021-04-11 14:07:26 +02:00
Andrew Trick
e9d0b08706 Add utilities to support OSSA update after running SSAUpdater.
This directly adds support to BasicBlockCloner for updating OSSA.

It also adds a much more general-purpose GuaranteedPhiBorrowFixup
utility which can be used for more complicated SSA updates, in which
multiple phis need to be created. More generally, it handles adding
nested borrow scopes for guaranteed phis even when that phi is used by
other guaranteed phis.
2021-03-18 00:14:13 -07:00
Michael Gottesman
19dd90674d Merge pull request #36167 from gottesmm/pr-761936430320990a32c8dfb2f85d27f171f186ba
[simplify-cfg] Only check if we can remove releases by performing simple jump threading if our block argument is not a trivial type.
2021-03-13 13:00:03 -08:00
Slava Pestov
7ccc41a7b7 SIL: Preliminary support for 'apply [noasync]' calls
Refactor SILGen's ApplyOptions into an OptionSet, add a
DoesNotAwait flag to go with DoesNotThrow, and sink it
all down into SILInstruction.h.

Then, replace the isNonThrowing() flag in ApplyInst and
BeginApplyInst with getApplyOptions(), and plumb it
through to TryApplyInst as well.

Set the flag when SILGen emits a sync call to a reasync
function.

When set, this disables the SIL verifier check against
calling async functions from sync functions.

Finally, this allows us to add end-to-end tests for
rdar://problem/71098795.
2021-03-04 22:41:46 -05:00
Meghana Gupta
1f89d9ff89 Verify critical edges when -sil-verify-all is enabled 2021-03-03 23:45:56 -08:00
Michael Gottesman
34e4e42642 [simplify-cfg] Only check if we can remove releases by performing simple jump threading if our block argument is not a trivial type.
Eliminates unnecessary work and reduces compile time.
2021-02-25 12:21:49 -08:00
Andrew Trick
ba9f52071b OSSA: simplify-cfg support for trivial block arguments.
Enable most simplify-cfg optimizations as long as the block arguments
have trivial types. Enable most simplify CFG unit tests cases.

This massively reduces the size of the CFG during OSSA passes.

Test cases that aren't supported in OSSA yet have been moved to a
separate test file for disabled OSSA tests,

Full simplify-cfg support is currently blocked on OSSA utilities which
I haven't checked in yet.
2021-02-23 22:47:59 -08:00
Michael Gottesman
142c3bd1fc [simplify-cfg] Enable some simple opts during ownerships on br, cond_br that do not involve objects directly.
Just to reduce the size of the CFG.
2021-02-12 23:20:17 -08:00
Michael Gottesman
6c255734ba [simplify-cfg] Enable remove unreachable blocks to shrink the CFG a bit. 2021-02-12 23:20:17 -08:00
Erik Eckstein
214b7a9929 Use the new BasicBlockWorklist utility in various places in the compiler.
It's a refactoring which simplifies the code.
NFC.
2021-02-12 11:15:55 +01:00
Erik Eckstein
fe10f98cf0 SIL: rename the SILBitfield.h header file to BasicBlockBits.h
NFC
2021-02-12 11:15:55 +01:00
Andrew Trick
67863c55b6 Revert "SimplifyCFG: fix an infinite jump-threading loop."
This reverts commit fe928d57ac.

This causes an ASAN failure. Reverting until it can be debugged.
2021-01-28 23:46:23 -08:00
Andrew Trick
0b2a6267f6 Revert "Comment SimplifyCFG JumpThreadingCost."
This reverts commit 7a1065cfc2.
2021-01-28 23:46:12 -08:00
Andrew Trick
7a1065cfc2 Comment SimplifyCFG JumpThreadingCost.
Explain how this actually works since it isn't directly obvious.
2021-01-27 16:34:02 -08:00
Andrew Trick
e5e2cf1f62 Merge pull request #35608 from atrick/guard-simplify-loops
Add a safeguard to SimplifyCFG tryJumpThreading to avoid infinite loop peeling
2021-01-27 16:19:27 -08:00
Erik Eckstein
fe928d57ac SimplifyCFG: fix an infinite jump-threading loop.
The JumpThreadingCost map in Simplify CFG is used to prevent infinite jump threading loops.
There was a missing update of the cost for blocks which are cloned:
Jump threading loops were prevented for infinitely cloning the original block, but not for re-cloning the cloned block.

A test case is already added in 8948f7565a

rdar://73357726, [SR-14068]
2021-01-27 14:54:25 +01:00
Erik Eckstein
f48191966c SILOptimizer: use BasicBlockSet instead of SmallPtrSet in various transformations.
It reduces compile time.
2021-01-27 10:31:17 +01:00
Andrew Trick
8b2098445e Add a safeguard to SimplifyCFG tryJumpThreading to avoid infinite loop peeling
rdar://73644659 (Add a safeguard to SimplifyCFG tryJumpThreading to avoid infinite loop peeling)

A case of infinite loop peeling was exposed recently:

([SR-14068]: Compiling with optimisation runs indefinitely for grpc-swift)

It was trivially fixed here:

---
commit 8948f7565a (HEAD -> fix-simplifycfg-tramp, public/fix-simplifycfg-tramp)
Author: Andrew Trick <atrick@apple.com>
Date:   Tue Jan 26 17:02:37 2021

Fix a SimplifyCFG typo that leads to unbounded optimization
---

However, that fix isn't a strong guarantee against this behavior. The
obvious complete fix is that jump-threading should not affect loop
structure. But changing that requires a performance investigation. In
the meantime this change introduces a simple mechanism that guarantees
that a loop header is not repeatedly cloned.

This safeguard is worthwhile because jump-threading across loop
boundaries is kicking in more frequently now the critical edges are
being split within SimplifyCFG.

Note that it is both necessary and desirable to split critical edges
between transformations so that SIL remains in a valid state. That
allows other code in SimplifyCFG to call arbitrary SIL utilities,
allows verifying SimplifyCFG by running verification between
transformation, and simplifies the patters that SimplifyCFG itself
needs to consider.
2021-01-26 19:42:19 -08:00
Andrew Trick
8948f7565a Fix a SimplifyCFG typo that leads to unbounded optimization
Fixes rdar://73357726 ([SR-14068]: Compiling with optimisation runs
indefinitely for grpc-swift)

The root cause of this problem is that SimplifyCFG::tryJumpThreading
jump threads into loops, effectively peeling loops. This is not the
right way to implement loop peeling. That belongs in a loop
optimization pass. There's is simply no sane way to control jump
threading if it is allowed across loop boundaries, both from the
standpoint of requiring optimizations to terminate and from the
standpoint of reducing senseless code bloat.

SimplifyCFG does have a mechanism to avoid jump-threading into loop in
most cases. That mechanism would actually prevent the infinite loop
peeling in this particular case if it were implemented correctly. But
the original implementation circa 2014 appears to have a typo.

This commit fixes that obvious bug. I do not think it's a sufficient
to ensure we never see the bad behavior. I will file separate bugs for
the broader issue.

This bad behavior was exposed incidentally by splitting critical
edges. Without edge splitting, SimplifyCFG::simplifyBlocks only
performs "jump threading" once, creating a critical edge to the loop
header. Because simplifyBlocks works under the assumption that there
are no critical edges, it never attempts to perform jump threading
again. In other words, the presence of the critical edge "breaks" the
optimization, preventing it from continuing as intended.

With edge splitting, the simplifyBlocks worklist performs "jump
threading" followed by "jump to trampoline" removal, which creates a
new loop-back edge to the original loop header. This is fine. However,
simplifyBlocks iteratively attempts all optimizations up to a fix
point and it does not stop at loop headers! So, splitting the critical
edge causes simplifyBlocks to work as intended, which leads to
infinite loop peeling. The end result is an infinite sequence of
nested loops. Each peeled iteration is actually within the parent
loop.
2021-01-26 17:05:37 -08:00
Michael Gottesman
aa38be6d98 [inst-simplify] Hide simplifyInstruction in favor of using simplifyAndReplaceAllSimplifiedUsesAndErase.
Currently all of these places in the code base perform simplifyInstruction and
then a replaceAllSimplifiedUsesAndErase(...). This is a bad pattern since:

1. simplifyInstruction assumes its result will be passed to
   replaceAllSimplifiedUsesAndErase. So by leaving these as separate things, we
   allow for users to pointlessly make this mistake.

2. I am going to implement in a subsequent commit a utility that lifetime
   extends interior pointer bases when replacing an address with an interior
   pointer derived address. To do this efficiently, I want to reuse state I
   compute during simplifyInstruction during the actual RAUW meaning that if the
   two operations are split, that is difficult without extending the API. So by
   removing this, I can make the transform and eliminate mistakes at the same
   time.
2021-01-17 20:08:24 -08:00
Michael Gottesman
0de00d1ce4 [sil-inst-opt] Improve performance of InstModCallbacks by eliminating indirect call along default callback path.
Specifically before this PR, if a caller did not customize a specific callback
of InstModCallbacks, we would store a static default std::function into
InstModCallbacks. This means that we always would have an indirect jump. That is
unfortunate since this code is often called in loops.

In this PR, I eliminate this problem by:

1. I made all of the actual callback std::function in InstModCallback private
   and gave them a "Func" postfix (e.x.: deleteInst -> deleteInstFunc).

2. I created public methods with the old callback names to actually call the
   callbacks. This ensured that as long as we are not escaping callbacks from
   InstModCallback, this PR would not result in the need for any source changes
   since we are changing a call of a std::function field to a call to a method.

3. I changed all of the places that were escaping inst mod's callbacks to take
   an InstModCallback. We shouldn't be doing that anyway.

4. I changed the default value of each callback in InstModCallbacks to be a
   nullptr and changed the public helper methods to check if a callback is
   null. If the callback is not null, it is called, otherwise the getter falls
   back to an inline default implementation of the operation.

All together this means that the cost of a plain InstModCallback is reduced and
one pays an indirect function cost price as one customizes it further which is
better scalability.

P.S. as a little extra thing, I added a madeChange field onto the
InstModCallback. Now that we have the helpers calling the callbacks, I can
easily insert instrumentation like this, allowing for users to pass in
InstModCallback and see if anything was RAUWed without needing to specify a
callback.
2021-01-04 12:51:55 -08:00
Michael Gottesman
c026e95cce [ownership] Extract out SILOwnershipKind from ValueOwnershipKind into its own type and rename Invalid -> Any.
This makes it easier to understand conceptually why a ValueOwnershipKind with
Any ownership is invalid and also allowed me to explicitly document the lattice
that relates ownership constraints/value ownership kinds.
2020-11-10 14:29:11 -08:00
Michael Gottesman
f36e8561f1 [ownership] Use a new ADT SwitchEnumBranch instead of SwitchEnumInstBase for generic operations on SwitchEnum{,Addr}Inst.
I have a need to have SwitchEnum{,Addr}Inst have different base classes
(TermInst, OwnershipForwardingTermInst). To do this I need to add a template to
SwitchEnumInstBase so I can switch that BaseTy. Sadly since we are using
SwitchEnumInstBase as an ADT type as well as an actual base type for
Instructions, this is impossible to do without introducing a template in a ton
of places.

Rather than doing that, I changed the code that was using SwitchEnumInstBase as
an ADT to instead use a proper ADT SwitchEnumBranch. I am happy to change the
name as possible see fit (maybe SwitchEnumTerm?).
2020-11-08 19:52:02 -08:00
Andrew Trick
223ee10939 EdgeThreadingCloner. Remove splitCriticalEdges calls. 2020-11-03 01:40:00 -08:00
Andrew Trick
824cf85165 Fix SimplifyCFG jump-threading to cleanup after itself.
Avoid extra blocks and/or extra iterations in the simplify loop.
2020-11-03 01:40:00 -08:00
Andrew Trick
8278332a92 SILCloner should not introduce new critical edges.
Also, it must update the DomTree for any CFG changes except for the
addition of cloned blocks.
2020-11-03 01:30:58 -08:00
Andrew Trick
f5b51474a7 Fix the SimplifyCFG ThreadInfo broken abstraction.
In a blatant abuse of OO style, the logic for manipulating the CFG was
encapsulated inside a descriptor of the CFG edge, making it impossible
to work with.
2020-11-03 01:28:21 -08:00
Andrew Trick
51dfc63d51 SimplifyCFG: fix indentation so subsequent diffs are clean. 2020-11-03 01:10:30 -08:00
Andrew Trick
871518ca1a SimplifyCFG; add jump-threading successors to the worklist.
This is required for SimplifyCFG to iterate over simplified blocks
when it does critical edge splitting.
2020-10-29 15:25:20 -07:00
Andrew Trick
8991192793 SimplifyCFG; trampoline cleanup
Remove a pile of crufty "trampoline" elimination code that didn't make
sense.
2020-10-29 15:25:20 -07:00
Andrew Trick
b1fbc2b389 Rewrite SimplifyCFG's trampoline removal.
Avoid introducing new critical edges. Passes will end up resplitting
them, forcing SimplifyCFG to continually rerun. Also, we want to allow
SIL utilities to assume no critical edges, and avoid the need for
several passes to internally split edges and modify the CFG for no
reason.

Also, handle arbitrary block arguments which may be trivial and
unrelated to the real optimizations that trampoline removal exposes,
such as "unwrapping" enumeration-type arguments.

The previous code was an example of how to write an unstable
optimization. It could be defeated by other code in the function that
isn't directly related to the SSA graph being optimized. In general,
when an optimization can be defeated by unrelated code in the
function, that leads to instability which can be very hard to track
down (I spent multiple full days on this one). In this case, we have
enum-type branch args which need to be simplified by unwrapping
them. But, the existence of a trivial and entirely unrelated block
argument would defeat the optimization.
2020-10-29 15:25:20 -07:00
Andrew Trick
d8dcf61026 Improve SimplifyCFG: remove conditional branches to the same target. 2020-10-26 13:12:22 -07:00
Andrew Trick
d82e0ff781 SILInliner: Critical edges have no code size impact.
I think unconditional branches should be free, period. They will
mostly be removed during LLVM code gen. However, fixing this requires
signficant adjustments to inlining heuristics to avoid microbenchmark
regressions at -Osize. So, instead I am just making this less
sensitive to critical edges for the sake of pipeline stability.
2020-10-26 10:49:18 -07:00
Erik Eckstein
6c85f267bf SimplifyCFG: fix a crash caused by an unreachable CFG cycles with block arguments.
When SimplifyCFG (temporarily) produces an unreachable CFG cycle, some other transformations in SimplifyCFG didn't deal with this situation correctly.

Unfortunately I couldn't create a SIL test case for this bug, so I just added a swift test case.

https://bugs.swift.org/browse/SR-13650
rdar://problem/69942431
2020-10-15 15:04:16 +02:00
Joe Groff
a664a33b52 SIL: Add instructions to represent async suspend points.
`get_async_continuation[_addr]` begins a suspend operation by accessing the continuation value that can resume
the task, which can then be used in a callback or event handler before executing `await_async_continuation` to
suspend the task.
2020-10-01 14:21:52 -07:00
Erik Eckstein
0a71d0fbea SimplifyCFG: allow jump-threading for switch_enum_data_addr instructions.
If the branch-block injects a certain enum case and the destination switches on that enum, it's worth jump threading. E.g.

  inject_enum_addr %enum : $*Optional<T>, #Optional.some
  ... // no memory writes here
  br DestBB
DestBB:
  ... // no memory writes here
  switch_enum_addr %enum : $*Optional<T>, case #Optional.some ...

This enables removing all code with optionals in a loop, which iterates over an array of address-only elements, e.g.

  func test<T>(_ items: [T]) {
    for i in items {
      print(i)
    }
  }
2020-09-30 16:44:58 +02:00
Andrew Trick
5ae231eaab Rename getFieldNo() to getFieldIndex().
Do I really need to justify this?
2020-09-24 22:44:13 -07:00
Anthony Latsis
9fd1aa5d59 [NFC] Pre- increment and decrement where possible 2020-06-01 15:39:29 +03:00
Arnold Schwaighofer
147144baa6 SIL: Thread type expansion context through to function convention apis
This became necessary after recent function type changes that keep
substituted generic function types abstract even after substitution to
correctly handle automatic opaque result type substitution.

Instead of performing the opaque result type substitution as part of
substituting the generic args the underlying type will now be reified as
part of looking at the parameter/return types which happens as part of
the function convention apis.

rdar://62560867
2020-05-04 13:53:30 -07:00
Meghana Gupta
013387eceb Update Devirtualizer's analysis invalidation (#31284)
* Update Devirtualizer's analysis invalidation

castValueToABICompatibleType can change CFG, Devirtualizer uses this api but doesn't check if it modified the cfg
2020-04-27 18:30:33 -07:00
Erik Eckstein
1de19a1b32 SimplifyCFG: fix a compile time problem with block merging
When merging many blocks to a single block (in the wrong order), instructions are getting moved over and over again.
This is quadratic and can result in very long compile times for large functions.
To fix this, always move the instruction to smaller block to the larger block.

rdar://problem/56268570
2020-04-10 20:10:24 +02:00
Slava Pestov
9ec80df97e SIL: Remove curried SILDeclRefs 2020-03-19 02:20:21 -04:00
Andrew Trick
38c29e231e Generalize and fix SinkAddressProjections.
Fixes a potential real bug in the case that SinkAddressProjections moves
projections without notifying SimplifyCFG of the change. This could
fail to update Analyses (probably won't break anything in practice).

Introduce SILInstruction::isPure. Among other things, this can tell
you if it's safe to duplicate instructions at their
uses. SinkAddressProjections should check this before sinking uses. I
couldn't find a way to expose this as a real bug, but it is a
theoretical bug.

Add the SinkAddressProjections functionality to the BasicBlockCloner
utility. Enable address projection sinking for all BasicBlockCloner
clients (the four different kinds of jump-threading that use it). This
brings the compiler much closer to banning all address phis.

The "bugs" were originally introduced a week ago here:

commit f22371bf0b (fork/fix-address-phi, fix-address-phi)
Author: Andrew Trick <atrick@apple.com>
Date:   Tue Sep 17 16:45:51 2019

    Add SIL SinkAddressProjections utility to avoid address phis.

    Enable this utility during jump-threading in SimplifyCFG.

    Ultimately, the SIL verifier should prevent all address-phis and we'll
    need to use this utility in a few more places.

    Fixes <rdar://problem/55320867> SIL verification failed: Unknown
    formal access pattern: storage
2019-11-14 16:11:00 -08:00
Andrew Trick
71523642ce Fix logic related to isTriviallyDuplicatable.
In SILInstruction::isTriviallyDuplicatable():

- Make deallocating instructions trivially duplicatable. They are by
  any useful definition--duplicating an instruction does not imply
  reordering it. Tail duplication was already treating deallocations
  as duplicatable, but doing it inconsistently. Sometimes it checks
  isTriviallyDuplicatable, and sometimes it doesn't, which appears to
  have been an accident. Disallowing duplication of deallocations will
  cause severe performance regressions. Instead, consistently allow
  them to be duplicated, making tail duplication more powerful, which
  could expose other bugs.

- Do not duplicate on-stack AllocRefInst (without special
  consideration). This is a correctness fix that apparently was never
  exposed.

Fix SILLoop::canDuplicate():

- Handle isDeallocatingStack. It's not clear how we were avoiding an
  assertion before when a stack allocatable reference was confined to
  a loop--probably just by luck.

- Handle begin/end_access inside a loop. This is extremely important
  and probably prevented many loop optimizations from working with
  exclusivity.

Update LoopRotate canDuplicateOrMoveToPreheader(). This is NFC.
2019-11-13 18:39:23 -08:00
Arnold Schwaighofer
8aaa7b4dc1 SILOptimizer: Pipe through TypeExpansionContext 2019-11-11 14:21:52 -08:00
Jordan Rose
171ff440fc Remove swift::reversed in favor of llvm::reverse (#27610)
The former predates the latter, but we don't need it anymore! The
latter has more features anyway.

No functionality change.
2019-10-10 17:16:09 -07:00
Andrew Trick
bddc69c8a6 Organize SILOptimizer/Utils headers. Remove Local.h.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.

New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h

Removed:
- Local.h
- Two conflicting CFG.h files

This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.

Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.

Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.

Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
2019-10-02 11:34:54 -07:00
Michael Gottesman
aa00865715 [simplify-cfg] Add a visitedBlocks set to hasSameUltimateSuccessor to prevent infinite loops.
Previously, we were not handling properly blocks that we could visit multiple
times. In this commit, I added a SmallPtrSet to ensure that we handle all of the
same cases that we handled previously.

The key reason that we want to follow this approach rather than something else
is that the previous algorithm on purpose allowed for side-entrances from other
checks since often times when we have multiple checks, all of the .none branches
funnel together into a single ultimate block.

This can be seen by the need of this code to support the test two_chained_calls
in simplify_switch_enum_objc.sil.

rdar://55861081
2019-09-30 17:03:06 -07:00
Michael Gottesman
fab7752983 [simplify-cfg] Add option sil-simplify-cfg-simplify-unconditional-branches for testing purposes
I am going to use this in a subsequent commit to make sure we do not infinite
loop upon a test case.

rdar://55861081
2019-09-30 17:03:06 -07:00
Andrew Trick
f22371bf0b Add SIL SinkAddressProjections utility to avoid address phis.
Enable this utility during jump-threading in SimplifyCFG.

Ultimately, the SIL verifier should prevent all address-phis and we'll
need to use this utility in a few more places.

Fixes <rdar://problem/55320867> SIL verification failed: Unknown
formal access pattern: storage
2019-09-18 18:02:59 -07:00