In the common case, we cannot have redundant @owned and @guaranteed phi
args. However, if the incoming values had none ownership and were passed
as an @owned or an @guaranteed phi arg, they can be considered as redundant.
For such redundant @owned phi args, create a copy value, and use the
copy to replace all uses of the redundant phi arg.
For such redundant @guaranteed phi args, we turn off the optimization
currently. We cannot just create a borrow scope and use the new borrowed
value for replacement. We will also have to re-write the consuming uses of
the @guaranteed values such that the newly created borrow scope is
within the scope of its operand.
If we know that we have a FunctionRefInst (and not another variant of FunctionRefBaseInst), we know that getting the referenced function will not be null (in contrast to FunctionRefBaseInst::getReferencedFunctionOrNull).
NFC
This PR simplifies the handling of store [assign] in Mem2Reg by always
converting it to a store [init].
Also fixes an edge case where store [assign] in a non entry block was
not the first store of an alloc_stack with uses in multiple blocks.
This is needed for non-OSSA SIL: in case we see a destroyArray builtin, we can safely remove the otherwise dead array.
This optimization kicks in later in the pipeline, after array semantics are already inlined (for early SIL, DeadObjectElimination can remove arrays based on semantics).
rdar://73569282
https://bugs.swift.org/browse/SR-14100
A reborrow's base value may change if the base value is also passed as
an operand on a branch. Record reborrow dependencies so that we can mark
the base value as live, if the reborrow was also live. This ensures we
do not insert destroy_value on the base value prematurely before the
lifetime of the reborrow ends.
Instead make `findJointPostDominatingSet` a stand-alone function.
There is no need to keep the temporary SmallVector alive across multiple calls of findJointPostDominatingSet for the purpose of re-using malloc'ed memory. The worklist usually contains way less elements than its small size.
My goal was to reduce the size of SILLocation. It now contains only of a storage union, which is basically a pointer and a bitfield containing the Kind, StorageKind and flags. By far, most locations are only single pointers to an AST node. For the few cases where more data needs to be stored, this data is allocated separately: with the SILModule's bump pointer allocator.
While working on this, I couldn't resist to do a major refactoring to simplify the code:
* removed unused stuff
* The term "DebugLoc" was used for 3 completely different things:
- for `struct SILLocation::DebugLoc` -> renamed it to `FilePosition`
- for `hasDebugLoc()`/`getDebugSourceLoc()` -> renamed it to `hasASTNodeForDebugging()`/`getSourceLocForDebugging()`
- for `class SILDebugLocation` -> kept it as it is (though, `SILScopedLocation` would be a better name, IMO)
* made SILLocation more "functional", i.e. replaced some setters with corresponding constructors
* replaced the hand-written bitfield `KindData` with C bitfields
* updated and improved comments
The JumpThreadingCost map in Simplify CFG is used to prevent infinite jump threading loops.
There was a missing update of the cost for blocks which are cloned:
Jump threading loops were prevented for infinitely cloning the original block, but not for re-cloning the cloned block.
A test case is already added in 8948f7565a
rdar://73357726, [SR-14068]
rdar://73644659 (Add a safeguard to SimplifyCFG tryJumpThreading to avoid infinite loop peeling)
A case of infinite loop peeling was exposed recently:
([SR-14068]: Compiling with optimisation runs indefinitely for grpc-swift)
It was trivially fixed here:
---
commit 8948f7565a (HEAD -> fix-simplifycfg-tramp, public/fix-simplifycfg-tramp)
Author: Andrew Trick <atrick@apple.com>
Date: Tue Jan 26 17:02:37 2021
Fix a SimplifyCFG typo that leads to unbounded optimization
---
However, that fix isn't a strong guarantee against this behavior. The
obvious complete fix is that jump-threading should not affect loop
structure. But changing that requires a performance investigation. In
the meantime this change introduces a simple mechanism that guarantees
that a loop header is not repeatedly cloned.
This safeguard is worthwhile because jump-threading across loop
boundaries is kicking in more frequently now the critical edges are
being split within SimplifyCFG.
Note that it is both necessary and desirable to split critical edges
between transformations so that SIL remains in a valid state. That
allows other code in SimplifyCFG to call arbitrary SIL utilities,
allows verifying SimplifyCFG by running verification between
transformation, and simplifies the patters that SimplifyCFG itself
needs to consider.
Fixes rdar://73357726 ([SR-14068]: Compiling with optimisation runs
indefinitely for grpc-swift)
The root cause of this problem is that SimplifyCFG::tryJumpThreading
jump threads into loops, effectively peeling loops. This is not the
right way to implement loop peeling. That belongs in a loop
optimization pass. There's is simply no sane way to control jump
threading if it is allowed across loop boundaries, both from the
standpoint of requiring optimizations to terminate and from the
standpoint of reducing senseless code bloat.
SimplifyCFG does have a mechanism to avoid jump-threading into loop in
most cases. That mechanism would actually prevent the infinite loop
peeling in this particular case if it were implemented correctly. But
the original implementation circa 2014 appears to have a typo.
This commit fixes that obvious bug. I do not think it's a sufficient
to ensure we never see the bad behavior. I will file separate bugs for
the broader issue.
This bad behavior was exposed incidentally by splitting critical
edges. Without edge splitting, SimplifyCFG::simplifyBlocks only
performs "jump threading" once, creating a critical edge to the loop
header. Because simplifyBlocks works under the assumption that there
are no critical edges, it never attempts to perform jump threading
again. In other words, the presence of the critical edge "breaks" the
optimization, preventing it from continuing as intended.
With edge splitting, the simplifyBlocks worklist performs "jump
threading" followed by "jump to trampoline" removal, which creates a
new loop-back edge to the original loop header. This is fine. However,
simplifyBlocks iteratively attempts all optimizations up to a fix
point and it does not stop at loop headers! So, splitting the critical
edge causes simplifyBlocks to work as intended, which leads to
infinite loop peeling. The end result is an infinite sequence of
nested loops. Each peeled iteration is actually within the parent
loop.