Given an aggregate addr `%agg` with trivial subobject addr `%triv` and
nontrivial subobject addr `%sub` which `%triv` is projected from,
```
Aggregate <- %agg
Subobject <- %sub
Trivial <- %triv
...
...
```
after `%sub` is destroyed, `%triv` is no longer initialized. As a
result, it's not valid to fold a destroy_addr of %agg into a sequence of
`load [copy]`s and `copy_addr`s if there's an access to `%triv` after
the `load [copy]`/`copy_addr` of `%sub` (or some intermediate
subobject).
In other words, transforming
```
copy_addr %sub
load [trivial] %triv
destroy_addr %agg
```
into
```
copy_addr [take] %sub
load [trivial] %triv
```
is invalid.
During destroy_addr folding, prevent that from happening by keeping
track of the trivial fields that have already been visited. If a
trivial field is seen more than once, then bail on folding. This is
the same as what is done for non-trivial fields except that there's no
requirement that all trivial fields be destroyed.
Pass a BasicCalleeAnalysis instance to isDeinitBarrier. This will
enable SSADestroyHoisting to hoist destroy_addrs over applies of
functions that are not themselves deinit barriers.
Added new C++-to-Swift callback for isDeinitBarrier.
And pass it CalleeAnalysis so it can depend on function effects. For
now, the argument is ignored. And, all callers just pass nullptr.
Promoted to API the mayAccessPointer component predicate of
isDeinitBarrier which needs to remain in C++. That predicate will also
depends on function effects. For that reason, it too is now passed a
BasicCalleeAnalysis and is moved into SILOptimizer.
Also, added more conservative versions of isDeinitBarrier and
maySynchronize which will never consider side-effects.
Arguments whose lifetimes are not lexical should be hoisted without
respect to deinit barriers. On the other hand, inout arguments
explicitly annotated @_lexical should respect deinit barriers.
IterableBackwardReachability just requires an iterable list of gens.
ShrinkBorrowScope, LexicalDestroyHoisting, and SSADestroyHoisting all
need to be able to check whether a given instruction is scope ending
quickly. Use a SmallSetVector rather than a SmallVector for the gens in
all three.
Instead of doing one or two non-iterative BackwardReachability runs,
do a single run of IterativeBackwardReachability. During that, pause
after discovery/local dataflow and use VisitBarrierAccessScopes to
determine which end_access instructions in the discovered region are
barriers. Add those instructions as kills to the dataflow. Finally run
the global dataflow.
Enables SSADestroyHoisting to hoist destroys over loops.
Addresses a correctness issue where access scopes which were open at
barrier blocks were not promoted to barriers, resulting in destroy_addrs
getting hoisted into unrelated access scopes.
Make loads and copy_addrs of casts of the underlying storage barriers to
folding. Destroying the target address may not be equivalent to
destroying the source address: for example, if the target address is a
generic and the source address is AnyObject, specialization may turn the
generic into a trivial type; the destruction of that trivial type fails
to destroy the original stored AnyObject, resulting in a leak.
Previously, destroy_addrs were folded into copy_addrs and load [copy]s
to produce copy_addr [take]s and load [take]s respectively, but only if
the source of the load/copy was exactly the address being destroyed.
Generalize that to a single-block sequence of copy_addrs and load
[copy]s of projections of the address being destroyed.
Only respect deinit barriers when lexical lifetimes are enabled. If
they aren't, hoist destroy_addrs of all addresses aggressively
regardless of whether doing so involves hoisting over deinit barriers.
Previously, SSADestroyHoisting was attempting to check whether an
unknown use of a variable was an address_to_pointer.
UniqueStorageUseVisitor, however, doesn't call back with that
instruction. Instead, it adds its uses to the stack of uses to visit.
Instead, we need to check whether the use was produced by an
address_to_pointer or more generally whether it's a
BuiltinRawPointerType.
Previously, FindBarrierAccessScopes::checkReachablePhiBarrier was not
looking at the terminator of predecessors but rather looking at the
terminator of block itself. Previously, in cases where the current
block's terminator was in fact a barrier, that resulted in failing to
hoist any live-in access scopes.
Now that we aren't running the data flow twice, the result was worse: in
cases where the current block's terminator was a barrier but there was
no access scope in play, no barrier would be added at all.
In order to determine which end_access instructions are barriers to
hoisting, a data flow which looks for access scopes containing barriers
is run. Those scopes that do contain barriers are added to a set. When
the second pass runs, the end_access instructions corresponding to
scopes in that set (i.e. the ends of scopes which contain barriers) are
treated as barriers.
In the common case where there are no barrier access scopes, though,
running two dataflows per variable is wasteful. Avoid that by just
checking whether we found any scopes that are barriers. If we didn't,
then we already visited all the barrier instructions and were told by
BackwardReachability which blocks had reachable ends and begins.
Tweaked the first data flow to record the barriers and the blocks in
DeinitBarriers. In DeinitBarriers::compute, if no access scopes that
are barriers were found, stop working. If any were found, clear what
had been recorded so far and run the second data flow.
In order to be able to clear everything, switched from using
BasicBlockSet and BasicBlockSetVector to SmallPtrSet<SILBasicBlock *>
and SmallPtrSetVector<SILBasicBlock *>.
As was done with store [init], transform instructions like
copy_addr %n to %m
into the sequence
destroy_addr %m
copy_addr %n to [initialization] %m
in order to create more opportunities for hoisting destroys.
After hoisting, if these opportunities for hoisting don't result in
hoisting actually occurring, recombine the two instructions.
Previously, all arguments using the inout convention were hoisted
ignoring deinit barriers. That was incorrect because @inout_aliasable
addresses are modifications but aren't exclusive. Here, that's fixed by
only allowing arguments with the @inout convention to be hoisted.
If a load [copy] appears near the end of the scope protecting access to
another address and a destroy_addr of the loaded address appears
afterwards, don't fold the destroy into the scope. The reason is that
doing so could allow a deinit which previously executed outside the
exclusivity scope to subsequently execute within it.
Before hoisting destroy_addrs, we split store [assign]s into
destroy_addrs and store [init]s. If those destroy_addrs were not able
to be hoisted, though, recombine the two back into a store [assign].
Doing so avoids introducing extra ARC traffic.
Added a second backward reachability data flow that determines whether
any open access scopes contain barriers. The end_access instructions
for those access scopes are themselves barriers.
If the destroy_addr's barrier is an end_access, try to fold with copies
or loads that occur inside the scope so long as there are no barriers
between the destroy_addr and the instruction it is to be fold with.
Extract code for classifying instructions out of the one data flow where
it is currently used into the DeinitBarriers type. This will facilitate
a second data flow which needs to access the same info and adding an
isBarrier member function to DeinitBarriers for use by folding.
In addition to hoisting destroy_addrs for alloc_stacks and function
arguments, also hoist begin_access [modify] insts. Hoist starting from
innermost scopes and proceeding outwards.
Previously, Reachability assumed that phis were not barriers. Here,
handling for barrier phis is added. To that end, a new delegate
callback `checkReachableBarrier(PhiValue)` is added. Before marking the
beginning of a block as reached (or any of its predecessors), check
whether each argument that is a phi is a barrier. If any is, then
reachability is done.
Implemented the new method in SSADestroyHoisting by splitting apart the
classification of an instruction and the work to do in response to
visiting an instruction. Then, when visiting a PhiValue, just check
whether any of the predecessors terminators are classified as barriers.
That way, seeing that they're classified that way doesn't result in
noting down that those terminators had been reached (which indeed they
will not have been if any of the terminators from which the values are
flowing into the phi are barriers).
For trivial values, the pattern
%val = load [trivial] %addr
destroy_addr %addr
arises. Don't fold these two into
%val = load [take] %addr
because that isn't valid SIL for trivial types in OSSA.
Extract and rewrite the destroy hoisting algorithm originally from
CopyForwarding (in 2014).
This is now a light-weight utility for hoisting destroy_addr
instructions. Shrinking an object's memory lifetime can allow removal
of copy_addr and other optimization.
This is extremely low-overhead and can run at any optimization level
without dependency on any analysis.
This algorithm is:
- Incremental
- SSA-based
- Canonical
- Free from alias analysis
See file-level comments.
The immediate purpose is to specify and test the constraints
introduced by adding lexical variable lifetimes to SIL semantics. It
can be used as a template for end_borrow hoisting.
Ultimately, this utility can be invoked within any pass that needs to
optimize a particular uniquely identified address. It will be used to
remove much of the complexity from CopyForwarding.