Currently, memory locations whose type is empty (`SILType::isEmpty`) are
regarded as viable sources for loads.
Previously, though, Mem2Reg only handled loads from empty types formed
only by tupling. Here, support is added for types formed also by
struct'ing. As before, this entails recursively instantiating the empty
types until reaching the innermost empty types (which aggregate nothing)
and then aggregating the resulting instances.
rdar://106224845
Previously, LiveValues consisted always of three values: the value which
was stored, the borrow, and the copy. For store_borrows, there never
was a copy. Treating the two different scenarios as if they were the
same was confusing already. It was only get when we switch to
representing owned lexical lifetimes with move_values.
The type Optional<Ty *> implied that there was a meaningful distinction
between None and Some(nullptr) but that was not the case here. Replaced
it with a bare Ty *.
No need to bind to the StoreBorrowInst to get the source because it's
already bound to the local variable `stored`. And no need to bind to a
more specific type to find the next instruction.
Extend the definition of isGuaranteedLexicalValue--by means of which
Mem2Reg determines whether to borrow introducing a begin_borrow
[lexical] of a value which is store_borrow'd to an alloc_stack
[lexical]--to include every guaranteed lexical value.
Because lexical borrows are already avoided for store_borrows of lexical
values, the function is already misnamed: it's not that Mem2Reg should
necessarily _add_ a lexical lifetime, but rather that it should ensure
that there is one. Considering that we should do the same for owned
lexical values, the renaming will remain appropriate later. Finally,
name it so that it can switch from being a boolean to returning a
tristate (none, guaranteed, owned) when that becomes necessary (as it
will when we need to distinguish among the states to determine what phis
look like).
Now that StackAllocationPromoter::initializationPoints maps to either a
StoreInst or a StoreBorrowInst, there is no longer a subtype of
SILInstruction * at which the BlockToInstMap could be specialized, so
just eliminate the template argument and erase some angle brackets.
Add `deletableInstructions()` and `reverseDeletableInstructions()` in SILBasicBlock.
It allows deleting instructions while iterating over all instructions of the block.
This is a replacement for `InstructionDeleter::updatingRange()`.
It's a simpler implementation than the existing `UpdatingListIterator` and `UpdatingInstructionIteratorRegistry`, because it just needs to keep the prev/next pointers for "deleted" instructions instead of the iterator-registration machinery.
It's also safer, because it doesn't require to delete instructions via a specific instance of an InstructionDeleter (which can be missed easily).
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
Instead of checking if the end_borrow is ending the lifetime of the store_borrow of the asi under consideration,
this code was checking if the store_borrow source is the runningValue which is incorrect in cases where a store_borrow
src to another destination gets replaced during mem2reg. This PR fixes the issue.
With the change to include `SmallVector.h` directly in `LLVM.h` rather
than forward declaring in the only case it matters (ie. Clang <= 5),
these fixes are no longer needed. Since defaulted version is preferred
when there's no better choice (which is presumably the case if that's
how they were originally added), use it instead. Some uses were instead
changed to add `llvm::` so remove that too.
Already, load [take]s of struct_element_addr|tuple_element_addr
projections resulted in Mem2Reg bailing. Expand that to include load
[take]s involving unchecked_addr_cast.
To handle load [take]s of (struct|tuple)_element_addr projections, it
would be necessary to replace the running value with a value obtained
from the original product by recursive destructuring, replacing the
value at the load [take]n address with undef, and then restructuring.
To handle load [take]s of cast projections, it would be necessary to use
unchecked_value_cast instead of unchecked_bitwise_cast. But we would
need to still use unchecked_bitwise_cast in the case of load [copy]
because otherwise we would lose the original value--unchecked_value_cast
forwards ownership, and not all casts can be reversed (because they may
narrow).
For now, just bail out in the face of these complex load [take]s.
When optimizing an enum `store` to an `alloc_stack`, require that all uses are in the same block.
Otherwise it could be a `switch_enum` of an optional where the none-case does not have a destroy of the enum value.
After transforming such an `alloc_stack`, the value would leak in the none-case block.
It fixes the same OSSA verification error as done for TempRValueOpt in a previous commit.
Otherwise they will never compare the same. Also restore the test that was
updated for the previous commit to its old state. I did this to ensure that each
commit would compile successfully and so that I could show the test associated
with this commit.
The reason why I am doing this is that we are going to be enabling lexical
lifetimes early in the pipeline so that I can use it for the move operator's
diagnostics.
To make it easy for passes to know whether or not they should support lexical
lifetimes, I included a query on SILOptions called
supportsLexicalLifetimes. This will return true if the pass (given the passed in
option) should insert the lexical lifetime flag. This ensures that passes that
run in both pipelines (e.x.: AllocBoxToStack) know whether or not to set the
lexical lifetime flag without having to locally reason about it.
This is just chopping off layers of a larger patch I am upstreaming.
NOTE: This is technically NFC since it leaves the default alone of not inserting
lexical lifetimes at all.
Thanks to the lack of critical edges in SIL, if a block B dominated by P
has a successor S which is not dominated by P, then B must have only a
single successor. Used this fact to replace a loop over successors to a
call to getSinglePredecessor.
Also added an assertion that, in the notation above, B is dominated by
P.
Previously, it was asserted that any single-block allocation which had
valid memory after all instructions in the block were visited terminated
in unreachable. That assertion was false--while all paths through that
block must end in unreachable, the block itself need not be terminated
with an unreachable inst. Here, that is corrected by walking forward
from the block until blocks terminated by unreachables or blocks not
dominated by the block containing the alloc_stack are reached. In the
first case, the lifetime is ended just before the unreachable inst. In
the second, the lifetime is ended just before the branch to such a
successor block (i.e. just before the branch to a block which is not
dominated by the block containing the alloc_stack).
Previously, if it was determined that a proactive phi was unnecessary,
it was removed, along with the phis for the lifetime and the original
value of which the proactive phi was a copy. The uses of only one of
the three phis (namely, the proactive phi) was RAUW undef. In the case
where the only usage of the phi was to branch back to the block that
took the phi as an argument, that was a problem. Here, that is fixed by
giving all three phis the same treatment. To avoid duplicating code,
that treatment is pulled out into a new lambda.
This was exposed by adding lifetime versions of some OSSA versions of
mem2reg tests that had been missed previously.
Previously, the flag was a LangOptioins. That didn't make much sense because
this isn't really a user-facing behavior. More importantly, as a member
of that type type it couldn't be accessed when setting up pass
pipelines. Here, the flag is moved to SILOptions.
Previously, Mem2Reg would delete write-only alloc_stacks. That is
incorrect for lexical alloc_stacks--if a var was assigned but never
read, we still want to keep it alive for the duration of that var's
lexical scope.
Previously, the lexical borrow scopes that were introduced to replace
lexical stack allocations were tied to the uses of the stored value.
That meant that the borrow scope could be both too long and also too
short. It could be too long if there were uses of the value after the
dealloc stack and it would be too short if there were not uses of the
value while the value was still stored into the stack allocation.
Here, instead, the lexical borrow scopes are tied to the storage of a
value into the stack allocation. A value's lexical borrow scope begins
when the storage is initialied with the value; its lexical borrow scope
ends when the storage is deinitialized. That corresponds to the range
during which a var in the Swift source has a particular value assigned
to it.
Mem2Reg's implementation is split into a few steps:
(1) visiting the instructions in a basic block which contains all uses
of an alloc_stack
(2.a) visiting the instructions in each basic block which contains a use
of the alloc_stack
(2.b) adding phis
(2.c) using the last stored values as arguments to the new outgoing phi
arguments
(2.c) replacing initial uses of the storage with the new incoming phi
arguments
And here, (1) amounts to a special case of (2.a).
During (1) and (2.a):
(a) lexical borrow scopes are begun after store instructions for the
values that were stored
(b) when possible, lexical borrow scopes are ended before instructions
that deinitialize memory
- destroy_addr
- store [assign]
- load [take]
For (1), that is enough to create valid borrow scopes.
For (2), there are two complications:
(a) Borrow scopes that are begun may not be ended (when visiting a
single block's instructions).
For example, when visiting
bb1:
store %instance to [init] %addr
br bb2
a borrow scope is started after the store but cannot be ended.
(b) There may not be enough information available to end borrow scopes
when visiting instructions that would typically introduce borrow
scopes.
For example, when visiting
bb2:
%copy = load [copy] %addr
%instance = load [take] %addr
br bb3
there is not enough information available to end the borrow scope
that should be ended before the load [take]
To resolve these issues, both sorts of instructions are tracked. For
(a), in StackAllocationPromoter::initializationPoints. For (b), in
StackAllocationPromoter::deinitializationPoints. Finally, a new
step is added:
(2.d) StackAllocationPromoter::endLexicalLifetimes does a forward CFG
walk starting from the out-edge of each of the blocks which began
but did not end lexical lifetimes. At an out-edge, we do a check
regarding unreachables is done, and we may end the borrow scope.
Otherwise, the walk continues to the in-edge of each successor.
At an in-edge, we look for an instruction from (b) (in
unprecedentedDeinitializations) above. If one is found, then we
end the borrow scope before that instruction. Otherwise, the walk
continues to the out-edge of the block.
The ubuntu 18.04 Linux builder fails to build when the SmallVector does
not include the number of elements. This is a known issue and is
incorrect, but that is the version of clang on that OS.
The definition of SmallVector is already available transitively since
there are values of that type, I've just made it explicit.
llvm::SmallVector has a default parameter for the number of elements
while swift::SmallVector doesn't. I've gone ahead and made it explicitly
use the llvm::SmallVector to receive that.
SILGen turns vars into alloc_boxes. When possible, AllocBoxToStack
turns those into alloc_stacks. In order to preserve the lexical
lifetime of those vars, the alloc_stacks are annotated with the
[lexical] attribute. When Mem2Reg runs, it promotes the loads from
and stores into those alloc_stacks to uses of registers. In order to
preserve the lexical lifetime during that transformation, lexical borrow
scopes must be introduces that encode the same lifetime as the
alloc_stacks did. Here, that is done.