Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining, it was assumed that non-coroutine callees
maintained stack discipline. And, when inlining an OSSA function into a
non-OSSA function, OSSA instructions were lowered directly. The result
was that stack discipline could be violated.
Here, when inlining a function in OSSA form into a function lowered out
of OSSA form, stack nesting is fixed up.
Previously, there was an -Xllvm option to verify after all inlining to a
particlar caller. That makes it a chore to track down which apply's
inlining resulted in invalid code. Here, a new option is added that
verifies after each run of the inliner.
CSE relies on OSSA RAUW for lifetime extension when replacing a redundant instruction.
OSSA RAUW however does not handle lifetime extension for escaped base values.
Values escape from ownership via pointer escape, bitwise escape, forwarding unowned operations
and have none or unowned ownership. For all such values do not look through ownership instructions
while determining equality. It is possible to CSE such values with equivalent operands, because the
operand use guarantees lifetime of the base operand.
This is the first slice of bringing up escaping closure support. The support is
based around introducing a new type of SILGen VarLoc: a VarLoc with a box and
without a value. Because the VarLoc only has a box, we have to in SILGen always
eagerly reproject out the address from the box. The reason why I am doing this
is that it makes it easy for the move checker to distinguish in between
different accesses to the box that we want to check separately. As such every
time that we open the box, we insert a mark_must_check
[assignable_but_not_consumable] on that project. If allocbox_to_stack manages to
determine that the box can be stack allocated, we eliminate all of the
mark_must_check and place a new mark_must_check [consumable_and_assignable] on
the alloc_stack. The end result is that we get the old model that we had before
and also can support escaping closures.
ensure that we use consumable_to_assign.
What this patch is does is add an extra phase after alloc_box runs where we look
at uses of the alloc_stack and if we see any mark_must_check of any kind, we
delete them and rewrite a single mark_must_check [consumable_and_assignable] on
the alloc_stack and make all uses of the alloc_stack go through the
mark_must_check.
This has two effects:
1. In a subsequent PR when I add noncopyable semantics for escaping closures,
this will cause allocbox to stack to convert such boxes from having escaping
semantics to having non-escaping semantics. Escaping semantics means that we
always reproject out from the box and use mark_must_check
[assignable_but_not_consumable] (since we can't consume from the box, but can
assign to it). In contrast, non-escaping semantics means that the box becomes an
alloc_stack and we use the traditional var checker semantics. NOTE: We can do
this for lets represented as addresses and vars since the typechecker will
validate that the let is never actually written to even if at the SIL level we
would allow that.
2. In cases where we are implementing simple mark_must_check
[consumable_and_assignable] on one of the project_box and capture the box, we
used to have a problem where the direct box uses would be on the alloc_stack and
not go through the mark_must_check. Now, all uses after allocbox_to_stack occur
go through the mark_must_check. This is why I was able to remove instances of
the "compiler does not understand this pattern" errors... since the compiler
with this change can now understand them.
If the partial_apply is already [on_stack], then ClosureLifetimeFixup would
have already turned the captures into borrows and established their final
lifetimes.
It's need to correctly maintain dependencies from an open-existential instruction to a `keypath` instruction which uses the opened type.
Fixes a SILVerifier crash.
rdar://105517521
Previously, LiveValues consisted always of three values: the value which
was stored, the borrow, and the copy. For store_borrows, there never
was a copy. Treating the two different scenarios as if they were the
same was confusing already. It was only get when we switch to
representing owned lexical lifetimes with move_values.
The type Optional<Ty *> implied that there was a meaningful distinction
between None and Some(nullptr) but that was not the case here. Replaced
it with a bare Ty *.
No need to bind to the StoreBorrowInst to get the source because it's
already bound to the local variable `stored`. And no need to bind to a
more specific type to find the next instruction.
Extend the definition of isGuaranteedLexicalValue--by means of which
Mem2Reg determines whether to borrow introducing a begin_borrow
[lexical] of a value which is store_borrow'd to an alloc_stack
[lexical]--to include every guaranteed lexical value.
Because lexical borrows are already avoided for store_borrows of lexical
values, the function is already misnamed: it's not that Mem2Reg should
necessarily _add_ a lexical lifetime, but rather that it should ensure
that there is one. Considering that we should do the same for owned
lexical values, the renaming will remain appropriate later. Finally,
name it so that it can switch from being a boolean to returning a
tristate (none, guaranteed, owned) when that becomes necessary (as it
will when we need to distinguish among the states to determine what phis
look like).
Now that StackAllocationPromoter::initializationPoints maps to either a
StoreInst or a StoreBorrowInst, there is no longer a subtype of
SILInstruction * at which the BlockToInstMap could be specialized, so
just eliminate the template argument and erase some angle brackets.
- SILPackType carries whether the elements are stored directly
in the pack, which we're not currently using in the lowering,
but it's probably something we'll want in the final ABI.
Having this also makes it clear that we're doing the right
thing with substitution and element lowering. I also toyed
with making this a scalar type, which made it necessary in
various places, although eventually I pulled back to the
design where we always use packs as addresses.
- Pack boundaries are a core ABI concept, so the lowering has
to wrap parameter pack expansions up as packs. There are huge
unimplemented holes here where the abstraction pattern will
need to tell us how many elements to gather into the pack,
but a naive approach is good enough to get things off the
ground.
- Pack conventions are related to the existing parameter and
result conventions, but they're different on enough grounds
that they deserve to be separated.
Enables the outlining of the ContiguousArrayStorage<StaticString> used
when initializing a RawRepresentable enum whose RawValue is String into
a global value to continue even when ContiguousArrayStorage has a
lexical lifetime.
Addresses the following regressions
StringEnumRawValueInitialization 400 7680 +1820.0% **0.05x**
ArrayLiteral2 78 647 +729.5% **0.12x**
DataCreateSmallArray 1750 8850 +405.7% **0.20x**
seen when enabling lexical lifetimes in the standard library.
Given an aggregate addr `%agg` with trivial subobject addr `%triv` and
nontrivial subobject addr `%sub` which `%triv` is projected from,
```
Aggregate <- %agg
Subobject <- %sub
Trivial <- %triv
...
...
```
after `%sub` is destroyed, `%triv` is no longer initialized. As a
result, it's not valid to fold a destroy_addr of %agg into a sequence of
`load [copy]`s and `copy_addr`s if there's an access to `%triv` after
the `load [copy]`/`copy_addr` of `%sub` (or some intermediate
subobject).
In other words, transforming
```
copy_addr %sub
load [trivial] %triv
destroy_addr %agg
```
into
```
copy_addr [take] %sub
load [trivial] %triv
```
is invalid.
During destroy_addr folding, prevent that from happening by keeping
track of the trivial fields that have already been visited. If a
trivial field is seen more than once, then bail on folding. This is
the same as what is done for non-trivial fields except that there's no
requirement that all trivial fields be destroyed.