Thanks to the invariants of store_borrow, rewriting a store_borrow is a
simple matter of replacing its (non end_borrow) uses with uses of the
underlying address-only value whose use was stored.
When finding the PrunedLiveness on whose boundary end_borrows are to be
inserted, allow lifetime ending uses if they are end_borrows. Such
end_borrows appear validly when an inner use is a nested begin_borrow.
The begin_apply instruction introduces storage which is only valid until
the coroutine is ended via end_apply/abort_apply. Previously,
AddressLowering assumed the memory was valid "as long as it needed to
be": either as an argument to the function being lowered (so valid for
the whole lifetime) or as a new alloc_stack (whose lifetime is
determined by AddressLowering itself). The result was that the storage
made available by the begin_apply was used after the
end_apply/abort_apply when there were uses of [a copy of] [a projection
of] the yield after the end_apply/abort_apply.
Here, the behavior is fixed. Now, the storage is used "as soon as
possible". There are two cases:
(1) If an indirect value's ownership is transferred into the
caller (i.e. its convention is `@in` or `@in_constant`), the value is
`copy_addr [take]`n into caller-local storage when lowering the
begin_apply.
(2) If an indirect value's ownership is only lent to the caller (i.e.
its convention is `@in_guaranteed`), no equivalent operation could be
done: there's no `copy_addr [borrow]`; on the other hand, there's no
need to do it--uses of the value must have occurred within the
coroutine's range. Instead, it's necessary to disable the store-copy
optimization in this case: storage must be allocated for a copy of [a
projection of] the yielded value.
It is valid for owned values not to be destroyed on paths terminating in
unreachable. Similarly, it is valid for storage not to be deallocated
on paths terminating in unreachable. However, it is _not_ valid for
storage to be deallocated before being deinitialized, even in paths
terminating in unreachable. Consequently, when AddressLowering,
dealloc_stacks must not be created in dead-end blocks: because the input
OSSA may not destroy an owned value, the output may not deinitialize
owned storage; so a dealloc_stack in an unreachable block could dealloc
storage which was not deinitialized.
During storage allocation, create all alloc_stacks at the front of the
function entry block, except those for opened existential types. Do not
create any dealloc_stacks, even for opened existential types.
Then, after function rewriting, position the alloc_stacks according to
uses, except those for opened existential types. This means allocating
in the block that's the least common ancestor of all uses. At this
point, create dealloc_stacks on the dominance frontier of the
alloc_stacks, even for allocations for opened existential types.
The behavior for opened existential types diverges from all others in
order to maintain SIL validity (as much as possible) while the pass
runs. It would be invalid to create the alloc_stacks for opened
existential types at the front of the entry block because the type which
is opened is not yet available, but that type's opening must dominate
the corresponding alloc_stack.
On Windows CI builds, we have observed a failure to build the Swift
Standard Library after 10/27 (first noticed on 11/11 snapshot due to
other failures). The failure is an invalid `isa` cast where the
instance is a `nullptr`. The SILPipeline suggests that the
SILOptimizer might be the one triggering the incorrect use of `isa`.
The only instance of `isa` introduced in that range in the
SILOptimizer/Mandatory region for that duration is this particular
instance. Tighten the assertion to ensure that `oper->getUser()`
returns a non-`nullptr` value.
Thanks for @gwynne for helping narrow down the search area.
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
When loading an argument, first check whether its type is trivial. If
so, produce a `load [trivial]`. Only then produce `load [take]`s or
`load_borrow`s. Fixes an issue where values passed @in_guaranteed were
`load [trivial]`'d and then `end_borrow`'d.
Previously, there were three places where unconditional_checked_cast
instructions were rewritten as unconditional_checked_cast_addr
instructions:
- from address-only
- to address-only
- neither
Here, all three are made to factor through the new
rewriteUnconditionalCheckedCast function.
When address lowering rewrites a return to use an indirect convention,
it creates a ParamDecl, using the function's decl context. Some
functions, such as default argument getters, don't have such contexts,
though. In such cases, fall back to module as decl context.
Previously, when lowering destructure_tuple and destructure_struct
instructions, either a load [trivial] or load [take] was always created
for each loadable field. When the operand to the destructure
instruction was @owned, this was the correct behavior; but when the
operand was @guaranteed, it was not. It would result in SIL like
```
(..., %elt_addr, ...) = destructure_ %addr
%value = load [take] %elt_addr
destroy_addr %addr
```
where (1) %elt_addr was destroyed twice (once by the load [take] and
once by the destroy_addr of the aggregate and (2) the loaded value was
leaked.
Here, this is fixed by creating load_borrows for this case.
When lowering an unconditional_checked_cast from an address-only value
to a loadable value, the value with which to replace the original is a
load of the destination address of the unconditional_checked_cast_addr
to which the instruction was lowered. In the case of non-trivial
values, that load must be a take.
When rewriting a store, insert the copy_addr after it rather than before
it.
Previously, after the copy_addr is created,
```
alloc_stack %target_addr $Ty, var, name "something"
...
copy_addr [take] %source_addr to [init] %target_addr
store %instance to [init] %addr
```
when deleting the store, debug info salvaging may result in a
debug_value instruction being created before the store,
```
alloc_stack %target_addr $Ty, var, name "something"
...
copy_addr [take] %source_addr to [init] %target_addr
debug_value %source_addr : $*Ty, var, name "something"
store %instance to [init] %addr
```
using the %source_addr. If the created copy_addr is a [take], this
results in the debug_value being a use-after-consume in the final SIL:
```
alloc_stack %target_addr $Ty, var, name "something"
...
copy_addr [take] %source_addr to [init] %target_addr
debug_value %source_addr : $*Ty, var, name "something"
```
Instead, now, the copy_addr is created after:
```
alloc_stack %target_addr $Ty, var, name "something"
...
store %instance to [init] %addr
copy_addr [take] %source_addr to [init] %target_addr
```
So when the debug_value instruction is created before the store during
debug info salvaging
```
alloc_stack %target_addr $Ty, var, name "something"
...
debug_value %source_addr : $*Ty, var, name "something"
store %instance to [init] %addr
copy_addr [take] %source_addr to [init] %target_addr
```
it is also before the newly added copy_addr. The result is that when
the store is deleted, we have valid SIL:
```
alloc_stack %target_addr $Ty, var, name "something"
...
debug_value %source_addr : $*Ty, var, name "something"
copy_addr [take] %source_addr to [init] %target_addr
When a coroutine yields a value via an indirect convention, an address
must be yielded. For address-only types, AddressLowering was already
handling this. Here, support is added for all loadable, indirect
operands.
While visiting the function, record not only applies which have indirect
formal _results_ but also applies which have indirect formal _yields_.
When rewriting indirect applies, check the apply site kind and rewrite
according to it.
Taught ApplyRewriter to handle rewriting indirect loadable yields.
Made the function available at the top-level so that it can be called
outside of the UseRewriter, specifically, so that it can be called when
rewriting begin_apply yields.
All checked casts are emitted as checked_cast_br
instructions. More than just the instructions which produce or consume
an opaque value must be rewritten as checked_cast_addr_br
instructions. In particular, all those instructions for which
canIRGenUseScalarCheckedCastInstructions returns false must be rewritten
as checked_cast_addr_br instructions.
Note the instructions to rewrite like that while visiting values and
then rewrite them near the end of rewriting.
All checked casts are emitted as unconditional_checked_cast
instructions. More than just the instructions which produce or consume
an opaque value must be rewritten as unconditional_checked_cast_addr
instructions. In particular, all those instructions for which
canIRGenUseScalarCheckedCastInstructions returns false must be rewritten
as unconditional_checked_cast_addr instructions.
Note the instructions to rewrite like that while visiting values and
then rewrite them near the end of rewriting.
Instead of waiting until after rewriting everything else, rewrite them
as the terminator results they produce are encountered. This enables
forming projections in the correct locations.
During def rewriting, the def itself can be changed, for example to be a
"dummy" load. In such cases, uses of the new def need to be rewritten,
not uses of the original def.
When a block argument is a terminator result from a try_apply, use the
ApplyRewriter to convert the try_apply.
In the case where the result is stored into an enum, with this change,
the init_enum_data_addr instruction is created prior to the try_apply
which is necessary in order for it to be passed as an argument to the
try_apply.
If a switch_enum instruction (1) exhaustively handles all cases, there
is no default case or block corresponding to it. If (2) it handles all
cases but one, the default case corresponds to the unique unhandled
case. Otherwise, (3) the default case corresponds to all the unhandled
cases.
The first two scenarios were already handled by address lowering.
Here, handling is added for case (3). It is similar to what is already
done for rewriting cases except that no unchecked_take_enum_data_addr
must be created and that the argument is always address-only (being the
same type as the operand of the switch_enum which is only being
rewritten because it was address-only).
The filterDeadArgs function takes a list of dead argument
indices--ordered from least to greatest--a list of original arguments,
and produces a list of arguments excluding the arguments at those dead
indices.
It does that by iterating from 0 to size(originalArguments) - 1, adding
the original argument at that index to the list of new arguments, so
long as the index that of a dead argument. To avoid doing lookups into
a set, this relies on the dead arguments being ordered ascending. There
is an interator into the dead argument list that is incremented only
when the current index is dead.
When that iterator is at the end, dereferencing it just gives the size
of the array of dead arguments. So in the case where the first argument
is dead but no other arguments are, and there _are_ other arguments, the
first argument would be skipped, and the second argument's index would
be found to be equal to the dereferenced iterator (1).
Previously, there was no check that the iterator was not at the end.
The result was failing to add the second argument to the new list. And
tripping an assertion failure.
Here, it is checked that the iterator is not at the end.
When rewriting uses, it is possible for new uses of a value to be
created, as when a debug_value instruction is created when a store
instruction is deleted. Ensure that all uses are rewritten by adding
all uses to the worklist of uses after rewriting each use.
When casting via unchecked_bitwise_cast, if the destination type is
loadable, don't mark the value it produces rewritten--that value is not
one that AddressLowering is tracking. Instead, replace its copy_value
uses with load [copy] uses of the address the rewritten instruction
produces.
Now that it can be called on partial_apply instructions,
insertAfterFullEvaluation does not name what the function does. One
could imagine a function which inserted after the applies of
(non-escaping) partial_applies.
Before iterating over an instruction's uses and deleting each, cache the
list of uses. Otherwise, the for loop stops after the first instruction
when it's deleted (and has its NextUse field cleared).