When promoting an `alloc_stack [lexical]` that is "owned" (i.e. not a
store_borrow location), omit lexical lifetimes (represented with
`move_value [lexical]` instructions) if the value being stored is
already lexical. Such moves are redundant.
rdar://99160718
This patch replaces the stateful generation of SILScope information in
SILGenFunction with data derived from the ASTScope hierarchy, which should be
100% in sync with the scopes needed for local variables. The goal is to
eliminate the surprising effects that the stack of cleanup operations can have
on the current state of SILBuilder leading to a fully deterministic (in the
sense of: predictible by a human) association of SILDebugScopes with
SILInstructions. The patch also eliminates the need to many workarounds. There
are still some accomodations for several Sema transformation passes such as
ResultBuilders, which don't correctly update the source locations when moving
around nodes. If these were implemented as macros, this problem would disappear.
This necessary rewrite of the macro scope handling included in this patch also
adds proper support nested macro expansions.
This fixes
rdar://88274783
and either fixes or at least partially addresses the following:
rdar://89252827
rdar://105186946
rdar://105757810
rdar://105997826
rdar://105102288
Previously, when blocks were added to the worklist, only blocks which
were users of the `alloc_stack` instruction were considered. For
"guaranteed alloc_stacks" (`store_borrow` locations), that resulted in
not processing blocks which contained uses of the `store_borrow` but not
the `alloc_stack`. When such a user was an `end_borrow`, the effect was
that no `end_borrow` was created for the newly introduced
`begin_borrow [lexical]`.
Fix this by adding blocks with users of the `store_borrow` to the
worklist.
Inline-always should only be used on relatively small functions. It must not be used on recursive functions.
Add a check that prevents that inlining of large @inline(__always) functions.
https://github.com/apple/swift/issues/64319
rdar://106655649
Currently, CopyPropagation only canonicalizes defs that are "canonical",
that is, the root of the copy_value tree. When that canonical
def is lexical, however, the canonicalization respects deinit barriers.
But copies of lexical values are not themselves lexical, so their
lifetimes can be shortened without respect to deinit barriers.
Here, immediate copies of lexical values are canonicalized before the
lexical values themselves are.
rdar://107197935
Previously, the utility bailed out on lexical lifetimes because it
didn't respect deinit barriers. Here, deinit barriers are found and
added to liveness if the value is lexical. This enables copies to be
propagated without hoisting destroys over deinit barriers.
rdar://104630103
If a move_value is determined to be redundant and removed, take the
opportunity its removal makes available to canonicalize the moved-from
value without the obstruction of the move_value.
Add a run of ComputeSideEffects before the first run of CopyPropagation.
Allow hoisting over applies of functions that are able to be analyzed
not to be deinit barriers at this early point.
SemanticARCOpts already eliminates move values that are redundant that
block its optimizations. But it's always run after CopyPropagation.
Because move_values divide copy-extended lifetimes, move_values obstruct
lifetime canonicalization. If a move_value isn't separating lifetimes
with different characteristics (specifically: lexicallity, escaping),
then it is only obstructing lifetime canonicalization. Remove it
before canonicalizing the lifetime of the moved-from value.
* [Executors][Distributed] custom executors for distributed actor
* harden ordering guarantees of synthesised fields
* the issue was that a non-default actor must implement the is remote check differently
* NonDefaultDistributedActor to complete support and remote flag handling
* invoke nonDefaultDistributedActorInitialize when necessary in SILGen
* refactor inline assertion into method
* cleanup
* [Executors][Distributed] Update module version for NonDefaultDistributedActor
* Minor docs cleanup
* we solved those fixme's
* add mangling test for non-def-dist-actor
Currently, memory locations whose type is empty (`SILType::isEmpty`) are
regarded as viable sources for loads.
Previously, though, Mem2Reg only handled loads from empty types formed
only by tupling. Here, support is added for types formed also by
struct'ing. As before, this entails recursively instantiating the empty
types until reaching the innermost empty types (which aggregate nothing)
and then aggregating the resulting instances.
rdar://106224845
Add an emergency exit to avoid bad compile time problems in rare corner cases.
The introduced limit is more than enough for "real world" code. Even large functions have < 100 locations.
But in some corner cases - especially in generated code - we can run into quadratic complexity for large functions without that limit.
Fixes a compile time problem.
Unfortunately I don't have isolated test cases for these problems.
rdar://106516360
For very large control flow graphs the markControllingTerminatorsLive can stack overflow.
Fix this by doing the work iteratively instead of recursively.
rdar://106198943
MandatoryGenericSpecializer inlines transparent functions that it
specializes.
Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining during MandatoryGenericSpecializer, it was
assumed that the callee maintained stack discipline. And, when inlining
an OSSA function into a non-OSSA function, OSSA instructions were
lowered directly. The result was that stack discipline would be
violated when directly lowering callees with `partial_apply [on_stack]`s
that violate stack discipline.
Here, when MandatoryGenericSpecializer inlines a specialized generic
function in OSSA form into a function lowered out of OSSA form, stack
nesting is fixed up.
CSE inlines a portion of lazy property getters.
Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining during CSE, it was assumed that the callee
maintained stack discipline. And, when inlining an OSSA function into a
non-OSSA function, OSSA instructions were lowered directly. The result
was that stack discipline could be violated when directly lowering
callees with `partial_apply [on_stack]`s that violate stack discipline
upon direct lowering.
Here, when CSE inlining a lazy property getter in OSSA form into a
function lowered out of OSSA form, stack nesting is fixed up.
Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining, it was assumed that non-coroutine callees
maintained stack discipline. And, when inlining an OSSA function into a
non-OSSA function, OSSA instructions were lowered directly. The result
was that stack discipline could be violated.
Here, when inlining a function in OSSA form into a function lowered out
of OSSA form, stack nesting is fixed up.
Previously, there was an -Xllvm option to verify after all inlining to a
particlar caller. That makes it a chore to track down which apply's
inlining resulted in invalid code. Here, a new option is added that
verifies after each run of the inliner.
CSE relies on OSSA RAUW for lifetime extension when replacing a redundant instruction.
OSSA RAUW however does not handle lifetime extension for escaped base values.
Values escape from ownership via pointer escape, bitwise escape, forwarding unowned operations
and have none or unowned ownership. For all such values do not look through ownership instructions
while determining equality. It is possible to CSE such values with equivalent operands, because the
operand use guarantees lifetime of the base operand.
This is the first slice of bringing up escaping closure support. The support is
based around introducing a new type of SILGen VarLoc: a VarLoc with a box and
without a value. Because the VarLoc only has a box, we have to in SILGen always
eagerly reproject out the address from the box. The reason why I am doing this
is that it makes it easy for the move checker to distinguish in between
different accesses to the box that we want to check separately. As such every
time that we open the box, we insert a mark_must_check
[assignable_but_not_consumable] on that project. If allocbox_to_stack manages to
determine that the box can be stack allocated, we eliminate all of the
mark_must_check and place a new mark_must_check [consumable_and_assignable] on
the alloc_stack. The end result is that we get the old model that we had before
and also can support escaping closures.
ensure that we use consumable_to_assign.
What this patch is does is add an extra phase after alloc_box runs where we look
at uses of the alloc_stack and if we see any mark_must_check of any kind, we
delete them and rewrite a single mark_must_check [consumable_and_assignable] on
the alloc_stack and make all uses of the alloc_stack go through the
mark_must_check.
This has two effects:
1. In a subsequent PR when I add noncopyable semantics for escaping closures,
this will cause allocbox to stack to convert such boxes from having escaping
semantics to having non-escaping semantics. Escaping semantics means that we
always reproject out from the box and use mark_must_check
[assignable_but_not_consumable] (since we can't consume from the box, but can
assign to it). In contrast, non-escaping semantics means that the box becomes an
alloc_stack and we use the traditional var checker semantics. NOTE: We can do
this for lets represented as addresses and vars since the typechecker will
validate that the let is never actually written to even if at the SIL level we
would allow that.
2. In cases where we are implementing simple mark_must_check
[consumable_and_assignable] on one of the project_box and capture the box, we
used to have a problem where the direct box uses would be on the alloc_stack and
not go through the mark_must_check. Now, all uses after allocbox_to_stack occur
go through the mark_must_check. This is why I was able to remove instances of
the "compiler does not understand this pattern" errors... since the compiler
with this change can now understand them.