To avoid introducing new copies--which is illegal for move-only values--
don't rewrite `load [take]`s and `copy_addr [take]`s as `load [copy]`s
and `copy_addr`s respectively and introduce new `destroy_addr`s after
them. Instead, get the effect of folding such a newly created
`destroy_addr` into the preceding rewritten `load [copy]` or
`copy_addr`. Get that effect by neither modifying the `copy_addr [take]`
or `load [take]` nor adding a subsequent `destroy_addr`.
An example for each kind (`load [take]` and `copy_addr [take]`):
```
// Input 1 (`load [take]`)
copy_addr [take] %src to [init] %tmp
%val = load [take] %src
// Old Output 1
%val = load [copy] %src
destroy_addr %src
// New Output 2
%val = load [take] %src
```
```
// Input 2 (`copy_addr [take]`)
copy_addr [take] %src to [init] %tmp
copy_addr [take] %src to [init] %dst
// Old Output 2
copy_addr %src to [init] %dst
destroy_addr %src
// New Output 2
copy_addr [take] %src to [init] %dst
```
rdar://107839979
In preparation for "folding" an "inserted destroy" into a load [copy] or
copy_addr, rename the variable that indicates whether the copyInst's
source must be deinitialized after its last "load".
When promoting an `alloc_stack [lexical]` that is "owned" (i.e. not a
store_borrow location), omit lexical lifetimes (represented with
`move_value [lexical]` instructions) if the value being stored is
already lexical. Such moves are redundant.
rdar://99160718
This patch replaces the stateful generation of SILScope information in
SILGenFunction with data derived from the ASTScope hierarchy, which should be
100% in sync with the scopes needed for local variables. The goal is to
eliminate the surprising effects that the stack of cleanup operations can have
on the current state of SILBuilder leading to a fully deterministic (in the
sense of: predictible by a human) association of SILDebugScopes with
SILInstructions. The patch also eliminates the need to many workarounds. There
are still some accomodations for several Sema transformation passes such as
ResultBuilders, which don't correctly update the source locations when moving
around nodes. If these were implemented as macros, this problem would disappear.
This necessary rewrite of the macro scope handling included in this patch also
adds proper support nested macro expansions.
This fixes
rdar://88274783
and either fixes or at least partially addresses the following:
rdar://89252827
rdar://105186946
rdar://105757810
rdar://105997826
rdar://105102288
Previously, when blocks were added to the worklist, only blocks which
were users of the `alloc_stack` instruction were considered. For
"guaranteed alloc_stacks" (`store_borrow` locations), that resulted in
not processing blocks which contained uses of the `store_borrow` but not
the `alloc_stack`. When such a user was an `end_borrow`, the effect was
that no `end_borrow` was created for the newly introduced
`begin_borrow [lexical]`.
Fix this by adding blocks with users of the `store_borrow` to the
worklist.
Inline-always should only be used on relatively small functions. It must not be used on recursive functions.
Add a check that prevents that inlining of large @inline(__always) functions.
https://github.com/apple/swift/issues/64319
rdar://106655649
Currently, CopyPropagation only canonicalizes defs that are "canonical",
that is, the root of the copy_value tree. When that canonical
def is lexical, however, the canonicalization respects deinit barriers.
But copies of lexical values are not themselves lexical, so their
lifetimes can be shortened without respect to deinit barriers.
Here, immediate copies of lexical values are canonicalized before the
lexical values themselves are.
rdar://107197935
Previously, the utility bailed out on lexical lifetimes because it
didn't respect deinit barriers. Here, deinit barriers are found and
added to liveness if the value is lexical. This enables copies to be
propagated without hoisting destroys over deinit barriers.
rdar://104630103
If a move_value is determined to be redundant and removed, take the
opportunity its removal makes available to canonicalize the moved-from
value without the obstruction of the move_value.
Add a run of ComputeSideEffects before the first run of CopyPropagation.
Allow hoisting over applies of functions that are able to be analyzed
not to be deinit barriers at this early point.
SemanticARCOpts already eliminates move values that are redundant that
block its optimizations. But it's always run after CopyPropagation.
Because move_values divide copy-extended lifetimes, move_values obstruct
lifetime canonicalization. If a move_value isn't separating lifetimes
with different characteristics (specifically: lexicallity, escaping),
then it is only obstructing lifetime canonicalization. Remove it
before canonicalizing the lifetime of the moved-from value.
* [Executors][Distributed] custom executors for distributed actor
* harden ordering guarantees of synthesised fields
* the issue was that a non-default actor must implement the is remote check differently
* NonDefaultDistributedActor to complete support and remote flag handling
* invoke nonDefaultDistributedActorInitialize when necessary in SILGen
* refactor inline assertion into method
* cleanup
* [Executors][Distributed] Update module version for NonDefaultDistributedActor
* Minor docs cleanup
* we solved those fixme's
* add mangling test for non-def-dist-actor
Currently, memory locations whose type is empty (`SILType::isEmpty`) are
regarded as viable sources for loads.
Previously, though, Mem2Reg only handled loads from empty types formed
only by tupling. Here, support is added for types formed also by
struct'ing. As before, this entails recursively instantiating the empty
types until reaching the innermost empty types (which aggregate nothing)
and then aggregating the resulting instances.
rdar://106224845
Add an emergency exit to avoid bad compile time problems in rare corner cases.
The introduced limit is more than enough for "real world" code. Even large functions have < 100 locations.
But in some corner cases - especially in generated code - we can run into quadratic complexity for large functions without that limit.
Fixes a compile time problem.
Unfortunately I don't have isolated test cases for these problems.
rdar://106516360
For very large control flow graphs the markControllingTerminatorsLive can stack overflow.
Fix this by doing the work iteratively instead of recursively.
rdar://106198943
MandatoryGenericSpecializer inlines transparent functions that it
specializes.
Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining during MandatoryGenericSpecializer, it was
assumed that the callee maintained stack discipline. And, when inlining
an OSSA function into a non-OSSA function, OSSA instructions were
lowered directly. The result was that stack discipline would be
violated when directly lowering callees with `partial_apply [on_stack]`s
that violate stack discipline.
Here, when MandatoryGenericSpecializer inlines a specialized generic
function in OSSA form into a function lowered out of OSSA form, stack
nesting is fixed up.
CSE inlines a portion of lazy property getters.
Now that in OSSA `partial_apply [on_stack]`s are represented as owned
values rather than stack locations, it is possible for their destroys to
violate stack discipline. A direct lowering of the instructions to
non-OSSA would violate stack nesting.
Previously, when inlining during CSE, it was assumed that the callee
maintained stack discipline. And, when inlining an OSSA function into a
non-OSSA function, OSSA instructions were lowered directly. The result
was that stack discipline could be violated when directly lowering
callees with `partial_apply [on_stack]`s that violate stack discipline
upon direct lowering.
Here, when CSE inlining a lazy property getter in OSSA form into a
function lowered out of OSSA form, stack nesting is fixed up.