This pass is only used for functions with performance annotations (@_noLocks, @_noAllocation).
It runs in the mandatory pipeline and specializes all function calls in performance-annotated functions and functions which are called from such functions.
In addition, the pass also does some other related optimizations: devirtualization, constant-folding Builtin.canBeClass, inlining of transparent functions and memory access optimizations.
This patch introduces a new stdlib function called _move:
```Swift
@_alwaysEmitIntoClient
@_transparent
@_semantics("lifetimemanagement.move")
public func _move<T>(_ value: __owned T) -> T {
#if $ExperimentalMoveOnly
Builtin.move(value)
#else
value
#endif
}
```
It is a first attempt at creating a "move" function for Swift, albeit a skleton
one since we do not yet perform the "no use after move" analysis. But this at
leasts gets the skeleton into place so we can built the analysis on top of it
and churn tree in a manageable way. Thus in its current incarnation, all it does
is take in an __owned +1 parameter and returns it after moving it through
Builtin.move.
Given that we want to use an OSSA based analysis for our "no use after move"
analysis and we do not have opaque values yet, we can not supporting moving
generic values since they are address only. This has stymied us in the past from
creating this function. With the implementation in this PR via a bit of
cleverness, we are now able to support this as a generic function over all
concrete types by being a little clever.
The trick is that when we transparent inline _move (to get the builtin), we
perform one level of specialization causing the inlined Builtin.move to be of a
loadable type. If after transparent inlining, we inline builtin "move" into a
context where it is still address only, we emit a diagnostic telling the user
that they applied move to a generic or existential and that this is not yet
supported.
The reason why we are taking this approach is that we wish to use this to
implement a new (as yet unwritten) diagnostic pass that verifies that _move
(even for non-trivial copyable values) ends the lifetime of the value. This will
ensure that one can write the following code to reliably end the lifetime of a
let binding in Swift:
```Swift
let x = Klass()
let _ = _move(x)
// hypotheticalUse(x)
```
Without the diagnostic pass, if one were to write another hypothetical use of x
after the _move, the compiler would copy x to at least hypotheticalUse(x)
meaning the lifetime of x would not end at the _move, =><=.
So to implement this diagnostic pass, we want to use the OSSA infrastructure and
that only works on objects! So how do we square this circle: by taking advantage
of the mandatory SIL optimzier pipeline! Specifically we take advantage of the
following:
1. Mandatory Inlining and Predictable Dead Allocation Elimination run before any
of the move only diagnostic passes that we run.
2. Mandatory Inlining is able to specialize a callee a single level when it
inlines code. One can take advantage of this to even at -Onone to
monomorphosize code.
and then note that _move is such a simple function that predictable dead
allocation elimination is able to without issue eliminate the extra alloc_stack
that appear in the caller after inlining without issue. So we (as the tests
show) get SIL that for concrete types looks exactly like we just had run a
move_value for that specific type as an object since we promote away the
stores/loads in favor of object operations when we eliminate the allocation.
In order to prevent any issue with this being used in a context where multiple
specializations may occur, I made the inliner emit a diagnostic if it inlines
_move into a function that applies it to an address only value. The diagnostic
is emitted at the source location where the function call occurs so it is easy
to find, e.x.:
```
func addressOnlyMove<T>(t: T) -> T {
_move(t) // expected-error {{move() used on a generic or existential value}}
}
moveonly_builtin_generic_failure.swift:12:5: error: move() used on a generic or existential value
_move(t)
^
```
To eliminate any potential ABI impact, if someone calls _move in a way that
causes it to be used in a context where the transparent inliner will not inline
it, I taught IRGen that Builtin.move is equivalent to a take from src -> dst and
marked _move as always emit into client (AEIC). I also took advantage of the
feature flag I added in the previous commit in order to prevent any cond_fails
from exposing Builtin.move in the stdlib. If one does not pass in the flag
-enable-experimental-move-only then the function just returns the value without
calling Builtin.move, so we are safe.
rdar://83957028
Previously, when encountering a borrow of a guaranteed value, the
end_borrows of that reborrow were marked alive. Only doing that enables
end_borrows of the outer borrow scope can be marked as dead. The result
is that uses of the reborrowed value (including its end_borrow) can
outstrip the outer borrow scope, which is illegal.
Here, the outer borrow scope's end_borrows are marked alive. To do
that, the originally borrowed values have to be identified via
findGuaranteedReferenceRoots.
Previously, it was asserted that any single-block allocation which had
valid memory after all instructions in the block were visited terminated
in unreachable. That assertion was false--while all paths through that
block must end in unreachable, the block itself need not be terminated
with an unreachable inst. Here, that is corrected by walking forward
from the block until blocks terminated by unreachables or blocks not
dominated by the block containing the alloc_stack are reached. In the
first case, the lifetime is ended just before the unreachable inst. In
the second, the lifetime is ended just before the branch to such a
successor block (i.e. just before the branch to a block which is not
dominated by the block containing the alloc_stack).
Adds two new IRGen-level builtins (one for allocating, the other for deallocating), a stdlib shim function for enhanced stack-promotion heuristics, and the proposed public stdlib functions.
Previously, TempRValueElimination would peephole simple alloc_stacks,
even when they were lexical; here, they are left for Mem2Reg to properly
handle.
Previously, SemanticARCOpts would eliminate lexical begin_borrows,
incorrectly allowing the lifetime of the value borrowed by them to be
observably shortened. Here, those borrow scopes are not eliminated if
they are lexical.
Added an executable test that verifies that a local variable strongly
referencing a delegate object keeps that delegate alive through the call
to an object that weakly references the delegate and calls out to it.
Previously, if it was determined that a proactive phi was unnecessary,
it was removed, along with the phis for the lifetime and the original
value of which the proactive phi was a copy. The uses of only one of
the three phis (namely, the proactive phi) was RAUW undef. In the case
where the only usage of the phi was to branch back to the block that
took the phi as an argument, that was a problem. Here, that is fixed by
giving all three phis the same treatment. To avoid duplicating code,
that treatment is pulled out into a new lambda.
This was exposed by adding lifetime versions of some OSSA versions of
mem2reg tests that had been missed previously.
If a phi argument is dead and it reborrowing it was dependent on some
other value, that other value on which it was dependent may have already
itself been deleted. In that case, the destroy_value would have been
added just before the terminator of the predecessors of the block which
contained the dead phi. So, when deciding where to insert the
end_borrow, iterate backwards from the end of the block, skipping the
terminator updating the insertion point every time a destroy_value
instruction is encountered until we hit an instruction with a different
opcode. This ensures that no matter how many destroy_values may have
been added just before the terminator, the end_borrow will preceed them.
This commit just tweaks the preexisting logic that checked for this
condition. Specifically, the previous code didn't handle the case where
the block contains only a terminator and a destroy_value.
Prevent CSE from introducing useless copies, borrows, and
clones. Otherwise it will endlessly clone and re-cse the same
projections endlessly.
TODO: Most of these cases can be handled GuarateedOwnershipExtension
or extendOwnedLifetime without requiring any copies!
Previously, the flag was a LangOptioins. That didn't make much sense because
this isn't really a user-facing behavior. More importantly, as a member
of that type type it couldn't be accessed when setting up pass
pipelines. Here, the flag is moved to SILOptions.
Setup the API for use with SimplifyCFG first, so the OSSA RAUW utility
can be redesigned around it. The functionality is disabled because it
won't be testable until that's all in place.
Previously, Mem2Reg would delete write-only alloc_stacks. That is
incorrect for lexical alloc_stacks--if a var was assigned but never
read, we still want to keep it alive for the duration of that var's
lexical scope.
Previously, the lexical borrow scopes that were introduced to replace
lexical stack allocations were tied to the uses of the stored value.
That meant that the borrow scope could be both too long and also too
short. It could be too long if there were uses of the value after the
dealloc stack and it would be too short if there were not uses of the
value while the value was still stored into the stack allocation.
Here, instead, the lexical borrow scopes are tied to the storage of a
value into the stack allocation. A value's lexical borrow scope begins
when the storage is initialied with the value; its lexical borrow scope
ends when the storage is deinitialized. That corresponds to the range
during which a var in the Swift source has a particular value assigned
to it.
Mem2Reg's implementation is split into a few steps:
(1) visiting the instructions in a basic block which contains all uses
of an alloc_stack
(2.a) visiting the instructions in each basic block which contains a use
of the alloc_stack
(2.b) adding phis
(2.c) using the last stored values as arguments to the new outgoing phi
arguments
(2.c) replacing initial uses of the storage with the new incoming phi
arguments
And here, (1) amounts to a special case of (2.a).
During (1) and (2.a):
(a) lexical borrow scopes are begun after store instructions for the
values that were stored
(b) when possible, lexical borrow scopes are ended before instructions
that deinitialize memory
- destroy_addr
- store [assign]
- load [take]
For (1), that is enough to create valid borrow scopes.
For (2), there are two complications:
(a) Borrow scopes that are begun may not be ended (when visiting a
single block's instructions).
For example, when visiting
bb1:
store %instance to [init] %addr
br bb2
a borrow scope is started after the store but cannot be ended.
(b) There may not be enough information available to end borrow scopes
when visiting instructions that would typically introduce borrow
scopes.
For example, when visiting
bb2:
%copy = load [copy] %addr
%instance = load [take] %addr
br bb3
there is not enough information available to end the borrow scope
that should be ended before the load [take]
To resolve these issues, both sorts of instructions are tracked. For
(a), in StackAllocationPromoter::initializationPoints. For (b), in
StackAllocationPromoter::deinitializationPoints. Finally, a new
step is added:
(2.d) StackAllocationPromoter::endLexicalLifetimes does a forward CFG
walk starting from the out-edge of each of the blocks which began
but did not end lexical lifetimes. At an out-edge, we do a check
regarding unreachables is done, and we may end the borrow scope.
Otherwise, the walk continues to the in-edge of each successor.
At an in-edge, we look for an instruction from (b) (in
unprecedentedDeinitializations) above. If one is found, then we
end the borrow scope before that instruction. Otherwise, the walk
continues to the out-edge of the block.
The ubuntu 18.04 Linux builder fails to build when the SmallVector does
not include the number of elements. This is a known issue and is
incorrect, but that is the version of clang on that OS.
The definition of SmallVector is already available transitively since
there are values of that type, I've just made it explicit.
llvm::SmallVector has a default parameter for the number of elements
while swift::SmallVector doesn't. I've gone ahead and made it explicitly
use the llvm::SmallVector to receive that.
SILGen turns vars into alloc_boxes. When possible, AllocBoxToStack
turns those into alloc_stacks. In order to preserve the lexical
lifetime of those vars, the alloc_stacks are annotated with the
[lexical] attribute. When Mem2Reg runs, it promotes the loads from
and stores into those alloc_stacks to uses of registers. In order to
preserve the lexical lifetime during that transformation, lexical borrow
scopes must be introduces that encode the same lifetime as the
alloc_stacks did. Here, that is done.
When the -enable-experimental-lexical-lifetimes flag is enabled, the
alloc_stack instructions which the pass replaces alloc_box instructions
with have the lexical attribute.
Split AccessedStorage functionality in two pieces. This doesn't add
any new logic, it just allows utilities to make queries on the access
base. This is important for OSSA where we often need to find the
borrow scope or ownership root that contains an access.
To create OSSA terminator results, use:
- OwnershipForwardingTermInst::createResult(SILType ValueOwnershipKind)
- SwitchEnumInst::createDefaultResult()
Add support for passing trivial values to nontrivial forwarding
ownership. This effectively converts None to Guaranteed ownership.
This is essential for handling ".none" enums as trivial values while
extracting a nontrivial payload with switch_enum. This converts None
to Guaranteed ownership. Generates a copy if needed to convert back to
Owned ownership.