This instruction is similar to a copy_addr except that it marks a move of an
address that has to be checked. In order to keep the memory lifetime verifier
happy, the semantics before the checker runs are the mark_unresolved_move_addr is
equivalent to copy_addr [init] (not copy_addr [take][init]).
The use of this instruction is that Mandatory Inlining converts builtin "move"
to a mark_unresolved_move_addr when inlining the function "_move" (the only
place said builtin is invoked).
This is then run through a special checker (that is later in this PR) that
either proves that the mark_unresolved_move_addr can actually be a move in which
case it converts it to copy_addr [take][init] or if it can not be a move, emit
an error and convert the instruction to a copy_addr [init]. After this is done
for all instructions, we loop back through again and emit an error on any
mark_unresolved_move_addr that were not processed earlier allowing for us to
know that we have completeness.
NOTE: The move kills checker for addresses is going to run after Mandatory
Inlining, but before predictable memory opts and friends.
The reason why I am doing this is that currently checked_cast_br of an AnyObject
may return a different value due to boxing. As an example, this can occur when
performing a checked_cast_br of a __SwiftValue(AnyHashable(Klass())).
To model this in SIL, I added to OwnershipForwardingMixin a bit of information
that states if the instruction directly forwards ownership with a default value
of true. In checked_cast_br's constructor though I specialize this behavior if
the source type is AnyObject and thus mark the checked_cast_br as not directly
forwarding its operand. If an OwnershipForwardingMixin directly forwards
ownership, one can assume that if it forwards ownership, it will always do so in
a way that ensures that each forwarded value is rc-identical to whatever result
it algebraically forwards ownership to. So in the context of checked_cast_br, it
means that the source value is rc-identical to the argument of the success and
default blocks.
I added a verifier check to CheckedCastBr that makes sure that it can never
forward guaranteed ownership and have a source type of AnyObject.
In SemanticARCOpts, I modified the code that builds extended live ranges of
owned values (*) to check if a OwnershipForwardingMixin is directly
forwarding. If it is not directly forwarding, then we treat the use just as an
unknown consuming use. This will then prevent us from converting such an owned
value to guaranteed appropriately in such a case.
I also in SILGen needed to change checked_cast_br emission in SILGenBuilder to
perform an ensurePlusOne on the input operand (converting it to +1 with an
appropriate cleanup) if the source type is an AnyObject. I found this via the
verifier check catching this behavior from SILGen when compiling libswift. This
just shows how important IR verification of invariants can be.
(*) For those unaware, SemanticARCOpts contains a model of an owned value and
all forwarding uses of it called an OwnershipLiveRange.
rdar://85710101
The key thing is that the move checker will not consider the explicit copy value
to be a copy_value that can be rewritten, ensuring that any uses of the result
of the explicit copy_value (consuming or other wise) are not checked.
Similar to the _move operator I recently introduced, this is a transparent
function so we can perform one level of specialization and thus at least be
generic over all concrete types.
This patch introduces a new stdlib function called _move:
```Swift
@_alwaysEmitIntoClient
@_transparent
@_semantics("lifetimemanagement.move")
public func _move<T>(_ value: __owned T) -> T {
#if $ExperimentalMoveOnly
Builtin.move(value)
#else
value
#endif
}
```
It is a first attempt at creating a "move" function for Swift, albeit a skleton
one since we do not yet perform the "no use after move" analysis. But this at
leasts gets the skeleton into place so we can built the analysis on top of it
and churn tree in a manageable way. Thus in its current incarnation, all it does
is take in an __owned +1 parameter and returns it after moving it through
Builtin.move.
Given that we want to use an OSSA based analysis for our "no use after move"
analysis and we do not have opaque values yet, we can not supporting moving
generic values since they are address only. This has stymied us in the past from
creating this function. With the implementation in this PR via a bit of
cleverness, we are now able to support this as a generic function over all
concrete types by being a little clever.
The trick is that when we transparent inline _move (to get the builtin), we
perform one level of specialization causing the inlined Builtin.move to be of a
loadable type. If after transparent inlining, we inline builtin "move" into a
context where it is still address only, we emit a diagnostic telling the user
that they applied move to a generic or existential and that this is not yet
supported.
The reason why we are taking this approach is that we wish to use this to
implement a new (as yet unwritten) diagnostic pass that verifies that _move
(even for non-trivial copyable values) ends the lifetime of the value. This will
ensure that one can write the following code to reliably end the lifetime of a
let binding in Swift:
```Swift
let x = Klass()
let _ = _move(x)
// hypotheticalUse(x)
```
Without the diagnostic pass, if one were to write another hypothetical use of x
after the _move, the compiler would copy x to at least hypotheticalUse(x)
meaning the lifetime of x would not end at the _move, =><=.
So to implement this diagnostic pass, we want to use the OSSA infrastructure and
that only works on objects! So how do we square this circle: by taking advantage
of the mandatory SIL optimzier pipeline! Specifically we take advantage of the
following:
1. Mandatory Inlining and Predictable Dead Allocation Elimination run before any
of the move only diagnostic passes that we run.
2. Mandatory Inlining is able to specialize a callee a single level when it
inlines code. One can take advantage of this to even at -Onone to
monomorphosize code.
and then note that _move is such a simple function that predictable dead
allocation elimination is able to without issue eliminate the extra alloc_stack
that appear in the caller after inlining without issue. So we (as the tests
show) get SIL that for concrete types looks exactly like we just had run a
move_value for that specific type as an object since we promote away the
stores/loads in favor of object operations when we eliminate the allocation.
In order to prevent any issue with this being used in a context where multiple
specializations may occur, I made the inliner emit a diagnostic if it inlines
_move into a function that applies it to an address only value. The diagnostic
is emitted at the source location where the function call occurs so it is easy
to find, e.x.:
```
func addressOnlyMove<T>(t: T) -> T {
_move(t) // expected-error {{move() used on a generic or existential value}}
}
moveonly_builtin_generic_failure.swift:12:5: error: move() used on a generic or existential value
_move(t)
^
```
To eliminate any potential ABI impact, if someone calls _move in a way that
causes it to be used in a context where the transparent inliner will not inline
it, I taught IRGen that Builtin.move is equivalent to a take from src -> dst and
marked _move as always emit into client (AEIC). I also took advantage of the
feature flag I added in the previous commit in order to prevent any cond_fails
from exposing Builtin.move in the stdlib. If one does not pass in the flag
-enable-experimental-move-only then the function just returns the value without
calling Builtin.move, so we are safe.
rdar://83957028
Adds two new IRGen-level builtins (one for allocating, the other for deallocating), a stdlib shim function for enhanced stack-promotion heuristics, and the proposed public stdlib functions.
This API should not handle is phis because the phi argument's ownedhip
is independent from the phi itself. The client assumes that the
returned root is in the same lifetime/scope of the access.
Simple convenience on top of findOwnershipReferenceRoot to handle
guaranteed values forwarded through struct/tuple extract.
Note: eventually this should be handled by a ReferenceRoot abstraction
that stores the component path.
Replaces AccessStorage::isGuaranteedForFunction().
OSSA utilities need to determine whether two addresses with the same
AccessStorage are directly substitutable without "fixing-up" the OSSA
lifetime. Checking whether the access base is within a local borrow
scope is the first step.
Split AccessedStorage functionality in two pieces. This doesn't add
any new logic, it just allows utilities to make queries on the access
base. This is important for OSSA where we often need to find the
borrow scope or ownership root that contains an access.
This patch replace all in-memory objects of DebugValueAddrInst with
DebugValueInst + op_deref, and duplicates logics that handles
DebugValueAddrInst with the latter. All related check in the tests
have been updated as well.
Note that this patch neither remove the DebugValueAddrInst class nor
remove `debug_value_addr` syntax in the test inputs.
It was originally convenient for exclusivity optimization to treat
boxes specially. We wanted to know that the 'Box' kind was always
uniquely identified. But that's not really important. And now that
AccessedStorage is being used more generally, the inconsistency is
problematic.
A consistent model is also must easier to understand and explain.
This also make the implementation of the utility simpler and more powerful.
Functional changes:
isRCIdentical will look through mark_dependence and mark_uninitialized.
findReferenceRoot is used consistently everywhere increasing analysis precision.
AccessPathWithBase::compute can return a valid access path with unidentified base.
In such cases, we cannot LICM stores, because there is no base address to check if it is invariant
AccessPathDefUseTraversal accumulates the offsets that it has seen on
the def-use walk while visiting index_addr and casts.
This is quite tricky because either the AccessPath being matched or
the def-use chain could contain unknown offsets. We need to correctly
classify all cases as either an exact, inner, or overlapping match.
For SIL like this
%91 = integer_literal $Builtin.Int64, 0
%113 = builtin "truncOrBitCast_Int64_Word"(%91 : $Builtin.Int64) : $Builtin.Word
%115 = index_addr %114 : $*MyStruct, %113 : $Builtin.Word
%119 = struct_element_addr %115 : $*MyStruct, #MyStruct.klass
We have Path: (@Unknown,#0)
There are two issues
(1) When AccessPathDefUseTraversal::checkAndUpdateOffset visits the
struct_element_addr, it fails to pop the unknown offset from the
expected path but continues trying to handle the struct_element_addr
and hits an assert. This PR fixes this.
(2) If the builtin "truncOrBitCast_Int64_Word" comes from a literal,
we should not treat it as an unknown offset (given that the literal is
within the size of a word). Added a test for this case, we should be able
to identify the access path accurately in this.
Change the code generation patterns for `async let` bindings to use an ABI based on the following
functions:
- `swift_asyncLet_begin`, which starts an `async let` child task, but which additionally
now associates the `async let` with a caller-owned buffer to receive the result of the task.
This is intended to allow the task to emplace its result in caller-owned memory, allowing the
child task to be deallocated after completion without invalidating the result buffer.
- `swift_asyncLet_get[_throwing]`, which replaces `swift_asyncLet_wait[_throwing]`. Instead of
returning a copy of the value, this entry point concerns itself with populating the local buffer.
If the buffer hasn't been populated, then it awaits completion of the task and emplaces the
result in the buffer; otherwise, it simply returns. The caller can then read the result out of
its owned memory. These entry points are intended to be used before every read from the
`async let` binding, after which point the local buffer is guaranteed to contain an initialized
value.
- `swift_asyncLet_finish`, which replaces `swift_asyncLet_end`. Unlike `_end`, this variant
is async and will suspend the parent task after cancelling the child to ensure it finishes
before cleaning up. The local buffer will also be deinitialized if necessary. This is intended
to be used on exit from an `async let` scope, to handle cleaning up the local buffer if necessary
as well as cancelling, awaiting, and deallocating the child task.
- `swift_asyncLet_consume[_throwing]`, which combines `get` and `finish`. This will await completion
of the task, leaving the result value in the result buffer (or propagating the error, if it
throws), while destroying and deallocating the child task. This is intended as an optimization
for reading `async let` variables that are read exactly once by their parent task.
To avoid an epoch break with existing swiftinterfaces and ABI clients, the old builtins and entry
points are kept intact for now, but SILGen now only generates code using the new interface.
This new interface fixes several issues with the old async let codegen, including use-after-free
crashes if the `async let` was never awaited, and the inability to read from an `async let` variable
more than once.
rdar://77855176
Rather than using group task options constructed from the Swift parts
of the _Concurrency library and passed through `createAsyncTask`'s
options, introduce a separate builtin that always takes a group. Move
the responsibility for creating the options structure into IRGen, so
we don't need to expose the TaskGroupTaskOptionRecord type in Swift.
Introduce a builtin `createAsyncTask` that maps to `swift_task_create`,
and use that for the non-group task creation operations based on the
task-creation flags. `swift_task_create` and the thin function version
`swift_task_create_f` go through the dynamically-replaceable
`swift_task_create_common`, where all of the task creation logic is
present.
While here, move copying of task locals and the initial scheduling of
the task into `swift_task_create_common`, enabling by separate flags.
When an instruction is "deleted" from the SIL, it is put into the SILModule::scheduledForDeletion list.
The instructions in this list are eventually deleted for real in SILModule::flushDeletedInsts(), which is called by the pass manager after each pass run.
In other words: instruction deletion is deferred to the end of a pass.
This avoids dangling instruction pointers within the run of a pass and in analysis caches.
Note that the analysis invalidation mechanism ensures that analysis caches are invalidated before flushDeletedInsts().
Through various means, it is possible for a synchronous actor-isolated
function to escape to another concurrency domain and be called from
outside the actor. The problem existed previously, but has become far
easier to trigger now that `@escaping` closures and local functions
can be actor-isolated.
Introduce runtime detection of such data races, where a synchronous
actor-isolated function ends up being called from the wrong executor.
Do this by emitting an executor check in actor-isolated synchronous
functions, where we query the executor in thread-local storage and
ensure that it is what we expect. If it isn't, the runtime complains.
The runtime's complaints can be controlled with the environment
variable `SWIFT_UNEXPECTED_EXECUTOR_LOG_LEVEL`:
0 - disable checking
1 - warn when a data race is detected
2 - error and abort when a data race is detected
At an implementation level, this introduces a new concurrency runtime
entry point `_checkExpectedExecutor` that checks the given executor
(on which the function should always have been called) against the
executor on which is called (which is in thread-local storage). There
is a special carve-out here for `@MainActor` code, where we check
against the OS's notion of "main thread" as well, so that `@MainActor`
code can be called via (e.g.) the Dispatch library's
`DispatchQueue.main.async`.
The new SIL instruction `extract_executor` performs the lowering of an
actor down to its executor, which is implicit in the `hop_to_executor`
instruction. Extend the LowerHopToExecutor pass to perform said
lowering.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
Since it is an always take, we know that the original value will always be
invalidated by the checked_cast_addr_br.
This also lets me use this to recognize simple cases of checked casts in
opt-remark-gen.
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)