Change the code generation patterns for `async let` bindings to use an ABI based on the following
functions:
- `swift_asyncLet_begin`, which starts an `async let` child task, but which additionally
now associates the `async let` with a caller-owned buffer to receive the result of the task.
This is intended to allow the task to emplace its result in caller-owned memory, allowing the
child task to be deallocated after completion without invalidating the result buffer.
- `swift_asyncLet_get[_throwing]`, which replaces `swift_asyncLet_wait[_throwing]`. Instead of
returning a copy of the value, this entry point concerns itself with populating the local buffer.
If the buffer hasn't been populated, then it awaits completion of the task and emplaces the
result in the buffer; otherwise, it simply returns. The caller can then read the result out of
its owned memory. These entry points are intended to be used before every read from the
`async let` binding, after which point the local buffer is guaranteed to contain an initialized
value.
- `swift_asyncLet_finish`, which replaces `swift_asyncLet_end`. Unlike `_end`, this variant
is async and will suspend the parent task after cancelling the child to ensure it finishes
before cleaning up. The local buffer will also be deinitialized if necessary. This is intended
to be used on exit from an `async let` scope, to handle cleaning up the local buffer if necessary
as well as cancelling, awaiting, and deallocating the child task.
- `swift_asyncLet_consume[_throwing]`, which combines `get` and `finish`. This will await completion
of the task, leaving the result value in the result buffer (or propagating the error, if it
throws), while destroying and deallocating the child task. This is intended as an optimization
for reading `async let` variables that are read exactly once by their parent task.
To avoid an epoch break with existing swiftinterfaces and ABI clients, the old builtins and entry
points are kept intact for now, but SILGen now only generates code using the new interface.
This new interface fixes several issues with the old async let codegen, including use-after-free
crashes if the `async let` was never awaited, and the inability to read from an `async let` variable
more than once.
rdar://77855176
TLDR: I fixed a whole in the assembly-vision opt-remark pass where we were not
emitting a remark for end of scope instructions at the beginning of blocks. Now
all of these instructions (strong_release, end_access) should always reliably
have a remark emitted for them.
----
I think that this is a pragmatic first solution to the problem of
strong_release, release_value being the first instruction of a block. For those
who are unaware, this issue is that for a long time we have searched backwards
first for "end of scope" like instructions. This then allows us to identify the
"end of scope" instruction as happening at the end of the previous statement
which is where the developer thinks it should be:
```
var global: Klass
func bar() -> @owned Klass { global }
func foo() {
// We want the remark for the
bar() // expected-remark {{retain}}
}
```
This makes sense since we want to show end of scope instructions as being
applied to the earlier code whose scope it is ending. We can be clear that it is
at the end of the statement by placing the carrot on the end of statement
SourceLoc so there isn't any confusion upon whether or not
That generally has delivered nice looking results, but what if our release is
the first instruction in the block? In that case, we do not have any instruction
that we can immediately use, so traditionally we just gave up and didn't emit
anything. This is not an acceptable solution! We should be able to emit
something for every retain/release in the program if we want users to be able to
rely upon this! Thus we need to be able to get source location information from
somewhere around
First before we begin, my approach here is informed by my seeing over time that
the optimizer does a pretty good job of not breaking SourceLoc info for
terminators.
With that in mind, there are two possible approaches here: using the terminator
from the previous block and searching forward at worst taking the SourceLoc of
the current block's terminator (or earlier if we find a good SourceLoc). I
wasn't sure what the correct thing to do was at the time so I didn't fix the
issue. After some thought, I realized that the correct solution is to if we fail
a backwards search, search forwards. The reason why is that since our remarks
runs late in the optimization pipeline, there is a very high likelihood that if
we aren't folded into our previous block that there is a true need in the
program for conditional control flow here. We want to avoid placing the release
out of such pieces of code since it is misleading to the user:
```
In this example there is a release inside the case for .x but none for .y. In
that case it is possible that we get a release for .f since payload is passed in
at +1 at SILGen time. In such a case, using the terminator of the previous block
would mean that we would have the release be marked as on payload instead of
inside the case of .x. By using the terminator of the releases block, we can
switch payload {
case let .x(f):
...
case let .y:
...
}
```
So using the terminator from the previous block would be
misleading to the user. Instead it is better to pick a location after the release that way
we know at least the instruction we are inferring from must in some sense be
With this fix, we should not emit locations for all retains, releases. We may
not identify a good source loc for all of them, but we will identify them.
optimization pipeline, if our block was not folded into the previous block there
is a very high liklihood that there is some sort of conditional control flow
that is truly necessary in the program. If we
this
generally implies that there is a real side effect in the program that is
requiring conditional code execution (since the optimizer would have folded it).
The reason why is that we are at least going to hit a terminator or a
side-effect having instruction that generally have debug info preserved by the
optimizer.
Rather than using group task options constructed from the Swift parts
of the _Concurrency library and passed through `createAsyncTask`'s
options, introduce a separate builtin that always takes a group. Move
the responsibility for creating the options structure into IRGen, so
we don't need to expose the TaskGroupTaskOptionRecord type in Swift.
Introduce a builtin `createAsyncTask` that maps to `swift_task_create`,
and use that for the non-group task creation operations based on the
task-creation flags. `swift_task_create` and the thin function version
`swift_task_create_f` go through the dynamically-replaceable
`swift_task_create_common`, where all of the task creation logic is
present.
While here, move copying of task locals and the initial scheduling of
the task into `swift_task_create_common`, enabling by separate flags.
This is the initial version of a buildable SIL definition in libswift.
It defines an initial set of SIL classes, like Function, BasicBlock, Instruction, Argument, and a few instruction classes.
The interface between C++ and SIL is a bridging layer, implemented in C.
It contains all the required bridging data structures used to access various SIL data structures.
When an instruction is "deleted" from the SIL, it is put into the SILModule::scheduledForDeletion list.
The instructions in this list are eventually deleted for real in SILModule::flushDeletedInsts(), which is called by the pass manager after each pass run.
In other words: instruction deletion is deferred to the end of a pass.
This avoids dangling instruction pointers within the run of a pass and in analysis caches.
Note that the analysis invalidation mechanism ensures that analysis caches are invalidated before flushDeletedInsts().
Instead, put the archetype->instrution map into SIlModule.
SILOpenedArchetypesTracker tried to maintain and reconstruct the mapping locally, e.g. during a use of SILBuilder.
Having a "global" map in SILModule makes the whole logic _much_ simpler.
I'm wondering why we didn't do this in the first place.
This requires that opened archetypes must be unique in a module - which makes sense. This was the case anyway, except for keypath accessors (which I fixed in the previous commit) and in some sil test files.
Through various means, it is possible for a synchronous actor-isolated
function to escape to another concurrency domain and be called from
outside the actor. The problem existed previously, but has become far
easier to trigger now that `@escaping` closures and local functions
can be actor-isolated.
Introduce runtime detection of such data races, where a synchronous
actor-isolated function ends up being called from the wrong executor.
Do this by emitting an executor check in actor-isolated synchronous
functions, where we query the executor in thread-local storage and
ensure that it is what we expect. If it isn't, the runtime complains.
The runtime's complaints can be controlled with the environment
variable `SWIFT_UNEXPECTED_EXECUTOR_LOG_LEVEL`:
0 - disable checking
1 - warn when a data race is detected
2 - error and abort when a data race is detected
At an implementation level, this introduces a new concurrency runtime
entry point `_checkExpectedExecutor` that checks the given executor
(on which the function should always have been called) against the
executor on which is called (which is in thread-local storage). There
is a special carve-out here for `@MainActor` code, where we check
against the OS's notion of "main thread" as well, so that `@MainActor`
code can be called via (e.g.) the Dispatch library's
`DispatchQueue.main.async`.
The new SIL instruction `extract_executor` performs the lowering of an
actor down to its executor, which is implicit in the `hop_to_executor`
instruction. Extend the LowerHopToExecutor pass to perform said
lowering.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
Since it is an always take, we know that the original value will always be
invalidated by the checked_cast_addr_br.
This also lets me use this to recognize simple cases of checked casts in
opt-remark-gen.
For example, now we get the following diagnostic on globals:
public func getGlobal() -> Klass {
return global // expected-remark @:5 {{retain of type 'Klass'}}
// expected-note @-5:12 {{of 'global'}}
+ // expected-remark @-2:12 {{begin exclusive access to value of type 'Klass'}}
+ // expected-note @-7:12 {{of 'global'}}
+ // expected-remark @-4 {{end exclusive access to value of type 'Klass'}}
+ // expected-note @-9:12 {{of 'global'}}
+
}
and for classes when we can't eliminate the access:
+func simpleInOut() -> Klass {
+ let x = Klass() // expected-remark @:13 {{heap allocated ref of type 'Klass'}}
+ // expected-note @-1:9 {{of 'x'}}
+ simpleInOutUser(&x.next) // expected-remark @:5 {{begin exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-3:9 {{of 'x.next'}}
+ // expected-remark @-2:28 {{end exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-5:9 {{of 'x.next'}}
+ return x
+}
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
The MemoryLifetimeVerifier has to ignore locations with empty types, e.g. and empty tuple.
So far, the check for empty types didn't check recursively, so it missed e.g. "((), ())"
And rename MemoryDataflow -> BitDataflow.
MemoryLifetime contained MemoryLocations, MemoryDataflow and the MemoryLifetimeVerifier.
Three independent things, for which it makes sense to have them in three separated files.
NFC.
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)