* [Concurrency] Reduce overhead of Task.yield and Task.sleep
Instead of creating a new task, we create a simple job that wraps a Builtin.RawUnsafeContinuation and resumes the continuation when it is executed. The job instance is allocated on the task local allocator, meaning we don't malloc anything.
* Update stdlib/public/Concurrency/Task.swift
Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>
Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>
- Introduce an UnownedSerialExecutor type into the concurrency library.
- Create a SerialExecutor protocol which allows an executor type to
change how it executes jobs.
- Add an unownedExecutor requirement to the Actor protocol.
- Change the ABI for ExecutorRef so that it stores a SerialExecutor
witness table pointer in the implementation field. This effectively
makes ExecutorRef an `unowned(unsafe) SerialExecutor`, except that
default actors are represented without a witness table pointer (just
a bit-pattern).
- Synthesize the unownedExecutor method for default actors (i.e. actors
that don't provide an unownedExecutor property).
- Make synthesized unownedExecutor properties `final`, and give them
a semantics attribute specifying that they're for default actors.
- Split `Builtin.buildSerialExecutorRef` into a few more precise
builtins. We're not using the main-actor one yet, though.
Pitch thread:
https://forums.swift.org/t/support-custom-executors-in-swift-concurrency/44425
The `async` operation is a global function that initiates asynchronous
work on behalf of the synchronous code that calls it. Unlike `detach`,
`async` inherits priority, actor context, and other aspects of the
synchronous code that initiates it, making it a better "default"
operation for creating asynchronous work than `detach`. The `detach`
operation is still important for creating truly detached tasks that
can later be `await`'d or cancelled if needed.
Implements the main entry point for rdar://76927008.
The closure does not escape the startAsyncLet - endAsyncLet scope. Even though it's (potentially) running on a different thread.
The substantial change in the runtime is to not call swift_release on the closure context if it's a non-escaping closure.
Through various means, it is possible for a synchronous actor-isolated
function to escape to another concurrency domain and be called from
outside the actor. The problem existed previously, but has become far
easier to trigger now that `@escaping` closures and local functions
can be actor-isolated.
Introduce runtime detection of such data races, where a synchronous
actor-isolated function ends up being called from the wrong executor.
Do this by emitting an executor check in actor-isolated synchronous
functions, where we query the executor in thread-local storage and
ensure that it is what we expect. If it isn't, the runtime complains.
The runtime's complaints can be controlled with the environment
variable `SWIFT_UNEXPECTED_EXECUTOR_LOG_LEVEL`:
0 - disable checking
1 - warn when a data race is detected
2 - error and abort when a data race is detected
At an implementation level, this introduces a new concurrency runtime
entry point `_checkExpectedExecutor` that checks the given executor
(on which the function should always have been called) against the
executor on which is called (which is in thread-local storage). There
is a special carve-out here for `@MainActor` code, where we check
against the OS's notion of "main thread" as well, so that `@MainActor`
code can be called via (e.g.) the Dispatch library's
`DispatchQueue.main.async`.
The new SIL instruction `extract_executor` performs the lowering of an
actor down to its executor, which is implicit in the `hop_to_executor`
instruction. Extend the LowerHopToExecutor pass to perform said
lowering.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
For ordinary memory-management reasons, this should only ever
happen when there will be no more uses of the actor outside of the
actor runtime. The actor runtime, meanwhile, doesn't care about
anything except the default-actor control state of the actor. So
we can just allow the rest of the actor to be destructed when it
isn't needed anymore, then destroy the actor state and deallocate
the object when we get around to switching off the executor.
This does assume that the task doesn't do anything which semantically
detects the executor it's on before switching off it, since doing so
might read a bogus executor. However, we should only get an executor
in a zombie state like this when a hop has been removed or reordered,
and detection events should count as inhibiting that and forcing the
true executor to be switched to (and thus detected).
(But maybe lifetime optimization can make this happen? Maybe we
need semantic detection to filter out zombie executors.)
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
Take the existing CompatibilityOverride mechanism and generalize it so it can be used in both the runtime and Concurrency libraries. The mechanism is preprocessor-heavy, so this requires some tricks. Use the SWIFT_TARGET_LIBRARY_NAME define to distinguish the libraries, and use a different .def file and mach-o section name accordingly.
We want the global/main executor functions to be a little more flexible. Instead of using the override mechanism, we expose function pointers that can be set by the compatibility library, or by any other code that wants to use a custom implementation.
rdar://73726764
CFRunLoopRun returns once it finishes, though the kernel may clean us up
before we get there. We effectively have a race condition between the
kernel cleaning us up and returning from a never returning function.
Small programs likely get cleaned up before reaching the ud2 instruction
emitted after the never returning function in swift, so they don't
notice, but programs of a sufficient size do. At that size, the program
will crash after what the programmer expects to be the end of their
program.
Most of the async runtime functions have been changed to not
expect the task and executor to be passed in. When knowing the
task and executor is necessary, there are runtime functions
available to recover them.
The biggest change I had to make to a runtime function signature
was to swift_task_switch, which has been altered to expect to be
passed the context and resumption function instead of requiring
the caller to park the task. This has the pleasant consequence
of allowing the implementation to very quickly turn around when
it recognizes that the current executor is satisfactory. It does
mean that on arm64e we have to sign the continuation function
pointer as an argument and then potentially resign it when
assigning into the task's resume slot.
rdar://70546948
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)
I'm about to replace these with new variations that correctly handle spawning a task with
a closure as its initial entry point, without leaking or requiring large amounts of caller-side
code generation.
First, just call an async -> T function instead of forcing the caller
to piece together which case we're in and perform its own copy. This
ensures that the task is actually kept alive properly.
Second, now that we no longer implicitly depend on the waiting tasks
being run synchronously, go ahead and schedule them to run on the
global executor.
This solves some problems which were blocking the work on TLS-ifying
the task/executor state.