The goal here is not to eventually implement a concurrent thread
pool ourselves. We're just making it easier for integrators who
have their own pool and don't want to use Dispatch to build the
Swift concurrency runtime. Just hook the right functions and
you should be fine.
The necessary functions to hook are:
- swift_task_enqueueGlobal
- swift_task_enqueueGlobalAfterDelay
The following functions *would* be necessary to hook:
- swift_task_enqueueMainExecutor
- swift_task_asyncMainDrainQueue (only if you have an async main?)
However, this configuration does not currently properly support
the main executor, and so `@MainActor` should be avoided for now.
rdar://83513751
The goal here is not to eventually implement a concurrent thread
pool ourselves. We're just making it easier for integrators who
have their own pool and don't want to use Dispatch to build the
Swift concurrency runtime. Just hook the right functions and
you should be fine.
The necessary functions to hook are:
- swift_task_enqueueGlobal
- swift_task_enqueueGlobalAfterDelay
The following functions *would* be necessary to hook:
- swift_task_enqueueMainExecutor
- swift_task_asyncMainDrainQueue (only if you have an async main?)
However, this configuration does not currently properly support
the main executor, and so `@MainActor` should be avoided for now.
rdar://83513751
runAsyncAndBlock has long been dead, I'm pulling out the machinery that
lives in the runtime that used to support it. This is the first layer
that can go away.
Change the code generation patterns for `async let` bindings to use an ABI based on the following
functions:
- `swift_asyncLet_begin`, which starts an `async let` child task, but which additionally
now associates the `async let` with a caller-owned buffer to receive the result of the task.
This is intended to allow the task to emplace its result in caller-owned memory, allowing the
child task to be deallocated after completion without invalidating the result buffer.
- `swift_asyncLet_get[_throwing]`, which replaces `swift_asyncLet_wait[_throwing]`. Instead of
returning a copy of the value, this entry point concerns itself with populating the local buffer.
If the buffer hasn't been populated, then it awaits completion of the task and emplaces the
result in the buffer; otherwise, it simply returns. The caller can then read the result out of
its owned memory. These entry points are intended to be used before every read from the
`async let` binding, after which point the local buffer is guaranteed to contain an initialized
value.
- `swift_asyncLet_finish`, which replaces `swift_asyncLet_end`. Unlike `_end`, this variant
is async and will suspend the parent task after cancelling the child to ensure it finishes
before cleaning up. The local buffer will also be deinitialized if necessary. This is intended
to be used on exit from an `async let` scope, to handle cleaning up the local buffer if necessary
as well as cancelling, awaiting, and deallocating the child task.
- `swift_asyncLet_consume[_throwing]`, which combines `get` and `finish`. This will await completion
of the task, leaving the result value in the result buffer (or propagating the error, if it
throws), while destroying and deallocating the child task. This is intended as an optimization
for reading `async let` variables that are read exactly once by their parent task.
To avoid an epoch break with existing swiftinterfaces and ABI clients, the old builtins and entry
points are kept intact for now, but SILGen now only generates code using the new interface.
This new interface fixes several issues with the old async let codegen, including use-after-free
crashes if the `async let` was never awaited, and the inability to read from an `async let` variable
more than once.
rdar://77855176
Tracking this as a single bit is actually largely uninteresting
to the runtime. To handle priority escalation properly, we really
need to track this at a finer grain of detail: recording that the
task is running on a specific thread, enqueued on a specific actor,
or so on. But starting by tracking a single bit is important for
two reasons:
- First, it's more realistic about the performance overheads of
tasks: we're going to be doing this tracking eventually, and
the cost of that tracking will be dominated by the atomic
access, so doing that access now sets the baseline about right.
- Second, it ensures that we've actually got runtime involvement
in all the right places to do this tracking.
A propos of the latter: there was no runtime involvement with
awaiting a continuation, which is a point at which the task
potentially transitions from running to suspended. We must do
the tracking as part of this transition, rather than recognizing
in the run-loops that a task is still active and treating it as
having suspended, because the latter point potentially races with
the resumption of the task. To do this, I've had to introduce
a runtime function, swift_continuation_await, to do this awaiting
rather than inlining the atomic operation on the continuation.
As part of doing this work, I've also fixed a bug where we failed
to load-acquire in swift_task_escalate before walking the task
status records to invoke escalation actions.
I've also fixed several places where the handling of task statuses
may have accidentally allowed the task to revert to uncancelled.
The symbol is swift_async_extendedFramePointerFlags. Since the value doesn't need to be dynamically computed, we save a level of indirection by emitting a fake global variable whose address is the value we want, similar to objc_absolute_packed_isa_class_mask.
This bit is mixed in to the frame pointer address stored on the stack to signal that a frame is an async frame. The compiler can emit code that ORs in the address of this symbol to apply the appropriate flag when it doesn't know the flag statically.
rdar://80277146
Introduce a builtin `createAsyncTask` that maps to `swift_task_create`,
and use that for the non-group task creation operations based on the
task-creation flags. `swift_task_create` and the thin function version
`swift_task_create_f` go through the dynamically-replaceable
`swift_task_create_common`, where all of the task creation logic is
present.
While here, move copying of task locals and the initial scheduling of
the task into `swift_task_create_common`, enabling by separate flags.
The flags that are useful for task creation are a bit different from
the flags that go on a job. Create a separate flag set for task
creation and use that in the API for `swift_task_create`. For now,
have the callers do the remapping.
Collapse the `group` parameter of this API into the task options, and
have existing callers set up the options appropriately. The goal for
this function is to become the centralized entry point for all task
creation, with an extensible interface.
introduce new options parameter to all task spawning
[Concurrency] ABI for asynclet start to accept options
[Concurrency] fix unittest usages of changed task creation ABI
[Concurrency] introduce constants for parameter indexes in ownership
[Concurrency] fix test/SILOptimizer/closure_lifetime_fixup_concurrency.swift
I added Builtin.buildMainActorExecutor before, but because I never
implemented it correctly in IRGen, it's not okay to use it on old
versions, so I had to introduce a new feature only for it.
The shim dispatch queue class in the Concurrency runtime is rather
awful, but I couldn't think of a reasonable alternative without
just entirely hard-coding the witness table in the runtime.
It's not ABI, at least.
Changes the task, taskGroup, asyncLet wait funtion call ABIs.
To reduce code size pass the context parameters and resumption function
as arguments to the wait function.
This means that the suspend point does not need to store parent context
and resumption to the suspend point's context.
```
void swift_task_future_wait_throwing(
OpaqueValue * result,
SWIFT_ASYNC_CONTEXT AsyncContext *callerContext,
AsyncTask *task,
ThrowingTaskFutureWaitContinuationFunction *resume,
AsyncContext *callContext);
```
The runtime passes the caller context to the resume entry point saving
the load of the parent context in the resumption function.
This patch adds a `Metadata *` field to `GroupImpl`. The await entry
pointer no longer pass the metadata pointer and there is a path through
the runtime where the task future is no longer available.
* [Distributed] Initial distributed checking
* [Distributed] initial types shapes and conform to DistributedActor
* [Distributed] Require Codable params and return types
* [Distributed] initial synthesis of fields and constructors
* [Distributed] Field and initializer synthesis
* [Distributed] Codable requirement on distributed funcs; also handle <T: Codable>
* [Distributed] handle generic type params which are Codable in dist func
[Distributed] conformsToProtocol after all
* [Distributed] Implement remote flag on actors
* Implement remote flag on actors
* add test
* actor initializer that sets remote flag
[Distributed] conformances getting there
* [Distributed] dont require async throws; cleanup compile tests
* [Distributed] do not synthesize default implicit init, only our special ones
* [Distributed] properly synth inits and properties; mark actorTransport as _distributedActorIndependent
Also:
- do not synthesize default init() initializer for dist actor
* [Distributed] init(transport:) designated and typechecking
* [Distributed] dist actor initializers MUST delegate to local-init
* [Distributed] check if any ctors in delegation call init(transport:)
* [Distributed] check init(transport:) delegation through many inits; ban invoking init(resolve:using:) explicitly
* [Distributed] disable IRGen test for now
* [Distributed] Rebase cleanups
* [Concurrent] transport and address are concurrent value
* [Distributed] introduce -enable-experimental-distributed flag
* rebase adjustments again
* rebase again...
* [Distributed] distributed functions are implicitly async+throws outside the actor
* [Distributed] implicitly throwing and async distributed funcs
* remove printlns
* add more checks to implicit function test
* [Distributed] resolve initializer now marks the isRemote actor flag
* [Distributed] distributedActor_destroy invoked instead, rather than before normal
* [Distributed] Generate distributed thunk for actors
* [distributed] typechecking for _remote_ functions existing, add tests for remote funcs
* adding one XFAIL'ed task & actor lifetime test
The `executor_deinit1` test fails 100% of the time
(from what I've seen) so I thought we could track
and see when/if someone happens to fix this bug.
Also, added extra coverage for #36298 via `executor_deinit2`
* Fix a memory issue with actors in the runtime system, by @phausler
* add new test that now passes because of patch by @phausler
See previous commit in this PR.
Test is based on one from rdar://74281361
* fix all tests that require the _remote_ function stubs
* Do not infer @actorIndependent onto `let` decls
* REVERT_ME: remove some tests that hacky workarounds will fail
* another flaky test, help build toolchain
* [Distributed] experimental distributed implies experimental concurrency
* [Distributed] Allow distributed function that are not marked async or throws
* [Distributed] make attrs SIMPLE to get serialization generated
* [Distributed] ActorAddress must be Hashable
* [Distributed] Implement transport.actorReady call in local init
* cleanup after rebase
* [Distributed] add availability attributes to all distributed actor code
* cleanup - this fixed some things
* fixing up
* fixing up
* [Distributed] introduce new Distributed module
* [Distributed] diagnose when missing 'import _Distributed'
* [Distributed] make all tests import the module
* more docs on address
* [Distributed] fixup merge issues
* cleanup: remove unnecessary code for now SIMPLE attribute
* fix: fix getActorIsolationOfContext
* [Distributed] cmake: depend on _concurrency module
* fixing tests...
* Revert "another flaky test, help build toolchain"
This reverts commit 83ae6654dd.
* remove xfail
* clenup some IR and SIL tests
* cleanup
* [Distributed] fix cmake test and ScanDependencies/can_import_with_map.swift
* [Distributed] fix flags/build tests
* cleanup: use isDistributed wherever possible
* [Distributed] don't import Dispatch in tests
* dont link distributed in stdlib unittest
* trying always append distributed module
* cleanups
* [Distributed] move all tests to Distributed/ directory
* [lit] try to fix lit test discovery
* [Distributed] update tests after diagnostics for implicit async changed
* [Distributed] Disable remote func tests on Windows for now
* Review cleanups
* [Distributed] fix typo, fixes Concurrency/actor_isolation_objc.swift
* [Distributed] attributes are DistributedOnly (only)
* cleanup
* [Distributed] cleanup: rely on DistributedOnly for guarding the keyword
* Update include/swift/AST/ActorIsolation.h
Co-authored-by: Doug Gregor <dgregor@apple.com>
* introduce isAnyThunk, minor cleanup
* wip
* [Distributed] move some type checking to TypeCheckDistributed.cpp
* [TypeCheckAttr] remove extra debug info
* [Distributed/AutoDiff] fix SILDeclRef creation which caused AutoDiff issue
* cleanups
* [lit] remove json import from lit test suite, not needed after all
* [Distributed] distributed functions only in DistributedActor protocols
* [Distributed] fix flag overlap & build setting
* [Distributed] Simplify noteIsolatedActorMember to not take bool distributed param
* [Distributed] make __isRemote not public
* [Distributed] Fix availability and remove actor class tests
* [actorIndependent] do not apply actorIndependent implicitly to values where it would be illegal to apply
* [Distributed] disable tests until issue fixed
Co-authored-by: Dario Rexin <drexin@apple.com>
Co-authored-by: Kavon Farvardin <kfarvardin@apple.com>
Co-authored-by: Doug Gregor <dgregor@apple.com>
* Revert "[Distributed] disable tests until issue fixed"
This reverts commit 0a04278920.
* Revert "[Distributed] Initial `distributed` actors and functions and new module (#37109)"
This reverts commit 814ede0cf3.
* [Distributed] Initial distributed checking
* [Distributed] initial types shapes and conform to DistributedActor
* [Distributed] Require Codable params and return types
* [Distributed] initial synthesis of fields and constructors
* [Distributed] Field and initializer synthesis
* [Distributed] Codable requirement on distributed funcs; also handle <T: Codable>
* [Distributed] handle generic type params which are Codable in dist func
[Distributed] conformsToProtocol after all
* [Distributed] Implement remote flag on actors
* Implement remote flag on actors
* add test
* actor initializer that sets remote flag
[Distributed] conformances getting there
* [Distributed] dont require async throws; cleanup compile tests
* [Distributed] do not synthesize default implicit init, only our special ones
* [Distributed] properly synth inits and properties; mark actorTransport as _distributedActorIndependent
Also:
- do not synthesize default init() initializer for dist actor
* [Distributed] init(transport:) designated and typechecking
* [Distributed] dist actor initializers MUST delegate to local-init
* [Distributed] check if any ctors in delegation call init(transport:)
* [Distributed] check init(transport:) delegation through many inits; ban invoking init(resolve:using:) explicitly
* [Distributed] disable IRGen test for now
* [Distributed] Rebase cleanups
* [Concurrent] transport and address are concurrent value
* [Distributed] introduce -enable-experimental-distributed flag
* rebase adjustments again
* rebase again...
* [Distributed] distributed functions are implicitly async+throws outside the actor
* [Distributed] implicitly throwing and async distributed funcs
* remove printlns
* add more checks to implicit function test
* [Distributed] resolve initializer now marks the isRemote actor flag
* [Distributed] distributedActor_destroy invoked instead, rather than before normal
* [Distributed] Generate distributed thunk for actors
* [distributed] typechecking for _remote_ functions existing, add tests for remote funcs
* adding one XFAIL'ed task & actor lifetime test
The `executor_deinit1` test fails 100% of the time
(from what I've seen) so I thought we could track
and see when/if someone happens to fix this bug.
Also, added extra coverage for #36298 via `executor_deinit2`
* Fix a memory issue with actors in the runtime system, by @phausler
* add new test that now passes because of patch by @phausler
See previous commit in this PR.
Test is based on one from rdar://74281361
* fix all tests that require the _remote_ function stubs
* Do not infer @actorIndependent onto `let` decls
* REVERT_ME: remove some tests that hacky workarounds will fail
* another flaky test, help build toolchain
* [Distributed] experimental distributed implies experimental concurrency
* [Distributed] Allow distributed function that are not marked async or throws
* [Distributed] make attrs SIMPLE to get serialization generated
* [Distributed] ActorAddress must be Hashable
* [Distributed] Implement transport.actorReady call in local init
* cleanup after rebase
* [Distributed] add availability attributes to all distributed actor code
* cleanup - this fixed some things
* fixing up
* fixing up
* [Distributed] introduce new Distributed module
* [Distributed] diagnose when missing 'import _Distributed'
* [Distributed] make all tests import the module
* more docs on address
* [Distributed] fixup merge issues
* cleanup: remove unnecessary code for now SIMPLE attribute
* fix: fix getActorIsolationOfContext
* [Distributed] cmake: depend on _concurrency module
* fixing tests...
* Revert "another flaky test, help build toolchain"
This reverts commit 83ae6654dd.
* remove xfail
* clenup some IR and SIL tests
* cleanup
* [Distributed] fix cmake test and ScanDependencies/can_import_with_map.swift
* [Distributed] fix flags/build tests
* cleanup: use isDistributed wherever possible
* [Distributed] don't import Dispatch in tests
* dont link distributed in stdlib unittest
* trying always append distributed module
* cleanups
* [Distributed] move all tests to Distributed/ directory
* [lit] try to fix lit test discovery
* [Distributed] update tests after diagnostics for implicit async changed
* [Distributed] Disable remote func tests on Windows for now
* Review cleanups
* [Distributed] fix typo, fixes Concurrency/actor_isolation_objc.swift
* [Distributed] attributes are DistributedOnly (only)
* cleanup
* [Distributed] cleanup: rely on DistributedOnly for guarding the keyword
* Update include/swift/AST/ActorIsolation.h
Co-authored-by: Doug Gregor <dgregor@apple.com>
* introduce isAnyThunk, minor cleanup
* wip
* [Distributed] move some type checking to TypeCheckDistributed.cpp
* [TypeCheckAttr] remove extra debug info
* [Distributed/AutoDiff] fix SILDeclRef creation which caused AutoDiff issue
* cleanups
* [lit] remove json import from lit test suite, not needed after all
* [Distributed] distributed functions only in DistributedActor protocols
* [Distributed] fix flag overlap & build setting
* [Distributed] Simplify noteIsolatedActorMember to not take bool distributed param
* [Distributed] make __isRemote not public
Co-authored-by: Dario Rexin <drexin@apple.com>
Co-authored-by: Kavon Farvardin <kfarvardin@apple.com>
Co-authored-by: Doug Gregor <dgregor@apple.com>
This commit changes JobFlags storage to be 32bits, but leaves the runtime
API expressed in terms of size_t. This allows us to pack an Id in the
32bits we freed up.
The offset of this Id in the AsyncTask is an ABI constant. This way
introspection tools can extract the currently running task identifier
without any need for special APIs.
* [Concurrency] Reduce overhead of Task.yield and Task.sleep
Instead of creating a new task, we create a simple job that wraps a Builtin.RawUnsafeContinuation and resumes the continuation when it is executed. The job instance is allocated on the task local allocator, meaning we don't malloc anything.
* Update stdlib/public/Concurrency/Task.swift
Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>
Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>
- Introduce an UnownedSerialExecutor type into the concurrency library.
- Create a SerialExecutor protocol which allows an executor type to
change how it executes jobs.
- Add an unownedExecutor requirement to the Actor protocol.
- Change the ABI for ExecutorRef so that it stores a SerialExecutor
witness table pointer in the implementation field. This effectively
makes ExecutorRef an `unowned(unsafe) SerialExecutor`, except that
default actors are represented without a witness table pointer (just
a bit-pattern).
- Synthesize the unownedExecutor method for default actors (i.e. actors
that don't provide an unownedExecutor property).
- Make synthesized unownedExecutor properties `final`, and give them
a semantics attribute specifying that they're for default actors.
- Split `Builtin.buildSerialExecutorRef` into a few more precise
builtins. We're not using the main-actor one yet, though.
Pitch thread:
https://forums.swift.org/t/support-custom-executors-in-swift-concurrency/44425
The `async` operation is a global function that initiates asynchronous
work on behalf of the synchronous code that calls it. Unlike `detach`,
`async` inherits priority, actor context, and other aspects of the
synchronous code that initiates it, making it a better "default"
operation for creating asynchronous work than `detach`. The `detach`
operation is still important for creating truly detached tasks that
can later be `await`'d or cancelled if needed.
Implements the main entry point for rdar://76927008.
The closure does not escape the startAsyncLet - endAsyncLet scope. Even though it's (potentially) running on a different thread.
The substantial change in the runtime is to not call swift_release on the closure context if it's a non-escaping closure.
Through various means, it is possible for a synchronous actor-isolated
function to escape to another concurrency domain and be called from
outside the actor. The problem existed previously, but has become far
easier to trigger now that `@escaping` closures and local functions
can be actor-isolated.
Introduce runtime detection of such data races, where a synchronous
actor-isolated function ends up being called from the wrong executor.
Do this by emitting an executor check in actor-isolated synchronous
functions, where we query the executor in thread-local storage and
ensure that it is what we expect. If it isn't, the runtime complains.
The runtime's complaints can be controlled with the environment
variable `SWIFT_UNEXPECTED_EXECUTOR_LOG_LEVEL`:
0 - disable checking
1 - warn when a data race is detected
2 - error and abort when a data race is detected
At an implementation level, this introduces a new concurrency runtime
entry point `_checkExpectedExecutor` that checks the given executor
(on which the function should always have been called) against the
executor on which is called (which is in thread-local storage). There
is a special carve-out here for `@MainActor` code, where we check
against the OS's notion of "main thread" as well, so that `@MainActor`
code can be called via (e.g.) the Dispatch library's
`DispatchQueue.main.async`.
The new SIL instruction `extract_executor` performs the lowering of an
actor down to its executor, which is implicit in the `hop_to_executor`
instruction. Extend the LowerHopToExecutor pass to perform said
lowering.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
For ordinary memory-management reasons, this should only ever
happen when there will be no more uses of the actor outside of the
actor runtime. The actor runtime, meanwhile, doesn't care about
anything except the default-actor control state of the actor. So
we can just allow the rest of the actor to be destructed when it
isn't needed anymore, then destroy the actor state and deallocate
the object when we get around to switching off the executor.
This does assume that the task doesn't do anything which semantically
detects the executor it's on before switching off it, since doing so
might read a bogus executor. However, we should only get an executor
in a zombie state like this when a hop has been removed or reordered,
and detection events should count as inhibiting that and forcing the
true executor to be switched to (and thus detected).
(But maybe lifetime optimization can make this happen? Maybe we
need semantic detection to filter out zombie executors.)
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
Take the existing CompatibilityOverride mechanism and generalize it so it can be used in both the runtime and Concurrency libraries. The mechanism is preprocessor-heavy, so this requires some tricks. Use the SWIFT_TARGET_LIBRARY_NAME define to distinguish the libraries, and use a different .def file and mach-o section name accordingly.
We want the global/main executor functions to be a little more flexible. Instead of using the override mechanism, we expose function pointers that can be set by the compatibility library, or by any other code that wants to use a custom implementation.
rdar://73726764
CFRunLoopRun returns once it finishes, though the kernel may clean us up
before we get there. We effectively have a race condition between the
kernel cleaning us up and returning from a never returning function.
Small programs likely get cleaned up before reaching the ud2 instruction
emitted after the never returning function in swift, so they don't
notice, but programs of a sufficient size do. At that size, the program
will crash after what the programmer expects to be the end of their
program.