* Revert "[Distributed] disable tests until issue fixed"
This reverts commit 0a04278920.
* Revert "[Distributed] Initial `distributed` actors and functions and new module (#37109)"
This reverts commit 814ede0cf3.
* [Distributed] Initial distributed checking
* [Distributed] initial types shapes and conform to DistributedActor
* [Distributed] Require Codable params and return types
* [Distributed] initial synthesis of fields and constructors
* [Distributed] Field and initializer synthesis
* [Distributed] Codable requirement on distributed funcs; also handle <T: Codable>
* [Distributed] handle generic type params which are Codable in dist func
[Distributed] conformsToProtocol after all
* [Distributed] Implement remote flag on actors
* Implement remote flag on actors
* add test
* actor initializer that sets remote flag
[Distributed] conformances getting there
* [Distributed] dont require async throws; cleanup compile tests
* [Distributed] do not synthesize default implicit init, only our special ones
* [Distributed] properly synth inits and properties; mark actorTransport as _distributedActorIndependent
Also:
- do not synthesize default init() initializer for dist actor
* [Distributed] init(transport:) designated and typechecking
* [Distributed] dist actor initializers MUST delegate to local-init
* [Distributed] check if any ctors in delegation call init(transport:)
* [Distributed] check init(transport:) delegation through many inits; ban invoking init(resolve:using:) explicitly
* [Distributed] disable IRGen test for now
* [Distributed] Rebase cleanups
* [Concurrent] transport and address are concurrent value
* [Distributed] introduce -enable-experimental-distributed flag
* rebase adjustments again
* rebase again...
* [Distributed] distributed functions are implicitly async+throws outside the actor
* [Distributed] implicitly throwing and async distributed funcs
* remove printlns
* add more checks to implicit function test
* [Distributed] resolve initializer now marks the isRemote actor flag
* [Distributed] distributedActor_destroy invoked instead, rather than before normal
* [Distributed] Generate distributed thunk for actors
* [distributed] typechecking for _remote_ functions existing, add tests for remote funcs
* adding one XFAIL'ed task & actor lifetime test
The `executor_deinit1` test fails 100% of the time
(from what I've seen) so I thought we could track
and see when/if someone happens to fix this bug.
Also, added extra coverage for #36298 via `executor_deinit2`
* Fix a memory issue with actors in the runtime system, by @phausler
* add new test that now passes because of patch by @phausler
See previous commit in this PR.
Test is based on one from rdar://74281361
* fix all tests that require the _remote_ function stubs
* Do not infer @actorIndependent onto `let` decls
* REVERT_ME: remove some tests that hacky workarounds will fail
* another flaky test, help build toolchain
* [Distributed] experimental distributed implies experimental concurrency
* [Distributed] Allow distributed function that are not marked async or throws
* [Distributed] make attrs SIMPLE to get serialization generated
* [Distributed] ActorAddress must be Hashable
* [Distributed] Implement transport.actorReady call in local init
* cleanup after rebase
* [Distributed] add availability attributes to all distributed actor code
* cleanup - this fixed some things
* fixing up
* fixing up
* [Distributed] introduce new Distributed module
* [Distributed] diagnose when missing 'import _Distributed'
* [Distributed] make all tests import the module
* more docs on address
* [Distributed] fixup merge issues
* cleanup: remove unnecessary code for now SIMPLE attribute
* fix: fix getActorIsolationOfContext
* [Distributed] cmake: depend on _concurrency module
* fixing tests...
* Revert "another flaky test, help build toolchain"
This reverts commit 83ae6654dd.
* remove xfail
* clenup some IR and SIL tests
* cleanup
* [Distributed] fix cmake test and ScanDependencies/can_import_with_map.swift
* [Distributed] fix flags/build tests
* cleanup: use isDistributed wherever possible
* [Distributed] don't import Dispatch in tests
* dont link distributed in stdlib unittest
* trying always append distributed module
* cleanups
* [Distributed] move all tests to Distributed/ directory
* [lit] try to fix lit test discovery
* [Distributed] update tests after diagnostics for implicit async changed
* [Distributed] Disable remote func tests on Windows for now
* Review cleanups
* [Distributed] fix typo, fixes Concurrency/actor_isolation_objc.swift
* [Distributed] attributes are DistributedOnly (only)
* cleanup
* [Distributed] cleanup: rely on DistributedOnly for guarding the keyword
* Update include/swift/AST/ActorIsolation.h
Co-authored-by: Doug Gregor <dgregor@apple.com>
* introduce isAnyThunk, minor cleanup
* wip
* [Distributed] move some type checking to TypeCheckDistributed.cpp
* [TypeCheckAttr] remove extra debug info
* [Distributed/AutoDiff] fix SILDeclRef creation which caused AutoDiff issue
* cleanups
* [lit] remove json import from lit test suite, not needed after all
* [Distributed] distributed functions only in DistributedActor protocols
* [Distributed] fix flag overlap & build setting
* [Distributed] Simplify noteIsolatedActorMember to not take bool distributed param
* [Distributed] make __isRemote not public
Co-authored-by: Dario Rexin <drexin@apple.com>
Co-authored-by: Kavon Farvardin <kfarvardin@apple.com>
Co-authored-by: Doug Gregor <dgregor@apple.com>
- Introduce an UnownedSerialExecutor type into the concurrency library.
- Create a SerialExecutor protocol which allows an executor type to
change how it executes jobs.
- Add an unownedExecutor requirement to the Actor protocol.
- Change the ABI for ExecutorRef so that it stores a SerialExecutor
witness table pointer in the implementation field. This effectively
makes ExecutorRef an `unowned(unsafe) SerialExecutor`, except that
default actors are represented without a witness table pointer (just
a bit-pattern).
- Synthesize the unownedExecutor method for default actors (i.e. actors
that don't provide an unownedExecutor property).
- Make synthesized unownedExecutor properties `final`, and give them
a semantics attribute specifying that they're for default actors.
- Split `Builtin.buildSerialExecutorRef` into a few more precise
builtins. We're not using the main-actor one yet, though.
Pitch thread:
https://forums.swift.org/t/support-custom-executors-in-swift-concurrency/44425
When the closure of startAsyncLet is no-escaping, the captured values (= the partial_apply arguments) must be kept alive until the endAsyncLet builtin.
ClosureLifetimeFixup adds the generated mark_dependence as a second operand to endAsyncLet, which keeps all the arguments alive until this point.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
This isn't _terribly_ useful as-is, because the only constant mask you can get at from Swift at present is the zeroinitializer, but even that is quite useful for optimizing the repeating: intializer on SIMD. At some future point we should wire up generating constant masks for the .even, .odd, .high and .low properties (and also eventually make shufflevector take non-constant masks in LLVM). But this is enough to be useful, so let's get it in.
The comment in LowerHopToActor explains the design here.
We want SILGen to emit hops to actors, ignoring executors,
because it's easier to fully optimize in a world where deriving
an executor is a non-trivial operation. But we also want something
prior to IRGen to lower the executor derivation because there are
useful static optimizations we can do, such as doing the derivation
exactly once on a dominance path and strength-reducing the derivation
(e.g. exploiting static knowledge that an actor is a default actor).
There are probably phase-ordering problems with doing this so late,
but hopefully they're restricted to situations like actors that
share an executor. We'll want to optimize that eventually, but
in the meantime, this unblocks the executor work.
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
Tasks shouldn't normally hog the actor context indefinitely after making a call that's bound to
that actor, since that prevents the actor from potentially taking on other jobs it needs to
be able to address. Set up SILGen so that it saves the current executor (using a new runtime
entry point) and hops back to it after every actor call, not only ones where the caller context
is also actor-bound.
The added executor hopping here also exposed a bug in the runtime implementation while processing
DefaultActor jobs, where if an actor job returned to the processing loop having already yielded
the thread back to a generic executor, we would still attempt to make the actor give up the thread
again, corrupting its state.
rdar://71905765
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)
We expect to iterate on this quite a bit, both publicly
and internally, but this is a fine starting-point.
I've renamed runAsync to runAsyncAndBlock to underline
very clearly what it does and why it's not long for this
world. I've also had to give it a radically different
implementation in an effort to make it continue to work
given an actor implementation that is no longer just
running all work synchronously.
The major remaining bit of actor-scheduling work is to
make swift_task_enqueue actually do something sensible
based on the executor it's been given; currently it's
expecting a flag that IRGen simply doesn't know to set.
In derivatives of loops, no longer allocate boxes for indirect case payloads. Instead, use a custom pullback context in the runtime which contains a bump-pointer allocator.
When a function contains a differentiated loop, the closure context is a `Builtin.NativeObject`, which contains a `swift::AutoDiffLinearMapContext` and a tail-allocated top-level linear map struct (which represents the linear map struct that was previously directly partial-applied into the pullback). In branching trace enums, the payloads of previously indirect cases will be allocated by `swift::AutoDiffLinearMapContext::allocate` and stored as a `Builtin.RawPointer`.
`Builtin.createAsyncTask` takes flags, an optional parent task, and an
async/throwing function to execute, and passes it along to the
`swift_task_create_f` entry point to create a new (potentially child)
task, returning the new task and its initial context.
Implement a new builtin, `cancelAsyncTask()`, to cancel the given
asynchronous task. This lowers down to a call into the runtime
operation `swift_task_cancel()`.
Use this builtin to implement Task.Handle.cancel().
Rather than produce an "unowned" result from `getCurrentAsyncTask()`,
take advantage of the fact that the task is effectively guaranteed in
the scope. Do so be returning it as "unowned", and push an
end_lifetime cleanup to end the lifetime. This eliminates unnecessary
ref-count traffic as well as introducing another use of unowned.
Approach is thanks to Michael Gottesman, bugs are mine.
This introduces a new builtin, `getCurrentAsyncTask()`, that produces a
reference to the current task. This builtin can only be used within
`async` functions, and IR generation merely grabs the task argument
and packages it up.
The type of this function is `() -> Builtin.NativeObject`, because we
don't currently have a Swift-level representation of tasks, and can
probably handle everything through builtins or runtime calls.
Define type signatures and SILGen for the following builtins:
```
/// Applies the {jvp|vjp} of `f` to `arg1`, ..., `argN`.
func applyDerivative_arityN_{jvp|vjp}(f, arg1, ..., argN) -> jvp/vjp return type
/// Applies the transpose of `f` to `arg`.
func applyTranspose_arityN(f, arg) -> transpose return type
/// Makes a differentiable function from the given `original`, `jvp`, and
/// `vjp` functions.
func differentiableFunction_arityN(original, jvp, vjp)
/// Makes a linear function from the given `original` and `transpose` functions.
func linearFunction_arityN(original, transpose)
```
Add SILGen FileCheck tests for all builtins.
(BaseT, @inout @unowned(unsafe) T) -> @guaranteed T
The reason for the weird signature is that currently the Builtin infrastructure
does not handle results well. Also, note that we are not actually performing a
call here. We are SILGening directly so we can create a guaranteed result.
The intended semantics is that one passes in a base value that guarantees the
lifetime of the unowned(unsafe) value. The builtin then:
1. Borrows the base.
2. Loads the trivial unowned (unsafe), converts that value to a guaranteed ref
after unsafely unwrapping the optional.
3. Uses mark dependence to tie the lifetimes of the guaranteed base to the
guaranteed ref.
I also updated my small UnsafeValue.swift test to make sure we get the codegen
we expect.
The signature is:
(T, @inout @unowned(unsafe) Optional<T>) -> ()
The reason for the weird signature is that currently the Builtin infrastructure
does not handle results well.
The semantics of this builtin is that it enables one to store the first argument
into an unowned unsafe address without any reference counting operations. It
does this just by SILGening the relevant code. The optimizer chews through this
code well, so we get the expected behavior.
I also included a small proof of concept to validate that this builtin works as
expected.
and eliminate dead code. This is meant to be a replacement for the utility:
recursivelyDeleteTriviallyDeadInstructions. The new utility performs more aggresive
dead-code elimination for ownership SIL.
This patch also migrates most non-force-delete uses of
recursivelyDeleteTriviallyDeadInstructions to the new utility.
and migrates one force-delete use of recursivelyDeleteTriviallyDeadInstructions
(in IRGenPrepare) to use the new utility.
TLDR: This patch introduces a new kind of builtin, "a polymorphic builtin". One
calls it like any other builtin, e.x.:
```
Builtin.generic_add(x, y)
```
but it has a contract: it must be specialized to a concrete builtin by the time
we hit Lowered SIL. In this commit, I add support for the following generic
operations:
Type | Op
------------------------
FloatOrVector |FAdd
FloatOrVector |FDiv
FloatOrVector |FMul
FloatOrVector |FRem
FloatOrVector |FSub
IntegerOrVector|AShr
IntegerOrVector|Add
IntegerOrVector|And
IntegerOrVector|ExactSDiv
IntegerOrVector|ExactUDiv
IntegerOrVector|LShr
IntegerOrVector|Mul
IntegerOrVector|Or
IntegerOrVector|SDiv
IntegerOrVector|SRem
IntegerOrVector|Shl
IntegerOrVector|Sub
IntegerOrVector|UDiv
IntegerOrVector|Xor
Integer |URem
NOTE: I only implemented support for the builtins in SIL and in SILGen. I am
going to implement the optimizer parts of this in a separate series of commits.
DISCUSSION
----------
Today there are polymorphic like instructions in LLVM-IR. Yet, at the
swift and SIL level we represent these operations instead as Builtins whose
names are resolved by splatting the builtin into the name. For example, adding
two things in LLVM:
```
%2 = add i64 %0, %1
%2 = add <2 x i64> %0, %1
%2 = add <4 x i64> %0, %1
%2 = add <8 x i64> %0, %1
```
Each of the add operations are done by the same polymorphic instruction. In
constrast, we splat out these Builtins in swift today, i.e.:
```
let x, y: Builtin.Int32
Builtin.add_Int32(x, y)
let x, y: Builtin.Vec4xInt32
Builtin.add_Vec4xInt32(x, y)
...
```
In SIL, we translate these verbatim and then IRGen just lowers them to the
appropriate polymorphic instruction. Beyond being verbose, these prevent these
Builtins (which need static types) from being used in polymorphic contexts where
we can guarantee that eventually a static type will be provided.
In contrast, the polymorphic builtins introduced in this commit can be passed
any type, with the proviso that the expert user using this feature can guarantee
that before we reach Lowered SIL, the generic_add has been eliminated. This is
enforced by IRGen asserting if passed such a builtin and by the SILVerifier
checking that the underlying builtin is never called once the module is in
Lowered SIL.
In forthcoming commits, I am going to add two optimizations that give the stdlib
tool writer the tools needed to use this builtin:
1. I am going to add an optimization to constant propagation that changes a
"generic_*" op to the type of its argument if the argument is a type that is
valid for the builtin (i.e. integer or vector).
2. I am going to teach the SILCloner how to specialize these as it inlines. This
ensures that when we transparent inline, we specialize the builtin automatically
and can then form SSA at -Onone using predictable memory access operations.
The main implication around these polymorphic builtins are that if an author is
not able to specialize the builtin, they need to ensure that after constant
propagation, the generic builtin has been DCEed. The general rules are that the
-Onone optimizer will constant fold branches with constant integer operands. So
if one can use a bool of some sort to trigger the operation, one can be
guaranteed that the code will not codegen. I am considering putting in some sort
of diagnostic to ensure that the stdlib writer has a good experience (e.x. get
an error instead of crashing the compiler).
1. builtin "int_expect", which makes the evaluator work on more
integer operations such as left/right shift (with traps) and
integer conversions.
2. builtin "_assertConfiguration", which enables the evaluator
to work with debug stdlib.
3. builtin "ptrtoint", which enables the evaluator to track
StaticString
4. _assertionFailure API, which enables the evaluator to report
stdlib assertion failures encountered during constant evaluation.
Also, enable attaching auxiliary data with the enum "UnknownReason"
and use it to improve diagnostics for UnknownSymbolicValues,
which represent failures during constant evaluation.
Returns `true` if `T.Type` is known to refer to a concrete type. The
implementation allows for the optimizer to specialize this at -O and
eliminate conditional code.
Includes `Swift._isConcrete<T>(T.Type) -> Bool` wrapper function.
We have to be source compatible to be able to parse old swiftinterface files where the old Builtin.condfail is used in inlineable functions.
rdar://problem/53176692
The SIL generation for this builtin also changes: instead of generating the cond_fail instructions upfront, let the optimizer generate it, if the operand is a static string literal.
In worst case, if the second operand is not a static string literal, the Builtin.condfail is lowered at the end of the optimization pipeline with a default message: "unknown program error".
The SIL generation for this builtin also changes: instead of generating the cond_fail instructions upfront, let the optimizer generate it, if the operand is a static string literal.
In worst case, if the second operand is not a static string literal, the Builtin.condfail is lowered at the end of the optimization pipeline with a default message: "unknown program error".
String -> Builtin.RawPointer that given a string constructed from a
literal, returns the address of the string literal in the global string
table of the compiled binary as a pointer.
This fixes the Windows platform, where the aligned allocation path is
not malloc-compatible. It won't have any observable difference on
Darwin or Linux, aside from manually allocated memory on Linux now
being consistently 16-byte aligned (heap objects will still be 8-byte
aligned on Linux).
It is unfortunate that we can't guarantee Swift-allocated memory via
Unsafe*Pointer is malloc compatible on Windows. It would have been
nice for that to be a cross platform guarantee since it's normal to
allocate in C and deallocate in Swift or vice-versa. Now we have to
tell developers to always use _aligned_malloc/_aligned_free when
transitioning between Swift/C if they expect their code to work on
Windows.
Even though this fix isn't required today on Darwin/Linux, it makes
good sense to guarantee that the allocation/deallocation paths are
consistent.
This is done by specifying a constant that stdlib can use to round up
alignment, _swift_MinAllocationAlignment. The runtime asserts that
this constant is greater than MALLOC_ALIGN_MASK for all platforms.
This way, manually allocated buffers will always use the aligned
allocation path. If users specify an alignment less than m
round up so users don't need
to pass the same alignment to deallocate the buffer). This constant
does not need to be ABI.
Alternatives are:
1. Require users of Unsafe*Pointer to specify the same alignment
during deallocation. This is obviously madness.
2. Introduce new runtime entry points:
swift_alignedAlloc/swift_alignedDealloc, introduce corresponding
new builtins, and have Unsafe*Pointer always call those. This would
make the runtime API a little more obvious but would introduce
complexity in other areas of the compiler and it doesn't have any
other significant benefit. Less than 16-byte alignment of manually
allocated buffers on Linux is a non-goal.
* Remove apparently obsolete builtin functions.
- Remove s_to_u_checked_conversion and u_to_s_checked_conversion functions from builtin AST parsing, SIL/IR generation and from SIL optimisations.
* Remove apparently obsolete builtin functions - unit tests.
- Remove unit tests for SIL transformations relating to s_to_u_checked_conversion and u_to_s_checked_conversion builtin functions.