When the differentiating a function containing loops, we allocate a linear map context object on the heap. This context object may store non-trivial objects, such as closures, that need to be released explicitly. Fix the autodiff linear map context allocation builtins to correctly release such objects and not just free the memory they occupy.
The `bare` attribute indicates that the object header is not used throughout the lifetime of the object.
This means, no reference counting operations are performed on the object and its metadata is not used.
The header of bare objects doesn't need to be initialized.
Reformatting everything now that we have `llvm` namespaces. I've
separated this from the main commit to help manage merge-conflicts and
for making it a bit easier to read the mega-patch.
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
Although nonescaping closures are representationally trivial pointers to their
on-stack context, it is useful to model them as borrowing their captures, which
allows for checking correct use of move-only values across the closure, and
lets us model the lifetime dependence between a closure and its captures without
an ad-hoc web of `mark_dependence` instructions.
During ownership elimination, We eliminate copy/destroy_value instructions and
end the partial_apply's lifetime with an explicit dealloc_stack as before,
for compatibility with existing IRGen and non-OSSA aware passes.
The variants are produced by SILGen when opaque values are enabled.
They are necessary because otherwise SILGen would produce
address_to_pointer of values.
They will be lowered by AddressLowering.
The metatype value isn't needed to do the memory layout computation, and
IRGen only looks at the substitutions passed to the generic signature, so
we can reduce SIL code size, and avoid leaving behind useless metadata
accesses, if we don't emit it.
This ensures that opened archetypes always inherit any outer generic parameters from the context in which they reside. This matters because class bounds may bind generic parameters from these outer contexts, and without the outer context you can wind up with ill-formed generic environments like
<τ_0_0, where τ_0_0 : C<T>, τ_0_0 : P>
Where T is otherwise unbound because there is no entry for it among the generic parameters of the environment's associated generic signature.
Opened archetypes can be created in the constraint system, and the
existential type it wraps can contain type variables. This can happen
when the existential type is inferred through a typealias inside a
generic type, and a member reference whose base is the opened existential
gets bound before binding the generic arguments of the parent type.
However, simplifying opened archetypes to replace type variables is
not yet supported, which leads to type variables escaping the constraint
system. We can support cases where the underlying existential type doesn't
depend on the type variables by canonicalizing it when opening the
existential. Cases where the underlying type requires resolved generic
arguments are still unsupported for now.
Required for UnsafeRawPointer.withMemoryReboud(to:).
%out_token = rebind_memory %0 : $Builtin.RawPointer to %in_token
%0 must be of $Builtin.RawPointer type
%in_token represents a cached set of bound types from a prior memory state.
%out_token is an opaque $Builtin.Word representing the previously bound
types for this memory region.
This instruction's semantics are identical to ``bind_memory``, except
that the types to which memory will be bound, and the extent of the
memory region is unknown at compile time. Instead, the bound-types are
represented by a token that was produced by a prior memory binding
operation. ``%in_token`` must be the result of bind_memory or
Required for UnsafeRawPointer.withMemoryRebound(to:)
%token = bind_memory %0 : $Builtin.RawPointer, %1 : $Builtin.Word to $T
%0 must be of $Builtin.RawPointer type
%1 must be of $Builtin.Word type
%token is an opaque $Builtin.Word representing the previously bound types
for this memory region.
This patch introduces a new stdlib function called _move:
```Swift
@_alwaysEmitIntoClient
@_transparent
@_semantics("lifetimemanagement.move")
public func _move<T>(_ value: __owned T) -> T {
#if $ExperimentalMoveOnly
Builtin.move(value)
#else
value
#endif
}
```
It is a first attempt at creating a "move" function for Swift, albeit a skleton
one since we do not yet perform the "no use after move" analysis. But this at
leasts gets the skeleton into place so we can built the analysis on top of it
and churn tree in a manageable way. Thus in its current incarnation, all it does
is take in an __owned +1 parameter and returns it after moving it through
Builtin.move.
Given that we want to use an OSSA based analysis for our "no use after move"
analysis and we do not have opaque values yet, we can not supporting moving
generic values since they are address only. This has stymied us in the past from
creating this function. With the implementation in this PR via a bit of
cleverness, we are now able to support this as a generic function over all
concrete types by being a little clever.
The trick is that when we transparent inline _move (to get the builtin), we
perform one level of specialization causing the inlined Builtin.move to be of a
loadable type. If after transparent inlining, we inline builtin "move" into a
context where it is still address only, we emit a diagnostic telling the user
that they applied move to a generic or existential and that this is not yet
supported.
The reason why we are taking this approach is that we wish to use this to
implement a new (as yet unwritten) diagnostic pass that verifies that _move
(even for non-trivial copyable values) ends the lifetime of the value. This will
ensure that one can write the following code to reliably end the lifetime of a
let binding in Swift:
```Swift
let x = Klass()
let _ = _move(x)
// hypotheticalUse(x)
```
Without the diagnostic pass, if one were to write another hypothetical use of x
after the _move, the compiler would copy x to at least hypotheticalUse(x)
meaning the lifetime of x would not end at the _move, =><=.
So to implement this diagnostic pass, we want to use the OSSA infrastructure and
that only works on objects! So how do we square this circle: by taking advantage
of the mandatory SIL optimzier pipeline! Specifically we take advantage of the
following:
1. Mandatory Inlining and Predictable Dead Allocation Elimination run before any
of the move only diagnostic passes that we run.
2. Mandatory Inlining is able to specialize a callee a single level when it
inlines code. One can take advantage of this to even at -Onone to
monomorphosize code.
and then note that _move is such a simple function that predictable dead
allocation elimination is able to without issue eliminate the extra alloc_stack
that appear in the caller after inlining without issue. So we (as the tests
show) get SIL that for concrete types looks exactly like we just had run a
move_value for that specific type as an object since we promote away the
stores/loads in favor of object operations when we eliminate the allocation.
In order to prevent any issue with this being used in a context where multiple
specializations may occur, I made the inliner emit a diagnostic if it inlines
_move into a function that applies it to an address only value. The diagnostic
is emitted at the source location where the function call occurs so it is easy
to find, e.x.:
```
func addressOnlyMove<T>(t: T) -> T {
_move(t) // expected-error {{move() used on a generic or existential value}}
}
moveonly_builtin_generic_failure.swift:12:5: error: move() used on a generic or existential value
_move(t)
^
```
To eliminate any potential ABI impact, if someone calls _move in a way that
causes it to be used in a context where the transparent inliner will not inline
it, I taught IRGen that Builtin.move is equivalent to a take from src -> dst and
marked _move as always emit into client (AEIC). I also took advantage of the
feature flag I added in the previous commit in order to prevent any cond_fails
from exposing Builtin.move in the stdlib. If one does not pass in the flag
-enable-experimental-move-only then the function just returns the value without
calling Builtin.move, so we are safe.
rdar://83957028
This patch updates the asynchronous main function to run the first thunk
of the function synchronously through a call to `swift_job_run`.
The runloop is killed by exiting or aborting the task that it is running
on. As such, we need to ensure that the task contains an async function
that either calls exit explicitly or aborts. The AsyncEntryPoint, that
contains this code, was added in the previous patch. This patch adds the
pieces for the actual implementation of this behaviour as well as adding
the necessary code to start the runloop.
There are now four layers of main functions before hitting the "real"
code.
@main: This is the actual main entrypoint of the program. This
constructs the task containing @async_main, grabs the main executor,
runs swift_job_run to run the first part synchronously, and finally
kicks off the runloop with a call to _asyncMainDrainQueue. This is
generated in the call to `emitAsyncMainThreadStart`.
@async_main: This thunk exists to ensure that the main function calls
`exit` at some point so that the runloop stops. It also handles emitting
an error if the user-written main function throws.
e.g:
```
func async_main() async -> () {
do {
try await Main.$main()
exit(0)
} catch {
_errorInMain(error)
}
}
```
Main.$main(): This still has the same behaviour as with the
synchronous case. It just calls `try await Main.main()` and exists to
simplify typechecking.
Main.main(): This is the actual user-specified main. It serves the same
purpose as in the synchronous, allowing the programmer to write code,
but it's async!
The control flow in `emitFunctionDefinition` is a little confusing (to
me anyway), so here it is spelled out:
If the main function is synchronous, the `constant.kind` will be a
`SILDeclRef::Kind::EntryPoint`, but the `decl` won't be async, so it
drops down to `emitArtificalTopLevel` anyway.
If the main function is async and we're generating `@main`, the
`constant.kind` will be `SILDeclRef::Kind::AsyncEntryPoint`, so we also
call `emitArtificalTopLevel`. `emitArtificalTopLevel` is responsible for
detecting whether the decl is async and deciding whether to emit code to
extract the argc/argv variables that get passed into the actual main
entrypoint to the program. If we're generating the `@async_main` body,
the kind will be `SILDeclRef::Kind::EntryPoint` and the `decl` will be
async, so we grab the mainEntryPoint decl and call
`emitAsyncMainThreadStart` to generate the wrapping code.
Note; there is a curious change in `SILLocation::getSourceLoc()`
where instead of simply checking `isFilenameAndLocation()`, I change it
to `getStorageKind() == FilenameAndLocationKind`. This is because the
SILLocation returned is to a FilenameAndLocationKind, but the actual
storage returns true for the call to `isNull()` inside of the
`isFilenameAndLocation()` call. This results in us incorrectly falling
through to the `getASTNode()` call below that, which asserts when asked
to get the AST node of a location.
I also did a little bit of refactoring in the SILGenModule for grabbing
intrinsics. Previously, there was only a `getConcurrencyIntrinsic`
function, which would only load FuncDecls out of the concurrency
module. The `exit` function is in the concurrency shims module, so I
refactored the load code to take a ModuleDecl to search from.
The emitBuiltinCreateAsyncTask function symbol is exposed from
SILGenBuiltin so that it is available from SILGenFunction. There is a
fair bit of work involved going from what is available at the SGF to
what is needed for actually calling the CreateAsyncTask builtin, so in
order to avoid additional maintenance, it's good to re-use that.
Right now this does not actually do anything beyond causing a move_value
instruction to be emitted. With time, I am going to use this to map T ->
@_moveOnly T in the fullness of time... but I am going to stage in that part in
a different commit once I add the MoveOnly type itself. I am trying to split up
that larger commit as much as possible to make it easy to review.
Use APIs for creating terminator results that handle forwarding
ownership consistently.
Add ManagedValue::forForwardedRValue(SILValue) to handle cleanups
consistently based on ownership forwarding.
Add SILGenBuilder::createForwardedTermResult(SILType type) for
creating termator results with the correct ownership and cleanups.
Add SILGenBuilder::createTermResult(SILType type, ValueOwnershipKind
ownership) that handles cleanup based on terminator result ownership.
Add SILGenBuilder::createOptionalSomeResult(SwitchEnumInst) so a lot
of code doesn't need to deal with unwrapping Optional types,
terminator results, and ownership rules.
Replace the existing "phi" APIs with a single
SILGenBuilder::createPhi(SILType, ValueOwnershipKind) that handles
cleanup based on phi ownership.
Phis and terminator results are fundamentally different and need to be handled differently everywhere. Remove the confusion where terminator results were generated with a "phi argument" API.
Rather than using group task options constructed from the Swift parts
of the _Concurrency library and passed through `createAsyncTask`'s
options, introduce a separate builtin that always takes a group. Move
the responsibility for creating the options structure into IRGen, so
we don't need to expose the TaskGroupTaskOptionRecord type in Swift.
Introduce a builtin `createAsyncTask` that maps to `swift_task_create`,
and use that for the non-group task creation operations based on the
task-creation flags. `swift_task_create` and the thin function version
`swift_task_create_f` go through the dynamically-replaceable
`swift_task_create_common`, where all of the task creation logic is
present.
While here, move copying of task locals and the initial scheduling of
the task into `swift_task_create_common`, enabling by separate flags.
introduce new options parameter to all task spawning
[Concurrency] ABI for asynclet start to accept options
[Concurrency] fix unittest usages of changed task creation ABI
[Concurrency] introduce constants for parameter indexes in ownership
[Concurrency] fix test/SILOptimizer/closure_lifetime_fixup_concurrency.swift
SILGen this builtin to a mandatory hop_to_executor with an actor type
operand.
e.g.
Task.detached {
Builtin.hopToActor(MainActor.shared)
await suspend()
}
Required to fix a bug in _runAsyncMain.
- Introduce an UnownedSerialExecutor type into the concurrency library.
- Create a SerialExecutor protocol which allows an executor type to
change how it executes jobs.
- Add an unownedExecutor requirement to the Actor protocol.
- Change the ABI for ExecutorRef so that it stores a SerialExecutor
witness table pointer in the implementation field. This effectively
makes ExecutorRef an `unowned(unsafe) SerialExecutor`, except that
default actors are represented without a witness table pointer (just
a bit-pattern).
- Synthesize the unownedExecutor method for default actors (i.e. actors
that don't provide an unownedExecutor property).
- Make synthesized unownedExecutor properties `final`, and give them
a semantics attribute specifying that they're for default actors.
- Split `Builtin.buildSerialExecutorRef` into a few more precise
builtins. We're not using the main-actor one yet, though.
Pitch thread:
https://forums.swift.org/t/support-custom-executors-in-swift-concurrency/44425
Tasks shouldn't normally hog the actor context indefinitely after making a call that's bound to
that actor, since that prevents the actor from potentially taking on other jobs it needs to
be able to address. Set up SILGen so that it saves the current executor (using a new runtime
entry point) and hops back to it after every actor call, not only ones where the caller context
is also actor-bound.
The added executor hopping here also exposed a bug in the runtime implementation while processing
DefaultActor jobs, where if an actor job returned to the processing loop having already yielded
the thread back to a generic executor, we would still attempt to make the actor give up the thread
again, corrupting its state.
rdar://71905765
Most of the async runtime functions have been changed to not
expect the task and executor to be passed in. When knowing the
task and executor is necessary, there are runtime functions
available to recover them.
The biggest change I had to make to a runtime function signature
was to swift_task_switch, which has been altered to expect to be
passed the context and resumption function instead of requiring
the caller to park the task. This has the pleasant consequence
of allowing the implementation to very quickly turn around when
it recognizes that the current executor is satisfactory. It does
mean that on arm64e we have to sign the continuation function
pointer as an argument and then potentially resign it when
assigning into the task's resume slot.
rdar://70546948
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)
The underlying runtime functions really want to be able to consume the closure being used
to spawn the task, but the implementation was trying to hide this by introducing a retain
at IRGen time, which is not very ARC-optimizer-friendly. Correctly model the builtin operands
as consumed so that the ownership verifier allows them to be taken +1.
Refactor SILGen's ApplyOptions into an OptionSet, add a
DoesNotAwait flag to go with DoesNotThrow, and sink it
all down into SILInstruction.h.
Then, replace the isNonThrowing() flag in ApplyInst and
BeginApplyInst with getApplyOptions(), and plumb it
through to TryApplyInst as well.
Set the flag when SILGen emits a sync call to a reasync
function.
When set, this disables the SIL verifier check against
calling async functions from sync functions.
Finally, this allows us to add end-to-end tests for
rdar://problem/71098795.