The current code generation will emit an autibsp after adjusting the
stack pointe for the tail call. If callee and caller argument area does
not match this would fail.
Most of the async runtime functions have been changed to not
expect the task and executor to be passed in. When knowing the
task and executor is necessary, there are runtime functions
available to recover them.
The biggest change I had to make to a runtime function signature
was to swift_task_switch, which has been altered to expect to be
passed the context and resumption function instead of requiring
the caller to park the task. This has the pleasant consequence
of allowing the implementation to very quickly turn around when
it recognizes that the current executor is satisfactory. It does
mean that on arm64e we have to sign the continuation function
pointer as an argument and then potentially resign it when
assigning into the task's resume slot.
rdar://70546948
Previously, thick async functions were represented sometimes as a pair
of (AsyncFunctionPointer, nullptr)--when the thick function was produced
via a thin_to_thick_function, e.g.--and sometimes as a pair of
(FunctionPointer, ThickContext)--when the thick function was produced by
a partial_apply--with the size stored in the slot of the ThickContext.
That optimized for the wrong case: partial applies of dynamic async
functions; in that case, there is no appropriate AsyncFunctionPointer to
form when lowering the partial_apply instruction. The far more common
case is to know exactly which function is being partially applied. In
that case, we can form the appropriate AsyncFunctionPointer.
Furthermore, the previous representation made calling a thick function
more complex: it was always necessary to check whether the context was
in fact null and then proceed along two different paths depending.
Here, that behavior is corrected by creating a thunk in a mandatory
IRGen SIL pass in the case that the function that is being partially
applied is dynamic. That new thunk is then partially applied in place
of the original partial_apply of the dynamic function.
The `coro.end.async` intrinsic allow specifying a function that is to be
tail-called as the last thing before returning.
LLVM lowering will inline the `must-tail-call` function argument to
`coro.end.async`. This `must-tail-call` function can contain a
`musttail` call.
```
define @my_must_tail_call_func(void (*)(i64) %fnptr, i64 %args) {
musttail call void %fnptr(i64 %args)
ret void
}
define @async_func() {
...
coro.end.async(..., @my_must_tail_call_func, %return_continuation, i64 %args)
unreachable
}
```
First, just call an async -> T function instead of forcing the caller
to piece together which case we're in and perform its own copy. This
ensures that the task is actually kept alive properly.
Second, now that we no longer implicitly depend on the waiting tasks
being run synchronously, go ahead and schedule them to run on the
global executor.
This solves some problems which were blocking the work on TLS-ifying
the task/executor state.
This is conditional on UseAsyncLowering and in the future should also be
conditional on `clangTargetInfo.isSwiftAsyncCCSupported()` once that
support is merged.
Update tests to work either with swiftcc or swifttailcc.
LLVM does type based analysis on sret storage types. This is a problem
with non-fixed types whose storage type does not contain the non-fixed
part.
rdar://73778591
Compiler:
- Add `Forward` and `Reverse` to `DifferentiabilityKind`.
- Expand `DifferentiabilityMask` in `ExtInfo` to 3 bits so that it now holds all 4 cases of `DifferentiabilityKind`.
- Parse `@differentiable(reverse)` and `@differentiable(_forward)` declaration attributes and type attributes.
- Emit a warning for `@differentiable` without `reverse`.
- Emit an error for `@differentiable(_forward)`.
- Rename `@differentiable(linear)` to `@differentiable(_linear)`.
- Make `@differentiable(reverse)` type lowering go through today's `@differentiable` code path. We will specialize it to reverse-mode in a follow-up patch.
ABI:
- Add `Forward` and `Reverse` to `FunctionMetadataDifferentiabilityKind`.
- Extend `TargetFunctionTypeFlags` by 1 bit to store the highest bit of differentiability kind (linear). Note that there is a 2-bit gap in `DifferentiabilityMask` which is reserved for `AsyncMask` and `ConcurrentMask`; `AsyncMask` is ABI-stable so we cannot change that.
_Differentiation module:
- Replace all occurrences of `@differentiable` with `@differentiable(reverse)`.
- Delete `_transpose(of:)`.
Resolves rdar://69980056.
In __swift_async_resume_project_context, the context is stored into the
extended frame. On arm64e, it is signed first. Previously, that signed
context was returned fro the function. That resulted in code like
pacda x16, x10
str x16, [x9]
ldr x9, [x16, #0x48]
where the context is signed (pacda), stored into the extended frame
(str) and then an attempt is made to load from the signed context (ldr).
Here, the unsigned context is returned from the function.
* Adds support for generating code that uses swiftasync parameter lowering.
* Currently only arm64's llvm lowering supports the swift_async_context_addr intrinsic.
* Add arm64e pointer signing of updated swift_async_context_addr.
This commit needs the PR llvm-project#2291.
* [runtime] unittests should use just-built compiler if the runtime did
This will start to matter with the introduction of usage of swiftasync parameters which only very recent compilers support.
rdar://71499498
Previously, swift_suspend_dispatch was passed a pointer to a function
and createAsyncDispatchFn was passed a FunctionPointer. The latter
constructed a new FunctionPointer using the passed-in function pointer
as the value, because the value inside the FunctionPointer was a value
in a different function.
That worked fine on platforms without pointer authentication. On
arm64e, however, calling a function can use two values from a
FunctionPointer: the pointer to the function and the discriminator. The
result was that on arm64e, swift_suspend_dispatch failed verification
because the discriminator value that was used in the call made by
swift_suspend_dispatch did not originate in that function.
Here, that problem is resolved by passing the discriminator to
swift_suspend_dispatch. Now, createAsyncDispatchFn creates a
FunctionPointer using not just the passed-in function pointer but also
the passed-in discriminator. The result is that swift_suspend_dispatch
no longer fails verification.
A partial apply of an async non-direct function entails storing a
pointer to the function's AsyncFunctionPointer into the thick context
using a specific (IGM.getOptions().PointerAuth.PartialApplyCapture)
ptrauth schema. Previously, the incorrect schema (derived from the
function type) was used to auth the ptr-to-AsyncFunctionPointer and then
to again sign the ptr-to-function. Here that error is corrected. Now,
the pointer-to-AsyncFunctionPointer is auth'd using that specific schema
and then the extracted function pointer is again signed again using it.
For this, store those 3 values on the stack at function entry and update them with the return values of coro_suspend_async intrinsic calls.
This fixes a correctness issue, because the executor may be different after a resume.
It also is more efficient, because this means that the 3 values don't have to preserved in the context over a suspension point.
Now that the convention for partial apply forwarders of async functions
is that the thick context is embedded within the async context, there is
never a need for a placeholder thick context. Here, the placeholder
thick context is only added when the function for which a partial apply
forwarder is being emitted is not async.
Previously, the thick context was passed as a fourth parameter to
partial apply forwarders. Here, the thick context is instead moved into
the async context at the local context position. To support this, the
local context is made always available.
An AsyncFunctionPointer, defined in Task.h, is a struct consisting of
two i32s: (1) the relative address of the async function and (2) the
size of the async context to be allocated when calling that function.
Here, such structs are emitted for every async SILFunction that is
emitted.
Previously, an IRGenFunction was being passed to the functions that
construct an AsyncContextLayout. That was not actually necessary and
prevented construction of the layout in contexts where no IRGenFunction
was present. Here that requirement is eased to requiring an IRGenModule
which is indeed required to construct an AsyncContextLayout.
Previously, the task and executor values passed to a partial apply
forwarder were being ignored. Here, they are passed along to the
partially applied function.
The previous stage of bringup only had async functions taking a single
argument: the async context. The next stage will involve the task and
executor. Here, arguments are added for those values. To begin with,
null is always passed for these values.
The majority of support comes in the form of emitting partial
application forwarders for partial applications of async functions.
Such a partial application forwarder must take an async context which
has been partially populated at the apply site. It is responsible for
populating it "the rest of the way". To do so, like sync partial
application forwarders, it takes a second argument, its context, from
which it pulls the additional arguments which were capture at
partial_apply time.
The size of the async context that is passed to the forwarder, however,
can't be known at the apply site by simply looking at the signature of
the function to be applied (not even by looking at the size associated
with the function in the special async function pointer constant which
will soon be emitted). The reason is that there are an unknown (at the
apply site) number of additional arguments which will be filled by the
partial apply forwarder (and in the case of repeated partial
applications, further filled in incrementally at each level). To enable
this, there will always be a heap object for thick async functions.
These heap objects will always store the size of the async context to be
allocated as their first element. (Note that it may be possible to
apply the same optimization that was applied for thick sync functions
where a single refcounted object could be used as the context; doing so,
however, must be made to interact properly with the async context size
stored in the heap object.)
To continue to allow promoting thin async functions to thick async
functions without incurring a thunk, at the apply site, a null-check
will be performed on the context pointer. If it is null, then the async
context size will be determined based on the signature. (When async
function pointers become pointers to a constant with a size i32 and a
relative address to the underlying function, the size will be read from
that constant.) When it is not-null, the size will be pulled from the
first field of the context (which will in that case be cast to
<{%swift.refcounted, i32}>).
To facilitate sharing code and preserving the original structure of
emitPartialApplicationForwarder (which weighed in at roughly 700 lines
prior to this change), a new small class hierarchy, descending from
PartialApplicationForwarderEmission has been added, with subclasses for
the sync and async case. The shuffling of arguments into and out of the
final explosion that was being performed in the synchronous case has
been preserved there, though the arguments are added and removed through
a number of methods on the superclass with more descriptive names. That
was necessary to enable the async class to handle these different
flavors of parameters correctly.
To get some initial test coverage, the preexisting
IRGen/partial_apply.sil and IRGen/partial_apply_forwarder.sil tests have
been duplicated into the async folder. Those tests cases within these
files which happened to have been crashing have each been extracted into
its own runnable test that both verifies that the compiler does not
crash and also that the partial application forwarder behaves correctly.
The FileChecks in these tests are extremely minimal, providing only
enough information to be sure that arguments are in fact squeezed into
an async context.
Here, the following is implemented:
- Construction of SwiftContext struct with the fields needed for calling
functions.
- Allocating and deallocating these swift context via runtime calls
before calling async functions and after returning from them.
- Storing arguments (including bindings and the self parameter but not
including protocol fields for witness methods) and returns (both
direct and indirect).
- Calling async functions.
Additional things that still need to be done:
- protocol extension methods
- protocol witness methods
- storing yields
- partial applies
This became necessary after recent function type changes that keep
substituted generic function types abstract even after substitution to
correctly handle automatic opaque result type substitution.
Instead of performing the opaque result type substitution as part of
substituting the generic args the underlying type will now be reified as
part of looking at the parameter/return types which happens as part of
the function convention apis.
rdar://62560867
A [onstack] closure does not take ownership of its arguments. It is
therefore not correct to use an initWithTake copy of indirect arguments.
Instead we just capture the address of the indirect argument.
rdar://61261982
* Use in_guaranteed for let captures
With this all let values will be captured with in_guaranteed convention
by the closure. Following are the main changes :
SILGen changes:
- A new CaptureKind::Immutable is introduced, to capture let values as in_guaranteed.
- SILGen of in_guaranteed capture had to be fixed.
in_guaranteed captures as per convention are consumed by the closure. And so SILGen should not generate a destroy_addr for an in_guaranteed capture.
But LetValueInitialization can push Dealloc and Release states of the captured arg in the Cleanup stack, and there is no way to access the CleanupHandle and disable the emission of destroy_addr while emitting the captures in SILGenFunction::emitCaptures.
So we now create, temporary allocation of the in_guaranteed capture iduring SILGenFunction::emitCaptures without emitting destroy_addr for it.
SILOptimizer changes:
- Handle in_guaranteed in CopyForwarding.
- Adjust dealloc_stack of in_guaranteed capture to occur after destroy_addr for on_stack closures in ClosureLifetimeFixup.
IRGen changes :
- Since HeapLayout can be non-fixed now, make sure emitSize is used conditionally
- Don't consider ClassPointerSource kind parameter type for fulfillments while generating code for partial apply forwarder.
The TypeMetadata of ClassPointSource kind sources are not populated in HeapLayout's NecessaryBindings. If we have a generic parameter on the HeapLayout which can be fulfilled by a ClassPointerSource, its TypeMetaData will not be found while constructing the dtor function of the HeapLayout.
So it is important to skip considering sources of ClassPointerSource kind, so that TypeMetadata of a dependent generic parameters gets populated in HeapLayout's NecessaryBindings.
https://forums.swift.org/t/improving-the-representation-of-polymorphic-interfaces-in-sil-with-substituted-function-types/29711
This prepares SIL to be able to more accurately preserve the calling convention of
polymorphic generic interfaces by letting the type system represent "substituted function types".
We add a couple of fields to SILFunctionType to support this:
- A substitution map, accessed by `getSubstitutions()`, which maps the generic signature
of the function to its concrete implementation. This will allow, for instance, a protocol
witness for a requirement of type `<Self: P> (Self, ...) -> ...` for a concrete conforming
type `Foo` to express its type as `<Self: P> (Self, ...) -> ... for <Foo>`, preserving the relation
to the protocol interface without relying on the pile of hacks that is the `witness_method`
protocol.
- A bool for whether the generic signature of the function is "implied" by the substitutions.
If true, the generic signature isn't really part of the calling convention of the function.
This will allow closure types to distinguish a closure being passed to a generic function, like
`<T, U> in (*T, *U) -> T for <Int, String>`, from the concrete type `(*Int, *String) -> Int`,
which will make it easier for us to differentiate the representation of those as types, for
instance by giving them different pointer authentication discriminators to harden arm64e
code.
This patch is currently NFC, it just introduces the new APIs and takes a first pass at updating
code to use them. Much more work will need to be done once we start exercising these new
fields.
This does bifurcate some existing APIs:
- SILFunctionType now has two accessors to get its generic signature.
`getSubstGenericSignature` gets the generic signature that is used to apply its
substitution map, if any. `getInvocationGenericSignature` gets the generic signature
used to invoke the function at apply sites. These differ if the generic signature is
implied.
- SILParameterInfo and SILResultInfo values carry the unsubstituted types of the parameters
and results of the function. They now have two APIs to get that type. `getInterfaceType`
returns the unsubstituted type of the generic interface, and
`getArgumentType`/`getReturnValueType` produce the substituted type that is used at
apply sites.
This change modifies spare bit masks so that they are arranged in
the byte order of the target platform. It also modifies and
consolidates the code that gathers and scatters bits into enum
values.
All enum-related validation tests are now passing on IBM Z (s390x)
which is a big-endian platform.
In `emitPartialApplicationForwarder`, when we handle result types that have a type parameter, the lowered result type could be a struct.
This happens when we have a function result, for instance:
```swift
%0 = function_ref @returns_closure : $@convention(thin) <τ_0_0> (Empty<τ_0_0>) -> (@owned Empty<τ_0_0>, @owned @callee_guaranteed (Empty<τ_0_0>) -> @owned Empty<τ_0_0>)
%1 = partial_apply [callee_guaranteed] %0<S>() : $@convention(thin) <τ_0_0> (Empty<τ_0_0>) -> (@owned Empty<τ_0_0>, @owned @callee_guaranteed (Empty<τ_0_0>) -> @owned Empty<τ_0_0>)
```
In this case, we emit code that casts the struct memberwise.
Resolves [SR-9709](https://bugs.swift.org/browse/SR-9709).