This corresponds to the parameter-passing convention of the Itanium C++
ABI, in which the argument is passed indirectly and possibly modified,
but not destroyed, by the callee.
@in_cxx is handled the same way as @in in callers and @in_guaranteed in
callees. OwnershipModelEliminator emits the call to destroy_addr that is
needed to destroy the argument in the caller.
rdar://122707697
This PR implements first set of changes required to support autodiff for coroutines. It mostly targeted to `_modify` accessors in standard library (and beyond), but overall implementation is quite generic.
There are some specifics of implementation and known limitations:
- Only `@yield_once` coroutines are naturally supported
- VJP is a coroutine itself: it yields the results *and* returns a pullback closure as a normal return. This allows us to capture values produced in resume part of a coroutine (this is required for defers and other cleanups / commits)
- Pullback is a coroutine, we assume that coroutine cannot abort and therefore we execute the original coroutine in reverse from return via yield and then back to the entry
- It seems there is no semantically sane way to support `_read` coroutines (as we will need to "accept" adjoints via yields), therefore only coroutines with inout yields are supported (`_modify` accessors). Pullbacks of such coroutines take adjoint buffer as input argument, yield this buffer (to accumulate adjoint values in the caller) and finally return the adjoints indirectly.
- Coroutines (as opposed to normal functions) are not first-class values: there is no AST type for them, one cannot e.g. store them into tuples, etc. So, everywhere where AST type is required, we have to hack around.
- As there is no AST type for coroutines, there is no way one could register custom derivative for coroutines. So far only compiler-produced derivatives are supported
- There are lots of common things wrt normal function apply's, but still there are subtle but important differences. I tried to organize the code to enable code reuse, still it was not always possible, so some code duplication could be seen
- The order of how pullback closures are produced in VJP is a bit different: for normal apply's VJP produces both value and pullback closure via a single nested VJP apply. This is not so anymore with coroutine VJP's: yielded values are produced at `begin_apply` site and pullback closure is available only from `end_apply`, so we need to track the order in which pullbacks are produced (and arrange consumption of the values accordingly – effectively delay them)
- On the way some complementary changes were required in e.g. mangler / demangler
This patch covers the generation of derivatives up to SIL level, however, it is not enough as codegen of `partial_apply` of a coroutine is completely broken. The fix for this will be submitted separately as it is not directly autodiff-related.
---------
Co-authored-by: Andrew Savonichev <andrew.savonichev@gmail.com>
Co-authored-by: Richard Wei <rxwei@apple.com>
Introduce SILGen support for reabstractions thunks that change the
error, between indirect and direct errors as well as conversions
amongst error types (e.g., from concrete to `any Error`).
For now, always use indirect convention for types with packs. This is
motivated by the fact that getting from/setting to a pack currently
requires addresses which aren't materialized for tuples. In the
fullness of time, these values should be direct in opaque values mode,
but for now it can be postponed.
This required quite a bit of infrastructure for emitting this kind of
tuple expression, although I'm not going to claim they really work yet;
in particular, I know the RValue constructor is going to try to explode
them, which it really shouldn't.
It also doesn't include the caller side of returns, for which I'll need
to teach ResultPlan to do the new abstraction-pattern walk. But that's
next.
- SILPackType carries whether the elements are stored directly
in the pack, which we're not currently using in the lowering,
but it's probably something we'll want in the final ABI.
Having this also makes it clear that we're doing the right
thing with substitution and element lowering. I also toyed
with making this a scalar type, which made it necessary in
various places, although eventually I pulled back to the
design where we always use packs as addresses.
- Pack boundaries are a core ABI concept, so the lowering has
to wrap parameter pack expansions up as packs. There are huge
unimplemented holes here where the abstraction pattern will
need to tell us how many elements to gather into the pack,
but a naive approach is good enough to get things off the
ground.
- Pack conventions are related to the existing parameter and
result conventions, but they're different on enough grounds
that they deserve to be separated.
When branching to the exit block, flatten an @out tuple return value
into its components, as is done for all other return values.
In the exit block, when constructing the @out tuple to return, visit the
tuple-type-tree of the return value to reconstruct the nested tuple
structure: @out tuple returns are not flattened, unlike regular tuple
returns.
Sometimes one just has a SILFunctionConvention instead of the underlying
SILFunctionType (that the SILFunctionConvention contains). This just shims in
that API onto the composition type.
This became necessary after recent function type changes that keep
substituted generic function types abstract even after substitution to
correctly handle automatic opaque result type substitution.
Instead of performing the opaque result type substitution as part of
substituting the generic args the underlying type will now be reified as
part of looking at the parameter/return types which happens as part of
the function convention apis.
rdar://62560867
https://forums.swift.org/t/improving-the-representation-of-polymorphic-interfaces-in-sil-with-substituted-function-types/29711
This prepares SIL to be able to more accurately preserve the calling convention of
polymorphic generic interfaces by letting the type system represent "substituted function types".
We add a couple of fields to SILFunctionType to support this:
- A substitution map, accessed by `getSubstitutions()`, which maps the generic signature
of the function to its concrete implementation. This will allow, for instance, a protocol
witness for a requirement of type `<Self: P> (Self, ...) -> ...` for a concrete conforming
type `Foo` to express its type as `<Self: P> (Self, ...) -> ... for <Foo>`, preserving the relation
to the protocol interface without relying on the pile of hacks that is the `witness_method`
protocol.
- A bool for whether the generic signature of the function is "implied" by the substitutions.
If true, the generic signature isn't really part of the calling convention of the function.
This will allow closure types to distinguish a closure being passed to a generic function, like
`<T, U> in (*T, *U) -> T for <Int, String>`, from the concrete type `(*Int, *String) -> Int`,
which will make it easier for us to differentiate the representation of those as types, for
instance by giving them different pointer authentication discriminators to harden arm64e
code.
This patch is currently NFC, it just introduces the new APIs and takes a first pass at updating
code to use them. Much more work will need to be done once we start exercising these new
fields.
This does bifurcate some existing APIs:
- SILFunctionType now has two accessors to get its generic signature.
`getSubstGenericSignature` gets the generic signature that is used to apply its
substitution map, if any. `getInvocationGenericSignature` gets the generic signature
used to invoke the function at apply sites. These differ if the generic signature is
implied.
- SILParameterInfo and SILResultInfo values carry the unsubstituted types of the parameters
and results of the function. They now have two APIs to get that type. `getInterfaceType`
returns the unsubstituted type of the generic interface, and
`getArgumentType`/`getReturnValueType` produce the substituted type that is used at
apply sites.
This pass now canonicalizes results before lowering and handles all combinations
of direct and indirect multiple return values. The logic is much less ad-hoc and
more robust.
try_apply still isn't handled, but should be much easier now.
Add visitLoadInst, visitStoreInst, visitDebugValueInst, etc.