Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
This PR implements first set of changes required to support autodiff for coroutines. It mostly targeted to `_modify` accessors in standard library (and beyond), but overall implementation is quite generic.
There are some specifics of implementation and known limitations:
- Only `@yield_once` coroutines are naturally supported
- VJP is a coroutine itself: it yields the results *and* returns a pullback closure as a normal return. This allows us to capture values produced in resume part of a coroutine (this is required for defers and other cleanups / commits)
- Pullback is a coroutine, we assume that coroutine cannot abort and therefore we execute the original coroutine in reverse from return via yield and then back to the entry
- It seems there is no semantically sane way to support `_read` coroutines (as we will need to "accept" adjoints via yields), therefore only coroutines with inout yields are supported (`_modify` accessors). Pullbacks of such coroutines take adjoint buffer as input argument, yield this buffer (to accumulate adjoint values in the caller) and finally return the adjoints indirectly.
- Coroutines (as opposed to normal functions) are not first-class values: there is no AST type for them, one cannot e.g. store them into tuples, etc. So, everywhere where AST type is required, we have to hack around.
- As there is no AST type for coroutines, there is no way one could register custom derivative for coroutines. So far only compiler-produced derivatives are supported
- There are lots of common things wrt normal function apply's, but still there are subtle but important differences. I tried to organize the code to enable code reuse, still it was not always possible, so some code duplication could be seen
- The order of how pullback closures are produced in VJP is a bit different: for normal apply's VJP produces both value and pullback closure via a single nested VJP apply. This is not so anymore with coroutine VJP's: yielded values are produced at `begin_apply` site and pullback closure is available only from `end_apply`, so we need to track the order in which pullbacks are produced (and arrange consumption of the values accordingly – effectively delay them)
- On the way some complementary changes were required in e.g. mangler / demangler
This patch covers the generation of derivatives up to SIL level, however, it is not enough as codegen of `partial_apply` of a coroutine is completely broken. The fix for this will be submitted separately as it is not directly autodiff-related.
---------
Co-authored-by: Andrew Savonichev <andrew.savonichev@gmail.com>
Co-authored-by: Richard Wei <rxwei@apple.com>
LLVM is presumably moving towards `std::string_view` -
`StringRef::startswith` is deprecated on tip. `SmallString::startswith`
was just renamed there (maybe with some small deprecation inbetween, but
if so, we've missed it).
The `SmallString::startswith` references were moved to
`.str().starts_with()`, rather than adding the `starts_with` on
`stable/20230725` as we only had a few of them. Open to switching that
over if anyone feels strongly though.
I am doing this in preparation for adding options to SILParameterInfo/
SILResultInfo that state that a parameter/result is transferring. Even though I
could have just introduced a new bit here, I instead streamlined the interface
of SILParameterInfo/SILResultInfo to use an OptionSet instead of individual bits
to make it easier to add new flags here. The reason why it is easier is that
along API (e.x.: function argument) boundaries one does not have to marshal each
field or pass each field. Instead one can just pass the whole OptionSet as an
opaque thing. Using this I was able to change serialization/deserialization of
SILParameterInfo/SILResultInfo so that one does not need to update them if one
adds new fields!
The reason why I am doing this for both SILParameterInfo/SILResultInfo in the
same commit is because they share code in the demangler that I did not want to
have to duplicate in an intervening commit. By changing them both at the same
type, I didn't have to change anything without an actual need to.
I am doing this in a separate commit from adding transferring support so I can
validate correctness using the tests for the options already supported
(currently only differentiability).
Add `Differentiable` requirements to pattern substitutions / pattern generic signature when calculating constrained function type. Also, add requirements for differentiable results as well.
Fixes#65487
Introduce the notion of "semantic result parameter". Handle differentiation of inouts via semantic result parameter abstraction. Do not consider non-wrt semantic result parameters as semantic results
Fixes#67174
Reformatting everything now that we have `llvm` namespaces. I've
separated this from the main commit to help manage merge-conflicts and
for making it a bit easier to read the mega-patch.
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
- `Mangle::ASTMangler::mangleAutoDiffDerivativeFunction()` and `Mangle::ASTMangler::mangleAutoDiffLinearMap()` accept original function declarations and return a mangled name for a derivative function or linear map. This is called during SILGen and TBDGen.
- `Mangle::DifferentiationMangler` handles differentiation function mangling in the differentiation transform. This part is necessary because we need to perform demangling on the original function and remangle it as part of a differentiation function mangling tree in order to get the correct substitutions in the mangled derivative generic signature.
A mangled differentiation function name includes:
- The original function.
- The differentiation function kind.
- The parameter indices for differentiation.
- The result indices for differentiation.
- The derivative generic signature.
Add base type parameter to `TangentStoredPropertyRequest`.
Use `TypeBase::getTypeOfMember` instead of `VarDecl::getType` to correctly
compute the member type of original stored properties, using the base type.
Resolves SR-13134.
Add request that resolves the "tangent stored property" corresponding to an
original stored property in a `Differentiable`-conforming type.
Enables better non-differentiability differentiation transform diagnostics.
`DifferentiableFunctionInst` now stores result indices.
`SILAutoDiffIndices` now stores result indices instead of a source index.
`@differentiable` SIL function types may now have multiple differentiability
result indices and `@noDerivative` resutls.
`@differentiable` AST function types do not have `@noDerivative` results (yet),
so this functionality is not exposed to users.
Resolves TF-689 and TF-1256.
Infrastructural support for TF-983: supporting differentiation of `apply`
instructions with multiple active semantic results.
Remove all assertions from `AnyFunctionType::getAutoDiffDerivativeFunctionLinearMapType`.
All error cases are represented by `DerivativeFunctionTypeError` now.
Fix `DerivativeFunctionTypeError` error payloads and improve error case naming.
Create `DerivativeFunctionTypeError` representing all potential derivative
function type calculation errors.
Make `AnyFunctionType::getAutoDiffDerivativeFunctionLinearMapType` return
`llvm::Expected<AnyFunctionType *>`. This is much safer and caller-friendly
than performing assertions.
Delete hacks in `@differentiable` and `@derivative` attribute type-checking
for verifying that `Differentiable.TangentVector` type witnesses are valid:
this is no longer necessary.
Robust fix for TF-521: invalid `Differentiable` conformances during
`@derivative` attribute type-checking.
Resolves SR-12793: bad interaction between `@differentiable` and `@derivative`
attribute type-checking and `Differentiable` derived conformances.
Annotate the covered switches with `llvm_unreachable` to avoid the MSVC
warning which does not recognise the covered switches. This allows us
to avoid a spew of warnings.
Define type signatures and SILGen for the following builtins:
```
/// Applies the {jvp|vjp} of `f` to `arg1`, ..., `argN`.
func applyDerivative_arityN_{jvp|vjp}(f, arg1, ..., argN) -> jvp/vjp return type
/// Applies the transpose of `f` to `arg`.
func applyTranspose_arityN(f, arg) -> transpose return type
/// Makes a differentiable function from the given `original`, `jvp`, and
/// `vjp` functions.
func differentiableFunction_arityN(original, jvp, vjp)
/// Makes a linear function from the given `original` and `transpose` functions.
func linearFunction_arityN(original, transpose)
```
Add SILGen FileCheck tests for all builtins.
`@differentiable` attribute on protocol requirements and non-final class members
will produce derivative function entries in witness tables and vtables.
This patch adds an optional derivative function configuration
(`AutoDiffDerivativeFunctionIdentifier`) to `SILDeclRef` to represent these
derivative function entries.
Derivative function configurations consist of:
- A derivative function kind (JVP or VJP).
- Differentiability parameter indices.
Resolves TF-1209.
Enables TF-1212: upstream derivative function entries in witness tables/vtables.
Generate SIL differentiability witnesses from `@differentiable` and
`@derivative` declaration attributes.
Add SILGen utilities for:
- Emiting differentiability witnesses.
- Creating derivative function thunks, which are used as entries in
differentiability witnesses.
When users register a custom derivative function, it is necessary to create a
thunk with the expected derivative type computed from the original function's
type. This is important for consistent typing and consistent differentiability
witness entry mangling.
See `SILGenModule::getOrCreateCustomDerivativeThunk` documentation for details.
Resolves TF-1138.
In order to allow this, I've had to rework the syntax of substituted function types; what was previously spelled `<T> in () -> T for <X>` is now spelled `@substituted <T> () -> T for <X>`. I think this is a nice improvement for readability, but it did require me to churn a lot of test cases.
Distinguishing the substitutions has two chief advantages over the existing representation. First, the semantics seem quite a bit clearer at use points; the `implicit` bit was very subtle and not always obvious how to use. More importantly, it allows the expression of generic function types that must satisfy a particular generic abstraction pattern, which was otherwise impossible to express.
As an example of the latter, consider the following protocol conformance:
```
protocol P { func foo() }
struct A<T> : P { func foo() {} }
```
The lowered signature of `P.foo` is `<Self: P> (@in_guaranteed Self) -> ()`. Without this change, the lowered signature of `A.foo`'s witness would be `<T> (@in_guaranteed A<T>) -> ()`, which does not preserve information about the conformance substitution in any useful way. With this change, the lowered signature of this witness could be `<T> @substituted <Self: P> (@in_guaranteed Self) -> () for <A<T>>`, which nicely preserves the exact substitutions which relate the witness to the requirement.
When we adopt this, it will both obviate the need for the special witness-table conformance field in SILFunctionType and make it far simpler for the SILOptimizer to devirtualize witness methods. This patch does not actually take that step, however; it merely makes it possible to do so.
As another piece of unfinished business, while `SILFunctionType::substGenericArgs()` conceptually ought to simply set the given substitutions as the invocation substitutions, that would disturb a number of places that expect that method to produce an unsubstituted type. This patch only set invocation arguments when the generic type is a substituted type, which we currently never produce in type-lowering.
My plan is to start by producing substituted function types for accessors. Accessors are an important case because the coroutine continuation function is essentially an implicit component of the function type which the current substitution rules simply erase the intended abstraction of. They're also used in narrower ways that should exercise less of the optimizer.
Semantically, an `inout` parameter is both a parameter and a result.
`@differentiable` and `@derivative` attributes now support original functions
with one "semantic result": either a formal result or an `inout` parameter.
Derivative typing rules for functions with `inout` parameters are now defined.
The differential/pullback type of a function with `inout` differentiability
parameters also has `inout` parameters. This is ideal for performance.
Differential typing rules:
- Case 1: original function has no `inout` parameters.
- Original: `(T0, T1, ...) -> R`
- Differential: `(T0.Tan, T1.Tan, ...) -> R.Tan`
- Case 2: original function has a non-wrt `inout` parameter.
- Original: `(T0, inout T1, ...) -> Void`
- Differential: `(T0.Tan, ...) -> T1.Tan`
- Case 3: original function has a wrt `inout` parameter.
- Original: `(T0, inout T1, ...) -> Void`
- Differential: `(T0.Tan, inout T1.Tan, ...) -> Void`
Pullback typing rules:
- Case 1: original function has no `inout` parameters.
- Original: `(T0, T1, ...) -> R`
- Pullback: `R.Tan -> (T0.Tan, T1.Tan, ...)`
- Case 2: original function has a non-wrt `inout` parameter.
- Original: `(T0, inout T1, ...) -> Void`
- Pullback: `(T1.Tan) -> (T0.Tan, ...)`
- Case 3: original function has a wrt `inout` parameter.
- Original: `(T0, inout T1, ...) -> Void`
- Pullback: `(inout T1.Tan) -> (T0.Tan, ...)`
Resolves TF-1164.
The `differentiability_witness_function` instruction looks up a
differentiability witness function (JVP, VJP, or transpose) for a referenced
function via SIL differentiability witnesses.
Add round-trip parsing/serialization and IRGen tests.
Notes:
- Differentiability witnesses for linear functions require more support.
`differentiability_witness_function [transpose]` instructions do not yet
have IRGen.
- Nothing currently generates `differentiability_witness_function` instructions.
The differentiation transform does, but it hasn't been upstreamed yet.
Resolves TF-1141.
Add `SILFunctionType::getAutoDiffTransposeFunctionType`.
It computes the transpose `SILFucntionType` for an original `SILFunctionType`,
given:
- Linearity parameter indices
- Transpose function generic signature (optional)
- Other auxiliary parameters
Add doc comments explaining typing rules, preconditions, and other details.
Add `isTranspose` flag to `autodiff::getConstrainedDerivativeGenericSignature`.
Partially resolves TF-1125.
Unblocks TF-1141: upstream `differentiability_witness_function` instruction.
SIL differentiability witnesses are a new top-level SIL construct mapping
an "original" SIL function and derivative configuration to derivative SIL
functions.
This patch adds `SILDifferentiabilityWitness` IRGen.
`SILDifferentiabilityWitness` has a fixed `{ i8*, i8* }` layout:
JVP and VJP derivative function pointers.
Resolves TF-1146.
`TangentSpace` represents the tangent space of a type.
- For `Differentiable`-conforming types:
- The tangent space is the `TangentVector` associated type.
- For tuple types:
- The tangent space is a tuple of the elements' tangent space types, for the
elements that have a tangent space.
- Other types have no tangent space.
`TypeBase::getAutoDiffTangentSpace` gets the tangent space of a type.
`TangentSpace` is used to:
- Compute the derivative function type of a given original function type.
- Compute the type of tangent/adjoint values during automatic differentiation.
Progress towards TF-828: upstream `@differentiable` attribute type-checking.
The `@derivative` attribute registers a function as a derivative of another
function-like declaration: a `func`, `init`, `subscript`, or `var` computed
property declaration.
The `@derivative` attribute also has an optional `wrt:` clause specifying the
parameters that are differentiated "with respect to", i.e. the differentiation
parameters. The differentiation parameters must conform to the `Differentiable`
protocol.
If the `wrt:` clause is unspecified, the differentiation parameters are inferred
to be all parameters that conform to `Differentiable`.
`@derivative` attribute type-checking verifies that the type of the derivative
function declaration is consistent with the type of the referenced original
declaration and the differentiation parameters.
The `@derivative` attribute is gated by the
`-enable-experimental-differentiable-programming` flag.
Resolves TF-829.