Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
This PR implements first set of changes required to support autodiff for coroutines. It mostly targeted to `_modify` accessors in standard library (and beyond), but overall implementation is quite generic.
There are some specifics of implementation and known limitations:
- Only `@yield_once` coroutines are naturally supported
- VJP is a coroutine itself: it yields the results *and* returns a pullback closure as a normal return. This allows us to capture values produced in resume part of a coroutine (this is required for defers and other cleanups / commits)
- Pullback is a coroutine, we assume that coroutine cannot abort and therefore we execute the original coroutine in reverse from return via yield and then back to the entry
- It seems there is no semantically sane way to support `_read` coroutines (as we will need to "accept" adjoints via yields), therefore only coroutines with inout yields are supported (`_modify` accessors). Pullbacks of such coroutines take adjoint buffer as input argument, yield this buffer (to accumulate adjoint values in the caller) and finally return the adjoints indirectly.
- Coroutines (as opposed to normal functions) are not first-class values: there is no AST type for them, one cannot e.g. store them into tuples, etc. So, everywhere where AST type is required, we have to hack around.
- As there is no AST type for coroutines, there is no way one could register custom derivative for coroutines. So far only compiler-produced derivatives are supported
- There are lots of common things wrt normal function apply's, but still there are subtle but important differences. I tried to organize the code to enable code reuse, still it was not always possible, so some code duplication could be seen
- The order of how pullback closures are produced in VJP is a bit different: for normal apply's VJP produces both value and pullback closure via a single nested VJP apply. This is not so anymore with coroutine VJP's: yielded values are produced at `begin_apply` site and pullback closure is available only from `end_apply`, so we need to track the order in which pullbacks are produced (and arrange consumption of the values accordingly – effectively delay them)
- On the way some complementary changes were required in e.g. mangler / demangler
This patch covers the generation of derivatives up to SIL level, however, it is not enough as codegen of `partial_apply` of a coroutine is completely broken. The fix for this will be submitted separately as it is not directly autodiff-related.
---------
Co-authored-by: Andrew Savonichev <andrew.savonichev@gmail.com>
Co-authored-by: Richard Wei <rxwei@apple.com>
Unreachable blocks possess some challenges to autodiff since in reverse pass (pullback generation) we need to execute the function backwards, pushing the values from return BB back to entry block. As a result, unreachable blocks might become reachable from the return BB and this might cause all kind of issues.
Optional's `init_enum_data_addr` and `inject_enum_addr` instructions are generated in presence of non-loadable Optional values. The compiler used to treat these instructions as inactive, and this resulted in silent run-time
issues described in #64223.
The patch marks `init_enum_data_addr` as "active" if its Optional operand is also active, and in PullbackCloner we differentiate through it and the related `inject_enum_addr`.
However, we only determine this relation in simple cases when both instructions are in the same block. There is no def-use relation between them (both take the same Optional operand), so if there is more than one set of instructions
operating on the same Optional, or there is some control flow, we currently bail out.
In PullbackCloner, we walk over instructions in reverse order and start from `inject_enum_addr` and its `Optional<Wrapped>.TangentVector` operand. Assuming that is is already initialized, we emit an `unchecked_take_enum_data_addr` and set it as the adjoint buffer of `init_enum_data_addr`. The Optional value is
invalidated, and we have to destroy the enum data address later when we reach `init_enum_data_addr`.
We need a lowered type for branch trace enum in order to compute linear map tuple type. However, the lowering of branch trace enum type depends on the types of its elements (the payloads are linear map tuples of predecessor BB).
As lowered types are cached, we cannot populate branch trace enum entries in the end as we did before: we already used wrong lowered types for linear map tuples.
Traverse basic blocks in reverse post-order traverse order building linear map tuples and branch tracing enums in one go, ensuring that we've done with predecessor BBs before processing the BB itself.
Introduce the notion of "semantic result parameter". Handle differentiation of inouts via semantic result parameter abstraction. Do not consider non-wrt semantic result parameters as semantic results
Fixes#67174
* Don't invalidate the lookup cache in 'getOrCreateSynthesizedFile()'
Adding a synthesized file itself doesn't introduce any decls. Instead,
we should invalidate the right after the actual declrations are added
in the file
* Remove 'SourceLookupCache::invalidate()' method. It was just used
right before the destruction. It was just not necessary
* Include auxiliary decls in 'SourceLookupCache::lookupVisibleDecls()'
Previously, global symbol completion didn't include decls synthesized
by peer macros or freestanding decl macros
* Include "auxiliary decls" in visible member lookup, and visible local
decl lookup
* Hide macro unique names
rdar://110535113
Linear maps are captured in vjp routine via callee-guaranteed partial apply and are passed as @owned references to the enclosing pullback that finally consumes them. Necessary retains are inserted by a partial apply forwarder.
However, this is not the case when the function being differentiated contains loops as heap-allocated context is used and bare pointer is captured by the pullback partial apply. As a result, partial apply forwarder does not retain the linear maps that are owned by a heap-allocated context, however, they are still treated as @owned references and therefore are released in the pullback after the first call. As a result, subsequent pullback calls release linear maps and we'd end with possible use-after-free.
Ensure we retain values when we load values from the context.
Reproducible only when:
* Loops (so, heap-allocated context)
* Pullbacks of thick functions (so context is non-zero)
* Multiple pullback calls
* Some cleanup while there
Fixes#64257
The changes are intentionally were made close to the original implementation w/o possible simplifications to ease the review
Fixes#63207, supersedes #63379 (and fixes#63234)
Start treating the null {Can}GenericSignature as a regular signature
with no requirements and no parameters. This not only makes for a much
safer abstraction, but allows us to simplify a lot of the clients of
GenericSignature that would previously have to check for null before
using the abstraction.
While it is very convenient to default the ExtInfo state when creating
new function types, it also make the intent unclear to those looking to
extend ExtInfo state. For example, did a given call site intend to have
the default ExtInfo state or does it just happen to work? This matters a
lot because function types are regularly unpacked and rebuilt and it's
really easy to accidentally drop ExtInfo state.
By changing the ExtInfo state to an optional, we can track when it is
actually needed.
- `Mangle::ASTMangler::mangleAutoDiffDerivativeFunction()` and `Mangle::ASTMangler::mangleAutoDiffLinearMap()` accept original function declarations and return a mangled name for a derivative function or linear map. This is called during SILGen and TBDGen.
- `Mangle::DifferentiationMangler` handles differentiation function mangling in the differentiation transform. This part is necessary because we need to perform demangling on the original function and remangle it as part of a differentiation function mangling tree in order to get the correct substitutions in the mangled derivative generic signature.
A mangled differentiation function name includes:
- The original function.
- The differentiation function kind.
- The parameter indices for differentiation.
- The result indices for differentiation.
- The derivative generic signature.
In derivatives of loops, no longer allocate boxes for indirect case payloads. Instead, use a custom pullback context in the runtime which contains a bump-pointer allocator.
When a function contains a differentiated loop, the closure context is a `Builtin.NativeObject`, which contains a `swift::AutoDiffLinearMapContext` and a tail-allocated top-level linear map struct (which represents the linear map struct that was previously directly partial-applied into the pullback). In branching trace enums, the payloads of previously indirect cases will be allocated by `swift::AutoDiffLinearMapContext::allocate` and stored as a `Builtin.RawPointer`.
AD-generated data structures (linear map structs and branching trace enums) do not need to be resilient data structures. These decls ade missing a `@frozen` attribute.
Resolves rdar://71319547.
We'll need this to get the right 'selfDC' when name lookup
finds a 'self' declaration in a capture list, eg
class C {
func bar() {}
func foo() {
_ = { [self] in bar() }
}
}
Previously, `LinearMapInfo::shouldDifferentiateInstruction` had a special case
for `copy_value`, returning true for `copy_value` instructions with an active
operand.
This is unexpected and led to "leaked owned value" ownership verification
failures due to unnecessarily cloned `copy_value` instructions during
differential generation.
Now, the special case is removed, fixing the failures.
`shouldDifferentiateInstruction` returns true for `copy_value` instructions
whose operand and result are both active.
Resolves SR-13530.
Fixes foward-mode crashed related to:
- Missing tangent buffers for non-wrt `inout` parameters.
- Tangent buffers not being initialized due to the corresponding original
buffer intialization instructions being non-active.
- Non-varied indirect results not being initialized.
- `emitDestroyValue` crashes due to `TangentVector` value category mismatch.
Resolves TF-984 and SR-13447.
Co-authored-by: Dan Zheng <danielzheng@google.com>
Update differentiation to handle `array.finalize_intrinsic` applications.
`VJPEmitter::visitApplyInst` does standard cloning for these applications.
`PullbackEmitter::visitApplyInst` treats the intrinsic like an identity
function, accumulating result's adjoint into argument's adjoint.
This fixes array literal initialization differentiation.
`DifferentiableFunctionInst` now stores result indices.
`SILAutoDiffIndices` now stores result indices instead of a source index.
`@differentiable` SIL function types may now have multiple differentiability
result indices and `@noDerivative` resutls.
`@differentiable` AST function types do not have `@noDerivative` results (yet),
so this functionality is not exposed to users.
Resolves TF-689 and TF-1256.
Infrastructural support for TF-983: supporting differentiation of `apply`
instructions with multiple active semantic results.
Ensure that semantic member accesors have empty linear map structs.
Semantic member accessor VJPs do not accumulate any callee pullbacks.
Resolves SR-12800: SIL verification error.
Move differentiation-related SILOptimizer files to
{include/swift,lib}/SILOptimizer/Differentiation/.
This reduces directory nesting and gathers files together.