When performing a dynamic cast to an existential type that satisfies
(Metatype)Sendable, it is unsafe to allow isolated conformances of any
kind to satisfy protocol requirements for the existential. Identify
these cases and mark the corresponding cast instructions with a new flag,
`[prohibit_isolated_conformances]` that will be used to indicate to the
runtime that isolated conformances need to be rejected.
Since `move_value` is a destroying operation, the adjoint of `y = move_value x` should be `adj[x] += adj[y]; adj[y] = 0` instead of just `adj[x] += adj[y]`.
As autodiff happens on function types it is not in general possible to determine the real expansion context of the function being differentiated. Use of minimal context is a conservative approach that should work even when libraty evolution mode is enabled.
Fixes#55179
This corresponds to the parameter-passing convention of the Itanium C++
ABI, in which the argument is passed indirectly and possibly modified,
but not destroyed, by the callee.
@in_cxx is handled the same way as @in in callers and @in_guaranteed in
callees. OwnershipModelEliminator emits the call to destroy_addr that is
needed to destroy the argument in the caller.
rdar://122707697
Create two versions of the following functions:
isConsumedParameter
isGuaranteedParameter
SILParameterInfo::isConsumed
SILParameterInfo::isGuaranteed
SILArgumentConvention::isOwnedConvention
SILArgumentConvention::isGuaranteedConvention
These changes will be needed when we add a new convention for
non-trivial C++ types as the functions will return different answers
depending on whether they are called for the caller or the callee. This
commit doesn't change any functionality.
Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
[serialized_for_package] if Package CMO is enabled. The latter kind
allows a function to be serialized even if it contains loadable types,
if Package CMO is enabled. Renamed IsSerialized_t as SerializedKind_t.
The tri-state serialization kind requires validating inlinability
depending on the serialization kinds of callee vs caller; e.g. if the
callee is [serialized_for_package], the caller must be _not_ [serialized].
Renamed `hasValidLinkageForFragileInline` as `canBeInlinedIntoCaller`
that takes in its caller's SerializedKind as an argument. Another argument
`assumeFragileCaller` is also added to ensure that the calle sites of
this function know the caller is serialized unless it's called for SIL
inlining optimization passes.
The [serialized_for_package] attribute is allowed for SIL function, global var,
v-table, and witness-table.
Resolves rdar://128406520
getVarInfo() now always returns a variable with a location and scope.
To opt out of this change, getVarInfo(false) returns an incomplete variable.
This can be used to work around bugs, but should only really be used for
printing.
The complete var info will also contain the type, except for debug_values,
as its type depends on another instruction, which may be inconsistent if
called mid-pass.
All locations in debug variables are now also stripped of flags, to avoid
issues when comparing or hashing debug variables.
This PR implements first set of changes required to support autodiff for coroutines. It mostly targeted to `_modify` accessors in standard library (and beyond), but overall implementation is quite generic.
There are some specifics of implementation and known limitations:
- Only `@yield_once` coroutines are naturally supported
- VJP is a coroutine itself: it yields the results *and* returns a pullback closure as a normal return. This allows us to capture values produced in resume part of a coroutine (this is required for defers and other cleanups / commits)
- Pullback is a coroutine, we assume that coroutine cannot abort and therefore we execute the original coroutine in reverse from return via yield and then back to the entry
- It seems there is no semantically sane way to support `_read` coroutines (as we will need to "accept" adjoints via yields), therefore only coroutines with inout yields are supported (`_modify` accessors). Pullbacks of such coroutines take adjoint buffer as input argument, yield this buffer (to accumulate adjoint values in the caller) and finally return the adjoints indirectly.
- Coroutines (as opposed to normal functions) are not first-class values: there is no AST type for them, one cannot e.g. store them into tuples, etc. So, everywhere where AST type is required, we have to hack around.
- As there is no AST type for coroutines, there is no way one could register custom derivative for coroutines. So far only compiler-produced derivatives are supported
- There are lots of common things wrt normal function apply's, but still there are subtle but important differences. I tried to organize the code to enable code reuse, still it was not always possible, so some code duplication could be seen
- The order of how pullback closures are produced in VJP is a bit different: for normal apply's VJP produces both value and pullback closure via a single nested VJP apply. This is not so anymore with coroutine VJP's: yielded values are produced at `begin_apply` site and pullback closure is available only from `end_apply`, so we need to track the order in which pullbacks are produced (and arrange consumption of the values accordingly – effectively delay them)
- On the way some complementary changes were required in e.g. mangler / demangler
This patch covers the generation of derivatives up to SIL level, however, it is not enough as codegen of `partial_apply` of a coroutine is completely broken. The fix for this will be submitted separately as it is not directly autodiff-related.
---------
Co-authored-by: Andrew Savonichev <andrew.savonichev@gmail.com>
Co-authored-by: Richard Wei <rxwei@apple.com>
Unreachable blocks possess some challenges to autodiff since in reverse pass (pullback generation) we need to execute the function backwards, pushing the values from return BB back to entry block. As a result, unreachable blocks might become reachable from the return BB and this might cause all kind of issues.
Optional's `init_enum_data_addr` and `inject_enum_addr` instructions are generated in presence of non-loadable Optional values. The compiler used to treat these instructions as inactive, and this resulted in silent run-time
issues described in #64223.
The patch marks `init_enum_data_addr` as "active" if its Optional operand is also active, and in PullbackCloner we differentiate through it and the related `inject_enum_addr`.
However, we only determine this relation in simple cases when both instructions are in the same block. There is no def-use relation between them (both take the same Optional operand), so if there is more than one set of instructions
operating on the same Optional, or there is some control flow, we currently bail out.
In PullbackCloner, we walk over instructions in reverse order and start from `inject_enum_addr` and its `Optional<Wrapped>.TangentVector` operand. Assuming that is is already initialized, we emit an `unchecked_take_enum_data_addr` and set it as the adjoint buffer of `init_enum_data_addr`. The Optional value is
invalidated, and we have to destroy the enum data address later when we reach `init_enum_data_addr`.
We need a lowered type for branch trace enum in order to compute linear map tuple type. However, the lowering of branch trace enum type depends on the types of its elements (the payloads are linear map tuples of predecessor BB).
As lowered types are cached, we cannot populate branch trace enum entries in the end as we did before: we already used wrong lowered types for linear map tuples.
Traverse basic blocks in reverse post-order traverse order building linear map tuples and branch tracing enums in one go, ensuring that we've done with predecessor BBs before processing the BB itself.
It lowers let property accesses of classes.
Lowering consists of two tasks:
* In class initializers, insert `end_init_let_ref` instructions at places where all let-fields are initialized.
This strictly separates the life-range of the class into a region where let fields are still written during
initialization and a region where let fields are truly immutable.
* Add the `[immutable]` flag to all `ref_element_addr` instructions (for let-fields) which are in the "immutable"
region. This includes the region after an inserted `end_init_let_ref` in an class initializer, but also all
let-field accesses in other functions than the initializer and the destructor.
This pass should run after DefiniteInitialization but before RawSILInstLowering (because it relies on `mark_uninitialized` still present in the class initializer).
Note that it's not mandatory to run this pass. If it doesn't run, SIL is still correct.
Simplified example (after lowering):
bb0(%0 : @owned C): // = self of the class initializer
%1 = mark_uninitialized %0
%2 = ref_element_addr %1, #C.l // a let-field
store %init_value to %2
%3 = end_init_let_ref %1 // inserted by lowering
%4 = ref_element_addr [immutable] %3, #C.l // set to immutable by lowering
%5 = load %4
For some values we cannot compute types for differentiation (for example,
tangent vector type), so it is better to diagnose them earlier. Otherwise we hit
assertions when generating code for such invalid values.
The LIT test is a reduced reproducer from the issue #66996. Before the patch the
compiler crashed while trying to get a tangent vector type for the following
value (partial_apply):
%54 = function_ref @$s4null1o2ffSdAA1FV_tFSdyKXEfu0_ :
$@convention(thin) @substituted <τ_0_0> (@inout_aliasable Double)
-> (@out τ_0_0, @error any Error) for <Double>
%55 = partial_apply [callee_guaranteed] %54(%2) :
$@convention(thin) @substituted <τ_0_0> (@inout_aliasable Double)
-> (@out τ_0_0, @error any Error) for <Double>
Now we emit a diagnostic instead.
The patch resolves issues #66996 and #63331
When the differentiating a function containing loops, we allocate a linear map context object on the heap. This context object may store non-trivial objects, such as closures, that need to be released explicitly. Fix the autodiff linear map context allocation builtins to correctly release such objects and not just free the memory they occupy.