This PR implements first set of changes required to support autodiff for coroutines. It mostly targeted to `_modify` accessors in standard library (and beyond), but overall implementation is quite generic.
There are some specifics of implementation and known limitations:
- Only `@yield_once` coroutines are naturally supported
- VJP is a coroutine itself: it yields the results *and* returns a pullback closure as a normal return. This allows us to capture values produced in resume part of a coroutine (this is required for defers and other cleanups / commits)
- Pullback is a coroutine, we assume that coroutine cannot abort and therefore we execute the original coroutine in reverse from return via yield and then back to the entry
- It seems there is no semantically sane way to support `_read` coroutines (as we will need to "accept" adjoints via yields), therefore only coroutines with inout yields are supported (`_modify` accessors). Pullbacks of such coroutines take adjoint buffer as input argument, yield this buffer (to accumulate adjoint values in the caller) and finally return the adjoints indirectly.
- Coroutines (as opposed to normal functions) are not first-class values: there is no AST type for them, one cannot e.g. store them into tuples, etc. So, everywhere where AST type is required, we have to hack around.
- As there is no AST type for coroutines, there is no way one could register custom derivative for coroutines. So far only compiler-produced derivatives are supported
- There are lots of common things wrt normal function apply's, but still there are subtle but important differences. I tried to organize the code to enable code reuse, still it was not always possible, so some code duplication could be seen
- The order of how pullback closures are produced in VJP is a bit different: for normal apply's VJP produces both value and pullback closure via a single nested VJP apply. This is not so anymore with coroutine VJP's: yielded values are produced at `begin_apply` site and pullback closure is available only from `end_apply`, so we need to track the order in which pullbacks are produced (and arrange consumption of the values accordingly – effectively delay them)
- On the way some complementary changes were required in e.g. mangler / demangler
This patch covers the generation of derivatives up to SIL level, however, it is not enough as codegen of `partial_apply` of a coroutine is completely broken. The fix for this will be submitted separately as it is not directly autodiff-related.
---------
Co-authored-by: Andrew Savonichev <andrew.savonichev@gmail.com>
Co-authored-by: Richard Wei <rxwei@apple.com>
drop_deinit forwards ownership while effectively stripping the deinitializer. It is similar to a type cast.
Fixes rdar://125590074 ([NonescapableTypes] Nonescapable types
cannot have deinits)
ActorIsolation already has a "I have no value case": unspecified. Lets just use
that.
Just a mistake I made that I am trying to fix before anything further depends on
this code.
When cloning SIL, it's OK for conformances to Copyable or Escapable to
be carried-over as a builtin conformance, rather than an abstract
conformance.
This is a workaround for a bug introduced in
`6cd5468cceacc1d600c7dafdd4debc6740d1dfd6`.
resolves rdar://125460667
* Allow normal function results of @yield_once coroutines
* Address review comments
* Workaround LLVM coroutine codegen problem: it assumes that unwind path never returns.
This is not true to Swift coroutines as unwind path should end with error result.
[region-isolation] A few improvements + If a value is dynamically actor isolated, do not consider it transferred if the transfer statically was to that same actor isolation.
This issue can come up when a value is initially statically disconnected, but
after we performed dataflow, we discovered that it was actually actor isolated
at the transfer point, implying that we are not actually transferring.
Example:
```swift
@MainActor func testGlobalAndGlobalIsolatedPartialApplyMatch2() {
var ns = (NonSendableKlass(), NonSendableKlass())
// Regions: (ns.0, ns.1), {(mainActorIsolatedGlobal), @MainActor}
ns.0 = mainActorIsolatedGlobal
// Regions: {(ns.0, ns.1, mainActorIsolatedGlobal), @MainActor}
// This is not a transfer since ns is already main actor isolated.
let _ = { @MainActor in
print(ns)
}
useValue(ns)
}
```
To do this, I also added to SILFunction an actor isolation that SILGen puts on
the SILFunction during pre function visitation. We don't print it or serialize
it for now.
rdar://123474616
Change FieldSensitive's enum representation to allow distinguishing
among the elements with associated value. Consider
`unchecked_take_enum_data_addr` to consume all other fields than that
taken.
rdar://125113258
* Let the customBits and lastInitializedBitfieldID share a single uint64_t. This increases the number of available bits in SILNode and Operand from 8 to 20. Also, it simplifies the Operand class because no PointerIntPairs are used anymore to store the operand pointer fields.
* Instead make the "deleted" flag a separate bool field in SILNode (instead of encoding it with the sign of lastInitializedBitfieldID). Another simplification
* Enable important invariant checks also in release builds by using `require` instead of `assert`. Not catching such errors in release builds would be a disaster.
* Let the Swift optimization passes use all the available bits and not only a fixed amount of 8 (SILNode) and 16 (SILBasicBlock).
An instruction can consume multiple (discontiguous) fields. Use a
SmallBitVector to track the fields consumed by an instruction rather
than a TypeTreeLeafRange.
rdar://125103951
A `try_apply` with indirect out arguments is only a def for those arguments on
the success path. Model this by sinking the def-ness of the instruction into the
success branch of the try_apply, and introducing a new `DeadToLiveEdge` mode for
block liveness which stops propagation of use-before-def conditions into the
block that introduced the def. Fixes rdar://118567869.
Add PartialApplyInst.hasNoescapeCapture
Add PartialApplyInst.mayEscape
Refactor DiagnoseInvalidEscapingCaptures. This may change functionality because tuples containing a noescape closure are now correctly recognized. Although I'm not sure such tupes can ever be captured directly.
LLVM is presumably moving towards `std::string_view` -
`StringRef::startswith` is deprecated on tip. `SmallString::startswith`
was just renamed there (maybe with some small deprecation inbetween, but
if so, we've missed it).
The `SmallString::startswith` references were moved to
`.str().starts_with()`, rather than adding the `starts_with` on
`stable/20230725` as we only had a few of them. Open to switching that
over if anyone feels strongly though.