There are passes that rely on the isolation being present after
specialization and other optmizations i.e. `SendNonSendable` which
means the clones need to always preserve the isolation of the
original function.
Support folding `differentiable_function_extract` of
`differentiable_function` in presence of borrowed scopes. Such folding
is crucial for VJP inlining, which is required for AutoDiff closure
specialization (ADCS) pass working properly.
Such handling was not required previously, but now ADCS runs in presence
of OSSA, making handling of borrowed scopes essential.
The folding logic is based on similar logic for
`struct`/`struct_extract` simplification.
Note that the `AutoDiff/SILOptimizer/licm_context.swift` test needs to
be modified since it relies on specific inlining behavior. Particularly,
we need to force inlining of the implicitly generated VJP of `B.a()`
into the VJP of `q()`. Without `@inline(__always)`, this particular
inlining decision stops happening on MacOS because the SIL combiner
changes from this patch affect the inlining decisions.
The changes from this patch make some new inlining decisions possible to
be taken befor attempting to inline the VJP of `B.a()` into the VJP of
`q()`. As a result, the VJP of `B.a()` becomes bigger because of other
VJPs being inlined into that, and the inlining cost of `B.a()` VJP
becomes too high when trying to perform inlining inside the VJP of
`q()`.
Depends on #87859 to allow force-inlining in
`AutoDiff/SILOptimizer/licm_context.swift`.
* replace the non-OSSA ClosureSpecializer with the new OSSA ClosureSpecialization pass
* move the OwnershipModelEliminator after the mid-level and closure-specialization pipelines
* add an additional RedundantLoadElimination pass at the begin of the low-level pipeline to compensate for not eliminated loads in OSSA
In #84704, some tests were XFAIL'ed since they've become failing. The
root cause of failures is lack of ownership info on the 2nd run of
AutoDiff closure specialization pass. Ownership info is required for the
pass run after #84704, so at the moment only the 1st run of the pass is
effective.
Several test cases still remain passing because for some cases we are
lucky and all the inlining required for specialization is done before
the 1st pass run. This patch enables such test cases back so we have at
least some test coverage.
Beside supporting OSSA, this change significantly simplifies the pass.
The main change is that instead of starting at a closure (e.g. `partial_apply`) and finding all call sites, we now start at a call site and look for closures for all arguments. This makes a lot of things much simpler, e.g. not so many intermediate data structures are required to track all the states.
I needed to remove the 3 unit tests because the things those tests were testing are not there anymore. However, the pass is tested with a lot of sil tests (and I added quite a few), which should give good test coverage.
The old ClosureSpecializer pass is still kept in place, because at that point in the pipeline we don't have OSSA, yet. Once we have that, we can replace the old pass withe the new one.
However, the autodiff closure specializer already runs in the OSSA pipeline and there the new changes take effect.
Previously, AutoDiff closure specialization pass was triggered only on
VJPs containing single basic block. However, the pass logic allows
running on arbitrary VJPs. This PR enables the pass for all VJPs
unconditionally. So, if the pullback corresponding to multiple-BB VJP
accepts some closures directly as arguments, these closures might become
specialized by the pass. Closures passed via payload of branch tracing
enum are not specialized - this is subject for future changes.
The PR contains several commits.
1. The thing named "call site" in the code is partial_apply of pullback
corresponding to the VJP. This might appear only once, so we drop
support for multiple "call sites".
2. Enhance existing SILOptimizer tests for the pass.
3. Add validation-tests for single basic block case.
4. The change itself - delete check against single basic block.
5. Add validation-tests for multiple basic block case.
6. Add SILOptimizer tests for multiple basic block case.