This new OSSA invariant simplifies many optimizations because they don't have to take care of the corner case of incomplete lifetimes in dead-end blocks.
The implementation basically consists of these changes:
* add the lifetime completion utility
* add a flag in SILFunction which tells optimization that they need to run the lifetime completion utility
* let all optimizations complete lifetimes if necessary
* enable the ownership verifier to check complete lifetimes
These two new invariants eliminate corner cases which caused bugs if optimization didn't handle them.
Also, it will significantly simplify lifetime completion.
The implementation basically consists of these changes:
* add a flag in SILFunction which tells optimization if they need to take care of infinite loops
* add a utility to break infinite loops
* let all optimizations remove unreachable blocks and break infinite loops if necessary
* add verification to check the new SIL invariants
The new `breakIfniniteLoops` utility breaks infinite loops in the control flow by inserting an "artificial" loop exit to a new dead-end block with an `unreachable`.
It inserts a `cond_br` with a `builtin "infinite_loop_true_condition"`:
```
bb0:
br bb1
bb1:
br bb1 // back-end branch
```
->
```
bb0:
br bb1
bb1:
%1 = builtin "infinite_loop_true_condition"() // always true, but the compiler doesn't know
cond_br %1, bb2, bb3
bb2: // new back-end block
br bb1
bb3: // new dead-end block
unreachable
```
Inserts an unreachable after an unconditional fail:
```
%0 = integer_literal 1
cond_fail %0, "message"
// following instructions
```
->
```
%0 = integer_literal 1
cond_fail %0, "message"
unreachable
deadblock:
// following instructions
```
Remove the old SILCombine implementation because it's not working well with OSSA lifetime completion.
This also required to move the `shouldRemoveCondFail` utility function from SILCombine to InstOptUtils.
Usually `explicit_copy_addr` and `explicit_copy_value` don't survive until the first SILCombine pass run anyway.
But if they do, the simplifications need to be registered, otherwise SILCombine will complain.
This is needed in Embedded Swift because the `witness_method` convention requires passing the witness table to the callee.
However, the witness table is not necessarily available.
A witness table is only generated if an existential value of a protocol is created.
This is a rare situation because only witness thunks have `witness_method` convention and those thunks are created as "transparent" functions, which means they are always inlined (after de-virtualization of a witness method call).
However, inlining - even of transparent functions - can fail for some reasons.
This change adds a new EmbeddedWitnessCallSpecialization pass:
If a function with `witness_method` convention is directly called, the function is specialized by changing the convention to `method` and the call is replaced by a call to the specialized function:
```
%1 = function_ref @callee : $@convention(witness_method: P) (@guaranteed C) -> ()
%2 = apply %1(%0) : $@convention(witness_method: P) (@guaranteed C) -> ()
...
sil [ossa] @callee : $@convention(witness_method: P) (@guaranteed C) -> () {
...
}
```
->
```
%1 = function_ref @$e6calleeTfr9 : $@convention(method) (@guaranteed C) -> ()
%2 = apply %1(%0) : $@convention(method) (@guaranteed C) -> ()
...
// specialized callee
sil shared [ossa] @$e6calleeTfr9 : $@convention(method) (@guaranteed C) -> () {
...
}
```
Fixes a compiler crash
rdar://165184147
It eliminates dead access scopes if they are not conflicting with other scopes.
Removes:
```
%2 = begin_access [modify] [dynamic] %1
... // no uses of %2
end_access %2
```
However, dead _conflicting_ access scopes are not removed.
If a conflicting scope becomes dead because an optimization e.g. removed a load, it is still important to get an access violation at runtime.
Even a propagated value of a redundant load from a conflicting scope is undefined.
```
%2 = begin_access [modify] [dynamic] %1
store %x to %2
%3 = begin_access [read] [dynamic] %1 // conflicting with %2!
%y = load %3
end_access %3
end_access %2
use(%y)
```
After redundant-load-elimination:
```
%2 = begin_access [modify] [dynamic] %1
store %x to %2
%3 = begin_access [read] [dynamic] %1 // now dead, but still conflicting with %2
end_access %3
end_access %2
use(%x) // propagated from the store, but undefined here!
```
In this case the scope `%3` is not removed because it's important to get an access violation error at runtime before the undefined value `%x` is used.
This pass considers potential conflicting access scopes in called functions.
But it does not consider potential conflicting access in callers (because it can't!).
However, optimizations, like redundant-load-elimination, can only do such transformations if the outer access scope is within the function, e.g.
```
bb0(%0 : $*T): // an inout from a conflicting scope in the caller
store %x to %0
%3 = begin_access [read] [dynamic] %1
%y = load %3 // cannot be propagated because it cannot be proved that %1 is the same address as %0
end_access %3
```
All those checks are only done for dynamic access scopes, because they matter for runtime exclusivity checking.
Dead static scopes are removed unconditionally.
It hoists `destroy_value` instructions for non-lexical values.
```
%1 = some_ownedValue
...
last_use(%1)
... // other instructions
destroy_value %1
```
->
```
%1 = some_ownedValue
...
last_use(%1)
destroy_value %1 // <- moved after the last use
... // other instructions
```
In contrast to non-mandatory optimization passes, this is the only pass which hoists destroys over deinit-barriers.
This ensures consistent behavior in -Onone and optimized builds.
This is done by splitting the `begin_borrow` of the whole struct into individual borrows of the fields (for trivial fields no borrow is needed).
And then sinking the `struct` to it's consuming use(s).
```
%3 = struct $S(%nonTrivialField, %trivialField) // owned
...
%4 = begin_borrow %3
%5 = struct_extract %4, #S.nonTrivialField
%6 = struct_extract %4, #S.trivialField
use %5, %6
end_borrow %4
...
end_of_lifetime %3
```
->
```
...
%5 = begin_borrow %nonTrivialField
use %5, %trivialField
end_borrow %5
...
%3 = struct $S(%nonTrivialField, %trivialField)
end_of_lifetime %3
```
This optimization is important for Array code where the Array buffer is constantly wrapped into structs and then extracted again to access the buffer.
Beside supporting OSSA, this change significantly simplifies the pass.
The main change is that instead of starting at a closure (e.g. `partial_apply`) and finding all call sites, we now start at a call site and look for closures for all arguments. This makes a lot of things much simpler, e.g. not so many intermediate data structures are required to track all the states.
I needed to remove the 3 unit tests because the things those tests were testing are not there anymore. However, the pass is tested with a lot of sil tests (and I added quite a few), which should give good test coverage.
The old ClosureSpecializer pass is still kept in place, because at that point in the pipeline we don't have OSSA, yet. Once we have that, we can replace the old pass withe the new one.
However, the autodiff closure specializer already runs in the OSSA pipeline and there the new changes take effect.
During inlining, some instructions in the caller may be deleted. So
when publishing this best effort list of instructions which were
"changed" during inlining, don't start from a deleted instruction.
Instead just don't publish, as already happens in other cases.
Without actually addressing
```
// TODO: get a list of cloned instructions from the `inlineFunction`
```
this is somewhat better than checking for having reached the end of the
function during the walk because that will result in falsely
broadcasting that some unchanged instructions have changed whereas this
change only results in not broadcasting which is already being done in
some cases (e.g. when a `begin_apply` is immediately followed by an
`end_apply`).
During inlining, the apply is deleted. So when publishing this best
effort list of which instructions were "changed" during inlining, start
from the instruction before the deleted apply.
(old name: CapturePropagation)
The pass is now rewritten in swift which makes the code smaller and simpler.
Compared to the old pass it has two improvements:
* It can constant propagate whole structs (and not only builtin literals). This is important for propagating "real" Swift constants which have a struct type of e.g. `Int`.
* It constant propagates keypaths even if there are other non-constant closure captures which are not propagated. This is something the old pass didn't do.
rdar://151185177
* move some Cloner utilities from ContextCommon.swift directly into Cloner.swift
* add an `cloneRecursively` overload which doesn't require the `customGetCloned` closure argument
* some small cleanups
So far, constant propagated arguments could only be builtin literals.
Now we support arbitrary structs (with constant arguments), e.g. `Int`.
This requires a small addition in the mangling scheme for function specializations.
Also, the de-mangling tree now looks a bit different to support a "tree" of structs and literals.