The patch implements proper sil-combiner handling of
`differentiable_function` for cases when extractee has non-trivial
ownership. In such casese, it is consumed by the differentiable_function
instruction. We must copy the extractee before the consumption point so
the copy remains live afterward.
Fixes#88816
We cannot use spare bits or other overlapping storage layout tricks with fundamentally
address-only enums, and we can take advantage of this to do borrowing switches or other
in-place projections without copying the value. However, for resilient enums, the
implementation may use spare bit packing, but the type must be handled address-only
outside of its defining module, and we didn't have a way to express that with
borrowing switch. Optimization passes have also been running into problems with the
complexity that we were using `unchecked_take_enum_data_addr` sometimes as a pure
operation. This patch splits the instruction into three:
- `unchecked_inplace_enum_data_addr` represents a nondestructive in-place enum
projection. It is only allowed for enums whose projection operation is
nondestructive.
- `unchecked_take_enum_data_addr` represents a destructive enum projection,
invalidating the enum and leaving the payload to be further consumed.
This matches the current instruction's semantics.
- `unchecked_borrow_enum_data_addr` represents a borrowing enum projection.
The instruction takes a second operand for "scratch" space, which the
enum representation may be copied into in order to avoid invalidating the
enum value, so the result is dependent on the lifetime of both the
original enum and the scratch buffer. This allows for borrowing switches
over resilient enums.
`unchecked_borrow_enum_data_addr` is implemented by taking advantage of the
"address-only enums can't do spare bit optimization" property at runtime.
We inspect the operand type's bitwise-borrowability from its metadata. If
the type is bitwise-borrowable, then we are allowed to bitwise-copy the
enum to the scratch space and apply the projection to the scratch space,
preserving the original value. If the type is not bitwise-borrowable, then
we cannot use spare bit optimization in its layout, so we apply the
projection in-place.
Fixes rdar://174952822.
Required for Builtin.borrowAt. SILGen may generate dead borrow scopes for
loadable values used in a return expression that for a borrow access that uses
guaranteed_address convention.
rdar://175382154 (Enable load_borrow simplification at -Onone)
When the option `-remove-runtime-asserts` is used all `cond_fail` instructions are removed.
However, the cast optimizer inserts such unconditional fails for failing casts. This ended up in an infinite optimization loop in SILCombine.
The fix is
1. don't remove unconditional `cond_fail`s, even if with the `-remove-runtime-asserts` option. This also has the benefit that it enables later optimizations to remove all the dead code after such an unconditional `cond_fail`.
2. Don't optimize a failing cast if it is already preceded by an unconditional `cond_fail`
This bug was introduced by https://github.com/swiftlang/swift/pull/88258
Fixes a compiler hang
rdar://174185165
This optimization handles unconditional `cond_fail` instructions, i.e. `cond_fail`s with a non-zero `integer_literal` operand.
It cuts off the control flow after such a `cond_fail` by inserting an `unreachable` instruction.
However, this optimization cannot be done as instruction simplification, because it can leave OSSA lifetimes uncompleted.
Other simplification may depend on complete lifetimes.
Similar for constant folding failing casts: we also cannot insert an `unreachable` there.
Instead, do this optimization a new function pass (which can do lifetime completion).
Fixes a SIL verification error
rdar://173728487
This optimization replaces (the very inefficient) RawRepresentable comparison to a simple compare of enum tags.
However if the raw type is a custom type we don't know how the comparison is implemented.
A custom raw type can implement the case comparison in a way that comparing different cases will return `true`.
Therefore only do the optimization for known stdlib raw value types.
Fixes a mis-compile
https://github.com/swiftlang/swift/issues/87906
rdar://172746003
Support folding `differentiable_function_extract` of
`differentiable_function` in presence of borrowed scopes. Such folding
is crucial for VJP inlining, which is required for AutoDiff closure
specialization (ADCS) pass working properly.
Such handling was not required previously, but now ADCS runs in presence
of OSSA, making handling of borrowed scopes essential.
The folding logic is based on similar logic for
`struct`/`struct_extract` simplification.
Note that the `AutoDiff/SILOptimizer/licm_context.swift` test needs to
be modified since it relies on specific inlining behavior. Particularly,
we need to force inlining of the implicitly generated VJP of `B.a()`
into the VJP of `q()`. Without `@inline(__always)`, this particular
inlining decision stops happening on MacOS because the SIL combiner
changes from this patch affect the inlining decisions.
The changes from this patch make some new inlining decisions possible to
be taken befor attempting to inline the VJP of `B.a()` into the VJP of
`q()`. As a result, the VJP of `B.a()` becomes bigger because of other
VJPs being inlined into that, and the inlining cost of `B.a()` VJP
becomes too high when trying to perform inlining inside the VJP of
`q()`.
Depends on #87859 to allow force-inlining in
`AutoDiff/SILOptimizer/licm_context.swift`.
When propagating a concrete type of an existential to an apply/try_apply instruction, we need to insert either an `unchecked_addr_cast` for address-only (= general) existentials or an `unchecked_ref_cast` for class existentials.
Fixes a SIL verification crash
rdar://172758483
For example:
```
checked_cast_br B in %1 to X, bb1, bb2
bb1(%2 : @owned $X):
destroy_value %2 // no other instructions in this block
br bb3
bb2(%4 : @owned $B):
destroy_value %4 // no other instructions in this block
br bb3
```
The `move_value` instruction is only used to specify flags.
Remove a `move_value` which either doesn't specify any flags or which flags are not relevant outside the mandatory pipeline, like `[lexical]`.
Rewrite the enum -> single payload optimization to handle non-destructive `unchecked_take_enum_data_addr` correctly.
Also, we can handle more cases now.
Inserts an unreachable after an unconditional fail:
```
%0 = integer_literal 1
cond_fail %0, "message"
// following instructions
```
->
```
%0 = integer_literal 1
cond_fail %0, "message"
unreachable
deadblock:
// following instructions
```
Remove the old SILCombine implementation because it's not working well with OSSA lifetime completion.
This also required to move the `shouldRemoveCondFail` utility function from SILCombine to InstOptUtils.
This can cause several problems, e.g. false performance errors or even mis-compiles.
Instead, just ignore dead-end destroys as we don't want to "move" them away from `unreachable` instructions, anyway.
rdar://167553623
* remove borrow scopes which are borrowing an already guaranteed value
* allow optimizing lexical `begin_borrows` outside the mandatory pipeline
* fix: don't remove `begin_borrow [lexical]` of a `thin_to_thick_function` in the mandatory pipeline
When trying to remove a borrow scope by replacing all guaranteed uses with owned uses, don't bail for values which have debug_value uses.
Also make sure that the debug_value instructions are located within the owned value's lifetime.
Attempt to optimize by forwarding the destroy to operands of forwarding instructions.
```
%3 = struct $S (%1, %2)
destroy_value %3 // the only use of %3
```
->
```
destroy_value %1
destroy_value %2
```
The benefit of this transformation is that the forwarding instruction can be removed.
Also, handle `destroy_value` for phi arguments.
This is a more complex case where the destroyed value comes from different predecessors via a phi argument.
The optimization moves the `destroy_value` to each predecessor block.
```
bb1:
br bb3(%0)
bb2:
br bb3(%1)
bb3(%3 : @owned T):
... // no deinit-barriers
destroy_value %3 // the only use of %3
```
->
```
bb1:
destroy_value %0
br bb3
bb2:
destroy_value %1
br bb3
bb3:
...
```
This showed up on and off again on the source-compatibility testsuite project hummingbird.
The gist of the problem is that transformations may not rewrite the
type of an inlined instance of a variable without also createing a
deep copy of the inlined function with a different name (and e.g., a
specialization suffix). Otherwise the modified inlined variable will
cause an inconsistency when later compiler passes try to create the
abstract declaration of that inlined function as there would be
conflicting declarations for that variable.
Since SILDebugScope isn't yet available in the SwiftCompilerSources
this fix just drop these variables, but it would be absolutely
possible to preserve them by using the same mechanism that SILCloner
uses to create a deep copy of the inlined function scopes.
rdar://163167975
* remove `filterUsers(ofType:)`, because it's a duplication of `users(ofType:)`
* rename `filterUses(ofType:)` -> `filter(usersOfType:)`
* rename `ignoreUses(ofType:)` -> `ignore(usersOfType:)`
* rename `getSingleUser` -> `singleUser`
* implement `singleUse` with `Sequence.singleElement`
* implement `ignoreDebugUses` with `ignore(usersOfType:)`
This is a follow-up of https://github.com/swiftlang/swift/pull/83728/commits/eb1d5f484c9f4dae73a3779191bfdf917fd07a49.
This is done by splitting the `begin_borrow` of the whole struct into individual borrows of the fields (for trivial fields no borrow is needed).
And then sinking the `struct` to it's consuming use(s).
```
%3 = struct $S(%nonTrivialField, %trivialField) // owned
...
%4 = begin_borrow %3
%5 = struct_extract %4, #S.nonTrivialField
%6 = struct_extract %4, #S.trivialField
use %5, %6
end_borrow %4
...
end_of_lifetime %3
```
->
```
...
%5 = begin_borrow %nonTrivialField
use %5, %trivialField
end_borrow %5
...
%3 = struct $S(%nonTrivialField, %trivialField)
end_of_lifetime %3
```
This optimization is important for Array code where the Array buffer is constantly wrapped into structs and then extracted again to access the buffer.
The assertion is hit through `TypeValueInst.simplify` when constructing
an integer literal instruction with a negative 64-bit `Swift.Int` and a
bit width of 32 (the target pointer bit width for arm64_32 watchOS).
This happens because we tell the `llvm::APInt` constructor to treat the
input integer as unsigned by default in `getAPInt`, and a negative
64-bit signed integer does not fit into 32 bits when interpreted as
unsigned.
Fix this by flipping the default signedness assumption for the Swift API
and introducing a convenience method for constructing a 1-bit integer
literal instruction, where the correct signedness assumption depends on
whether you want to use 1 or -1 for 'true'.
In the context of using an integer to construct an `llvm::APInt`, there
are 2 other cases where signedness matters that come to mind:
1. A non-decimal integer literal narrower than 64 bits, such as
`0xABCD`, is used.
2. The desired bit width is >64, since `llvm::APInt` can either
zero-extend or sign-extend the 64-bit integer it accepts.
Neither of these appear to be exercised in SwiftCompilerSources, and
if we ever do, the caller should be responsible for either (1)
appropriately extending the literal manually, e.g.
`Int(Int16(bitPattern: 0xABCD))`, or (2) passing along the appropriate
signedness.
The `explicit_copy_value` and `explicit_copy_addr` instructions are only used for non-copyable diagnostics in the mandatory pipeline.
After that we can replace them by their non-explicit counterparts so that optimizations (which only know of `copy_value` and `copy_addr`) can do their work.
rdar://159039552
When replacing an opened existential type with the concrete type, we didn't consider that the existential archetype can also be a "dependent" type of the root archetype.
For now, just bail in this case. In future we can support dependent archetypes as well.
Fixes a compiler crash.
rdar://158594365