The `projection` flag indicates that `index_addr` projects an element address from an array base address, as opposed to being used for general pointer arithmetic.
When this flag is set, the result address can only reach the single element at the given index — it is not possible to chain multiple `index_addr` instructions to reach other array elements from the result.
Without this flag, the result may be used as the base of another `index_addr`, allowing arithmetic across element boundaries (e.g. an `index_addr` with index 1 followed by an `index_addr` with index 2 reaches the element at offset 3).
An `index_addr [projection]` is mandatory to go from an array base address to an element - even if it's the first element, i.e. the index is zero.
This means that the optimizer must not remove `index_addr [projection]` with a zero index.
When the operand was changed by a pass but has a different type from
what the undef type should be, pass the correct type to killOperand
so that an undef with the right type is emitted.
When the value of an enum discriminator is known but the payload unknown
(for example, an Optional that we know is non-nil), salvage the value
using a debug reconstruction block that recreates the enum using the
payload and the known discriminator.
Replaces a sequence which is commonly found in generated class destructors:
```
%1 = some_owned_value
%2 = begin_borrow %1 // the only use of %1
%3 = unchecked_ref_cast %2 to $C
%4 = unchecked_ownership_conversion %3, @guaranteed to @owned
end_borrow %2
end_lifetime %1
```
->
```
%1 = some_owned_value
%2 = begin_borrow %1 // now dead
end_borrow %2
%4 = unchecked_ref_cast %1 to $C
```
Immortal foreign reference types never need release or retain operations
in Swift, they are represented as trival types, i.e, they can be copied/loaded
without any refcount operations. Some peepholes introducing upcast or
unchecked_ref_cast on the archetype which is not trivial. As a result,
we ended up with SIL that failed to pass verification, we tried to do
trivial operations on a non-trivial type.
This PR disables the peephole optimizations when the triviality of the
address type differs from the triviality of the type after the
transformation.
rdar://177549159
- Canonicalize dynamic_pack_index and pack_pack_index to scalar_pack_index.
- Replace opened Pack Element types with concrete types when statically known (i.e. when open_pack_element uses a scalar_pack_index) in SILCloner.
- Add a simplification for tuple_pack_element_addr to replace it with tuple_element_addr when it uses a scalar_pack_index. This is near-identical to the existing SILCombine visitor for tuple_pack_element_addr (which worked for dynamic_pack_index).
TODO: Fix or remove old SILCombine visitors for pack instructions that are broken or made obsolete by this change.
The patch enhances sil-combiner handling of `differentiable_function` by
adding support for `convert_function` which is further used in
`differentiable_function_extract`:
```
%0 = differentiable_function ... %x
%1 = begin_borrow %0
%2 = convert_function %1 to ...
%3 = differentiable_function_extract [xxx] %2
// use of %3
```
-->
```
%0 = differentiable_function ... %x
// use of %x
```
If a tuple_pack_element_addr uses a scalar_pack_index rather than a
dynamic_pack_index, we can replace it with tuple_element_addr, since the
specific tuple element accessed is statically known.
The patch implements proper sil-combiner handling of
`differentiable_function` for cases when extractee has non-trivial
ownership. In such casese, it is consumed by the differentiable_function
instruction. We must copy the extractee before the consumption point so
the copy remains live afterward.
Fixes#88816
We cannot use spare bits or other overlapping storage layout tricks with fundamentally
address-only enums, and we can take advantage of this to do borrowing switches or other
in-place projections without copying the value. However, for resilient enums, the
implementation may use spare bit packing, but the type must be handled address-only
outside of its defining module, and we didn't have a way to express that with
borrowing switch. Optimization passes have also been running into problems with the
complexity that we were using `unchecked_take_enum_data_addr` sometimes as a pure
operation. This patch splits the instruction into three:
- `unchecked_inplace_enum_data_addr` represents a nondestructive in-place enum
projection. It is only allowed for enums whose projection operation is
nondestructive.
- `unchecked_take_enum_data_addr` represents a destructive enum projection,
invalidating the enum and leaving the payload to be further consumed.
This matches the current instruction's semantics.
- `unchecked_borrow_enum_data_addr` represents a borrowing enum projection.
The instruction takes a second operand for "scratch" space, which the
enum representation may be copied into in order to avoid invalidating the
enum value, so the result is dependent on the lifetime of both the
original enum and the scratch buffer. This allows for borrowing switches
over resilient enums.
`unchecked_borrow_enum_data_addr` is implemented by taking advantage of the
"address-only enums can't do spare bit optimization" property at runtime.
We inspect the operand type's bitwise-borrowability from its metadata. If
the type is bitwise-borrowable, then we are allowed to bitwise-copy the
enum to the scratch space and apply the projection to the scratch space,
preserving the original value. If the type is not bitwise-borrowable, then
we cannot use spare bit optimization in its layout, so we apply the
projection in-place.
Fixes rdar://174952822.
Required for Builtin.borrowAt. SILGen may generate dead borrow scopes for
loadable values used in a return expression that for a borrow access that uses
guaranteed_address convention.
rdar://175382154 (Enable load_borrow simplification at -Onone)
When the option `-remove-runtime-asserts` is used all `cond_fail` instructions are removed.
However, the cast optimizer inserts such unconditional fails for failing casts. This ended up in an infinite optimization loop in SILCombine.
The fix is
1. don't remove unconditional `cond_fail`s, even if with the `-remove-runtime-asserts` option. This also has the benefit that it enables later optimizations to remove all the dead code after such an unconditional `cond_fail`.
2. Don't optimize a failing cast if it is already preceded by an unconditional `cond_fail`
This bug was introduced by https://github.com/swiftlang/swift/pull/88258
Fixes a compiler hang
rdar://174185165
This optimization handles unconditional `cond_fail` instructions, i.e. `cond_fail`s with a non-zero `integer_literal` operand.
It cuts off the control flow after such a `cond_fail` by inserting an `unreachable` instruction.
However, this optimization cannot be done as instruction simplification, because it can leave OSSA lifetimes uncompleted.
Other simplification may depend on complete lifetimes.
Similar for constant folding failing casts: we also cannot insert an `unreachable` there.
Instead, do this optimization a new function pass (which can do lifetime completion).
Fixes a SIL verification error
rdar://173728487
This optimization replaces (the very inefficient) RawRepresentable comparison to a simple compare of enum tags.
However if the raw type is a custom type we don't know how the comparison is implemented.
A custom raw type can implement the case comparison in a way that comparing different cases will return `true`.
Therefore only do the optimization for known stdlib raw value types.
Fixes a mis-compile
https://github.com/swiftlang/swift/issues/87906
rdar://172746003
Support folding `differentiable_function_extract` of
`differentiable_function` in presence of borrowed scopes. Such folding
is crucial for VJP inlining, which is required for AutoDiff closure
specialization (ADCS) pass working properly.
Such handling was not required previously, but now ADCS runs in presence
of OSSA, making handling of borrowed scopes essential.
The folding logic is based on similar logic for
`struct`/`struct_extract` simplification.
Note that the `AutoDiff/SILOptimizer/licm_context.swift` test needs to
be modified since it relies on specific inlining behavior. Particularly,
we need to force inlining of the implicitly generated VJP of `B.a()`
into the VJP of `q()`. Without `@inline(__always)`, this particular
inlining decision stops happening on MacOS because the SIL combiner
changes from this patch affect the inlining decisions.
The changes from this patch make some new inlining decisions possible to
be taken befor attempting to inline the VJP of `B.a()` into the VJP of
`q()`. As a result, the VJP of `B.a()` becomes bigger because of other
VJPs being inlined into that, and the inlining cost of `B.a()` VJP
becomes too high when trying to perform inlining inside the VJP of
`q()`.
Depends on #87859 to allow force-inlining in
`AutoDiff/SILOptimizer/licm_context.swift`.
When propagating a concrete type of an existential to an apply/try_apply instruction, we need to insert either an `unchecked_addr_cast` for address-only (= general) existentials or an `unchecked_ref_cast` for class existentials.
Fixes a SIL verification crash
rdar://172758483
For example:
```
checked_cast_br B in %1 to X, bb1, bb2
bb1(%2 : @owned $X):
destroy_value %2 // no other instructions in this block
br bb3
bb2(%4 : @owned $B):
destroy_value %4 // no other instructions in this block
br bb3
```
The `move_value` instruction is only used to specify flags.
Remove a `move_value` which either doesn't specify any flags or which flags are not relevant outside the mandatory pipeline, like `[lexical]`.
Rewrite the enum -> single payload optimization to handle non-destructive `unchecked_take_enum_data_addr` correctly.
Also, we can handle more cases now.
Inserts an unreachable after an unconditional fail:
```
%0 = integer_literal 1
cond_fail %0, "message"
// following instructions
```
->
```
%0 = integer_literal 1
cond_fail %0, "message"
unreachable
deadblock:
// following instructions
```
Remove the old SILCombine implementation because it's not working well with OSSA lifetime completion.
This also required to move the `shouldRemoveCondFail` utility function from SILCombine to InstOptUtils.
This can cause several problems, e.g. false performance errors or even mis-compiles.
Instead, just ignore dead-end destroys as we don't want to "move" them away from `unreachable` instructions, anyway.
rdar://167553623
* remove borrow scopes which are borrowing an already guaranteed value
* allow optimizing lexical `begin_borrows` outside the mandatory pipeline
* fix: don't remove `begin_borrow [lexical]` of a `thin_to_thick_function` in the mandatory pipeline
When trying to remove a borrow scope by replacing all guaranteed uses with owned uses, don't bail for values which have debug_value uses.
Also make sure that the debug_value instructions are located within the owned value's lifetime.
Attempt to optimize by forwarding the destroy to operands of forwarding instructions.
```
%3 = struct $S (%1, %2)
destroy_value %3 // the only use of %3
```
->
```
destroy_value %1
destroy_value %2
```
The benefit of this transformation is that the forwarding instruction can be removed.
Also, handle `destroy_value` for phi arguments.
This is a more complex case where the destroyed value comes from different predecessors via a phi argument.
The optimization moves the `destroy_value` to each predecessor block.
```
bb1:
br bb3(%0)
bb2:
br bb3(%1)
bb3(%3 : @owned T):
... // no deinit-barriers
destroy_value %3 // the only use of %3
```
->
```
bb1:
destroy_value %0
br bb3
bb2:
destroy_value %1
br bb3
bb3:
...
```
This showed up on and off again on the source-compatibility testsuite project hummingbird.
The gist of the problem is that transformations may not rewrite the
type of an inlined instance of a variable without also createing a
deep copy of the inlined function with a different name (and e.g., a
specialization suffix). Otherwise the modified inlined variable will
cause an inconsistency when later compiler passes try to create the
abstract declaration of that inlined function as there would be
conflicting declarations for that variable.
Since SILDebugScope isn't yet available in the SwiftCompilerSources
this fix just drop these variables, but it would be absolutely
possible to preserve them by using the same mechanism that SILCloner
uses to create a deep copy of the inlined function scopes.
rdar://163167975
* remove `filterUsers(ofType:)`, because it's a duplication of `users(ofType:)`
* rename `filterUses(ofType:)` -> `filter(usersOfType:)`
* rename `ignoreUses(ofType:)` -> `ignore(usersOfType:)`
* rename `getSingleUser` -> `singleUser`
* implement `singleUse` with `Sequence.singleElement`
* implement `ignoreDebugUses` with `ignore(usersOfType:)`
This is a follow-up of https://github.com/swiftlang/swift/pull/83728/commits/eb1d5f484c9f4dae73a3779191bfdf917fd07a49.