Do this even if the function then contains references to other functions with wrong linkage.
MandatoryPerformanceOptimization fixes the linkage afterwards.
This is similar to what we already do with de-virtualizing class and witness methods: https://github.com/swiftlang/swift/pull/76032
rdar://168171608
Wihtout this change an alloc_stack instruction that is defined in a different
basic block than its use could result in instructions that are not dominated by
its operands. In an asserts build this is caught by the SIL verifier, but in a
non-asserts build it can crash the compiler.
Thanks to Erik Eckstein for the actual implementation of the fix!
rdar://168622625
As RedundantLoadElimination processing the loads in reverse control flow order, the replaced loads might accumulate quite a lot of users.
This happens if the are many loads from the same location in a row.
To void quadratic complexity in `uses.replaceAll`, we swap both load instructions and move the uses from the `existingLoad` (which usually has a small number of uses) to this load - and delete the `existingLoad`.
This came up in the Sema/large_int_array.swift.gyb test, which tests a 64k large Int array.
With this fix, the compile time gets down from 3 minutes to 5 seconds.
I also changed the test:
* run the compiler with the timeout script to detect build time regressions
* re-enabled the SIL verifier because the problem there is already fixed (https://github.com/swiftlang/swift/pull/86781)
* run the compiler with -O to also test the whole optimizer pipeline and not only the mandatory pipeline
* assigned the array to a real variable (instead of `_`) to not let the optimizer remove the whole array too early
* run the compiler with and without `-parse-as-library` because this makes a huge difference how the global array is being generated
If a value is "copied" to the stack location via a `load` and `store` instruction pair and the source location is written or de-allocated between both instructions,
the optimization generated wrong SIL.
The fix is to make sure that writes to the source locations are always checked between the `load` and `store`.
rdar://168595700
This new OSSA invariant simplifies many optimizations because they don't have to take care of the corner case of incomplete lifetimes in dead-end blocks.
The implementation basically consists of these changes:
* add the lifetime completion utility
* add a flag in SILFunction which tells optimization that they need to run the lifetime completion utility
* let all optimizations complete lifetimes if necessary
* enable the ownership verifier to check complete lifetimes
These two new invariants eliminate corner cases which caused bugs if optimization didn't handle them.
Also, it will significantly simplify lifetime completion.
The implementation basically consists of these changes:
* add a flag in SILFunction which tells optimization if they need to take care of infinite loops
* add a utility to break infinite loops
* let all optimizations remove unreachable blocks and break infinite loops if necessary
* add verification to check the new SIL invariants
The new `breakIfniniteLoops` utility breaks infinite loops in the control flow by inserting an "artificial" loop exit to a new dead-end block with an `unreachable`.
It inserts a `cond_br` with a `builtin "infinite_loop_true_condition"`:
```
bb0:
br bb1
bb1:
br bb1 // back-end branch
```
->
```
bb0:
br bb1
bb1:
%1 = builtin "infinite_loop_true_condition"() // always true, but the compiler doesn't know
cond_br %1, bb2, bb3
bb2: // new back-end block
br bb1
bb3: // new dead-end block
unreachable
```
Make sure to create `destroy_value` with correct dead_end flags.
Especially, in a dead-end region where the original value's lifetime was ended with a `destroy_value [dead_end]`, the optimization must not re-create a destroy without the dead_end flag.
^ Conflicts:
^ test/SILOptimizer/mandatory-destroy-hoisting.sil
Make sure to create `destroy_value` with correct dead_end flags.
Especially, in a dead-end region where the original value's lifetime was ended with a `destroy_value [dead_end]`, the optimization must not re-create a destroy without the dead_end flag.
^ Conflicts:
^ test/SILOptimizer/destroy-hoisting.sil
This reverts commit b1eb70bf45.
The original PR (#85244) was reverted (#85836) due to rdar://165667449
This version fixes the aforementioned issue, and (potentially) improves overall
debug info retention by salvaging info in erase(instructionIncludingDebugUses:),
rather than inserting many explicit salvageDebugInfo calls throughout the code
to make the test cases pass.
Bridging: Make salvageDebugInfo a method of MutatingContext
This feels more consistent than making it a method of Instruction.
DebugInfo: Salvage from trivially dead instructions
This handles the case we were checking in constantFoldBuiltin, so we do not need to salvage debug info there any more.
Usually `load [take]` is not hoisted anyway, because there must be another instruction in the loop which re-initializes the loaded memory location.
However, this is not necessarily true for alloc_stack locations with dynamic lifetime, e.g. a pack-loop which is specialized for a single pack element (at runtime the loop is only executed once).
Fixes a SIL ownership verifier crash
rdar://167504916
The conformances `Type: Comparable` and `EnumCase: Equatable` were
previously introduced in #85757 for implementing AutoDiff closure
specialization logic. This patch moves the latter directly to the SIL
module. The conformance `Type: Comparable` is deleted, and `sort(by:)`
is used instead of `sort` to mimic the same behavior in tests. These
changes address warnings mentioned in the comment:
https://github.com/swiftlang/swift/pull/85757#issuecomment-3681636278
Specifically, improved debug info retention in:
* tryReplaceRedundantInstructionPair,
* splitAggregateLoad,
* TempLValueElimination,
* Mem2Reg,
* ConstantFolding.
The changes to Mem2Reg allow debug info to be retained in the case tested by
self-nostorage.swift in -O builds, so we have just enabled -O in that file
instead of writing a new test for it.
We attempted to add a case to salvageDebugInfo for unchecked_enum_data, but it
caused crashes in Linux CI that we were not able to reproduce.
Refactor certain functions to make them simpler. and avoid calling
AST.Type.loweredType, which can fail. Instead, access the types of the
function's (SIL) arguments directly.
Correctly handle exploding packs that contain generic or opaque types by using
AST.Type.mapOutOfEnvironment().
@substituted types cause the shouldExplode predicate to be unreliable for AST
types, so restrict it to just SIL.Type. Add test cases for functions that have
@substituted types.
Re-enable PackSpecialization in FunctionPass pipeline.
Add a check to avoid emitting a destructure_tuple of the original function's
return tuple when it is void/().
This is needed in Embedded Swift because the `witness_method` convention requires passing the witness table to the callee.
However, the witness table is not necessarily available.
A witness table is only generated if an existential value of a protocol is created.
This is a rare situation because only witness thunks have `witness_method` convention and those thunks are created as "transparent" functions, which means they are always inlined (after de-virtualization of a witness method call).
However, inlining - even of transparent functions - can fail for some reasons.
This change adds a new EmbeddedWitnessCallSpecialization pass:
If a function with `witness_method` convention is directly called, the function is specialized by changing the convention to `method` and the call is replaced by a call to the specialized function:
```
%1 = function_ref @callee : $@convention(witness_method: P) (@guaranteed C) -> ()
%2 = apply %1(%0) : $@convention(witness_method: P) (@guaranteed C) -> ()
...
sil [ossa] @callee : $@convention(witness_method: P) (@guaranteed C) -> () {
...
}
```
->
```
%1 = function_ref @$e6calleeTfr9 : $@convention(method) (@guaranteed C) -> ()
%2 = apply %1(%0) : $@convention(method) (@guaranteed C) -> ()
...
// specialized callee
sil shared [ossa] @$e6calleeTfr9 : $@convention(method) (@guaranteed C) -> () {
...
}
```
Fixes a compiler crash
rdar://165184147
It eliminates dead access scopes if they are not conflicting with other scopes.
Removes:
```
%2 = begin_access [modify] [dynamic] %1
... // no uses of %2
end_access %2
```
However, dead _conflicting_ access scopes are not removed.
If a conflicting scope becomes dead because an optimization e.g. removed a load, it is still important to get an access violation at runtime.
Even a propagated value of a redundant load from a conflicting scope is undefined.
```
%2 = begin_access [modify] [dynamic] %1
store %x to %2
%3 = begin_access [read] [dynamic] %1 // conflicting with %2!
%y = load %3
end_access %3
end_access %2
use(%y)
```
After redundant-load-elimination:
```
%2 = begin_access [modify] [dynamic] %1
store %x to %2
%3 = begin_access [read] [dynamic] %1 // now dead, but still conflicting with %2
end_access %3
end_access %2
use(%x) // propagated from the store, but undefined here!
```
In this case the scope `%3` is not removed because it's important to get an access violation error at runtime before the undefined value `%x` is used.
This pass considers potential conflicting access scopes in called functions.
But it does not consider potential conflicting access in callers (because it can't!).
However, optimizations, like redundant-load-elimination, can only do such transformations if the outer access scope is within the function, e.g.
```
bb0(%0 : $*T): // an inout from a conflicting scope in the caller
store %x to %0
%3 = begin_access [read] [dynamic] %1
%y = load %3 // cannot be propagated because it cannot be proved that %1 is the same address as %0
end_access %3
```
All those checks are only done for dynamic access scopes, because they matter for runtime exclusivity checking.
Dead static scopes are removed unconditionally.
This is wrong for hoisted load instructions because we don't check for aliasing in the pre-header.
And for side-effect-free instructions it's not really necessary, because that can cleanup CSE afterwards.
Fixes a miscompile
rdar://164034503
It hoists `destroy_value` instructions for non-lexical values.
```
%1 = some_ownedValue
...
last_use(%1)
... // other instructions
destroy_value %1
```
->
```
%1 = some_ownedValue
...
last_use(%1)
destroy_value %1 // <- moved after the last use
... // other instructions
```
In contrast to non-mandatory optimization passes, this is the only pass which hoists destroys over deinit-barriers.
This ensures consistent behavior in -Onone and optimized builds.