The new implementation has several benefits compared to the old C++ implementation:
* It is significantly simpler. It optimizes each load separately instead of all at once with bit-field based dataflow.
* It's using alias analysis more accurately which enables more loads to be optimized
* It avoids inserting additional copies in OSSA
The algorithm is a data flow analysis which starts at the original load and searches for preceding stores or loads by following the control flow in backward direction.
The preceding stores and loads provide the "available values" with which the original load can be replaced.
The old C++ pass didn't catch a few cases.
Also:
* The new pass is significantly simpler: it doesn't perform dataflow for _all_ memory locations at once using bitfields, but handles each store separately. (In both implementations there is a complexity limit in place to avoid quadratic complexity)
* The new pass works with OSSA
It sets the `[bare]` attribute for `alloc_ref` and `global_value` instructions if their header (reference count and metatype) is not used throughout the lifetime of the object.
The `bare` attribute indicates that the object header is not used throughout the lifetime of the value.
This means, no reference counting operations are performed on the object and its metadata is not used.
The header of bare objects doesn't need to be initialized.
The `bare` attribute indicates that the object header is not used throughout the lifetime of the object.
This means, no reference counting operations are performed on the object and its metadata is not used.
The header of bare objects doesn't need to be initialized.
* add the StaticInitCloner utility
* remove bridging of `copyStaticInitializer` and `createStaticInitializer`
* add `Context.mangleOutlinedVariable` and `Context.createGlobalVariable`
* move the apply of partial_apply transformation from simplify-apply to simplify-partial_apply
* delete dead partial_apply instructions
* devirtualize apply, try_apply and begin_apply
This allows to run the NamedReturnValueOptimization only late in the pipeline.
The optimization shouldn't be done before serialization, because it might prevent predictable memory optimizations in the caller after inlining.
It converts a lazily initialized global to a statically initialized global variable.
When this pass runs on a global initializer `[global_init_once_fn]` it tries to create a static initializer for the initialized global.
```
sil [global_init_once_fn] @globalinit {
alloc_global @the_global
%a = global_addr @the_global
%i = some_const_initializer_insts
store %i to %a
}
```
The pass creates a static initializer for the global:
```
sil_global @the_global = {
%initval = some_const_initializer_insts
}
```
and removes the allocation and store instructions from the initializer function:
```
sil [global_init_once_fn] @globalinit {
%a = global_addr @the_global
%i = some_const_initializer_insts
}
```
The initializer then becomes a side-effect free function which let's the builtin-simplification remove the `builtin "once"` which calls the initializer.
If a `debug_step` has the same debug location as a previous or succeeding instruction it is removed.
It's just important that there is at least one instruction for a certain debug location so that single stepping on that location will work.
* for testing: add the option `-simplify-instruction=<instruction-name>` to only run simplification passes for that instruction type
* on the swift side, add `Options.enableSimplification`
* split the `PassContext` into multiple protocols and structs: `Context`, `MutatingContext`, `FunctionPassContext` and `SimplifyContext`
* change how instruction passes work: implement the `simplify` function in conformance to `SILCombineSimplifyable`
* add a mechanism to add a callback for inserted instructions
Replace the generic `List` with the (non-generic) `InstructionList` and `BasicBlockList`.
The `InstructionList` is now a bit different than the `BasicBlockList` because it supports that instructions are deleted while iterating over the list.
Also add a test pass which tests instruction modification while iteration.
The pass to decide which functions should get stack protection was added in https://github.com/apple/swift/pull/60933, but was disabled by default.
This PR enables stack protection by default, but not the possibility to move arguments into temporaries - to keep the risk low.
Moving to temporaries can be enabled with the new frontend option `-enable-move-inout-stack-protector`.
rdar://93677524
This invalidation kind is used when a compute-effects pass changes function effects.
Also, let optimization passes which don't change effects only invalidate the `FunctionBody` and not `Everything`.
Added new C++-to-Swift callback for isDeinitBarrier.
And pass it CalleeAnalysis so it can depend on function effects. For
now, the argument is ignored. And, all callers just pass nullptr.
Promoted to API the mayAccessPointer component predicate of
isDeinitBarrier which needs to remain in C++. That predicate will also
depends on function effects. For that reason, it too is now passed a
BasicCalleeAnalysis and is moved into SILOptimizer.
Also, added more conservative versions of isDeinitBarrier and
maySynchronize which will never consider side-effects.
Computes the side effects for a function, which consists of argument- and global effects.
This is similar to the ComputeEscapeEffects pass, just for side-effects.