For a redundant pair of pointer-address conversions, e.g.
%2 = address_to_pointer %1
%3 = pointer_to_address %2 [strict]
replace all uses of %3 with %1.
It is necessary for opaque values where for casts that will newly start
out as checked_cast_brs and be lowered to checked_cast_addr_brs, since
the latter has the source formal type, IRGen relies on being able to
access it, and there's no way in general to obtain the source formal
type from the source lowered type.
We inline a function (e.g. a struct initializer) into a global init function if the result is part of the initialized global.
Now, also handle functions with indirect return values. Such function can result from not-reabstracted generic specializations.
Handle cases where the result is stored into a temporary alloc_stack or directly stored to (a part) of the global variable.
Sometimes structs are not stored in one piece, but as individual elements. Merge such individual stores to a single store of the whole struct.
This enables generating a statically initialized global.
The new implementation has several benefits compared to the old C++ implementation:
* It is significantly simpler. It optimizes each load separately instead of all at once with bit-field based dataflow.
* It's using alias analysis more accurately which enables more loads to be optimized
* It avoids inserting additional copies in OSSA
The algorithm is a data flow analysis which starts at the original load and searches for preceding stores or loads by following the control flow in backward direction.
The preceding stores and loads provide the "available values" with which the original load can be replaced.
To make it available in other optimizations as well.
Also, a few problems:
* Use destructre instructions when in OSSA
* Don't split the store if it's nominal type has unreferenceable stoarge
* rename it to `trySplit` because it's not guaranteed to work
Also, add the counterpart for load instructions: `LoadInst.trySplit()`
A begin_apply can yield multiple addresses. We need to store the result of the apply in order to distinguish between two AccessBases with different results from the same begin_apply.
`ownership` is a bad name in `LoadInst`, because it hides `Value.ownership`.
Therefore rename it to `loadOwnership`.
Do the same for ownership in StoreInst to be consistent.
Before this change, if a global variable is required to be statically initialized (e.g. due to @_section attribute), we don't allow its type to be a struct, only a scalar type works. This change improves on that by teaching MandatoryPerformanceOptimizations pass to inline struct initializer calls into initializer of globals, as long as they are simple enough so that we can be sure that we don't trigger recursive/infinite inlining.
The old C++ pass didn't catch a few cases.
Also:
* The new pass is significantly simpler: it doesn't perform dataflow for _all_ memory locations at once using bitfields, but handles each store separately. (In both implementations there is a complexity limit in place to avoid quadratic complexity)
* The new pass works with OSSA
For `alloc_ref [bare] [stack]` and `global_value [bare]` omit the object header initialization.
The `bare` flag means that the object header is not used.
This was already done with a peephole optimization inside IRGen for `global_value`. But now rely on the SIL `bare` flag.
It sets the `[bare]` attribute for `alloc_ref` and `global_value` instructions if their header (reference count and metatype) is not used throughout the lifetime of the object.
The `bare` attribute indicates that the object header is not used throughout the lifetime of the value.
This means, no reference counting operations are performed on the object and its metadata is not used.
The header of bare objects doesn't need to be initialized.
The `bare` attribute indicates that the object header is not used throughout the lifetime of the object.
This means, no reference counting operations are performed on the object and its metadata is not used.
The header of bare objects doesn't need to be initialized.
Look through `upcast` and `init_existential_ref` instructions and replace the operand of this cast instruction with the original value.
For example:
```
%2 = upcast %1 : $Derived to $Base
%3 = init_existential_ref %2 : $Base : $Base, $AnyObject
checked_cast_br %3 : $AnyObject to Derived, bb1, bb2
```
This makes it more likely that the cast can be constant folded because the source operand's type is more accurate.
In the example above, the cast reduces to
```
checked_cast_br %1 : $Derived to Derived, bb1, bb2
```
which can be trivially folded to always-succeeds.
Found while looking at `_SwiftDeferredNSDictionary.bridgeValues()`
Generic specialization drops metatype arguments and therefore exposes opportunities to remove dead metatype instructions.
Instead of removing dead metatype instructions before specialization, try to remove them after specialization.
Instead of using `if` in case of checking if `index < end` in `next` function of Stack. We can use `guard` statement to make it more readable and concise.
Optimize the sequence
```
%1 = init_enum_data_addr %enum_addr, #someCaseWithPayload
store %payload to %1
inject_enum_addr %enum_addr, #someCaseWithPayload
```
to
```
%1 = enum $E, #someCaseWithPayload, %payload
store %1 to %enum_addr
```
This sequence of three instructions must appear in consecutive order.
But usually this is the case, because it's generated this way by SILGen.
This comes up in the code for constructing an empty string literal.
With this optimization it's possible to statically initialize empty string global variables.