These are always safe in OSSA since what we are doing here is hoisting the
ref_to_raw_pointer up the def-use chain without deleting any instructions unless
we know that they do not have any uses (in a strict sense so destroy_value is
considered a use). E.x.:
```
%0 = ...
%1 = unchecked_ref_cast %0
%2 = ref_to_raw_pointer %1
```
->
```
%0 = ...
%1 = unchecked_ref_cast %0
%2 = ref_to_raw_pointer %0
```
Notice, how we are actually not changing %1 at all. Instead we are just moving
an instantaneous use earlier. One thing that is important to realize is that
this /does/ cause us to need to put the ref_to_raw_pointer at the insert
location of %0 since %0's lifetime ends at the unchecked_ref_cast if the value
is owned.
NOTE: I also identified the tests from sil_combine.sil that had to do with these
simplifications and extracted them into sil_combine_casts.sil and did the
ossa/non-ossa tests side by side. I am trying to fix up the SILCombine tests as
I update stuff, so if I find opportunities to move tests into a more descriptive
sub-file, I am going to do so.
As an aside, to make it easier to transition SILCombine away from using a
central builder, I added a withBuilder method that creates a new SILBuilder at a
requested insertPt and uses the same context as the main builder of
SILCombine. It also through the usage of auto makes really concise pieces of
code. Today to do this just using builder, we would do:
```
SILBuilderWithScope builder(insertPt, Builder);
builder.createInst1(insertPt->getLoc(), ...);
builder.createInst2(insertPt->getLoc(), ...);
builder.createInst3(insertPt->getLoc(), ...);
auto *finalValue = builder.createInst4(insertPt->getLoc(), ...);
```
Thats a lot of typing and wastes a really commonly used temp name (builder) in
the local scope! Instead, using this API, one can write:
auto *finalValue = withBuilder(insertPt, [&](auto &b, auto l) {
b.createInst1(l, ...);
b.createInst2(l, ...);
b.createInst3(l, ...);
return b.createInst4(l, ...);
});
There is significantly less to type and auto handles the types for us. The
withBuilder construct is just syntactic since we always inline it.
The one opt we perform here is that we promote fix_lifetime on loadable
alloc_stack addresses to fix_lifetimes on objects by loading the underlying
value and putting the fix lifetime upon it.
This is a generic API that when ownership is enabled allows one to replace all
uses of a value with a value with a differing ownership by transforming/lifetime
extending as appropriate.
This API supports all pairings of ownership /except/ replacing a value with
OwnershipKind::None with a value without OwnershipKind::None. This is a more
complex optimization that we do not support today. As a result, we include on
our state struct a helper routine that callers can use to know if the two values
that they want to process can be handled by the algorithm.
My moticiation is to use this to to update InstSimplify and SILCombiner in a
less bug prone way rather than just turn stuff off.
Noting that this transformation inserts ownership instructions, I have made sure
to test this API in two ways:
1. With Mandatory Combiner alone (to make sure it works period).
2. With Mandatory Combiner + Semantic ARC Opts to make sure that we can
eliminate the extra ownership instructions it inserts.
As one can see from the tests, the optimizer today is able to handle all of
these transforms except one conditional case where I need to eliminate a dead
phi arg. I have a separate branch that hits that today but I have exposed unsafe
behavior in ClosureLifetimeFixup that I need to fix first before I can land
that. I don't want that to stop this PR since I think the current low level ARC
optimizer may be able to help me here since this is a simple transform it does
all of the time.
This works around an issue where using an apply with an unsubstituted
substitution map causes issues in downstream optimizations.
```
%9 = alloc_stack $@opened("60E354F4-17B9-11EB-9427-ACDE48001122") NonClassProto
copy_addr %8 to [initialization] %9 : $*@opened("60E354F4-17B9-11EB-9427-ACDE48001122") NonClassProto
%11 = witness_method $ConformerClass, #NonClassProto.myVariable!getter : <Self where Self : NonClassProto> (Self) -> () -> SomeValue :
$@convention(witness_method: NonClassProto) <τ_0_0 where τ_0_0 : NonClassProto> (@in_guaranteed τ_0_0) -> SomeValue
apply %11<@opened("60E354F4-17B9-11EB-9427-ACDE48001122") NonClassProto>(%9) : $@convention(witness_method: NonClassProto) <τ_0_0 where τ_0_0 : NonClassProto> (@in_guaranteed τ_0_0) -> SomeValue
```
The problem arise when the devirtualizer replace
`witness_method $ConformerClass, #NonClassProto.myVariable!getter` with the
underlying implementation. That implementation for better or worse is further
constrained to `Self : ConformerClass` and applying an opened existential
which is not class constraint is a recipe for disaster. The proper
solution would probably be for the devirtualizer to insert the cast if necessary
and update the substitution list.
That fix will be left for another day though.
rdar://70582785
This caused a problem when propagating the concrete type of an existential: if the concrete type is itself an opened existential, it was not added to the OpenedArchetypeTracker.
https://bugs.swift.org/browse/SR-13444
rdar://problem/68077098
Optimize the unconditional_checked_cast_addr in this pattern:
%box = alloc_existential_box $Error, $ConcreteError
%a = project_existential_box $ConcreteError in %b : $Error
store %value to %a : $*ConcreteError
%err = alloc_stack $Error
store %box to %err : $*Error
%dest = alloc_stack $ConcreteError
unconditional_checked_cast_addr Error in %err : $*Error to ConcreteError in %dest : $*ConcreteError
to:
...
retain_value %value : $ConcreteError
destroy_addr %err : $*Error
store %value to %dest $*ConcreteError
This lets the alloc_existential_box become dead and it can be removed in following optimizations.
The same optimization is also done for conditional_checked_cast_addr.
There is also an implication for debugging:
Each "throw" in the code calls the runtime function swift_willThrow. The function is used by the debugger to set a breakpoint and also add hooks.
This optimization can completely eliminate a "throw", including the runtime call.
So, with optimized code, the user might not see the program to break at a throw, whereas in the source code it is actually throwing.
On the other hand, eliminating the existential box is a significant performance win and we don't guarantee any debugging behavior for optimized code anyway. So I think this is a reasonable trade-off.
I added an option "-Xllvm -keep-will-throw-call" to keep the runtime call which can be used if someone want's to reliably break on "throw" in optimized builds.
rdar://problem/66055678
This reinstates commit d7d829c059 with a fix for C tail-allocated arrays.
Replace a call of the getter of AnyKeyPath._storedInlineOffset with a "constant" offset, in case of a keypath literal.
"Constant" offset means a series of struct_element_addr and tuple_element_addr instructions with a 0-pointer as base address.
These instructions can then be lowered to "real" constants in IRGen for concrete types, or to metatype offset lookups for generic or resilient types.
Replace:
%kp = keypath ...
%offset = apply %_storedInlineOffset_method(%kp)
with:
%zero = integer_literal $Builtin.Word, 0
%null_ptr = unchecked_trivial_bit_cast %zero to $Builtin.RawPointer
%null_addr = pointer_to_address %null_ptr
%projected_addr = struct_element_addr %null_addr
... // other address projections
%offset_ptr = address_to_pointer %projected_addr
%offset_builtin_int = unchecked_trivial_bit_cast %offset_ptr
%offset_int = struct $Int (%offset_builtin_int)
%offset = enum $Optional<Int>, #Optional.some!enumelt, %offset_int
rdar://problem/53309403
Replace a call of the getter of AnyKeyPath._storedInlineOffset with a "constant" offset, in case of a keypath literal.
"Constant" offset means a series of struct_element_addr and tuple_element_addr instructions with a 0-pointer as base address.
These instructions can then be lowered to "real" constants in IRGen for concrete types, or to metatype offset lookups for generic or resilient types.
Replace:
%kp = keypath ...
%offset = apply %_storedInlineOffset_method(%kp)
with:
%zero = integer_literal $Builtin.Word, 0
%null_ptr = unchecked_trivial_bit_cast %zero to $Builtin.RawPointer
%null_addr = pointer_to_address %null_ptr
%projected_addr = struct_element_addr %null_addr
... // other address projections
%offset_ptr = address_to_pointer %projected_addr
%offset_builtin_int = unchecked_trivial_bit_cast %offset_ptr
%offset_int = struct $Int (%offset_builtin_int)
%offset = enum $Optional<Int>, #Optional.some!enumelt, %offset_int
rdar://problem/53309403
Propagate a value from a static "let" global variable.
This optimization is also done by GlobalOpt, but not with de-serialized globals, which can occur with cross-module optimization.
Replaces an alloc_stack of an enum by an alloc_stack of the payload if only one enum case (with payload) is stored to that location.
For example:
%loc = alloc_stack $Optional<T>
%payload = init_enum_data_addr %loc
store %value to %payload
...
%take_addr = unchecked_take_enum_data_addr %loc
%l = load %take_addr
is transformed to
%loc = alloc_stack $T
store %value to %loc
...
%l = load %loc
https://bugs.swift.org/browse/SR-12710
This simplifies the handling of the subdirectories in the SIL and
SILOptimizer paths. Create individual libraries as object libraries
which allows the analysis of the source changes to be limited in scope.
Because these are object libraries, this has 0 overhead compared to the
previous implementation. However, string operations over the filenames
are avoided. The cost for this is that any new sub-library needs to be
added into the list rather than added with the special local function.
The static analyzer flags this as a nullptr dereference since FullApplySite::isa
can fail if given a non-ApplySite. Of course though, the SILInstruction is an
apply! We just created it! This commit helps the static analyzer by propagating
this type information by not downcasting our ApplyInst to SILInstruction and
then just use FullApplySite's ApplyInst constructor instead.
Add new pattern in SILCombine to optimize redundant thick to objc metatype conversions
Add pattern to catch the following redundant conversion:
%tmp1 = thick_to_objc_metatype %x
%tmp2 = objc_to_thick_metatype %tmp1
...
%tmp3 = <sil operation> %tmp2
to:
%tmp3 = <sil operation> %x
Similarly add pattern for redundant conversion of objc_to_thick_metatype
followed by thick_to_objc_metatype.
Fixes rdar://62932799
This became necessary after recent function type changes that keep
substituted generic function types abstract even after substitution to
correctly handle automatic opaque result type substitution.
Instead of performing the opaque result type substitution as part of
substituting the generic args the underlying type will now be reified as
part of looking at the parameter/return types which happens as part of
the function convention apis.
rdar://62560867
When performing the substitution of the 'concrete' type that happens to
be an opened archetype we need to force the substitution to actually
call the conformance remapping function.
rdar://62202282
SR-12571
MSVC does not realize that the switch is exhaustive and requires that
the path is explicitly marked as unreachable. This silences the C4715
warning ("not all control paths return a value").
I am going to use this in mandatory combine, and it seems like a generally
useful transformation.
I also updated the routine to construct its own SILBuilder that injects a user
passed in SILBuilderContext eliminating the bad pattern of passing in
SILBuilders.
This should be an NFC change.
To be precise: don't add instruction uses to the worklist if it already has more than 10000 elements.
This avoids quadratic behavior for very large functions.
rdar://problem/56268570
Disable `SILCombiner::visitPartialApplyInst` from rewriting `partial_apply` with
with `@convention(method)` callee to `thin_to_thick_function`.
This fixes SIL verification errors: `thin_to_thick_function` only supports
`@convention(thin)` operands.
Resolves SR-12548.
A partial_apply of a function_ref whose body consists of just an
apply of a witness_method can be simplified down to a simple
partial_apply of the witness_method:
sil @foo:
%fn = witness_method ...
%result = apply %fn(...)
return %result
sil @bar:
%fn = function_ref @foo
%closure = partial_apply %fn(...)
===>
sil @bar:
%fn = witness_method ...
%closure = partial_apply %fn(...)