I implemented specifically the promotion from switch_enum_addr -> switch_enum
for loadable types.
NOTE: I did not add support for the inject_enum_addr based conversion to br since that
optimization deletes an edge from the CFG, potentially invalidating CFG
properties. Since OSSA needs to use DeadEndBlocks, we want to avoid changing
reachability in any way with our SILCombine transforms.
All of the non-SILCombiner specific helpers have already been updated for OSSA,
so this was not too bad.
NOTE: I also added two small combines that delete copy_value, destroy_value with
.none arguments. The reason why I added this is that this is a pretty small
addition and many of the tests of this code rely on SILCombine being able to
eliminate such operations on thin_to_thick_function.
NOTE: I also disabled TypePropagation in OSSA, we are going to redo that code
when we bring up opaque values.
From an ownership perspective, this is just eliminating a non-lifetime ending
use. The new instructions we create are not connecting to the underlying object
anymore.
This reverts commit 0a0f8cffc1.
Currently the memory lifetime verifier (I believe due to the switch_enum) is not
caring that we are not emitting destroy_addr for an enum address along paths
where we know the underlying enum case is trivial.
This transform eliminated that switch_enum_addr causing the memory lifetime
verifier to then trigger.
Disabling to unblock the bots!
NOTE: The stdlib count/capacity propagation code is tested in an end<->end
fashion in a separate Swift test. Once I flip the switch, that test will run.
The code is pretty simple, so I feel relatively confident with it.
The one opt we perform here is that we promote fix_lifetime on loadable
alloc_stack addresses to fix_lifetimes on objects by loading the underlying
value and putting the fix lifetime upon it.
This caused a problem when propagating the concrete type of an existential: if the concrete type is itself an opened existential, it was not added to the OpenedArchetypeTracker.
https://bugs.swift.org/browse/SR-13444
rdar://problem/68077098
Optimize the unconditional_checked_cast_addr in this pattern:
%box = alloc_existential_box $Error, $ConcreteError
%a = project_existential_box $ConcreteError in %b : $Error
store %value to %a : $*ConcreteError
%err = alloc_stack $Error
store %box to %err : $*Error
%dest = alloc_stack $ConcreteError
unconditional_checked_cast_addr Error in %err : $*Error to ConcreteError in %dest : $*ConcreteError
to:
...
retain_value %value : $ConcreteError
destroy_addr %err : $*Error
store %value to %dest $*ConcreteError
This lets the alloc_existential_box become dead and it can be removed in following optimizations.
The same optimization is also done for conditional_checked_cast_addr.
There is also an implication for debugging:
Each "throw" in the code calls the runtime function swift_willThrow. The function is used by the debugger to set a breakpoint and also add hooks.
This optimization can completely eliminate a "throw", including the runtime call.
So, with optimized code, the user might not see the program to break at a throw, whereas in the source code it is actually throwing.
On the other hand, eliminating the existential box is a significant performance win and we don't guarantee any debugging behavior for optimized code anyway. So I think this is a reasonable trade-off.
I added an option "-Xllvm -keep-will-throw-call" to keep the runtime call which can be used if someone want's to reliably break on "throw" in optimized builds.
rdar://problem/66055678
Propagate a value from a static "let" global variable.
This optimization is also done by GlobalOpt, but not with de-serialized globals, which can occur with cross-module optimization.
Replaces an alloc_stack of an enum by an alloc_stack of the payload if only one enum case (with payload) is stored to that location.
For example:
%loc = alloc_stack $Optional<T>
%payload = init_enum_data_addr %loc
store %value to %payload
...
%take_addr = unchecked_take_enum_data_addr %loc
%l = load %take_addr
is transformed to
%loc = alloc_stack $T
store %value to %loc
...
%l = load %loc
https://bugs.swift.org/browse/SR-12710
MSVC does not realize that the switch is exhaustive and requires that
the path is explicitly marked as unreachable. This silences the C4715
warning ("not all control paths return a value").
Instead of bailing on ownership functions in SILCombine::run, we will
bail in individual visitors. This way, as SILCombine is updated we can
paritially support ownership across the pass.
Convert sequences of
%payload_addr = init_enum_data_addr %enum_addr
%elem0_addr = tuple_element_addr %payload_addr, 0
%elem1_addr = tuple_element_addr %payload_addr, 1
...
store %payload0 to %elem0_addr
store %payload1 to %elem1_addr
...
inject_enum_addr %enum_addr, $EnumType.case
to
%tuple = tuple (%payload0, %payload1, ...)
%enum = enum $EnumType, $EnumType.case, %tuple
store %enum to %enum_addr
Such patterns are generated for example when using the stdlib enumarated() function.
Part of rdar://problem/33438123
A "copy_addr [take] %src to [initialization] %alloc_stack" is replaced by a "destroy_addr %src" if the alloc_stack is otherwise dead.
This is okay as long as the "moved" object is kept alive otherwise.
This can break if a retain of %src is moved after the copy_addr. It cannot happen with OSSA.
So as soon as we have OSSA, we can remove the check again.
rdar://problem/59509229
Constant-propagate the 0 value when loading "count" or "capacity" from the empty Array, Set or Dictionary storage.
On high-level SIL this optimization is also done by the ArrayCountPropagation pass, but only for Array.
And even for Array it's sometimes needed to propagate the empty-array count when high-level semantics function are already inlined.
Fixes an optimization deficiency for empty OptionSet literals.
https://bugs.swift.org/browse/SR-12046
rdar://problem/58861171
SIL type lowering erases DynamicSelfType, so we generate
incorrect code when casting to DynamicSelfType. Fixing this
requires a fair amount of plumbing, but most of the
changes are mechanical.
Note that the textual SIL syntax for casts has changed
slightly; the target type is now a formal type without a '$',
not a SIL type.
Also, the unconditional_checked_cast_value and
checked_cast_value_br instructions now take the _source_
formal type as well, just like the *_addr forms they are
intended to replace.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
This is the complement to load canonicalization. Although store
canonicalization is not required before diagnostics, it should be
defined in the same utility.
This is a large patch; I couldn't split it up further while still
keeping things working. There are four things being changed at
once here:
- Places that call SILType::isAddressOnly()/isLoadable() now call
the SILFunction overload and not the SILModule one.
- SILFunction's overloads of getTypeLowering() and getLoweredType()
now pass the function's resilience expansion down, instead of
hardcoding ResilienceExpansion::Minimal.
- Various other places with '// FIXME: Expansion' now use a better
resilience expansion.
- A few tests were updated to reflect SILGen's improved code
generation, and some new tests are added to cover more code paths
that previously were uncovered and only manifested themselves as
standard library build failures while I was working on this change.
Basically the pattern to optimize is:
inject_enum_addr %stackloc, #SomeCase
switch_enum_addr %stackloc ...
This works even if the enum is resilient and the case does not have a payload. As long as we don't have opaque values in SIL we need this peephole to optimize the pattern.
This change fixes the code generation for Float.rounded().
rdar://problem/46353885
We can always eliminate ARC operations on the result of a value_to_bridge_object instruction.
There is no need to restrict the operand to a specific SIL pattern.