* add `GlobalVariable.staticInitializerInstructions` to access all initializer instructions of a global
* implement `GlobalVariable.staticInitValue` with `GlobalVariable.staticInitializerInstructions`
* this requires that `InstructionList.reversed()` works without accessing the parent block of the iterator instruction
* allow `Context.erase(instruction:)` to delete instructions from a global's initializer list, which means to handle the case where a deleted instruction has no parent function.
APIs on ForwardingInstruction should be written as static taking in
a SILInstruction as a parameter making it awkward.
Introduce a ForwardingOperation wrapper type and move the apis from the
old "mixin" class to the wrapper type.
Add new api getForwardedOperands()
Previously, the utility bailed out on lexical lifetimes because it
didn't respect deinit barriers. Here, deinit barriers are found and
added to liveness if the value is lexical. This enables copies to be
propagated without hoisting destroys over deinit barriers.
rdar://104630103
* for testing: add the option `-simplify-instruction=<instruction-name>` to only run simplification passes for that instruction type
* on the swift side, add `Options.enableSimplification`
* split the `PassContext` into multiple protocols and structs: `Context`, `MutatingContext`, `FunctionPassContext` and `SimplifyContext`
* change how instruction passes work: implement the `simplify` function in conformance to `SILCombineSimplifyable`
* add a mechanism to add a callback for inserted instructions
To improve the debugging experience of values whose lifetimes are
canonicalized without compromising the semantics expressed in the source
language, when canonicalizing OSSA lifetimes at Onone, lengthen
lifetimes as much as possible without incurring copies that would be
eliminated at O.
rdar://99618502
Andy some time ago already created the new API but didn't go through and update
the old occurences. I did that in this PR and then deprecated the old API. The
tree is clean, so I could just remove it, but I decided to be nicer to
downstream people by deprecating it first.
* Add the possibility to bisect the individual transforms of SILCombine and SimplifyCFG.
To do so, the `-sil-opt-pass-count` option now accepts the format `<n>.<m>`, where `m` is the sub-pass number.
The sub-pass number limits the number of individual transforms in SILCombine or SimplifyCFG.
* Add an option `-sil-print-last` to print the SIL of the currently optimized function before and after the last pass, which is specified with `-sil-opt-pass-count`.
And a few other small related changes:
* remove libswiftPassInvocation from SILInstructionWorklist (because it's not needed)
* replace start/finishPassRun with start/finishFunction/InstructionPassRun
NFC
Required before fixing/re-enabling OSSA RAUW utilities.
Make sure the SILCombine worklist canonicalizes all the copies and
guarantees termination.
Run canonicalization on every existing copy_value once
and once for every new copy_value added during SILCombine.
Only add copies and their uses back to the worklist if
canonicalization deleted an instruction.
Add tracing for sinking forwaring instructions.
And fix the way it handles of borrow scopes so we can enable borrow
scope rewiting. Make sure SILCombine only does canonicalization that
operates on a self-contained single-value-lifetime. It's important to
limit SILCombine to transformations where each individual step
converges quickly to a more canonical form. Rewriting borrow scopes
requires the copy propagation pass to coordinate all the individual
transformations.
Make canonicalizeLifetimes a SILCombine utility. This moves complexity
out of the main loop. SILCombine knows which values it wants to
canonicalize and can directly call either canonicalizeValueLifetime or
canonicalizeFunctionArgument for each one.
Respect the -enable/disable-copy-propagation options.
Instruction passes are basically visit functions in SILCombine for a specific instruction type.
With the macro SWIFT_INSTRUCTION_PASS such a pass can be declared in Passes.def.
SILCombine then calls the run function of the pass in libswift.
Track in-use iterators and update them both when instructions are
deleted and when they are added.
Safe iteration in the presence of arbitrary changes now looks like
this:
for (SILInstruction *inst : deleter.updatingRange(&bb)) {
modify(inst);
}
It's not needed anymore with delayed instruction deletion.
It was used for two purposes:
1. For analysis, which cache instructions, to avoid dangling instruction pointers
2. For passes, which maintain worklists of instructions, to remove a deleted instructions from the worklist. This is now done by checking SILInstruction::isDeleted().
Instead of caching alias results globally for the module, make AliasAnalysis a FunctionAnalysisBase which caches the alias results per function.
Why?
* So far the result caches could only grow. They were reset when they reached a certain size. This was not ideal. Now, they are invalidated whenever the function changes.
* It was not possible to actually invalidate an alias analysis result. This is required, for example in TempRValueOpt and TempLValueOpt (so far it was done manually with invalidateInstruction).
* Type based alias analysis results were also cached for the whole module, while it is actually dependent on the function, because it depends on the function's resilience expansion. This was a potential bug.
I also added a new PassManager API to directly get a function-base analysis:
getAnalysis(SILFunction *f)
The second change of this commit is the removal of the instruction-index indirection for the cache keys. Now the cache keys directly work on instruction pointers instead of instruction indices. This reduces the number of hash table lookups for a cache lookup from 3 to 1.
This indirection was needed to avoid dangling instruction pointers in the cache keys. But this is not needed anymore, because of the new delayed instruction deletion mechanism.
To be more explicit, canonicalizeOSSALifetimes is a utility that
re-canonicalizes all at once a set of defs that the caller found by applying
CanonicalizeOSSALifetime::getCanonicalCopiedDef(copy)). The reason why I am
doing this is that when we RAUW in OSSA, we sometimes insert additional copies
to make the problem easier for a utility to handle. This lets us canonicalize
away any copies before we even leave the pass.
... with a fix for a non-assert build crash: I used the wrong ilist type for SlabList. This does not explain the crash, though. What I think happened here is that llvm miscompiled and put the llvm_unreachable from the Slab's deleteNode function unconditionally into the SILModule destructor.
Now by using simple_ilist, there is no need for a deleteNode at all.
There are a bunch of optimizations in SILCombine where we try to fold an
ownership forwarding instruction A into another ownership forwarding instruction
B without deleting A. Consider the upcasts in the example below:
```
%0 = upcast %x : $X->Y
%1 = upcast %0 : $Y->Z
```
These sorts of optimizations fold the first instruction into the second like so:
```
%0 = upcast %x : $X->Y
%1 = upcast %x : $X->Z
```
This creates a problem when we are dealing with owned values since we have just
introduced two consumes for %x. To work around this, we have two options:
1. Introduce extra copies.
2. We recognize the situations where we can guarantee that we can delete the
first upcast.
The first choice I believe is not a choice since breaking a forwarding chain of
ownership in favor of extra copies is a less canonical form. That leaves us with
the second form. What are the necessary/sufficient conditions for deleting the
first upcast. Simply it is that the upcast cannot have any non-debug,
non-consuming uses! In such a case, we know that along all paths through the
program the value has exactly one non-debug use, one of its consuming uses. If
when optimizing upcasts we could recognize that pattern, duplicate the inst
along paths not through our 2nd upcast and thus delete the original upcast
fixing the ownership error!
While this is all nice and good there is a problem with this: it doesn't
scale. As I was writing a few optimizations like this I began to note that I had
to write different versions of this same helper for many of the visitors (they
generally varied by how many forwarding instructions they looked through).
As I pondered the above, I chatted a bit with @atrick and during our
conversation, we both realized that it is much easier to solve this problem in
one block and that the condition above would allow us to sink these instructiosn
into the same block and thus if we could check for this condition and
canonicialize the IR to sink these instructions before we visiting, we could use
a single helper to handle all of these cases.
This is a generic API that when ownership is enabled allows one to replace all
uses of a value with a value with a differing ownership by transforming/lifetime
extending as appropriate.
This API supports all pairings of ownership /except/ replacing a value with
OwnershipKind::None with a value without OwnershipKind::None. This is a more
complex optimization that we do not support today. As a result, we include on
our state struct a helper routine that callers can use to know if the two values
that they want to process can be handled by the algorithm.
My moticiation is to use this to to update InstSimplify and SILCombiner in a
less bug prone way rather than just turn stuff off.
Noting that this transformation inserts ownership instructions, I have made sure
to test this API in two ways:
1. With Mandatory Combiner alone (to make sure it works period).
2. With Mandatory Combiner + Semantic ARC Opts to make sure that we can
eliminate the extra ownership instructions it inserts.
As one can see from the tests, the optimizer today is able to handle all of
these transforms except one conditional case where I need to eliminate a dead
phi arg. I have a separate branch that hits that today but I have exposed unsafe
behavior in ClosureLifetimeFixup that I need to fix first before I can land
that. I don't want that to stop this PR since I think the current low level ARC
optimizer may be able to help me here since this is a simple transform it does
all of the time.
Optimize the unconditional_checked_cast_addr in this pattern:
%box = alloc_existential_box $Error, $ConcreteError
%a = project_existential_box $ConcreteError in %b : $Error
store %value to %a : $*ConcreteError
%err = alloc_stack $Error
store %box to %err : $*Error
%dest = alloc_stack $ConcreteError
unconditional_checked_cast_addr Error in %err : $*Error to ConcreteError in %dest : $*ConcreteError
to:
...
retain_value %value : $ConcreteError
destroy_addr %err : $*Error
store %value to %dest $*ConcreteError
This lets the alloc_existential_box become dead and it can be removed in following optimizations.
The same optimization is also done for conditional_checked_cast_addr.
There is also an implication for debugging:
Each "throw" in the code calls the runtime function swift_willThrow. The function is used by the debugger to set a breakpoint and also add hooks.
This optimization can completely eliminate a "throw", including the runtime call.
So, with optimized code, the user might not see the program to break at a throw, whereas in the source code it is actually throwing.
On the other hand, eliminating the existential box is a significant performance win and we don't guarantee any debugging behavior for optimized code anyway. So I think this is a reasonable trade-off.
I added an option "-Xllvm -keep-will-throw-call" to keep the runtime call which can be used if someone want's to reliably break on "throw" in optimized builds.
rdar://problem/66055678
To be precise: don't add instruction uses to the worklist if it already has more than 10000 elements.
This avoids quadratic behavior for very large functions.
rdar://problem/56268570
Instead of bailing on ownership functions in SILCombine::run, we will
bail in individual visitors. This way, as SILCombine is updated we can
paritially support ownership across the pass.
Changes:
* Allow optimizing partial_apply capturing opened existential: we didn't do this originally because it was complicated to insert the required alloc/dealloc_stack instructions at the right places. Now we have the StackNesting utility, which makes this easier.
* Support indirect-in parameters. Not super important, but why not? It's also easy to do with the StackNesting utility.
* Share code between dead closure elimination and the apply(partial_apply) optimization. It's a bit of refactoring and allowed to eliminate some code which is not used anymore.
* Fix an ownership problem: We inserted copies of partial_apply arguments _after_ the partial_apply (which consumes the arguments).
* When replacing an apply(partial_apply) -> apply and the partial_apply becomes dead, avoid inserting copies of the arguments twice.
These changes don't have any immediate effect on our current benchmarks, but will allow eliminating curry thunks for existentials.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
In the previous commit, various methods for adding, replacing, and
removing instructions were duplicate from SILCombiner into
SILInstructionWorklist. Here, SILCombiner is modified to call through
to the methods which were added to SILInstructionWorklist.