using the new InstructionDeleter utility. The elimination logic
exploits the fact that constant_evaluable function calls have
been evaluated and folded, in order to remove such calls that
may possibly take @inout parameters.
Previously, we were incorrectly handling these test cases since we weren't
properly finding destroys through forwarding instructions. I fixed that in a
previous commit by changing the code here to only support load [copy] without
forwarding instructions.
In this commit, I change the code to instead use the LiveRange abstraction. The
LiveRange abstraction already knows how to find destroys through forwarding
instructions and has this destroy array already computed for us!
So we get better runtime performance of code (due to the better opt) and better
compile time (since we aren't computing the destroy list twice)!
rdar://58289320
This is the first of two commits. This commit is a very simple, easily
cherry-pickable fix but does not use the LiveRange infrastructure so that we
handle forwarding uses here. Instead, we just bail if all uses of our load
[copy] are not destroy_value.
In a subsequent commit, I am going to change this to use the LiveRange
infrastructure so we will handle these cases. Sadly doing so doesn't cherry-pick
well. = /.
rdar://58289320
This commit eliminates the need for mark uninitialized fixup by updating the
compiler so that we now emit:
```
%0 = alloc_box
%1 = mark_uninitialized %0
%2 = project_box %1
...
destroy_value %1
```
Instead of:
```
%0 = alloc_box
%1 = project_box %0
%2 = mark_uninitialized %1
...
destroy_value %0
```
Now that the first type of code is generated, I can change project_box to only
take guaranteed arguments. This will ensure that the OSSA ARC optimizer can
eliminate copies of boxes without needing to understand the usage of the
project_box.
This method returns argument lists, not arguments! We should add in the future
an additional API that returns a flap mapped range over all such argument lists
to cleanup some of this code. But at least now the name is more accurate.
This is going to let me change getUninitializedValue() to special case alloc_box
so that we return the project_box (the address we want to analyze) rather than
the mark_uninitialized (which is on the box).
Specifically isDefiniteInitFinished is always set to false and
TreatAddressToPointerAsInout is set to true when ever DI is run, so this patch
just constant propagates those values and then DCEs as appropriate.
Since collectUses in DI land always asserts that we are processing a pointer,
this code for sure has never been called (since it works with destructures),
meaning that the code must be dead.
and eliminate dead code. This is meant to be a replacement for the utility:
recursivelyDeleteTriviallyDeadInstructions. The new utility performs more aggresive
dead-code elimination for ownership SIL.
This patch also migrates most non-force-delete uses of
recursivelyDeleteTriviallyDeadInstructions to the new utility.
and migrates one force-delete use of recursivelyDeleteTriviallyDeadInstructions
(in IRGenPrepare) to use the new utility.
Specifically:
1. I renamed the method insertAfter -> insertAfterInvocation and added an
ehaustive switch to ensure that we properly update this code if we add new apply
sites.
2. I added a new method insertAfterFullEvaluation that is like
insertAfterInvocation except that the callback is called with insertion points
after the end/abort apply instead of after the initial invocation of the
begin_apply.
All non cond_br SILInstructions are layout compatible with BranchPropagatedUser
since BPU just stores a PointerIntPair that leaves its low bits as zero unless
one adds a cond_br. Since in these cases, the SILInstructions are all not
cond_br, we can use this alternative API.
I also added some asserts to validate that the assumption that all such
SILInstructions are not cond_br is respected.
Specifically if we had:
```
%1 = enum $Enum, %0
switch_enum %1
```
We would propagate %0 without eliminating the enum in certain cases. Instead, we
insert unchecked_enum_data right before the branch to ensure that:
1. The types line up.
2. The enum is only consumed along the path through the switch_enum instead of
dealing with the lifetime of the enum along other paths.
In a forthcoming commit, I am going to need access to the "escaping" uses of a
dead live range. Rather than add /another/ argument to isDeadLiveRange, this
commit refactors the code to be a constructor on a general LiveRange class that
internally contains the lists of destroys, forwarding instructions we can
convert to be guaranteed, and consuming instructions that we do not understand.
Should be NFC.
Previously we were:
1. Doing a linear scan, performing certain optimizations, and setting up a
worklist for future processing.
2. Draining the worklist of changed instructions until we reached a fix point.
The key thing here is that for (1) to be safe, we would need to not perform any
optimizations on block arguments since there was an ssumption that we would only
perform SSA-like optimizations.
I am going to be adding such optimizations soon, so it makes sense to just
convert the initial traversal to non-destructively seed the worklist and
eliminate that initial optimization pass.
This should be NFC.
All the context dependencies in SIL type lowering have been eradicated, but IRGen's
type info lowering is still context-dependent and doesn't systemically pass generic
contexts around. Sink GenericContextScope bookkeeping entirely into IRGen for now.
function of the OSLogOptimization pass that happens when folding a
guaranteed value with a constant (owned) value. The fix inserts the
constant value at the beginning of the borrowed scope of the guaranteed
value rather than at the definition of the guaranteed value.
Update the SIL tests for the new folding pattern and add a test that
catches this bug.
Also, re-enable the OSLogPrototypeExecTest.swift that was disabled
due to this bug.
This is important both for correctness reasons and for triaging reasons. The
correctness is obvious, but for those unaware there are many triaging utilities
that only verify or print pass output if the pass makes a change as indicated by
invalidating state.
Specifically, I was abusing some sorting behavior on some arrays that I really
needed to iterate over == non-determinism. To work around these issues, I made
two changes:
1. Rather than using a bit vector to mark copy_values that were handled as part
of phi handling and thus needing a way to map copy_value -> bit vector index, I
instead just added a separate small ptr set called
copyValueProcessedWithPhiNodes.
2. I refactored/changed how copy cleanups were inserted for phi nodes by
constructing a flat 2d-array that is stable sorted by the index of the incoming
value associated with the cleanups. An incoming value's index is the count of
the copy cleanup when we see it for the first time. Thus when we do the stable
sort we will be visiting in cleanup insertion order and also will be doing
insertion order along the incomingValue axis.
This involved fixing a few issues exposed by trying to use the load_borrow code
with load [copy] that were caught in our tests by the ownership
verifier. Specifically:
1. In getUnderlyingBorrowIntroducingValues we were first checking if a user was
a guaranteed forwarding instruction and then doing a check if our value was a
value with None ownership that we want to skip. This caused weird behavior where
one could get the wrong borrow introducers if that trivial value came from a
different guaranteed value.
2. I found via a test case in predictable_memaccess_opts that we needed to
insert compensating destroys for phi nodes after we process all of the incoming
values. The reason why this is needed is that multiple phi nodes could use the
same incoming value. In such a case, we need to treat all of the copy_values
inserted for each phi node as users of that incoming value to prevent us from
inserting too many destroy_value. Consider:
```
bb0:
%0 = copy_value
cond_br ..., bb1, bb2
bb1:
br bb3(%0)
bb2:
br bb4(%0)
```
If we processed the phi args in bb3, bb4 separately, we would insert an extra 2
destroys in bb1, bb2.
The implementation works by processing each phi and for each incoming value of
the phi appending the value, copy we made for the value to an array. We then
stable sort the array only by value. This then allows us to process ranges of
copies with the same underlying incoming value in a 2nd loop taking advantage of
all of the copies for an incoming value being contiguous in said array. We still
lifetime extend to the load/insert destroy_value for the actual phi (not its
incoming values) in that first loop.
3. I tightened up the invariant in the AvailableValueAggregator that it always
returns values that are newly copied unless we are performing a take. This
involved changing the code in handlePrimitiveValues to always insert copies when
ownership is enabled.
4. I tightened up the identification of intermediate phi nodes by instead of
just checking if the phi node has a single terminator user to instead it having
a single terminator user that has a successor with an inserted phi node that we
know about.
This is an analogous fix to a previous fix where we were eliminating copies that
were post-dominated by dead ends such that the copy_value did not have any
destroys.
rdar://56807157
fold a symbolic closure, which is the representation of a closure literal
in the constant evaluator. This commit improves the function
emitCodeForSymbolicValue so that given a symbolic closure it can emit
SIL code for constructing the closure.
This improvement enables folding the arguments array, which is an array
of closures, by its constant value inferred by constant evaluating the
new OSLog calls.
of the OSLogOptimization pass. This commit contain two changes:
- It handles non-OSSA better (but it is meant to be phased out) so
that array and closure folding can be supported
- It fixes a bug in the OSSA folding by making sure that when an
owned value replaces a guaranteed value, the owned value is
borrowed and the borrow is used in place of the guaranteed value.
This entailed untangling a few issues. I have purposefully only fixed this for
now for load_borrow to make the change in tree smaller (the reason for my
splitting the code paths in a previous group of commits). I am going to follow
up with a load copy version of the patch. I think that the code should be the
same. The interesting part about this is that all of these code conditions are
caught by the ownership verifier, suggesting to me that the code in the
stdlib/overlays is simple enough that we didn't hit these code patterns.
Anyways, the main change here is that we now eliminate control equivalence
issues arising from @owned values that are not control equivalent being
forwarded into aggregates and phis. This would cause our outer value to be
destroyed early since we would be destroying it once for every time iteration in
a loop.
The way that this is done is that (noting that this is only for borrows today):
* The AvailableValueAggregator always copies values at available value points to
lifetime extend the values to the load block. In the load block, the
aggregator will take the incoming values and borrow the value to form tuples,
structs as needed. This new guaranteed aggregate is then copied. If we do not
need to form an aggregate, we just copy. Since the copy is in our load block,
we know that we can use it for our load_borrow. Importantly note how we no
longer allow for these aggregates to forward owned ownership preventing the
control equivalence problem above.
* Since the aggregator may use the SSA updater if it finds multiple available
values, we need to also make sure that any phis are strongly control
equivalent on all of its incoming values. So we insert copies along those
edges and then lifetime extend the phi as appropriate.
* Since in all of the above all copy_value inserted by the available value
aggregator are never consumed, we can just add destroy_values in the
appropriate places and everything works.