Avoid dangling pointers in the connection graph when SILValues are
deleted. This didn't show up in normal compilation because the values
were only used for printing the graph. But now we have more assertions
that check for well-formed graph. These could hit the dangling
pointers.
To guarantee this can't happen, establish a strict invariant that the
mappedValue is only used for nodes that exist in the value-to-node
map. Then clear out mappedValue whenever a value is deleted. Add
verification to ensure that each node's mappedValue is valid and catch
dangling pointers quickly.
This completely changes the way that a node values are set and used
which in turn makes it *much* easier to debug the graph and associate
the nodes with the SIL function. For example, it's normal to view or
dump the graph at a key point. When the nodes are printed, the
associated SILValues are now more precise and consistent. However,
it's also normal to call CGNode::dump() from tracing code or within a
debugger. Previously, there was no way to assocate the output of
CGNode::dump() with the printed or displayed connection graph. Now
both forms of output have identical node IDs!
While we're at it, make sure the hasRC flag is always merged
conservatively in a couple more places.
Fixed <rdar://57290845> While running SILFunctionTransform "TempRValueOpt"
SIL type lowering erases DynamicSelfType, so we generate
incorrect code when casting to DynamicSelfType. Fixing this
requires a fair amount of plumbing, but most of the
changes are mechanical.
Note that the textual SIL syntax for casts has changed
slightly; the target type is now a formal type without a '$',
not a SIL type.
Also, the unconditional_checked_cast_value and
checked_cast_value_br instructions now take the _source_
formal type as well, just like the *_addr forms they are
intended to replace.
I left in the run before DestroyHoisting since I believe that DestroyHoisting
depends a bit on SemanticARCOpts running, but at the same time I don't want to
deal with any regressions that may come from moving DestroyHoisting.
function of the OSLogOptimization pass that happens when folding a
guaranteed value with a constant (owned) value. The fix inserts the
constant value at the beginning of the borrowed scope of the guaranteed
value rather than at the definition of the guaranteed value.
Update the SIL tests for the new folding pattern and add a test that
catches this bug.
Also, re-enable the OSLogPrototypeExecTest.swift that was disabled
due to this bug.
This pipeline just runs the Serialize SIL pass. The reason I am adding this is
that currently if one passes -disable-sil-perf-optzns or mess with
-sil-opt-pass-count, one can cause serialization to not occur, making it
difficult to bisect/turn off-on opts.
This is important both for correctness reasons and for triaging reasons. The
correctness is obvious, but for those unaware there are many triaging utilities
that only verify or print pass output if the pass makes a change as indicated by
invalidating state.
Specifically, I was abusing some sorting behavior on some arrays that I really
needed to iterate over == non-determinism. To work around these issues, I made
two changes:
1. Rather than using a bit vector to mark copy_values that were handled as part
of phi handling and thus needing a way to map copy_value -> bit vector index, I
instead just added a separate small ptr set called
copyValueProcessedWithPhiNodes.
2. I refactored/changed how copy cleanups were inserted for phi nodes by
constructing a flat 2d-array that is stable sorted by the index of the incoming
value associated with the cleanups. An incoming value's index is the count of
the copy cleanup when we see it for the first time. Thus when we do the stable
sort we will be visiting in cleanup insertion order and also will be doing
insertion order along the incomingValue axis.
This involved fixing a few issues exposed by trying to use the load_borrow code
with load [copy] that were caught in our tests by the ownership
verifier. Specifically:
1. In getUnderlyingBorrowIntroducingValues we were first checking if a user was
a guaranteed forwarding instruction and then doing a check if our value was a
value with None ownership that we want to skip. This caused weird behavior where
one could get the wrong borrow introducers if that trivial value came from a
different guaranteed value.
2. I found via a test case in predictable_memaccess_opts that we needed to
insert compensating destroys for phi nodes after we process all of the incoming
values. The reason why this is needed is that multiple phi nodes could use the
same incoming value. In such a case, we need to treat all of the copy_values
inserted for each phi node as users of that incoming value to prevent us from
inserting too many destroy_value. Consider:
```
bb0:
%0 = copy_value
cond_br ..., bb1, bb2
bb1:
br bb3(%0)
bb2:
br bb4(%0)
```
If we processed the phi args in bb3, bb4 separately, we would insert an extra 2
destroys in bb1, bb2.
The implementation works by processing each phi and for each incoming value of
the phi appending the value, copy we made for the value to an array. We then
stable sort the array only by value. This then allows us to process ranges of
copies with the same underlying incoming value in a 2nd loop taking advantage of
all of the copies for an incoming value being contiguous in said array. We still
lifetime extend to the load/insert destroy_value for the actual phi (not its
incoming values) in that first loop.
3. I tightened up the invariant in the AvailableValueAggregator that it always
returns values that are newly copied unless we are performing a take. This
involved changing the code in handlePrimitiveValues to always insert copies when
ownership is enabled.
4. I tightened up the identification of intermediate phi nodes by instead of
just checking if the phi node has a single terminator user to instead it having
a single terminator user that has a successor with an inserted phi node that we
know about.
Remove cruft that I added in the previous commit. This eliminates
unnecessary cleverness so initializePointsTo is now very simple.
Add hand-coded SIL test cases to somewhat verify that the new
algorithm works.
Fixes a potential real bug in the case that SinkAddressProjections moves
projections without notifying SimplifyCFG of the change. This could
fail to update Analyses (probably won't break anything in practice).
Introduce SILInstruction::isPure. Among other things, this can tell
you if it's safe to duplicate instructions at their
uses. SinkAddressProjections should check this before sinking uses. I
couldn't find a way to expose this as a real bug, but it is a
theoretical bug.
Add the SinkAddressProjections functionality to the BasicBlockCloner
utility. Enable address projection sinking for all BasicBlockCloner
clients (the four different kinds of jump-threading that use it). This
brings the compiler much closer to banning all address phis.
The "bugs" were originally introduced a week ago here:
commit f22371bf0b (fork/fix-address-phi, fix-address-phi)
Author: Andrew Trick <atrick@apple.com>
Date: Tue Sep 17 16:45:51 2019
Add SIL SinkAddressProjections utility to avoid address phis.
Enable this utility during jump-threading in SimplifyCFG.
Ultimately, the SIL verifier should prevent all address-phis and we'll
need to use this utility in a few more places.
Fixes <rdar://problem/55320867> SIL verification failed: Unknown
formal access pattern: storage
In SILInstruction::isTriviallyDuplicatable():
- Make deallocating instructions trivially duplicatable. They are by
any useful definition--duplicating an instruction does not imply
reordering it. Tail duplication was already treating deallocations
as duplicatable, but doing it inconsistently. Sometimes it checks
isTriviallyDuplicatable, and sometimes it doesn't, which appears to
have been an accident. Disallowing duplication of deallocations will
cause severe performance regressions. Instead, consistently allow
them to be duplicated, making tail duplication more powerful, which
could expose other bugs.
- Do not duplicate on-stack AllocRefInst (without special
consideration). This is a correctness fix that apparently was never
exposed.
Fix SILLoop::canDuplicate():
- Handle isDeallocatingStack. It's not clear how we were avoiding an
assertion before when a stack allocatable reference was confined to
a loop--probably just by luck.
- Handle begin/end_access inside a loop. This is extremely important
and probably prevented many loop optimizations from working with
exclusivity.
Update LoopRotate canDuplicateOrMoveToPreheader(). This is NFC.
This property will allow alias analysis to be safely optmistic when
querying the connection graph. A node that is known to have a ref
count is know to keep alive everything it points to. Therefore,
calling a deinitializer on a different reference cannot release the RC
node's contents.
This is an analogous fix to a previous fix where we were eliminating copies that
were post-dominated by dead ends such that the copy_value did not have any
destroys.
rdar://56807157
fold a symbolic closure, which is the representation of a closure literal
in the constant evaluator. This commit improves the function
emitCodeForSymbolicValue so that given a symbolic closure it can emit
SIL code for constructing the closure.
This improvement enables folding the arguments array, which is an array
of closures, by its constant value inferred by constant evaluating the
new OSLog calls.
of the OSLogOptimization pass. This commit contain two changes:
- It handles non-OSSA better (but it is meant to be phased out) so
that array and closure folding can be supported
- It fixes a bug in the OSSA folding by making sure that when an
owned value replaces a guaranteed value, the owned value is
borrowed and the borrow is used in place of the guaranteed value.
The two major take aways from this patch are:
(1) Impose graph structure and reduce superfluous nodes and edges.
Incrementally make the connection graph and the APIs used to construct
it more structured.
_This allows node properties based on the SILValue to be reliably added to nodes_
Although that was the initial motiviation, there are other
benefits. Non-content nodes now have verifiable SILValues. Content
nodes now have meaningful SILValues even though they can't be
guaranteed due to merging. As a result it is *much* easier to debug
the graph and correlate it with the SIL. Rather than a web of
connection graph nodes with no identity and edges that don't
correspond to anything in SIL, the graph nodes now have value number
that correspond to the instruction used to dereference the node. The
edges also exhibit structure now. A pointsTo edge now (in practice)
always corresponds to a real pointer deference in the SIL. Doing this
required more than just adding some helpers, it was also necessary to
rewrite the graph merge and update algorithms.
(2) Split up underlying functionality into more explicit steps
Breaks apart the most complex parts of the graph algorithms into small
self-contained, self-checked steps. The purpose of each step is clear
and it's possible to reason about correctness from basic
invariants. Each merge step can now run full graph verification.
This was also done to move toward an invariant that the graph is never
mutated during a query. But to finish that goal, we need to add a
use-point query. With that, there will be no node creation, use point
propagation, new defer edges, etc. after graph building. At the very
least, this will make it sane to debug the output of the analysis.
---
Here is a change-by-change description in diff order:
Replace `updatePointsTo` with `initializePointsTo` and
`mergePointsTo`. Merging is very simple on its own. Initialization
requires some extra consideration for graph invariants. This
separation makes it possible to write stong asserts and to
independently reason about the correctness of each step based on
static invariants.
Replace `getContentNode` with `createContentNode`, and add two higher
level APIs `createMergedContent`, and `getFieldContent`. This makes
explicit the important cases of merging nodes and creating a special
nodes for class fields. This slightly simplifies adding properties to
content nodes and helps understand the structure of the graph.
Factor out an `escapeContentsOf` helper for use elsewhere...
Add a `getValueContent` helper. This is where we can tie the
properties of content nodes to the address values that are used to
address that content. This now also ensures that a Value node's
value field is consistent with all SILValues that map to it.
Add -escapes-internal-verify to check that the graph is in a valid
state after every merge or update step. This verification drove the
partial rewrite of mergeAllScheduledNodes.
ConnectionGraph::defer implementation: explictly handle the three
possible cases of pointsTo initialization or pointsTo merging at the
top level, so that those underlying implementations do not need to
dynamically handle weirdly different scenarios.
ConnectionGraph::initializePointsTo implementation: this simplified
implementation is possible by relying on invariants that can be
checked at each merge/update step. The major functional difference is
that it avoids creating unnecessary pointsTo edges. The previous
implementation often created pointsTo edges when adding defer edges
just to be conservative. Fixing this saved my sanity during debugging
because the pointsTo edges now always correspond to a SIL operations
that dereference the pointer. I'm also arguing without evidence that
this should be much more efficient.
ConnectionGraph::mergeAllScheduledNodes implementation: Add
verification to each step so that we can prove the other utilities
that are used while merging aren't making incorrect assumptions about
the graph state. Remove checks for merged nodes now that the graph is
consistently valid. Also remove a loop at the end that didn't seem to
do anything. The diff is impossible to review, but the idea is
basically the same. As long as it's still possible to scan through the
steps in the new code without getting totally lost, then the goal was
achieved.
ConnectionGraph::mergePointsTo: This is extremely simple now. In all
the places where we used to call updatePointsTo, and now call
mergePointsTo, it's a lot easier for someone debugging the code to
reason about what could possibly happen at that point.
`createMergedContent` is a placeholder for transferring node properties.
The `getFieldContent` helper may seem silly, but I find it helpful to
see all the important ways that content can be created in one place
next to the createContentNode, and I like the way that the creation of
the special "field content" node is more explicit in the source.
ConnectionGraph::mergeFrom implementation: this is only a minor
cleanup to remove some control flow nesting and use the CGNodeWorklist
abstraction.
In AnalyzeInstruction, add EscapeAnalysis::getValueContent helper. It
eliminates an extra step of going through the value node to get at its
content node. This is where we can derive content node properties from
the SILValue that dereferences the content. We can update the content
node's associated value 'V' if it's useful. It's also a place to put
assertions specific to the first level of content.
In AnalyzeInstruction, Array semantic calls: add support for
getValueContent so we can derive node properties. This is also nice
because it's explicit about which nodes are value content vs. field
content.
In AnalyzeInstruction, cleanup Release handling: use the explicit
APIs: getValueContent, getFieldContent, and escapeContentsOf.
In AnalyzeInstruction, assert that load-like things can't produce addresses.
In AnalyzeInstruction, add comments to clarify object projection handling.
In AnalyzeInstruction, add comments to explain store handling.
In AnalyzeInstruction, drop the assumption that all partial applies hold pointers.
In AnalyzeInstruction, handle aggregates differently so that Value
nodes are always consistent with their SILValue and can be
verified. Aggregates nodes are still coalesced if they only have a
single pointer-type subelement. If we arbitrarily coalesced an
aggregate with just one of its subelements then there would be no
consistent way to identify the value that corresponds to a connection
graph node.
This entailed untangling a few issues. I have purposefully only fixed this for
now for load_borrow to make the change in tree smaller (the reason for my
splitting the code paths in a previous group of commits). I am going to follow
up with a load copy version of the patch. I think that the code should be the
same. The interesting part about this is that all of these code conditions are
caught by the ownership verifier, suggesting to me that the code in the
stdlib/overlays is simple enough that we didn't hit these code patterns.
Anyways, the main change here is that we now eliminate control equivalence
issues arising from @owned values that are not control equivalent being
forwarded into aggregates and phis. This would cause our outer value to be
destroyed early since we would be destroying it once for every time iteration in
a loop.
The way that this is done is that (noting that this is only for borrows today):
* The AvailableValueAggregator always copies values at available value points to
lifetime extend the values to the load block. In the load block, the
aggregator will take the incoming values and borrow the value to form tuples,
structs as needed. This new guaranteed aggregate is then copied. If we do not
need to form an aggregate, we just copy. Since the copy is in our load block,
we know that we can use it for our load_borrow. Importantly note how we no
longer allow for these aggregates to forward owned ownership preventing the
control equivalence problem above.
* Since the aggregator may use the SSA updater if it finds multiple available
values, we need to also make sure that any phis are strongly control
equivalent on all of its incoming values. So we insert copies along those
edges and then lifetime extend the phi as appropriate.
* Since in all of the above all copy_value inserted by the available value
aggregator are never consumed, we can just add destroy_values in the
appropriate places and everything works.