This is in prepration for other bug fixes.
Clarify the SIL utilities that return canonical address values for
formal access given the address used by some memory operation:
- stripAccessMarkers
- getAddressAccess
- getAccessedAddress
These are closely related to the code in MemAccessUtils.
Make sure passes use these utilities consistently so that
optimizations aren't defeated by normal variations in SIL patterns.
Create an isLetAddress() utility alongside these basic utilities to
make sure it is used consistently with the address corresponding to
formal access. When this query is used inconsistently, it defeats
optimization. It can also cause correctness bugs because some
optimizations assume that 'let' initialization is only performed on a
unique address value.
Functional changes to Memory Behavior:
- An instruction with side effects now conservatively still has side
effects even when the queried value is a 'let'. Let values are
certainly sensitive to side effects, such as the parent object being
deallocated.
- Return the correct MemBehavior for begin/end_access markers.
For alias analysis query to be generally correct, we need to
effectively merge the escape state and use points for everything in a
defer web.
It was unclear from the current design whether the "escaping" property
applied to the pointer value or its content. The implementation is
inconsistent in how it was treated. It appears that some bugs have
been worked around by propagating forward through defer edges, some
have been worked around by querying the content instead of the
pointer, and others have been worked around be creating fake use
points at block arguments.
If we always simply query the content for escape state and use points,
then we never need to propagate along defer edges. The current code
that propagates escape state along defer edges in one direction is
simply incorrect from the perspective of alias analysis.
One very attractive solution is to merge nodes eagerly without
creating any defer edges, but that would be a much more radical change
even than what I've done here. It would also pose some new issues: how
to resolve the current "node types" when merging and how to deal with
missing content nodes.
This solution of applying escape state to content nodes solves all
these problems without too radical of a change at the expense of
eagerly creating content nodes. (The potential graph memory usage is
not really an issue because it's possible to drastically shrink the
size of the graph anyway in a future commit--I've been able to fit a
node within one cache line). This solution nicely preserves graph
structure which makes it easy to debug and relate to the IR.
Eagerly creating content nodes also solves the missing content node
problem. For example, when querying canEscapeTo, we need to know
whether to look at the escape state for just the pointer value itself,
or also for its content. It may be possible the its content node is
actually part of the same object at the IR level. If the content node
is missing, then we don't know if the object's interior address is not
recognizable/representable or whether we simply never saw an access to
the interior address. We can't simply look at whether the current IR
value happens to be a reference, because that doesn't tell us whether
the graph node may have been merged with a non-reference node or even
with it's own content node. To be correct in general, this query would
need to be extremely conservative. However, if content nodes are
always created for references, then we only need to query the escape
state of a pointer's content node. The content node's flag tells us if
it's an interior node, in which case it will always point to another
content node which also needs to be queried.
For alias analysis query to be generally correct, we need to
effectively merge the escape state and use points for everything in a
defer web.
It was unclear from the current design whether the "escaping" property
applied to the pointer value or its content. The implementation is
inconsistent in how it was treated. It appears that some bugs have
been worked around by propagating forward through defer edges, some
have been worked around by querying the content instead of the
pointer, and others have been worked around be creating fake use
points at block arguments.
If we always simply query the content for escape state and use points,
then we never need to propagate along defer edges. The current code
that propagates escape state along defer edges in one direction is
simply incorrect from the perspective of alias analysis.
One very attractive solution is to merge nodes eagerly without
creating any defer edges, but that would be a much more radical change
even than what I've done here. It would also pose some new issues: how
to resolve the current "node types" when merging and how to deal with
missing content nodes.
This solution of applying escape state to content nodes solves all
these problems without too radical of a change at the expense of
eagerly creating content nodes. (The potential graph memory usage is
not really an issue because it's possible to drastically shrink the
size of the graph anyway in a future commit--I've been able to fit a
node within one cache line). This solution nicely preserves graph
structure which makes it easy to debug and relate to the IR.
Eagerly creating content nodes also solves the missing content node
problem. For example, when querying canEscapeTo, we need to know
whether to look at the escape state for just the pointer value itself,
or also for its content. It may be possible the its content node is
actually part of the same object at the IR level. If the content node
is missing, then we don't know if the object's interior address is not
recognizable/representable or whether we simply never saw an access to
the interior address. We can't simply look at whether the current IR
value happens to be a reference, because that doesn't tell us whether
the graph node may have been merged with a non-reference node or even
with it's own content node. To be correct in general, this query would
need to be extremely conservative. However, if content nodes are
always created for references, then we only need to query the escape
state of a pointer's content node. The content node's flag tells us if
it's an interior node, in which case it will always point to another
content node which also needs to be queried.
This is the first in a series of patches that reworks
EscapeAnalysis. For this patch, I extracted every change that does not
introduce new features, rewrite logic, or appear to change
functionality.
These cleanups were done in preparation for:
- adding a graph representation for reference counted objects
- rewriting parts to the query logic
- ...which then allows the analysis to safely assume that all
exclusive arguments are unique
- ...which then allows more aggressive optimization of local variables
that don't escape
There are two possible functional changes:
1. getUnderlyingAddressRoot in InstructionUtils now sees through OSSA
instructions: begin_borrow and copy_value
2. The getPointerBase helper in EscapeAnalysis now sees through all of
these reference and pointer casts:
+ case ValueKind::UncheckedRefCastInst:
+ case ValueKind::ConvertFunctionInst:
+ case ValueKind::UpcastInst:
+ case ValueKind::InitExistentialRefInst:
+ case ValueKind::OpenExistentialRefInst:
+ case ValueKind::RawPointerToRefInst:
+ case ValueKind::RefToRawPointerInst:
+ case ValueKind::RefToBridgeObjectInst:
+ case ValueKind::BridgeObjectToRefInst:
+ case ValueKind::UncheckedAddrCastInst:
+ case ValueKind::UnconditionalCheckedCastInst:
+ case ValueKind::RefTo##Name##Inst:
+ case ValueKind::Name##ToRefInst:
This coalesces a whole bunch of nodes together that were just there
because of casts. The existing code was already doing this for one
level of casts, but there was a bug that prevented it from happening
transitively. So, in theory, anything that breaks with this fix could
also break without this fix, but may not have been exposed. The fact
that this analysis coalesces address-to-reference casts at all is what
caused me to spent vast amounts of time debugging any time I tried to
force some structure on the graph via assertions. If it is done at
all, it should be done everywhere consistently to expose issues as
early as possible.
Here is a description of the changes in diff order. If something in
the diff is not listed here, then the code probably just moved around
in the file:
Rename isNotAliasedIndirectParameter to
isExclusiveIndirectParameter. The argument may be aliased in the
caller's scope and it's contents may have already escaped.
Add comments to SILType APIs (isTrivial, isReferenceCounted) that give
answers about the AST type which don't really make sense for address
SILTypes.
Add comments about CGNode's 'Value' field. I spent lots of time
attempting to tighten this down with asserts, but it's only possible
for non-content nodes. For content nodes, the node's value is highly
unpredictable and basically nonsense but needed for debugging.
Add comments about not assuming that the content nodes pointsTo edge
represents physical indirection. This matters when reasoning about
aliasing and it's a tempting assumption to make.
Add a CGNode::mergeProperties placeholder for adding node properties.
Factor out a CGNode::canAddDeferred helper for use later.
Rename `setPointsTo` to `setPointsToEdge` because it actually creates
an edge rather than just setting `pointsTo`.
Add CGNode::getValue() and related helpers to help maintain invariants.
Factor out a `markEscaping` helper.
Clean up the `escapesInsideFunction` helper.
Add node visitor helpers: visitSuccessors, visitDefers. This made is
much easier to prototype utilities.
Add comments to clarify the `pointsTo` invariant. In particular, an
entire defer web may have a null pointsTo for a while.
Add an `activeWorklist` to avoid nasty bugs when composing multiple
helpers that each use the worklist.
Remove the `EA` argument from `getNode`. I ended up needing access to
the `EA` context from the ConnectionGraph many times during
prototyping and passing `this` was all the `getNode` calls was very
silly.
Add graph visitor helpers: backwardTraverse, forwardTraverseDefer,
forwardTraversePointsToEdges, and mayReach for ease in developing new
logic and utilities.
Add isExclusiveArgument helper and distinguish exclusive arguments
from local objects. Confusing these properties could lead to scary
bugs. For example, unlike a local object, an exclusive argument's
contents may still escape even when the content's connection graph
node is non-escaping!
Add isUniquelyIdentified helper when we want to treat exclusive
arguments and local objects uniformly.
getUnderlyingAddressRoot now looks through OSSA instructions.
Rename `getAccessedMemory` to `getDirectlyAccessedMemory` with
comments. This is another dangerous API because it assumes the memory
access to a given address won't touch memory represented by different
graph nodes, but graph edges don't necessarily represent physical
indirection. Further clarify this issue in comments in
AliasAnalysis.cpp.
Factor out a 'findRecursiveRefType' helper from the old
'mayContainReference' for checking whether types may or must contain
references. Support both kinds of queries so the analysis can be
certain when a pointer's content is a physical heap object.
Factor out 'getPointerBase' and 'getPointerRoot' helpers that follow
address projections within what EscapeAnalysis will consider a single
node.
Create a CGNodeWorklist abstraction to safely formalize the worklist
mechanism that's used all over the place. In one place, there were
even two separate independent lists used independently (nodes added to
one list could appear to be in the other list).
The CGNodeMap abstraction did not significantly change, it was just moved.
Added 'dumpCG' for dumping .dot files making it possible to remote debug.
Added '-escapes-enable-graphwriter' option to dump .dot files, since
they are so much more useful than the textual dump of the connection
graph, which lacks node identity!
At some point, pass definitions were heavily macro-ized. Pass
descriptive names were added in two places. This is not only redundant
but a source of confusion. You could waste a lot of time grepping for
the wrong string. I removed all the getName() overrides which, at
around 90 passes, was a fairly significant amount of code bloat.
Any pass that we want to be able to invoke by name from a tool
(sil-opt) or pipeline plan *should* have unique type name, enum value,
commend-line string, and name string. I removed a comment about the
various inliner passes that contradicted that.
Side note: We should be consistent with the policy that a pass is
identified by its type. We have a couple passes, LICM and CSE, which
currently violate that convention.
(libraries now)
It has been generally agreed that we need to do this reorg, and now
seems like the perfect time. Some major pass reorganization is in the
works.
This does not have to be the final word on the matter. The consensus
among those working on the code is that it's much better than what we
had and a better starting point for future bike shedding.
Note that the previous organization was designed to allow separate
analysis and optimization libraries. It turns out this is an
artificial distinction and not an important goal.