Create a Predecessor abstraction for readability. Eliminates frequent
occurence of inscrutable getInt() and getPointer() calls.
Create an escapesInsideFunction() API that only uses mapped
values. This prevents bugs where a content node's value would be
mistakenly used to check for escapes.
Avoid dangling pointers in the connection graph when SILValues are
deleted. This didn't show up in normal compilation because the values
were only used for printing the graph. But now we have more assertions
that check for well-formed graph. These could hit the dangling
pointers.
To guarantee this can't happen, establish a strict invariant that the
mappedValue is only used for nodes that exist in the value-to-node
map. Then clear out mappedValue whenever a value is deleted. Add
verification to ensure that each node's mappedValue is valid and catch
dangling pointers quickly.
This completely changes the way that a node values are set and used
which in turn makes it *much* easier to debug the graph and associate
the nodes with the SIL function. For example, it's normal to view or
dump the graph at a key point. When the nodes are printed, the
associated SILValues are now more precise and consistent. However,
it's also normal to call CGNode::dump() from tracing code or within a
debugger. Previously, there was no way to assocate the output of
CGNode::dump() with the printed or displayed connection graph. Now
both forms of output have identical node IDs!
While we're at it, make sure the hasRC flag is always merged
conservatively in a couple more places.
Fixed <rdar://57290845> While running SILFunctionTransform "TempRValueOpt"
This property will allow alias analysis to be safely optmistic when
querying the connection graph. A node that is known to have a ref
count is know to keep alive everything it points to. Therefore,
calling a deinitializer on a different reference cannot release the RC
node's contents.
The two major take aways from this patch are:
(1) Impose graph structure and reduce superfluous nodes and edges.
Incrementally make the connection graph and the APIs used to construct
it more structured.
_This allows node properties based on the SILValue to be reliably added to nodes_
Although that was the initial motiviation, there are other
benefits. Non-content nodes now have verifiable SILValues. Content
nodes now have meaningful SILValues even though they can't be
guaranteed due to merging. As a result it is *much* easier to debug
the graph and correlate it with the SIL. Rather than a web of
connection graph nodes with no identity and edges that don't
correspond to anything in SIL, the graph nodes now have value number
that correspond to the instruction used to dereference the node. The
edges also exhibit structure now. A pointsTo edge now (in practice)
always corresponds to a real pointer deference in the SIL. Doing this
required more than just adding some helpers, it was also necessary to
rewrite the graph merge and update algorithms.
(2) Split up underlying functionality into more explicit steps
Breaks apart the most complex parts of the graph algorithms into small
self-contained, self-checked steps. The purpose of each step is clear
and it's possible to reason about correctness from basic
invariants. Each merge step can now run full graph verification.
This was also done to move toward an invariant that the graph is never
mutated during a query. But to finish that goal, we need to add a
use-point query. With that, there will be no node creation, use point
propagation, new defer edges, etc. after graph building. At the very
least, this will make it sane to debug the output of the analysis.
---
Here is a change-by-change description in diff order:
Replace `updatePointsTo` with `initializePointsTo` and
`mergePointsTo`. Merging is very simple on its own. Initialization
requires some extra consideration for graph invariants. This
separation makes it possible to write stong asserts and to
independently reason about the correctness of each step based on
static invariants.
Replace `getContentNode` with `createContentNode`, and add two higher
level APIs `createMergedContent`, and `getFieldContent`. This makes
explicit the important cases of merging nodes and creating a special
nodes for class fields. This slightly simplifies adding properties to
content nodes and helps understand the structure of the graph.
Factor out an `escapeContentsOf` helper for use elsewhere...
Add a `getValueContent` helper. This is where we can tie the
properties of content nodes to the address values that are used to
address that content. This now also ensures that a Value node's
value field is consistent with all SILValues that map to it.
Add -escapes-internal-verify to check that the graph is in a valid
state after every merge or update step. This verification drove the
partial rewrite of mergeAllScheduledNodes.
ConnectionGraph::defer implementation: explictly handle the three
possible cases of pointsTo initialization or pointsTo merging at the
top level, so that those underlying implementations do not need to
dynamically handle weirdly different scenarios.
ConnectionGraph::initializePointsTo implementation: this simplified
implementation is possible by relying on invariants that can be
checked at each merge/update step. The major functional difference is
that it avoids creating unnecessary pointsTo edges. The previous
implementation often created pointsTo edges when adding defer edges
just to be conservative. Fixing this saved my sanity during debugging
because the pointsTo edges now always correspond to a SIL operations
that dereference the pointer. I'm also arguing without evidence that
this should be much more efficient.
ConnectionGraph::mergeAllScheduledNodes implementation: Add
verification to each step so that we can prove the other utilities
that are used while merging aren't making incorrect assumptions about
the graph state. Remove checks for merged nodes now that the graph is
consistently valid. Also remove a loop at the end that didn't seem to
do anything. The diff is impossible to review, but the idea is
basically the same. As long as it's still possible to scan through the
steps in the new code without getting totally lost, then the goal was
achieved.
ConnectionGraph::mergePointsTo: This is extremely simple now. In all
the places where we used to call updatePointsTo, and now call
mergePointsTo, it's a lot easier for someone debugging the code to
reason about what could possibly happen at that point.
`createMergedContent` is a placeholder for transferring node properties.
The `getFieldContent` helper may seem silly, but I find it helpful to
see all the important ways that content can be created in one place
next to the createContentNode, and I like the way that the creation of
the special "field content" node is more explicit in the source.
ConnectionGraph::mergeFrom implementation: this is only a minor
cleanup to remove some control flow nesting and use the CGNodeWorklist
abstraction.
In AnalyzeInstruction, add EscapeAnalysis::getValueContent helper. It
eliminates an extra step of going through the value node to get at its
content node. This is where we can derive content node properties from
the SILValue that dereferences the content. We can update the content
node's associated value 'V' if it's useful. It's also a place to put
assertions specific to the first level of content.
In AnalyzeInstruction, Array semantic calls: add support for
getValueContent so we can derive node properties. This is also nice
because it's explicit about which nodes are value content vs. field
content.
In AnalyzeInstruction, cleanup Release handling: use the explicit
APIs: getValueContent, getFieldContent, and escapeContentsOf.
In AnalyzeInstruction, assert that load-like things can't produce addresses.
In AnalyzeInstruction, add comments to clarify object projection handling.
In AnalyzeInstruction, add comments to explain store handling.
In AnalyzeInstruction, drop the assumption that all partial applies hold pointers.
In AnalyzeInstruction, handle aggregates differently so that Value
nodes are always consistent with their SILValue and can be
verified. Aggregates nodes are still coalesced if they only have a
single pointer-type subelement. If we arbitrarily coalesced an
aggregate with just one of its subelements then there would be no
consistent way to identify the value that corresponds to a connection
graph node.
This is the first in a series of patches that reworks
EscapeAnalysis. For this patch, I extracted every change that does not
introduce new features, rewrite logic, or appear to change
functionality.
These cleanups were done in preparation for:
- adding a graph representation for reference counted objects
- rewriting parts to the query logic
- ...which then allows the analysis to safely assume that all
exclusive arguments are unique
- ...which then allows more aggressive optimization of local variables
that don't escape
There are two possible functional changes:
1. getUnderlyingAddressRoot in InstructionUtils now sees through OSSA
instructions: begin_borrow and copy_value
2. The getPointerBase helper in EscapeAnalysis now sees through all of
these reference and pointer casts:
+ case ValueKind::UncheckedRefCastInst:
+ case ValueKind::ConvertFunctionInst:
+ case ValueKind::UpcastInst:
+ case ValueKind::InitExistentialRefInst:
+ case ValueKind::OpenExistentialRefInst:
+ case ValueKind::RawPointerToRefInst:
+ case ValueKind::RefToRawPointerInst:
+ case ValueKind::RefToBridgeObjectInst:
+ case ValueKind::BridgeObjectToRefInst:
+ case ValueKind::UncheckedAddrCastInst:
+ case ValueKind::UnconditionalCheckedCastInst:
+ case ValueKind::RefTo##Name##Inst:
+ case ValueKind::Name##ToRefInst:
This coalesces a whole bunch of nodes together that were just there
because of casts. The existing code was already doing this for one
level of casts, but there was a bug that prevented it from happening
transitively. So, in theory, anything that breaks with this fix could
also break without this fix, but may not have been exposed. The fact
that this analysis coalesces address-to-reference casts at all is what
caused me to spent vast amounts of time debugging any time I tried to
force some structure on the graph via assertions. If it is done at
all, it should be done everywhere consistently to expose issues as
early as possible.
Here is a description of the changes in diff order. If something in
the diff is not listed here, then the code probably just moved around
in the file:
Rename isNotAliasedIndirectParameter to
isExclusiveIndirectParameter. The argument may be aliased in the
caller's scope and it's contents may have already escaped.
Add comments to SILType APIs (isTrivial, isReferenceCounted) that give
answers about the AST type which don't really make sense for address
SILTypes.
Add comments about CGNode's 'Value' field. I spent lots of time
attempting to tighten this down with asserts, but it's only possible
for non-content nodes. For content nodes, the node's value is highly
unpredictable and basically nonsense but needed for debugging.
Add comments about not assuming that the content nodes pointsTo edge
represents physical indirection. This matters when reasoning about
aliasing and it's a tempting assumption to make.
Add a CGNode::mergeProperties placeholder for adding node properties.
Factor out a CGNode::canAddDeferred helper for use later.
Rename `setPointsTo` to `setPointsToEdge` because it actually creates
an edge rather than just setting `pointsTo`.
Add CGNode::getValue() and related helpers to help maintain invariants.
Factor out a `markEscaping` helper.
Clean up the `escapesInsideFunction` helper.
Add node visitor helpers: visitSuccessors, visitDefers. This made is
much easier to prototype utilities.
Add comments to clarify the `pointsTo` invariant. In particular, an
entire defer web may have a null pointsTo for a while.
Add an `activeWorklist` to avoid nasty bugs when composing multiple
helpers that each use the worklist.
Remove the `EA` argument from `getNode`. I ended up needing access to
the `EA` context from the ConnectionGraph many times during
prototyping and passing `this` was all the `getNode` calls was very
silly.
Add graph visitor helpers: backwardTraverse, forwardTraverseDefer,
forwardTraversePointsToEdges, and mayReach for ease in developing new
logic and utilities.
Add isExclusiveArgument helper and distinguish exclusive arguments
from local objects. Confusing these properties could lead to scary
bugs. For example, unlike a local object, an exclusive argument's
contents may still escape even when the content's connection graph
node is non-escaping!
Add isUniquelyIdentified helper when we want to treat exclusive
arguments and local objects uniformly.
getUnderlyingAddressRoot now looks through OSSA instructions.
Rename `getAccessedMemory` to `getDirectlyAccessedMemory` with
comments. This is another dangerous API because it assumes the memory
access to a given address won't touch memory represented by different
graph nodes, but graph edges don't necessarily represent physical
indirection. Further clarify this issue in comments in
AliasAnalysis.cpp.
Factor out a 'findRecursiveRefType' helper from the old
'mayContainReference' for checking whether types may or must contain
references. Support both kinds of queries so the analysis can be
certain when a pointer's content is a physical heap object.
Factor out 'getPointerBase' and 'getPointerRoot' helpers that follow
address projections within what EscapeAnalysis will consider a single
node.
Create a CGNodeWorklist abstraction to safely formalize the worklist
mechanism that's used all over the place. In one place, there were
even two separate independent lists used independently (nodes added to
one list could appear to be in the other list).
The CGNodeMap abstraction did not significantly change, it was just moved.
Added 'dumpCG' for dumping .dot files making it possible to remote debug.
Added '-escapes-enable-graphwriter' option to dump .dot files, since
they are so much more useful than the textual dump of the connection
graph, which lacks node identity!
By convention, most structs and classes in the Swift compiler include a `dump()` method which prints debugging information. This method is meant to be called only from the debugger, but this means they’re often unused and may be eliminated from optimized binaries. On the other hand, some parts of the compiler call `dump()` methods directly despite them being intended as a pure debugging aid. clang supports attributes which can be used to avoid these problems, but they’re used very inconsistently across the compiler.
This commit adds `SWIFT_DEBUG_DUMP` and `SWIFT_DEBUG_DUMPER(<name>(<params>))` macros to declare `dump()` methods with the appropriate set of attributes and adopts this macro throughout the frontend. It does not pervasively adopt this macro in SILGen, SILOptimizer, or IRGen; these components use `dump()` methods in a different way where they’re frequently called from debugging code. Nor does it adopt it in runtime components like swiftRuntime and swiftReflection, because I’m a bit worried about size.
Despite the large number of files and lines affected, this change is NFC.
In such a case the overridden function gets a new separate vtable entry.
With this change, the computation of class method callees only uses the information in sil_vtables (instead of ClassDecl members).
Fixes a compiler crash in various optimization passes.
rdar://problem/56146633
In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
Previously, CallerAnalysis::FunctionInfo.foundAllCallers(), which is
documented to return true only when specialization of a function will
not require a thunk, returned true for functions which are possibly used
externally. Now, that member function only returns false for functions
which may be used externally since dead code elimination will not be
able to remove them.
Added getPartiallyPostDomReleaseSet to
ConsumedArgToEpilogueReleaseMatcher. Given an argument, the new method
returns the array of releases of the argument if there is an array
thereof and if the releases therein do not jointly post-dominate the
argument.
Excerpted from @gottesm's https://github.com/apple/swift/pull/16756.
- Replaced usage of raw map and vector with the type that wraps the
combination (BlotSetVector); that provided a significant deduplication
since a sizeable portion of the worklist's implementation was in
vector and map management now provided by the BlotSetVector.
- Templated over the type of map and vector used by the blot set vector.
- Added SmallSILInstructionWorklist where the map and vector are
specified to be SmallDenseMap and SmallVector respectively.
- Replaced usages of bare ValueBase with usages of SILValue.
- Renamed zap to resetChecked.
Added a bit of functionality to BlotSetVector, specifically to support
SILInstructionWorklist:
- Made insert return not just the index that the potentially-inserted
item is at on return but additionally whether an insertion occurred,
matching the behavior of llvm::DenseMap::insert.
- Added a method to reserve capacity in the backing vector and map:
BlotSetVector::reserve.
- Added a method to free extra storage used by the backing vector and
map: BlotSetVector::clear.
- Modified SmallBlotSetVector's template parameters so that only a value
and size can be specified--that type will always use SmallVector and
SmallDenseMap for its superclass' VectorT and MapT template
parameters.
- Updated variable names.
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances in the swift repo.
Describe the algorithm in the file-level doc comment. The basic
algorithm in the referenced paper is similar, but the most
interesting/important information is how it is adapted to SIL.
This analysis helper was inverting the result for builtins. Builtins
such as "copyMemory" were treated as never using a value.
This manifested in a crash in TestFoundation. NSDictionary's
initializer released the incoming array before copying it. This
crashed later during dictionary destruction.
The crash was hidden by a secondary bug in mayHaveSymmetricInterference
that effectively ignored the result from canNeverUseValue.
Rename the helper to canUseObject, invert the result for builtins, and
fix mayHaveSymmetricInterference to respect the result of
canUseObject.
Note that instructions that cannot access a referenced object
obviously cannot not "interfere" with a release.
Fixing these bugs now allows ARC optimization around dealloc_stack and
other operations that don't care about the reference count.
PointerUnion was generalized to be variadic. Replaced uses of
PointerUnion3 with PointerUnion
See also: git-svn-id:
https://llvm.org/svn/llvm-project/llvm/trunk@360962
91177308-0d34-0410-b5e6-96231b3b80d8
This will make the forthcoming CanonicalizeInstruction interface more
clear.
This is generally the better approach to utilities that mutate the
instruction stream. It avoids the temptation to assume that only a
single instruction will be deleted or that only instructions before
the current iterator will be deleted. This often happens to work but
eventually fails in the presense of debug and end-of-scope
instructions.
A function returning an iterator has a more clear contract than one
accepting some iterator reference of unknown
providence. Unfortunately, it doesn't work at the lowest level of
utilities, such as recursivelyDeleteTriviallyDeadInstructions, where
we want to handle instruction batches.
This is a large patch; I couldn't split it up further while still
keeping things working. There are four things being changed at
once here:
- Places that call SILType::isAddressOnly()/isLoadable() now call
the SILFunction overload and not the SILModule one.
- SILFunction's overloads of getTypeLowering() and getLoweredType()
now pass the function's resilience expansion down, instead of
hardcoding ResilienceExpansion::Minimal.
- Various other places with '// FIXME: Expansion' now use a better
resilience expansion.
- A few tests were updated to reflect SILGen's improved code
generation, and some new tests are added to cover more code paths
that previously were uncovered and only manifested themselves as
standard library build failures while I was working on this change.
Directly implement the data flow. Eliminate the extraneous work.
Remove cubic behavior. Do a single iteration of the data flow state at
each program point only performing the necessary set operations. At
unidentified access, clear the sets for simplicity and efficiency.
This cleanup results in significant functional changes:
- Allowing scopes to merge even if they are enclosed.
- Handling unidentified access conservatively.
Note that most of the added lines of code are comments.
Somehow this cleanup incidentally fixes:
<rdar://problem/48514339> swift compiler hangs building project
(I expected the subsequent loop summary cleanup to fix that problem.)
The previous design was customized to perfoming IPO with
GenericSideEffectAnalysis. Expose the underlying logic as a utility so
that AccessEnforcementOpts can use it to summarize loops (in addition
to call sites).
Improves SwiftSyntax release build speed by 4x.
Limit the size of the sets tracked by inter procedural
AccessedStorageAnalysis. There is a lot of leeway in this limit since
"normal" code doesn't come close to hitting it and SwiftSyntax compile
time isn't noticeably affected until 10x this limit.
This change also avoids reanalyzing function bodies once results have
bottomed out and avoids copying sets in the common case.
Fixes <rdar://problem/46905624> Release build time regression, a lot of time spent on AccessEnforcementOpts::run().
This is just a band aid. The fundamental problem is really:
<rdar://problem/47195282> Recomputing BottomUpIPAnalysis takes most of
the SwiftSyntax compile time.
SwiftSyntax compile time is still at least an order of magnitude
longer than it should be.