The main changes are:
*) Rewrite everything in swift. So far, parts of memory-behavior analysis were already implemented in swift. Now everything is done in swift and lives in `AliasAnalysis.swift`. This is a big code simplification.
*) Support many more instructions in the memory-behavior analysis - especially OSSA instructions, like `begin_borrow`, `end_borrow`, `store_borrow`, `load_borrow`. The computation of end_borrow effects is now much more precise. Also, partial_apply is now handled more precisely.
*) Simplify and reduce type-based alias analysis (TBAA). The complexity of the old TBAA comes from old days where the language and SIL didn't have strict aliasing and exclusivity rules (e.g. for inout arguments). Now TBAA is only needed for code using unsafe pointers. The new TBAA handles this - and not more. Note that TBAA for classes is already done in `AccessBase.isDistinct`.
*) Handle aliasing in `begin_access [modify]` scopes. We already supported truly immutable scopes like `begin_access [read]` or `ref_element_addr [immutable]`. For `begin_access [modify]` we know that there are no other reads or writes to the access-address within the scope.
*) Don't cache memory-behavior results. It turned out that the hit-miss rate was pretty bad (~ 1:7). The overhead of the cache lookup took as long as recomputing the memory behavior.
Sometimes when building the SwiftCompilerSources with a host compiler, linking fails with unresolved symbols for DenseMap and unique_ptr destroys.
This looks like a problem with C++ interop: the compiler thinks that destructors for some Analysis classes are materialized in the SwiftCompilerSources, but they are not.
Explicitly defining those destructors fixes the problem.
This is code that I am fairly familiar with but it still took a day of
investigation to figure out how it is supposed to be used now in the
presence of bridging.
This primarily involved ruling out the possibity that the mid-level
Swift APIs could at some point call into the lower-level C++ APIs.
The biggest problem was that AliasAnalysis::getMemoryBehaviorOfInst()
was declared as a public interface, and it's name indicates that it
computes the memory behavior. But it is just a wrapper around a Swift
API and never actually calls into any of the C++ logic that is
responsible for computing memory behavior!
The `isEscaping` function is called a lot from ARCSequenceOpt and ReleaseHoisting.
To avoid quadratic complexity for large functions, limit the amount of work what the EscapeUtils are allowed to to.
This keeps the complexity linear.
The arbitrary limit is good enough for almost all functions.
It lets the EscapeUtils do several hundred up/down walks which is much more than needed in most cases.
Fixes a compiler hang
https://github.com/apple/swift/issues/63846
rdar://105795976
Instead of caching alias results globally for the module, make AliasAnalysis a FunctionAnalysisBase which caches the alias results per function.
Why?
* So far the result caches could only grow. They were reset when they reached a certain size. This was not ideal. Now, they are invalidated whenever the function changes.
* It was not possible to actually invalidate an alias analysis result. This is required, for example in TempRValueOpt and TempLValueOpt (so far it was done manually with invalidateInstruction).
* Type based alias analysis results were also cached for the whole module, while it is actually dependent on the function, because it depends on the function's resilience expansion. This was a potential bug.
I also added a new PassManager API to directly get a function-base analysis:
getAnalysis(SILFunction *f)
The second change of this commit is the removal of the instruction-index indirection for the cache keys. Now the cache keys directly work on instruction pointers instead of instruction indices. This reduces the number of hash table lookups for a cache lookup from 3 to 1.
This indirection was needed to avoid dangling instruction pointers in the cache keys. But this is not needed anymore, because of the new delayed instruction deletion mechanism.
If the "regular" alias analysis thinks that an instruction may write to an address, check if the instruction is in an immutable scope of V.
That means that even if we don't know anything about the instruction (e.g. a call to an unknown function), we can be sure that it cannot write to the address.
An immutable scope is for example a read-only begin_access/end_access scope.
Another example is a borrow scope of an immutable copy-on-write buffer, for example:
%b = begin_borrow %array_buffer
%addr = ref_element_addr [immutable] %b : $BufferType, #BufferType.someField
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
When a non-value instruction (e.g. a destroy_addr) was deleted, the corresponding cache entry in MemoryBehaviorCache was not invalidated.
If a new instruction was allocated at the same memory location, the old - and invalid - cache entry was re-used.
This bug triggered a SIL memory lifetime failure in TempRValueElimination.
https://bugs.swift.org/browse/SR-13985
rdar://problem/72614608
This change also fixes another problem (which I found by inspection): Individual result values of MultipleValueInstructions were not invalidated correctly.
It's important that fundamental APIs don't lie to their users.
Make it clear that this API always returns true for deinitialization,
even if we could for example analyze the destructor and determine that
there aren't any actual writes!
When instructions are changed within a pass in a way that affects subsequent alias queries in the same pass run,
their alias analysis information must be invalidated.
Otherwise it can result in miscompiles and/or invalid SIL.
rdar://71924430
This is in prepration for other bug fixes.
Clarify the SIL utilities that return canonical address values for
formal access given the address used by some memory operation:
- stripAccessMarkers
- getAddressAccess
- getAccessedAddress
These are closely related to the code in MemAccessUtils.
Make sure passes use these utilities consistently so that
optimizations aren't defeated by normal variations in SIL patterns.
Create an isLetAddress() utility alongside these basic utilities to
make sure it is used consistently with the address corresponding to
formal access. When this query is used inconsistently, it defeats
optimization. It can also cause correctness bugs because some
optimizations assume that 'let' initialization is only performed on a
unique address value.
Functional changes to Memory Behavior:
- An instruction with side effects now conservatively still has side
effects even when the queried value is a 'let'. Let values are
certainly sensitive to side effects, such as the parent object being
deallocated.
- Return the correct MemBehavior for begin/end_access markers.
Use the new EscapeAnalysis infrastructure to make ARC code motion and
ARC sequence opts much more powerful and fix a latent bug in
AliasAnalysis.
Adds a new API `EscapeAnalysis::mayReleaseContent()`. This replaces
all uses if `EscapeAnalysis::canEscapeValueTo()`, which affects
`AliasAnalysis::can[Apply|Builtin]DecrementRefCount()`.
Also rewrite `AliasAnalysis::mayValueReleaseInterferWithInstruction` to
directly use `EscapeAnalysis::mayReleaseContent`.
The new implementation in `EscapeAnalysis::mayReleaseContent()`
generalizes the logic to handle more cases while avoiding an incorrect
assumption in the prior code. In particular, it adds support for
disambiguating local references from accessed addresses. This helps
handle cases in which inlining was defeating ARC optimization. The
incorrect assumption was that a non-escaping address is never
reachable via a reference. However, if a reference does not escape,
then an address into its object also does not escape.
The bug in `AliasAnalysis::mayValueReleaseInterfereWithInstruction()`
appears not to have broken anything yet because it is always called by
`AliasAnalysis::mayHaveSymmetricInteference()`, which later checks
whether the accessed address may alias with the released reference
using a separate query, `EscapeAnalysis::canPointToSameMemory()`. This
happens to work because an address into memory that is directly
released when destroying a reference necesasarilly points to the same
memory object. For this reason, I couldn't figure out a simple way to
hand-code SIL tests to expose this bug.
The changes in diff order:
Replace EscapeAnalysis `canEscapeToValue` with `mayReleaseContent` to
make the semantics clear. It queries: "Can the given reference release
the content pointed to the given address".
Change `AliasAnalysis::canApplyDecrementRefCount` to use
`mayReleaseContent` instead if 'canEscapeToValue'.
Change `AliasAnalysis::mayValueReleaseInterferWithInstruction`: after
getting the memory address accessed by the instruction, simply call
`EscapeAnalysis::mayReleaseContent`, which now implements all the
logic. This avoids the bad assumption made by AliasAnalysis.
Handle two cases in mayReleaseContent: non-escaping instruction
addresses and non-escaping referenecs. Fix the non-escaping address
case by following all content nodes to determine whether the address
is reachable from the released reference. Introduce a new optimization
for the case in which the reference being released is allocated
locally.
The following test case is now optimized in arcsequenceopts.sil:
remove_as_local_object_indirectly_escapes_to_callee. It was trying to
test that ARC optimization was not too aggressive when it removed a
retain/release of a child object whose parent container is still in
use. But the retain/release should be removed. The original example
already over-releases the parent object.
Add new unit tests to late_release_hoisting.sil.
This is a large patch; I couldn't split it up further while still
keeping things working. There are four things being changed at
once here:
- Places that call SILType::isAddressOnly()/isLoadable() now call
the SILFunction overload and not the SILModule one.
- SILFunction's overloads of getTypeLowering() and getLoweredType()
now pass the function's resilience expansion down, instead of
hardcoding ResilienceExpansion::Minimal.
- Various other places with '// FIXME: Expansion' now use a better
resilience expansion.
- A few tests were updated to reflect SILGen's improved code
generation, and some new tests are added to cover more code paths
that previously were uncovered and only manifested themselves as
standard library build failures while I was working on this change.
I believe that these were in SILInstruction for historic reasons. This is a
separate API on top of SILInstruction so it makes sense to pull it out into its
own header.
This name makes it clear that the function has not yet been deleted and also
contrasts with the past tense used in the API notifyAddedOrModifiedFunction to
show that said function has already added/modified the function.
The name notifyAddFunction is actively harmful since the pass manager uses this
entrypoint to notify analyses of added *OR* modified functions. It is up to the
caller analysis to distinguish in between these cases.
I am not vouching for the design, just trying to make names match the
current behavior.
Generally in the SIL/SILOptimizer libraries we have been putting kinds in the
swift namespace, not a nested scope in a type in swift (see ValueKind as an
example of this).
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
There are now separate functions for function addition and deletion instead of InvalidationKind::Function.
Also, there is a new function for witness/vtable invalidations.
rdar://problem/29311657
We now consider effect of deinit in addition to the released value.
rdar://25362826
This is the only 10%+ regression i measured on my machine. no performance improvement.
Sim2DArray | 326 | 366 | +12.3% | **0.89x**
As there are no instructions left which produce multiple result values, this is a NFC regarding the generated SIL and generated code.
Although this commit is large, most changes are straightforward adoptions to the changes in the ValueBase and SILValue classes.
If we use a shared valueenumerator, imagine the case when one of the AAcache or MBcache
is cleared and we clear the valueenumerator.
This could give rise to collisions (false positives) in the not-yet-cleared cache!
(Headers first)
It has been generally agreed that we need to do this reorg, and now
seems like the perfect time. Some major pass reorganization is in the
works.
This does not have to be the final word on the matter. The consensus
among those working on the code is that it's much better than what we
had and a better starting point for future bike shedding.
Note that the previous organization was designed to allow separate
analysis and optimization libraries. It turns out this is an
artificial distinction and not an important goal.