`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
Instead of caching alias results globally for the module, make AliasAnalysis a FunctionAnalysisBase which caches the alias results per function.
Why?
* So far the result caches could only grow. They were reset when they reached a certain size. This was not ideal. Now, they are invalidated whenever the function changes.
* It was not possible to actually invalidate an alias analysis result. This is required, for example in TempRValueOpt and TempLValueOpt (so far it was done manually with invalidateInstruction).
* Type based alias analysis results were also cached for the whole module, while it is actually dependent on the function, because it depends on the function's resilience expansion. This was a potential bug.
I also added a new PassManager API to directly get a function-base analysis:
getAnalysis(SILFunction *f)
The second change of this commit is the removal of the instruction-index indirection for the cache keys. Now the cache keys directly work on instruction pointers instead of instruction indices. This reduces the number of hash table lookups for a cache lookup from 3 to 1.
This indirection was needed to avoid dangling instruction pointers in the cache keys. But this is not needed anymore, because of the new delayed instruction deletion mechanism.
The client of this interface naturally expects to get back the
incoming phi value. Ignoring dominance and SIL ownership, the incoming
phi value and the block argument should be substitutable.
This method was actually returning the incoming operand for
checked_cast and switch_enum terminators, which is deeply misleading
and has been the source of bugs.
If the client wants to peek though casts, and enums, it should do so
explicitly. getSingleTerminatorOperand[s]() will do just that.
This patch fixes a number of issues:
The analysis was using EpilogueARCContext as a temporary when computing. This is
an performance problem since EpilogueARCContext contains all of the memory used
in the analysis. So essentially, we were mallocing tons of memory every time we
missed the analyses cache. This patch changes the pass to instead have 1
EpilogueARCContext whose internal state is cleared in between invocations. Since
the data structures (see below) used after this patch do not shrink memory after
being cleared, this should cause us to have far less memory churn.
The analysis was managing its block state data structure by allocating the
individual block state structs using a BumpPtrAllocator/DenseMap stored in
EpilogueARCContext. The individual state structures were allocated from the
BumpPtrAllocator and the DenseMap then mapped a specific SILBasicBlock to its
State data structure. Ignoring that we were mallocing this memory every time we
computed rather than reusing global state, this pessimizes performance on small
functions significantly. This is because the BumpPtrAllocator by default heap
allocates initially a page and DenseMap initially mallocs a 64 entry hash
table. Thus for a 1 block function, we would be allocating a large amount of
memory that is just unneeded.
Instead this patch changes the analysis to use a std::vector in combination with
PostOrderFunctionInfo to manage the per block state. The way this works is that
PostOrderFunctionInfo already contains a map from a SILBasicBlock to its post
order number. So, when we are allocating memory for each block, we visit the CFG
in post order. Thus we know that each block's state will be stored in the vector
at vector[post order number].
This has a number of nice effects:
1. By eliminating the need for the DenseMap, in large test cases, we are
signficiantly reducing the memory overhead (by 24 bytes per basic block assuming
8 byte ptrs).
2. We will use far less memory when applying this analysis to small functions.
rdar://33841629
For a long time, we have:
1. Created methods on SILArgument that only work on either function arguments or
block arguments.
2. Created code paths in the compiler that only allow for "function"
SILArguments or "block" SILArguments.
This commit refactors SILArgument into two subclasses, SILPHIArgument and
SILFunctionArgument, separates the function and block APIs onto the subclasses
(leaving the common APIs on SILArgument). It also goes through and changes all
places in the compiler that conditionalize on one of the forms of SILArgument to
just use the relevant subclass. This is made easier by the relevant APIs not
being on SILArgument anymore. If you take a quick look through you will see that
the API now expresses a lot more of its intention.
The reason why I am performing this refactoring now is that SILFunctionArguments
have a ValueOwnershipKind defined by the given function's signature. On the
other hand, SILBlockArguments have a stored ValueOwnershipKind. Rather than
store ValueOwnershipKind in both instances and in the function case have a dead
variable, I decided to just bite the bullet and fix this.
rdar://29671437
This was already done for getSuccessorBlocks() to distinguish getting successor
blocks from getting the full list of SILSuccessors via getSuccessors(). This
commit just makes all of the successor/predecessor code follow that naming
convention.
Some examples:
getSingleSuccessor() => getSingleSuccessorBlock().
isSuccessor() => isSuccessorBlock().
getPreds() => getPredecessorBlocks().
Really, IMO, we should consider renaming SILSuccessor to a more verbose name so
that it is clear that it is more of an internal detail of SILBasicBlock's
implementation rather than something that one should consider as apart of one's
mental model of the IR when one really wants to be thinking about predecessor
and successor blocks. But that is not what this commit is trying to change, it
is just trying to eliminate a bit of technical debt by making the naming
conventions here consistent.
It makes sense to turn the new epilogue retain/release matcher to an Analysis.
Its currently a data flow with an entry API point. This saves on compilation time,
even though it does not seem to be very expensive right now. But it is a iterative
data flow which could be expensive with large CFGs.
rdar://28178736