Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
While doing bottom up dataflow, if we encounter an
unmatched retain instruction, that can pair with a 'KnownSafe'
already visited release instruction, we turn off KnownSafety if the two
RCIdentities mayAlias.
This is done in BottomUpRefCountState::checkAndResetKnownSafety.
In order to determine if a retain is umatched, we look at
IncToDecStateMap. If a retain was matched during bottom up dataflow, it
is always found in IncToDecStateMap with value of the matched release's
BottomUpRefCountState.
Similarly, during top down dataflow, if we encounter an unmatched
release instruction, that can pair with a 'KnownSafe' already
visited retain instruction, we turn off KnownSafety if the two RCIdentities
mayAlias.
This is done in TopDownRefCountState::checkAndResetKnownSafety.
In order to determine if a release is umatched, we look at
DecToIncStateMap. If a release was matched during top down dataflow, it
is always found in DecToIncStateMap with value of the matched retain's
TopDownRefCountState.
For ARCLoopOpts, during bottom up and top down traversal of a region with
a nested loop, we find if the retain/release in the loop summary was
matched or not by looking at the persistent RefCountInstToMatched map.
This map is populated when processing the nested loop region from the
IncToDecStateMap/DecToStateMap which gets thrown away after the loop
region is processed.
This fixes the bugs in both ARCSequenceOpts without loop
support and with loop support.
* Remove NewInsts from ARCSequenceOpts
* Remove more instances of InsertPts
* Address comments from #33504
* Make bottom up loop traversal simpler. Use better apis
* Update LoopRegion printer with more info
Historically TermInsts were handled specially while visiting blocks in
bottom up order. TermInsts were not visited while traversing a block.
They were visited while traversing successors, and the most conservative
terminator of all predecessors would affect the refcount state in the
dataflow.
This was needed because ARCSequenceOpts also computed 'insertion points'
for the purposes of ARC code motion. ARC code motion was then removed
from ARCSequenceOpts and this code remained unchanged.
With this change, arc significant terminators are handled like all other
instructions while processing basic blocks bottom up.
Also updateForSameLoopInst, updateForDifferentLoopInst,
updateForPredTerminators all serve similar purpose with subtle differences.
This change removes some code duplication due to this.
Remove code duplication of isARCSignificantTerminator. Create
ARCSequenceOptUtils for common utils used in ARCSequenceOpts
This was already done for getSuccessorBlocks() to distinguish getting successor
blocks from getting the full list of SILSuccessors via getSuccessors(). This
commit just makes all of the successor/predecessor code follow that naming
convention.
Some examples:
getSingleSuccessor() => getSingleSuccessorBlock().
isSuccessor() => isSuccessorBlock().
getPreds() => getPredecessorBlocks().
Really, IMO, we should consider renaming SILSuccessor to a more verbose name so
that it is clear that it is more of an internal detail of SILBasicBlock's
implementation rather than something that one should consider as apart of one's
mental model of the IR when one really wants to be thinking about predecessor
and successor blocks. But that is not what this commit is trying to change, it
is just trying to eliminate a bit of technical debt by making the naming
conventions here consistent.
Before this commit all code relating to handling arguments in SILBasicBlock had
somewhere in the name BB. This is redundant given that the class's name is
already SILBasicBlock. This commit drops those names.
Some examples:
getBBArg() => getArgument()
BBArgList => ArgumentList
bbarg_begin() => args_begin()
I see some small performance improvements on a few benchmarks, but they
are likely to be due to noise.
The compilation pipeline is very epilogue release friendly at the moment,i.e.
we do not move the epilogue release of a function till very late in the pipeline.
Therefore, this global data flow sort of an overkill. I am going to change
the pass pipeline next so that we can move epilogue releases freely and the data
flow will become useful.
I do not see compilation time increase.
rdar://26446587
As promised, we separate the duty of moving retain release pairs with the
task of removing them. Now the task of moving retains and releases are in
Retain Release Code Motion committed in 51b1c0bc68.
This speeds and reduces memory consumption of test cases with large
CFGs. The specific test case that spawned this fix was a large function
with many dictionary assignments:
public func func_0(dictIn : [String : MyClass]) -> [String : MyClass] {
var dictOut : [String : MyClass] = [:]
dictOut["key5000"] = dictIn["key500"]
dictOut["key5010"] = dictIn["key501"]
dictOut["key5020"] = dictIn["key502"]
dictOut["key5030"] = dictIn["key503"]
dictOut["key5040"] = dictIn["key504"]
...
}
This continued for 10k - 20k values.
This commit reduces the compile time by 2.5x and reduces the amount of
memory allocated by ARC by 2.6x (the memory allocation number includes
memory that is subsequently freed).
rdar://24350646
In all of the cases where this is being used, we already immediately perform an
unreachable if we find a TermKind::Invalid. So simplify the code and move it
into the conversion switch itself.
Previously, we relied on a quirk in the ARC optimizer so that we only
need to visit terminators top down. This simplified the dataflow. Sadly,
try_apply changes this since it is a terminator that provides a call
with the value, causing this assumption to break program correctness.
Now during the bottom up traversal, while performing the dataflow for a
block B, we (after visiting all instructions), visit B's predecessors to
see if any of them have a terminator that is a use or decrement. We then
take the most conservative result among all of the terminators and
advance the sequence accordingly.
I do not think that we can have multiple such predecessors today since all
interesting terminators can not have any critical edges to successors. Thus if
our block is a successor of any such block, it can not have any other
predecessors. This is mainly for future proofing if we decide that this is able
to be done in the future.
rdar://23853221
SR-102
(libraries now)
It has been generally agreed that we need to do this reorg, and now
seems like the perfect time. Some major pass reorganization is in the
works.
This does not have to be the final word on the matter. The consensus
among those working on the code is that it's much better than what we
had and a better starting point for future bike shedding.
Note that the previous organization was designed to allow separate
analysis and optimization libraries. It turns out this is an
artificial distinction and not an important goal.