This is similar and yet different from epilogue release matcher. Particularly
how retain is found and when to bail. Therefore this is put into a different
class than ConsumedArgToEpilogueReleaseMatcher
This is currently a NFC other than some basic testing using the epilogue dumper.
When we have a single release that can be traced back to a SILArgument.
i.e. the released value is RC-identical to the SILArgument. we do not
need projection to check whether there are overlapping/uncovered releases
for the SILargument (which will result we exit the epilogue walking sequnece).
This brings the # of Total owned args -> guaranteed args from 118 and 149.
There are 23 owned args which we can not find epilogue releases for at this
point and many (if not most) is a result of partial_apply which rc-identity
nor projection can handle.
with this commit i see differences only in 2 benchmarks. baseline is without this
change. I am looking at the SuperChars regression.
StrToInt 9297 8402 -895 -9.6% 1.11x
SuperChars 676554 748375 +71821 +10.6% 0.90x (!)
optimization.
We get some improvements on the # of parameters converted to guanrateed
from owned on the stdlib.
before
======
103 sil-function-signature-opts - Total owned args -> guaranteed args
after
======
118 sil-function-signature-opts - Total owned args -> guaranteed args
I see the following improvements by running benchmarks with and without this
change. Only difference >=1.05X
ErrorHandling 8154 7497 -657 -8.1% 1.09x
LinkedList 9973 9529 -444 -4.5% 1.05x
ObjectAllocation 239 222 -17 -7.1% 1.08x
RC4 23167 21993 -1174 -5.1% 1.05x (!)
This is part of rdar://22380547
When we have all the epilogue releases. Make sure they cover all the non-trivial
parts of the base. Otherwise, treat as if we've found no releases for the base.
Currently. this is a NFC other than epilogue dumper. I will wire it up with
function signature with next commit.
This is part of rdar://22380547
This speeds and reduces memory consumption of test cases with large
CFGs. The specific test case that spawned this fix was a large function
with many dictionary assignments:
public func func_0(dictIn : [String : MyClass]) -> [String : MyClass] {
var dictOut : [String : MyClass] = [:]
dictOut["key5000"] = dictIn["key500"]
dictOut["key5010"] = dictIn["key501"]
dictOut["key5020"] = dictIn["key502"]
dictOut["key5030"] = dictIn["key503"]
dictOut["key5040"] = dictIn["key504"]
...
}
This continued for 10k - 20k values.
This commit reduces the compile time by 2.5x and reduces the amount of
memory allocated by ARC by 2.6x (the memory allocation number includes
memory that is subsequently freed).
rdar://24350646
top level driver . Move the top level driver of the pairing analysis into
ARCSequenceOpts and have ARCSequenceOpts use ARCMatchingSetBuilder directly.
This patch is the first in a series of patches that improve ARC compile
time performance by ensuring that ARC only visits the full CFG at most
one time.
Previously when ARC was split into an analysis and a pass, the split in
the codebase occurred at the boundary in between ARCSequenceOpts and
ARCPairingAnalysis. I used a callback to allow ARCSequenceOpts to inject
code into ARCPairingAnalysis.
Now that the analysis has been moved together with the pass this
unnecessarily complicates the code. More importantly though it creates
obstacles towards reducing compile time by visiting the CFG only once.
Specifically, we need to visit the full cfg once to gather interesting
instructions. Then when performing the actual dataflow analysis, we only
visit the interesting instructions. This causes an interesting problem
since retains/releases can have dependencies on each other implying that
I need to be able to update where various "interesting instructions" are
located after ARC moves it. The "interesting instruction" information is
stored at the pairing analysis level, but the moving/removal of
instructions is injected in via the callback.
By moving the top level driver part of ARCPairingAnalysis into
ARCSequenceOpts, we simplify the code by eliminating the dependency
injection callback and also make it easier to manage the cached CFG
state in the face of the ARC optimizer moving/removing retains/releases.
So instead of only being able to match %1 and release %1 in (1). we
can also match %1 with (release %2, and release%3, i.e. exploded release_value)
in (2).
(1)
foo(%1)
strong_release %1
(2)
foo(%1)
%2 = struct_extract %1, field_a
%3 = struct_extract %1, field_b
strong_release %2
strong_release %3
This will allow function signature to better move the release instructions to
the callers.
Currently, this is a NFC other than testing using the epilogue match dumper.
There's a group of methods in `DeclContext` with names that start with *is*,
such as `isClassOrClassExtensionContext()`. These names suggests a boolean
return value, while the methods actually return a type declaration. This
patch replaces the *is* prefix with *getAs* to better reflect their interface.
We don't use this information after this point and are already
invalidating this (and everything else for this function body) at the
end of the pass, so this is not necessary.
NFC.
Prior to splitting devirtualization out of inlining we needed this to
ensure that we would not go into an infinite loop in certain cases.
Now that these are split out and we no longer iterate within the inliner
over new opportunities, we only need to check for self-recursive
functions.
NFC.