Till now there was no way in SIL to explicitly express a dependency of an instruction on any opened archetypes used by it. This was a cause of many errors and correctness issues. In many cases the code was moved around without taking into account these dependencies, which resulted in breaking the invariant that any uses of an opened archetype should be dominated by the definition of this archetype.
This patch does the following:
- Map opened archetypes to the instructions defining them, i.e. to open_existential instructions.
- Introduce a helper class SILOpenedArchetypesTracker for creating and maintaining such mappings.
- Introduce a helper class SILOpenedArchetypesState for providing a read-only API for looking up available opened archetypes.
- Each SIL instruction which uses an opened archetype as a type gets an additional opened archetype operand representing a dependency of the instruction on this archetype. These opened archetypes operands are an in-memory representation. They are not serialized. Instead, they are re-constructed when reading binary or textual SIL files.
- SILVerifier was extended to conduct more thorough checks related to the usage of opened archetypes.
Each time we delete the pass manager, we delete the analyses it
holds. In this analysis we were holding onto memory that wasn't getting
released when the analysis was deleted.
Fixes rdar://problem/26245872.
replaced by retain release code motion. This code has been disabled for sometime now.
This should bring the retain release code motion into a close. The retain release
code motion pipeline looks like this. There could be some minor cleanups after this though.
1. We perform a global data flow for retain release code motion in RRCM (RetainReleaseCodeMotion)
2. We perform a local form of retain release code motion in SILCodeMotion. This is more
for cases which can not be handled in RRCM. e.g. sinking into a switch is more efficiently
done in a local form, the retain is not needed on the None block. Release on SILArgument needs
to be split to incoming values, this can not be done in RRCM and other cases.
3. We do not perform code motion in ASO, only elimination which are very important.
Some modifications to test cases, they look different, but functionally the same.
RRCM has this canonicalization effect, i.e. it uses the rc root, instead of
the SSA value the retain/release is currently using. As a result some test cases need
to be modified.
I also removed some test cases that do not make sense anymore and lot of duplicate test
cases between earlycodemotion.sil and latecodemotion.sil. These tests cases only have retains
and should be used to test early code motion.
Several functionalities have been added to FSO over time and the logic has become
muddled.
We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.
This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.
Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.
With this change, it will be easier to implement function signature optimization as now
we can do them independently now.
Small modifications to the test cases.
Shaves about 19% of the time from the construction of these sets. The
SmallVector size was chosen to minimize the number of dynamic
allocations we end up doing while building the stdlib. This should be a
reasonable size for most projects, too. It's a bit wasteful in space,
but the total amount of allocated space here is pretty small to begin
with.
For details see the comment in ConditionForwarding.cpp.
This optimization pass helps to optimize loops iterating over closed ranges, e.g. for i in 0...n { }
Several functionalities have been added to FSO over time and the logic has become
muddled.
We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.
This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.
Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.
With this change, it will be easier to implement function signature optimization as now
we can do them independently now.
Minimal modifications to the test cases.
We now consider effect of deinit in addition to the released value.
rdar://25362826
This is the only 10%+ regression i measured on my machine. no performance improvement.
Sim2DArray | 326 | 366 | +12.3% | **0.89x**
Change the optimizer to only make specializations [fragile] if both the
original callee is [fragile] *and* the caller is [fragile].
Otherwise, the specialized callee might be [fragile] even if it is never
called from a [fragile] function, which inhibits the optimizer from
devirtualizing calls inside the specialization.
This opens up some missed optimization opportunities in the performance
inliner and devirtualization, which currently reject fragile->non-fragile
references:
TEST | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
--- | --- | --- | --- | ---
DictionaryRemoveOfObjects | 38391 | 35859 | -6.6% | **1.07x**
Hanoi | 5853 | 5288 | -9.7% | **1.11x**
Phonebook | 18287 | 14988 | -18.0% | **1.22x**
SetExclusiveOr_OfObjects | 20001 | 15906 | -20.5% | **1.26x**
SetUnion_OfObjects | 16490 | 12370 | -25.0% | **1.33x**
Right now, passes other than performance inlining and devirtualization
of class methods are not checking invariants on [fragile] functions
at all, which was incorrect; as part of the work on building the
standard library with -enable-resilience, I added these checks, which
regressed performance with resilience disabled. This patch makes up for
these regressions.
Furthermore, once SIL type lowering is aware of resilience, this will
allow the stack promotion pass to make further optimizations after
specializing [fragile] callees.
We ended up adding the same instruction twice to a SmallVector of
instructions to be deleted. To avoid this, we'll track these
to-be-deleted instructions in a SmallSetVector instead.
We were also failing to add an instruction that we can delete to the set
of instructions to be deleted, so I fixed that as well.
I've added a test case, but it's currently disabled because fixing this
turned up another issue in the same code which I still need to take a
look at.
Fixes rdar://problem/25369617.
We can remove the retain/release pair preceeding the builtins based on the
knowledge that the lifetime of the reference is guaranteed by someone hanging on
to the reference elsewhere.
Eventually, we decided to do this
1. Have the function signature opts (used to be called the cloner to create
the optimized function.
2. Mark the thunk as always_inline
3. Rely on the inliner to inline the thunk to get the benefit of calling optimized
function directly.
We decided to use the inliner to rewrite the caller's callsites.
And eventually I will turn FunctionSignatureAnalysis into a Utility.
As its data should only be used and kept in the cloner pass.