Some parts were using the more modern style that we are using in the optimizer
that involves having ivars and local variables be camelCase instead of
CamelCase.
The current dumping format consists of 1 row of information per function. This
will become unweildy to write patterns for when I add additional state to
FunctionInfo.
Instead, this commit converts the dumping format of the caller analysis into a
multi line yaml format. This yaml format looks as follows:
---
calleeName: closure1
hasCaller: false
minPartialAppliedArgs: 1
partialAppliers:
- partial_apply_one_arg
- partial_apply_two_args1
fullAppliers:
...
This can easily expand over time as we expand the queries that caller analysis
can answer.
As an additional advantage, there are definitely yaml parsers that can handle
multiple yaml documents in sequence in a stream. This means that by running via
sil-opt the caller-analysis-printer pass, one now will get a yaml description of
the caller analysis state, perfect and ready for analysis.
This converts a DenseMap to a SmallMapVector and a SetVector to a
SmallSetVector. Both of these create large malloced data structures by
default. This really makes no sense when there are many functions that don't use
a partial apply or many applies.
Additionally, by changing the DenseMap to a MapVector container, this commit is
eliminating a potential source of non-determinism in the compiler since often
times we are iterating over the DenseMap to produce the results. Today all of
the usages of the DenseMap in this way are safe, but to defensively future proof
this analysis, it makes sense to use a MapVector here.
This patch adds SIL-level debug info support for variables whose
static type is rewritten by an optimizer transformation. When a
function is (generic-)specialized or inlined, the static types of
inlined variables my change as they are remapped into the generic
environment of the inlined call site. With this patch all inlined
SILDebugScopes that point to functions with a generic signature are
recursively rewritten to point to clones of the original function with
new unique mangled names. The new mangled names consist of the old
mangled names plus the new substituions, similar (or exactly,
respectively) to how generic specialization is handled.
On libSwiftCore.dylib (x86_64), this yields a 17% increase in unique
source vars and a ~24% increase in variables with a debug location.
rdar://problem/28859432
rdar://problem/34526036
Fixes <rdar://40555427> [SR-7773]:
SILCombiner::propagateConcreteTypeOfInitExistential fails to full propagate type
substitutions.
Fixes <rdar://problem/40923849>
SILCombiner::propagateConcreteTypeOfInitExistential crashes on protocol
compositions.
This rewrite fixes several fundamental bugs in the SILCombiner optimization that
propagates concrete types. In particular, the pass needs to handle:
- Arguments of callee Self type in non-self position.
- Indirect and direct return values of Self type.
- Types that indirectly depend on Self within callee function signature.
- Protocol composition existentials.
- All of the above need to work for protocol extensions as well as witness methods.
- For protocol extensions, conformance lookup should be based on the existential's conformance list.
Additionally, the optimization should not depend on a SILFunction's DeclContext,
which is not serialized. (In fact, we should prevent SIL passes from using
DeclContext). Furthermore, the code needs to be expressed in a way that one can
reason about correctness and invariants.
The root cause of these bugs is that SIL passes are written based on untested
assumptions of Swift type system. A SIL pass needs to handle all verifiable SIL
input because passes need to be composable. Bail-out logic can be added to
simplify the design; however, _the bail-out logic itself cannot make any
assumptions about the language or type system_ that aren't clearly and
explicitly enforced in the SIL verifier. This is a common mistake and major
source of bugs.
I created as many unit tests as I reasonably could to prevent this code from
regressing. Creating enough unit tests to cover all corner cases that were
broken in the original code would be intractable. But the code has been
simplified such that many corner cases disappear.
This opens up some oportunity for generalizing the optimization and eliminating
special cases. However, I want this PR to be limited to fixing correctness
issues only. In the long term, it would be preferable to replace this
optimization entirely with a much more powerful general type propagation pass.
I am tuning a new argument explosion heuristic to reduce code-size. One part of
the heuristic I am playing with is the part of the algorithm that attempts to
figure out if we could eliminate additonal arguments after performing
owned->guaranteed an additional release when we run FSO a second time. Today we
do this unconditionally. I am trying to do it in a more conservative way where
we only do it if we know that we aren't going to increase the number of
arguments too much.
rdar://41146023
This is particularly egrigious since we are only /reading/ from the DenseSet. So
we are basically mallocing/copying a DenseSet just to read from it... I don't
think I need to say more.
rdar://41146023
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
I am doing this so I can start writing DI tests without this lowering occuring.
There never was a real reason for this code to be in DI beyond convenience. Now
it just makes writing tests more difficult. To prevent any test delta, I changed
all current DI tests to run this pass after DI.
SubstitutionMaps are now just a trivial pointer-sized value, so
pass them by value instead.
I did have to move a couple of functors from Type.h to SubstitutionMap.h
to resolve some issues with forward declarations.
There are ~100 significant benchmark regressions (of ~350) with -O
-enforce-exclusivity=checked.
This optimization roughly cuts the overhead in half for almost all of those
regressions. These are the top 30 improvements with the optimization enabled.
XorShift....................................................2.83x
ReversedArray...............................................2.76x
RangeIterationSigned........................................2.67x
ExclusivityGlobal...........................................2.57x
Random......................................................2.44x
ReversedDictionary..........................................2.41x
GeekbenchGEMM...............................................2.35x
ArrayInClass................................................2.31x
StringWalk..................................................2.29x
Ary.........................................................2.25x
Ary3........................................................2.25x
Ary2........................................................2.21x
MultiFileTogether...........................................2.17x
MultiFileSeparate...........................................2.17x
RecursiveOwnedParameter.....................................2.14x
LevenshteinDistance.........................................2.04x
HashTest....................................................1.97x
Voronoi.....................................................1.94x
NopDeinit...................................................1.92x
Life........................................................1.89x
Richards....................................................1.84x
Rectangles..................................................1.74x
MatMul......................................................1.71x
LinkedList..................................................1.51x
GeekbenchFFT................................................1.47x
Xcbuild_OutputByteStreamPerfTests...........................1.39x
ObjectAllocation............................................1.33x
MapReduceLazyCollection.....................................1.30x
Prims.......................................................1.28x
CharIndexing_tweet_unicodeScalars_Backwards.................1.28x
Use AccessedStorageAnalysis to find access markers with no nested conflicts.
This optimization analyzes the scope of each access to determine
whether it contains a potentially conflicting access. If not, then it
can be demoted to an instantaneous check, which still catches
conflicts on any enclosing outer scope.
This removes up to half of the runtime calls associated with
exclusivity checking.
The actual algorithm used here has not changed at all so this is basically a NFC
commit. What this PR does is change the underlying algorithm to return the
operands that it computes internally rather than transforming the operand list
into the user list internally. This enables the callers of the optimization to
find the operand number related to the uses. This makes working with
instructions with multiple operands much easier since one does not need to mess
around with rederiving the operand number from the user instruction/SILValue
pair.
getRCUsers() works now by running getRCUses() internally and then maps the
operand list to the user list.
rdar://38196046