* [sil-performance-inliner] Re-factor the isProfitableToInline logic. NFC.
* [sil-performance-inliner] Introduce a pre-filter to decide of a generic function should be inlined
This logic is unconditional and is does not require any complex cost models.
The outcome of the check is one of:
- yes, inline this generic function
- no, do not inline this generic function
- None, don't know if it should be inline. Further more complex checks are required.
* [sil-performance-inliner] Handle inlining of generic functions into cold blocks
Generic functions should be inlined into cold blocks only if they should be unconditionally inlined (e.g. when they are always_inline or transparent).
* Allow generic inlining under -sil-inline-generics.
This is a NFC change, since verification still will be behind the flag. But this
will allow me to move copy_value, destroy_value in front of the
EnableSILOwnership flag and verify via SILGen that we are always using those
instructions.
rdar://28851920
Use the following options to enable this flag: -Xllvm -sil-inline-generics
Generic inlining is now handled by a dedicated logic in isProfitableToInlineGeneric. This makes it easier to find this logic. And it will make it easier to extend and improve it in the future.
The initial policy for generic inlining is:
- unconditionally inline generic functions if the -sil-inline-generics flag is used.
- If the flag is not used, only perform generic inlining of always_inline and transparent functions.
There are slight standard library's code-size regressions with this policy. They will be addressed by the future work on the generic inlining.
- Move the common performance inliner functionality into PerformanceInlinerUtils.cpp.
- Move the functionality specific to non-generic inlining into NonGenericPerformanceInliner.cpp
- Temporarily disable the inlining of generics. It will be enabled in the subsequent commit.
The behaviour of ilist has changed in LLVM. It is no longer permissible to
dereference the `end()` value. Add a check to ensure that we do not
accidentally dereference the iterator.
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
delete it and recreate new one
This is a compilation time improvement
There are a few small modifications to the tests, as we try to create
different, but equivalent retain/release before even though we can reuse
the old ones.
rdar://28329689
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
When applying substitutions to substitution lists in SIL, we would
unpack the ArrayRef<Substitution> into a SubstitutionMap on each
iteration over the original ArrayRef<Substitution>. Discourage
this sort of thing by removing the API in question and refactoring
surrounding code.
It makes sense to turn the new epilogue retain/release matcher to an Analysis.
Its currently a data flow with an entry API point. This saves on compilation time,
even though it does not seem to be very expensive right now. But it is a iterative
data flow which could be expensive with large CFGs.
rdar://28178736
When devirtualizing witness method and class method calls, we
transform apply instructions operating on the result of a SIL
witness_method or class_method instruction to direct calls of
a function_ref.
The generic signature of the dynamic call site might not match
the generic signature of the static thunk, so the substitution
list from the dynamic apply instruction cannot be used directly;
instead, we must transform it to a substitution list suitable
for the static thunk.
- With witness methods, the method is called using the protocol
requirement's signature, <Self : P, ...>, however the
witness thunk has a generic signature derived from the
concrete witness.
For example, the requirement might have a signature
<Self : P, T>, where the concrete witness thunk might
have a signature <X, Y>, where the concrete conforming type
is G<X, Y>.
At the call site, we substitute Self := G<X', Y'>; however
to be able to call the witness thunk directly, we need to
form substitutions X := X' and Y := Y'.
- A similar situation occurs with class methods when the
dynamically-dispatched call is performed against a derived
class, but devirtualization actually finds the method on a
base class of the derived class.
The base class may have a different number of generic
parameters than the derived class, either because the
derived class makes some generic parameters of the base
class concrete, or if the derived class introduces new
generic parameters of its own.
In both cases, we need to consider the generic signature of the
dynamic call site (the protocol requirement or the derived
class method) as well as the generic signature of the static
thunk, and carefully remap the substitutions from one form
into another.
Previously the optimizer would implicitly rely on substitutions
being in AllArchetypes order, in particular that concatenating
outer substitutions with inner substitutions makes sense.
This assumption is about to go away, so this patch refactors
the optimizer to use some new abstractions for remapping
substitution lists.
When performing a CSE of open_existential_ref instructions, we replace the new archetype by the old archetype by cloning the uses and re-mapping the archetypes. But we also need to consider that some of the uses of a open_existential_ref instruction (e.g. loads) may produce results depending on the opened archetype being replaced. Therefore, for every such use its own uses (and their uses) should be eventually recursively cloned and type-remapped as well if they depend on the opened archetype being replaced.
Fixes rdar://28136015 and https://bugs.swift.org/browse/SR-2545
This should have identical behavior as the old epilogue retain matcher.
I do not see performance improvement.
The compilation time does not show up on Instrument either.
This patch is rather large, since it was hard to make this change
incrementally, but most of the changes are mechanical.
Now that we have a lighter-weight data structure in the AST for mapping
interface types to archetypes and vice versa, use that in SIL instead of
a GenericParamList.
This means that when serializing a SILFunction body, we no longer need to
serialize references to archetypes from other modules.
Several methods used for forming substitutions can now be moved from
GenericParamList to GenericEnvironment.
Also, GenericParamList::cloneWithOuterParameters() and
GenericParamList::getEmpty() can now go away, since they were only used
when SILGen-ing witness thunks.
Finally, when printing generic parameters with identical names, the
SIL printer used to number them from highest depth to lowest, by
walking generic parameter lists starting with the innermost one.
Now, ambiguous generic parameters are numbered from lowest depth
to highest, by walking the generic signature, which means test
output in one of the SILGen tests has changed.
If a SILBuilder creates a new instruction based on an old instruction and a new instruction is supposed to use some opened archetypes, one needs to set a proper opened archetypes context in the builder based on the opened archetypes used by the old instruction.
This fixes rdar://28024272
This adds the typedef and switches uses of NodeType * to NodeRef. This is in
preparation for the eventual NodeRef-ization of the GraphTraits in LLVM. NFC.