Currently if a caller is > 400 blocks, the inliner bails out of finding
inlinable targets. This is incorrect behavior for inline always functions. In
such cases, we should continue inlining inline always functions and skip any
functions that are not inline always.
rdar://45976860
This is a performance hack: inlining a coroutine can promote heap
allocations of the frame to stack allocations, which is valuable
out of proportion to the normal weight. There are surely more
principled ways of getting this effect, though.
Mostly functionally neutral:
- may fix latent bugs.
- may reduce useless basic blocks after inlining.
This rewrite encapsulates the cloner's internal state, providing a
clean API for the CRTP subclasses. The subclasses are rewritten to use
the exposed API and extension points. This makes it much easier to
understand, work with, and extend SIL cloners, which are central to
many optimization passes. Basic SIL invariants are now clearly
expressed and enforced. There is no longer a intricate dance between
multiple levels of subclasses operating on underlying low-level data
structures. All of the logic needed to keep the original SIL in a
consistent state is contained within the SILCloner itself. Subclasses
only need to be responsible for their own modifications.
The immediate motiviation is to make CFG updates self-contained so
that SIL remains in a valid state. This will allow the removal of
critical edge splitting hacks and will allow general SIL utilities to
take advantage of the fact that we don't allow critical edges.
This rewrite establishes a simple principal that should be followed
everywhere: aside from the primitive mutation APIs on SIL data types,
each SIL utility is responsibile for leaving SIL in a valid state and
the logic for doing so should exist in one central location.
This includes, for example:
- Generating a valid CFG, splitting edges if needed.
- Returning a valid instruction iterator if any instructions are removed.
- Updating dominance.
- Updating SSA (block arguments).
(Dominance info and SSA properties are fundamental to SIL verification).
LoopInfo is also somewhat fundamental to SIL, and should generally be
updated, but it isn't required.
This also fixes some latent bugs related to iterator invalidation in
recursivelyDeleteTriviallyDeadInstructions and SILInliner. Note that
the SILModule deletion callback should be avoided. It can be useful as
a simple cache invalidation mechanism, but it is otherwise bug prone,
too limited to be very useful, and basically bad design. Utilities
that mutate should return a valid instruction iterator and provide
their own deletion callbacks.
A few places around the compiler were checking for this module by its
name. The implementation still checks by name, but at least that only
has to occur in one place.
(Unfortunately I can't eliminate the string constant altogether,
because the implicit import for SwiftOnoneSupport happens by name.)
No functionality change.
When compiling benchmark/single-source/DictTest.swift, a multiplication can overflow in SILPerformanceInliner::isProfitableToInline(). Notice when the value is large enough to overflow and zero out the value it would have been subtracted from.
To do so this commit does a few different things:
1. I changed SILOptFunctionBuilder to notify the pass manager's logging
functionality when new functions are added to the module and to notify analyses
as well. NOTE: This on purpose does not put the new function on the pass manager
worklist since we do not want to by mistake introduce a large amount of
re-optimizations. Such a thing should be explicit.
2. I eliminated SILModuleTransform::notifyAddFunction. This just performed the
operations from 1. Now that SILOptFunctionBuilder performs this operation for
us, it is not needed.
3. I changed SILFunctionTransform::notifyAddFunction to just add the function to
the passmanager worklist. It does not need to notify the pass manager's logging
or analyses that a new function was added to the module since
SILOptFunctionBuilder now performs that operation. Given its reduced
functionality, I changed the name to addFunctionToPassManagerWorklist(...). The
name is a little long/verbose, but this is a feature since one should think
before getting the pass manager to rerun transforms on a function. Also, giving
it a longer name calls out the operation in the code visually, giving this
operation more prominance when reading code. NOTE: I did the rename using
Xcode's refactoring functionality!
rdar://42301529
This works around a potential circular dependence issue where TypeSubstCloner
needs access to SILOptFunctionBuilder but is in libswiftSIL.
rdar://42301529
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
* allow small class methods to be inlined with -Osize
* inline pure calls: references to objects, which are initialized with constants, are considered as constant arguments
* allow small class methods to be inlined with -Osize
* inline pure calls: references to objects, which are initialized with constants, are considered as constant arguments
This commit is mostly refactoring.
*) Introduce a new OptimizationMode enum and use that in SILOptions and IRGenOptions
*) Allow the optimization mode also be specified for specific SILFunctions. This is not used in this commit yet and thus still a NFC.
Also, fixes a minor bug: we didn’t run mandatory IRGen passes for functions with @_semantics("optimize.sil.never")
This brings the capability from clang to save remarks in an external YAML files.
YAML files can be viewed with tools like the opt-viewer.
Saving the remarks is activated with the new option -save-optimization-record.
Similarly to -emit-tbd, I've only added support for single-compile mode for now.
In this case the default filename is determined by
getOutputFilenameFromPathArgOrAsTopLevel, i.e. unless explicitly specified
with -save-optimization-record-path, the file is placed in the directory of the
main output file as <modulename>.opt.yaml.
This allows reporting successful and unsuccessful optimizations similar to
clang/llvm.
This first patch adds support for the
options -Rpass=<pass-name-regex> -Rpass-missed=<pass-name-regex>. These allow
reporting successful/unsuccessful optimization on the compiler output for passes
specified by the regex. I've also added one missed and one passed remark type
to the inliner to test the infrastructure.
Clang also has the option of collecting these records in an external YAML data
file. This will be added in a later patch.
A few notes:
* The goal is to use this facility for both user-lever "performance" warnings
and expert-level performance analysis. There will probably be a flag in the
future differentiating the verbosity.
* The intent is match clang/llvm as much as it makes sense. On the other hand I
did make some changes. Unlike in llvm, the emitter is not a pass which
simplifies things. Also the remark class hierarchy is greatly simplified since
we don't derive from DiagnosticInfo. We also don't derive from Diagnostic to
support the streaming API for arbitrary named-value pairs.
* Currently function names are printed mangled which should be fixed.
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
The reason to do this is:
1. The check in SILInliner if we can inline can be done without triggering
side-effects.
2. This enables us to know if inlining will succeed before attempting to inline.
This enables for arguments to be adjusted with new SILInstructions and the like
before inlining occurs. I use this in a forthcoming patch that updates mandatory
inlining for ownership.
rdar://31521023
If some functions are explicitly annotated by developers as @inline(__always) or @_transparent, they should always be a subject for the inlining of generics, even if this kind of inlining is not enabled currently for all functions.
If some functions are explicitly annotated by developers as @inline(__always) or @_transparent, they should always be a subject for the inlining of generics, even if this kind of inlining is not enabled currently for all functions.
A function is pure if it has no side-effects.
If there is a call of a pure function with constant arguments, it always makes sense to inline it, because we know that the whole computation will be constant folded.
Replace `NameOfType foo = dyn_cast<NameOfType>(bar)` with DRY version `auto foo = dyn_cast<NameOfType>(bar)`.
The DRY auto version is by far the dominant form already used in the repo, so this PR merely brings the exceptional cases (redundant repetition form) in line with the dominant form (auto form).
See the [C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es11-use-auto-to-avoid-redundant-repetition-of-type-names) for a general discussion on why to use `auto` to avoid redundant repetition of type names.
SubstitutionMap::lookupConformance() would map archetypes out
of context to compute a conformance path. Do the same thing
in SubstitutionMap::lookupSubstitution().
The DenseMap of replacement types in a SubstitutionMap now
always has GenericTypeParamTypes as keys.
This simplifies some code and brings us one step closer to
a more efficient representation of SubstitutionMaps.
At some point, pass definitions were heavily macro-ized. Pass
descriptive names were added in two places. This is not only redundant
but a source of confusion. You could waste a lot of time grepping for
the wrong string. I removed all the getName() overrides which, at
around 90 passes, was a fairly significant amount of code bloat.
Any pass that we want to be able to invoke by name from a tool
(sil-opt) or pipeline plan *should* have unique type name, enum value,
commend-line string, and name string. I removed a comment about the
various inliner passes that contradicted that.
Side note: We should be consistent with the policy that a pass is
identified by its type. We have a couple passes, LICM and CSE, which
currently violate that convention.
This is important in case the inline restarts the pass pipeline.
In a sub-sequent invocation of the inlined it should receive a cleaned-up function so that it can make better estimations for further inlining.
As a compensation, reduce the caller-block limit of the inliner.
And add an overall block limit which is also taken into account for always-inline functions.