When enabling the option `-sil-opt-profile-repeat=<n>`, the optimizer runs passes n times and reports the total runtime at the end of the pass pipeline.
This is useful to profile a specific optimization pass with `sil-opt`.
For example, to profile the stack promotion pass:
```
sil-opt -stack-promotion -sil-opt-profile-repeat=10000 -o /dev/null test.sil
```
These sets are _much_ more efficient than `Set<Value>` and `Set<Instruction>` because they bridge to the efficient `NodeSet`.
Insertions/deletions are just bit operations.
* split the PassUtils.swift file into PassContext.swift and Passes.swift
* rework `Builder` bridging allowing more insertion point variations, e.g. inserting at the end of a block.
* add Builder.create functions for more instructions
* add `PassContext.splitBlock`
* move SIL modification functions from PassContext to extensions of the relevant types (e.g. instructions).
* rename `Location.bridgedLocation` -> `Location.bridged`
* C++: add a function `getDestructors(SILType type, bool isExactType)’: if the type is a final class or `isExactType` is true, then return the one and only destructor of that class.
* swift: add `getDestructor(ofExactType type: Type)` and `getIncompleteCallees`
* swift: remove `getDestructor` from the PassContext. The API of the `calleeAnalysis` can be used instead.
Allow other compilation units to refer to isFunctionSelectedForPrinting
by declaring it extern. Maybe at some point it might be nice to put
this into a header related to printing.
* Add the possibility to bisect the individual transforms of SILCombine and SimplifyCFG.
To do so, the `-sil-opt-pass-count` option now accepts the format `<n>.<m>`, where `m` is the sub-pass number.
The sub-pass number limits the number of individual transforms in SILCombine or SimplifyCFG.
* Add an option `-sil-print-last` to print the SIL of the currently optimized function before and after the last pass, which is specified with `-sil-opt-pass-count`.
* add `BasicBlockSet`
* add `BasicBlockWorklist`
* add `BasicBlockRange`, which defines a range of blocks from a common dominating “begin” block to a set of “end” blocks.
* add `InstructionRange`, which is similar to `BasicBlockRange`, just on instruction level. It can be used for value lifetime analysis.
* rename `StackList` -> `Stack` and move it to `Optimizer/DataStructures`
* rename `PassContext.passContext` to `PassContext._bridged`
* add notify-functions to PassContext
And a few other small related changes:
* remove libswiftPassInvocation from SILInstructionWorklist (because it's not needed)
* replace start/finishPassRun with start/finishFunction/InstructionPassRun
NFC
Use MemoryLayout.stride instead pf MemoryLayout.size. This fixes a buffer overflow bug in case of unaligned elements.
Plus some other bug fixes for stacklists which get larger than a single slab.
Also, add the `append(contentsOf:)` method.
* unify FunctionPassContext and InstructionPassContext
* add a modification API: PassContext.setOperand
* automatic invalidation notifications when the SIL is modified
Instruction passes are basically visit functions in SILCombine for a specific instruction type.
With the macro SWIFT_INSTRUCTION_PASS such a pass can be declared in Passes.def.
SILCombine then calls the run function of the pass in libswift.
StackList is a very efficient data structure for worklist type things.
This is a port of the C++ utility with the same name.
Compared to Array, it does not require any memory allocations.
With the macro SWIFT_FUNCTION_PASS a new libswift function pass can be defined in Passes.def.
The SWIFT_FUNCTION_PASS_WITH_LEGACY is similar, but it allows to keep an original C++ “legacy” implementation of the pass, which is used if the compiler is not built with libswift.
It's not needed anymore with delayed instruction deletion.
It was used for two purposes:
1. For analysis, which cache instructions, to avoid dangling instruction pointers
2. For passes, which maintain worklists of instructions, to remove a deleted instructions from the worklist. This is now done by checking SILInstruction::isDeleted().
When an instruction is "deleted" from the SIL, it is put into the SILModule::scheduledForDeletion list.
The instructions in this list are eventually deleted for real in SILModule::flushDeletedInsts(), which is called by the pass manager after each pass run.
In other words: instruction deletion is deferred to the end of a pass.
This avoids dangling instruction pointers within the run of a pass and in analysis caches.
Note that the analysis invalidation mechanism ensures that analysis caches are invalidated before flushDeletedInsts().
* rename -sil-print-only-function to -sil-print-function and -sil-print-only-functions to -sil-print-functions
* to print single functions, don't require -Xllvm -sil-print-all. It's now sufficient to use e.g. -Xllvm -sil-print-function=<f>
But it's still possible to select functions with -sil-print-function(s) for other print options, -sil-print-after.
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
This removes the ambiguity when casting from a SingleValueInstruction to SILNode, which makes the code simpler. E.g. the "isRepresentativeSILNode" logic is not needed anymore.
Also, it reduces the size of the most used instruction class - SingleValueInstruction - by one pointer.
Conceptually, SILInstruction is still a SILNode. But implementation-wise SILNode is not a base class of SILInstruction anymore.
Only the two sub-classes of SILInstruction - SingleValueInstruction and NonSingleValueInstruction - inherit from SILNode. SingleValueInstruction's SILNode is embedded into a ValueBase and its relative offset in the class is the same as in NonSingleValueInstruction (see SILNodeOffsetChecker).
This makes it possible to cast from a SILInstruction to a SILNode without knowing which SILInstruction sub-class it is.
Casting to SILNode cannot be done implicitly, but only with an LLVM `cast` or with SILInstruction::asSILNode(). But this is a rare case anyway.
The PassManager should transform all functions in bottom up order.
This is necessary because when optimizations like inlining looks at the
callee function bodies to compute profitability, the callee functions
should have already undergone optimizations to get better profitability
estimates.
The PassManager builds its function worklist based on bottom up order
on initialization. However, newly created SILFunctions due to
specialization etc, are simply appended to the function worklist. This
can cause us to make bad inlining decisions due to inaccurate
profitability estimates. This change now updates the function worklist such
that, all the callees of the newly added SILFunction are proccessed
before it by the PassManager.
Fixes rdar://52202680
* Fix the mid-level pass pipeline.
Module passes need to be in a separate pipeline, otherwise the
pipeline restart mechanism will be broken.
This makes GlobalOpt and serialization run earlier in the
pipeline. There's no explicit reason for them to be run later, in the
middle of a function pass pipeline.
Also, pipeline boundaries, like serialization and module passes should
be explicit at the the top level function that creates the pass
pipelines.
* SILOptimizer: Add enforcement of function-pass pipelines.
Don't allow module passes to be inserted within a function pass
pipeline. This silently breaks the function pipeline both interfering
with analysis and the normal pipeline restart mechanism.
* Add misssing pass in addFunctionPasses
Co-authored-by: Andrew Trick <atrick@apple.com>
-sil-verify-all flag will verify analyses before and after a pass to
confirm correct invalidations. But if an analysis was never
constructed or invalidated as per current pass order,
it may never detect insufficient invalidations.
-sil-verify-force-analysis will force construct an analysis so that we
can better check for insufficient invalidations.
It is also terribly slow compared to -sil-verify-all.
A request is intended to be a pure function of its inputs. That function could, in theory, fail. In practice, there were basically no requests taking advantage of this ability - the few that were using it to explicitly detect cycles can just return reasonable defaults instead of forwarding the error on up the stack.
This is because cycles are checked by *the Evaluator*, and are unwound by the Evaluator.
Therefore, restore the idea that the evaluate functions are themselves pure, but keep the idea that *evaluation* of those requests may fail. This model enables the best of both worlds: we not only keep the evaluator flexible enough to handle future use cases like cancellation and diagnostic invalidation, but also request-based dependencies using the values computed at the evaluation points. These aforementioned use cases would use the llvm::Expected interface and the regular evaluation-point interface respectively.