In response to [SR-3334], which pointed out a large cost to iteration over
ReversedRandomAccessCollections at optimization level -Onone. By adding
iteration through a.reversed() in the _Prespecialize struct, we can get to
about the same performance as forward iteration.
We preserve the current behavior of assuming Any ownership always and use
default arguments to hide this change most of the time. There are asserts now in
the SILBasicBlock::{create,replace,insert}{PHI,Function}Argument to ensure that
the people can only create SILFunctionArguments in entry blocks and
SILPHIArguments in non-entry blocks. This will ensure that the code in tree
maintains the API distinction even if we are not using the full distinction in
between the two.
Once the verifier is finished being upstreamed, I am going to audit the
createPHIArgument cases for the proper ownership. This is b/c I will be able to
use the verifier to properly debug the code. At that point, I will also start
serializing/printing/parsing the ownershipkind of SILPHIArguments, but lets take
things one step at a time and move incrementally.
In the process, I also discovered a CSE bug. I am not sure how it ever worked.
Basically we replace an argument with a new argument type but return the uses of
the old argument to refer to the old argument instead of a new argument.
rdar://29671437
This simplifies the SILType substitution APIs and brings them in line with Doug and Slava's refactorings to improve AST-level type substitution. NFC intended.
The purpose of this change is to test if the new mangling is equivalent to the old mangling.
Both mangling strings are created, de-mangled and checked if the de-mangle trees are equivalent.
This was already done for getSuccessorBlocks() to distinguish getting successor
blocks from getting the full list of SILSuccessors via getSuccessors(). This
commit just makes all of the successor/predecessor code follow that naming
convention.
Some examples:
getSingleSuccessor() => getSingleSuccessorBlock().
isSuccessor() => isSuccessorBlock().
getPreds() => getPredecessorBlocks().
Really, IMO, we should consider renaming SILSuccessor to a more verbose name so
that it is clear that it is more of an internal detail of SILBasicBlock's
implementation rather than something that one should consider as apart of one's
mental model of the IR when one really wants to be thinking about predecessor
and successor blocks. But that is not what this commit is trying to change, it
is just trying to eliminate a bit of technical debt by making the naming
conventions here consistent.
Before this commit all code relating to handling arguments in SILBasicBlock had
somewhere in the name BB. This is redundant given that the class's name is
already SILBasicBlock. This commit drops those names.
Some examples:
getBBArg() => getArgument()
BBArgList => ArgumentList
bbarg_begin() => args_begin()
This eliminates all inline creation of SILBasicBlock via placement new.
There are a few reasons to do this:
1. A SILBasicBlock is always created with a parent function. This commit
formalizes this into the SILBasicBlock API by only allowing for SILFunctions to
create SILBasicBlocks. This is implemented via the type system by making all
SILBasicBlock constructors private. Since SILFunction is a friend of
SILBasicBlock, SILFunction can still create a SILBasicBlock without issue.
2. Since all SILBasicBlocks will be created in only a few functions, it becomes
very easy to determine using instruments the amount of memory being allocated
for SILBasicBlocks by simply inverting the call tree in Allocations.
With LTO+PGO, normal inlining can occur if profitable so there shouldn't be
overhead that we care about in shipping compilers.
Today, loads and stores are treated as having @unowned(unsafe) ownership
semantics. This leaves the user to specify ownership changes on the loaded or
stored value independently of the load/store by inserting ARC operations. With
the change to Semantic SIL, this will no longer be true. Instead loads, stores
have ownership semantics that one must reason about such as copy, take, and
trivial.
This change moves us closer to that world by eliminating the default
OwnershipQualification argument from create{Load,Store}. This means that the
compiler developer cannot ignore reasoning about the ownership semantics of the
memory operation that they are creating.
Operationally, this is a NFC change since I have just gone through the compiler
and updated all places where we create loads, stores to pass in the former
default argument ({Load,Store}OwnershipQualifier::Unqualified), to
SILBuilder::create{Load,Store}(...). For now, one can just do that in situations
where one needs to create loads/stores, but over time, I am going to tighten the
semantics up via the verifier.
rdar://28685236
When applying substitutions to substitution lists in SIL, we would
unpack the ArrayRef<Substitution> into a SubstitutionMap on each
iteration over the original ArrayRef<Substitution>. Discourage
this sort of thing by removing the API in question and refactoring
surrounding code.
This patch is rather large, since it was hard to make this change
incrementally, but most of the changes are mechanical.
Now that we have a lighter-weight data structure in the AST for mapping
interface types to archetypes and vice versa, use that in SIL instead of
a GenericParamList.
This means that when serializing a SILFunction body, we no longer need to
serialize references to archetypes from other modules.
Several methods used for forming substitutions can now be moved from
GenericParamList to GenericEnvironment.
Also, GenericParamList::cloneWithOuterParameters() and
GenericParamList::getEmpty() can now go away, since they were only used
when SILGen-ing witness thunks.
Finally, when printing generic parameters with identical names, the
SIL printer used to number them from highest depth to lowest, by
walking generic parameter lists starting with the innermost one.
Now, ambiguous generic parameters are numbered from lowest depth
to highest, by walking the generic signature, which means test
output in one of the SILGen tests has changed.
This function takes a substitution array and produces a
contextual type substitution map, so it is the contextual
type equivalent of GenericSignature::getSubstitutionMap(),
which produces an interface type substitution map.
The new version takes a GenericSignature, just like the new
getForwardingSubstitutions(), so that it can walk the
requirements of the signature rather than walking the
AllArchetypes list.
Also, this new version now produces a mapping from
archetypes to conformances in addition to the type mapping,
which will allow it to be used in a few places that had
hand-coded logic.
The generic specialized would get out of control in certain cases and would not stop generating specializations of generic functions until it runs out of memory after a while.
rdar://21260480
Applying this patch triggered an assert while building libswiftOnoneSupport:
--- a/lib/SILOptimizer/PassManager/Passes.cpp
+++ b/lib/SILOptimizer/PassManager/Passes.cpp
@@ -283,6 +283,9 @@ void swift::runSILOptimizationPasses(SILModule &Module) {
PM.setStageName("HighLevel+EarlyLoopOpt");
// FIXME: update this to be a function pass.
PM.addEagerSpecializer();
+
+ AddSimplifyCFGSILCombine(PM);
+
AddSSAPasses(PM, OptimizationLevelKind::HighLevel);
AddHighLevelLoopOptPasses(PM);
PM.runOneIteration();
I don't have a reduced testcase, but presumably Erik will commit the above
change soon.
Fixes <rdar://problem/25646947>.
Two fixes to optimization passes to maintain restrictions about what
[fragile] functions can reference:
- When devirtualizing witness methods, don't devirtualize if the caller
is fragile and the callee is not. This matches existing logic in
class devirtualization.
- When performing generic or function signature specialization, don't
specialize non-fragile functions referenced from fragile functions.
Since @_transparent functions are allowed to call 'static inline'
imported functions, also be sure to mark the foreign-to-native thunk
for such a function as [fragile].
With this patch, the standard library and performance test suite
now build with -enable-resilience.
No new tests for this stuff here -- the existing tests together
with an -enable-resilience build provide coverage.
Closes out <https://bugs.swift.org/browse/SR-267> and
<https://bugs.swift.org/browse/SR-268>.
Change the optimizer to only make specializations [fragile] if both the
original callee is [fragile] *and* the caller is [fragile].
Otherwise, the specialized callee might be [fragile] even if it is never
called from a [fragile] function, which inhibits the optimizer from
devirtualizing calls inside the specialization.
This opens up some missed optimization opportunities in the performance
inliner and devirtualization, which currently reject fragile->non-fragile
references:
TEST | OLD_MIN | NEW_MIN | DELTA (%) | SPEEDUP
--- | --- | --- | --- | ---
DictionaryRemoveOfObjects | 38391 | 35859 | -6.6% | **1.07x**
Hanoi | 5853 | 5288 | -9.7% | **1.11x**
Phonebook | 18287 | 14988 | -18.0% | **1.22x**
SetExclusiveOr_OfObjects | 20001 | 15906 | -20.5% | **1.26x**
SetUnion_OfObjects | 16490 | 12370 | -25.0% | **1.33x**
Right now, passes other than performance inlining and devirtualization
of class methods are not checking invariants on [fragile] functions
at all, which was incorrect; as part of the work on building the
standard library with -enable-resilience, I added these checks, which
regressed performance with resilience disabled. This patch makes up for
these regressions.
Furthermore, once SIL type lowering is aware of resilience, this will
allow the stack promotion pass to make further optimizations after
specializing [fragile] callees.
We ended up adding the same instruction twice to a SmallVector of
instructions to be deleted. To avoid this, we'll track these
to-be-deleted instructions in a SmallSetVector instead.
We were also failing to add an instruction that we can delete to the set
of instructions to be deleted, so I fixed that as well.
I've added a test case, but it's currently disabled because fixing this
turned up another issue in the same code which I still need to take a
look at.
Fixes rdar://problem/25369617.
This was mistakenly reverted in an attempt to fix buildbots.
Unfortunately it's now smashed into one commit.
---
Introduce @_specialize(<type list>) internal attribute.
This attribute can be attached to generic functions. The attribute's
arguments must be a list of concrete types to be substituted in the
function's generic signature. Any number of specializations may be
associated with a generic function.
This attribute provides a hint to the compiler. At -O, the compiler
will generate the specified specializations and emit calls to the
specialized code in the original generic function guarded by type
checks.
The current attribute is designed to be an internal tool for
performance experimentation. It does not affect the language or
API. This work may be extended in the future to add user-visible
attributes that do provide API guarantees and/or direct dispatch to
specialized code.
This attribute works on any generic function: a freestanding function
with generic type parameters, a nongeneric method declared in a
generic class, a generic method in a nongeneric class or a generic
method in a generic class. A function's generic signature is a
concatenation of the generic context and the function's own generic
type parameters.
e.g.
struct S<T> {
var x: T
@_specialize(Int, Float)
mutating func exchangeSecond<U>(u: U, _ t: T) -> (U, T) {
x = t
return (u, x)
}
}
// Substitutes: <T, U> with <Int, Float> producing:
// S<Int>::exchangeSecond<Float>(u: Float, t: Int) -> (Float, Int)
---
[SILOptimizer] Introduce an eager-specializer pass.
This pass finds generic functions with @_specialized attributes and
generates specialized code for the attribute's concrete types. It
inserts type checks and guarded dispatch at the beginning of the
generic function for each specialization. Since we don't currently
expose this attribute as API and don't specialize vtables and witness
tables yet, the only way to reach the specialized code is by calling
the generic function which performs the guarded dispatch.
In the future, we can build on this work in several ways:
- cross module dispatch directly to specialized code
- dynamic dispatch directly to specialized code
- automated specialization based on less specific hints
- partial specialization
- and so on...
I reorganized and refactored the optimizer's generic utilities to
support direct function specialization as opposed to apply
specialization.
Temporarily reverting @_specialize because stdlib unit tests are
failing on an internal branch during deserialization.
This reverts commit e2c43cfe14, reversing
changes made to 9078011f93.
This pass finds generic functions with @_specialized attributes and
generates specialized code for the attribute's concrete types. It
inserts type checks and guarded dispatch at the beginning of the
generic function for each specialization. Since we don't currently
expose this attribute as API and don't specialize vtables and witness
tables yet, the only way to reach the specialized code is by calling
the generic function which performs the guarded dispatch.
In the future, we can build on this work in several ways:
- cross module dispatch directly to specialized code
- dynamic dispatch directly to specialized code
- automated specialization based on less specific hints
- partial specialization
- and so on...
I reorganized and refactored the optimizer's generic utilities to
support direct function specialization as opposed to apply
specialization.
Pre-specializations were only used by Onone builds, but were kept inside the standard library dylyb anyways. This commit moves all the pre-specializations into a dedicated Swift module and a dynamic library, which are only used by Onone builds.
This reduces the code size of libswiftCore.dylib by 4%-5%.
With this re-abstraction a specialized function has the same calling convention as if it would have been written with the specialized types in the first place.
In general this results in less alloc_stacks and load/stores.
It also can eliminate some re-abstraction thunks, e.g. if a generic closure is used in a non-generic context.
It some (hopefully rare) cases it may require to add re-abstraction thunks.
In case a function has multiple indirect results, only the first is converted to a direct result. This is an open TODO.
Pre-specializations were only used by Onone builds, but were kept inside the standard library dylyb anyways. This commit moves all the pre-specializations into a dedicated Swift module and a dynamic library, which are only used by Onone builds.
This reduces the code size of libswiftCore.dylib by 5%.