The old invalidation lattice was incorrect because changes to control flow could cause changes to the
call graph, so we've decided to change the way passes invalidate analysis. In the new scheme, the lattice
is replaced with a list of traits that passes preserve or invalidate. The current traits are Calls and Branches.
Now, passes report which traits they preserve, which is the opposite of the previous implementation where
passes needed to report what they invalidate.
Node: I tried to limit the changes in this commit to mechanical changes to ease the review. I will cleanup some
of the code in a following commit.
Swift SVN r26449
With this change we will devirtualize in trivial cases where mandatory
inlining has exposed opportunities due to substituting types, for
example substituting a struct type into a witness_method where we can
now easily determine exactly what method will be called.
This makes it possible to use @transparent on struct methods that are
dispatched via generic functions, resulting in the opportunity to emit
diagnostics for these methods as well as eliminate the overhead of the
indirect call.
I saw a handful of 10+% perf improvements at -Onone on our benchmarks.
In theory this should allow us to remove the overloads for ++/-- in
FixedPoint.swift.gyb without a performance penalty (and with the proper
overflow diagnostics), but unfortunately if we were to do so, we would
currently dispatch to functions that lack runtime overflow
checks (rdar://problem/20226526).
Swift SVN r26397
Make the clients remove the apply, which paves the way for the clients
to potentially update the call graph when inlining is successful.
Swift SVN r26075
This means that even transparent functions are not inlined into thunks.
Part of rdar://problem/19701613.
This reduces the size of protocol witnesses in the dylib by 24% resulting in a total code size reduction of 5% (in the dylib).
There are no significant changes in the benchmarks.
Swift SVN r25037
This allows us to inline @transparent functions in cases like this:
@transparent
func partial<T, U>(x: T, f: (T) -> U) -> U {
return f(x)
}
func applyPartial<U>(x: Int32, f: (Int32) -> U) -> U {
return partial(x, f)
}
I had planned on enabling this same behavior in the performance inliner
first in order to be able to test the underlying functionality more
thoroughly, but I hit a blocking issue pretty quickly in type
lowering (rdar://problem/19387372).
Given that @transparent is not currently user-facing, it seems
reasonable to go ahead and enable this now and fix any new issues it
exposes since it should be easy to work around those issues by not using
@transparent.
There were lots of performance differences as a result of this change,
mostly positive. Below I list the 10 largest improvements for each of
-Onone, -O, and -Ounchecked, along with all regressions greater than
10%. I will be opening radars for these.
-Onone
---------------------------------------
ArrayOfPOD 104.200%
ArrayOfGenericPOD 40.700%
TwoSum 37.800%
EditDistance 35.600%
GenericStack 33.500%
SwiftStructuresStack 29.700%
SwiftStructuresInsertionSort 29.300%
Havlak 27.200%
NestedLoop 26.800%
Life 26.400%
SwiftStructuresTrie -15.500%
-O
---------------------------------------
TwoSum 46.200%
GenericStack 37.900%
SwiftStructuresStack 37.100%
Dictionary 30.200%
Forest 27.800%
NSDictionaryImplicitConversion 24.400%
Prims 23.500%
Dictionary2 19.600%
DollarFilter 17.00%
SwiftStructuresQueue 16.600%
NSStringConversion -22.600%
SwiftStructuresTrie -25.900%
PopFrontArray -44.100%
-Ounchecked
---------------------------------------
SwiftStructuresStack 38.900%
GenericStack 37.400%
NSDictionaryImplicitConversion 21.200%
TwoSum 20.500%
Histogram 16.600%
DollarFilter 15.900%
DollarFunction 13.600%
ArrayLiteral 12.900%
Forest 12.300%
Prims 10.300%
ImageProc -10.900%
InsertionSort -11.200%
StrToInt -11.800%
NBody -14.900%
SwiftStructuresTrie -29.900%
Swift SVN r24263
It had exposed a problem with the MemBehavior on a couple SIL
instructions which resulted in code motion moving a retain across an
instruction that can release (fixed in r23722).
From the original commit message:
Remove restriction on substituting existentials during mandatory inlining.
Issues around this have now been resolved, so we should now support
anything that Sema lets through.
Fixes rdar://problem/17769717.
Swift SVN r23729
without a valid SILDebugScope. An assertion in IRGenSIL prevents future
optimizations from regressing in this regard.
Introducing SILBuilderWithScope and SILBuilderwithPostprocess to ease the
transition.
This patch is large, but mostly mechanical.
<rdar://problem/18494573> Swift: Debugger is not stopping at the set breakpoint
Swift SVN r22978
This is controlled by a new isWholeModule() attribute in SILModule.
It gives about 9% code size reduction on the benchmark executables.
For test-suite reasons it is currently not done for the stdlib.
Swift SVN r22491
This prevented dead function removal of inlined dead functions. Beside the stdlib it's mostly
an issue of SIL size (and therefore compiletime), because llvm did remove such functions anyway.
Swift SVN r22301
Fixes part of <rdar://problem/16196801>.
Inline generic functions, but only when:
- There are no unbound archetypes being substituted (due to various
assumptions in TypeSubstCloner about having all concrete types).
- When no substitution is an existential (due to
<rdar://problem/17431105>, <rdar://problem/17544901>, and
<rdar://problem/17714025>).
This gets things limping along, but we really need to fix the above
limitations so that mandatory inlining never fails.
This doesn't enable inlining generics in the performance inliner. There
is no reason it shouldn't work as well, but there is no compelling
reason to do so now and it could have unintended effects on performance.
Some highlights from PreCommitBench -
O0:
old (ms) new (ms) delta (ms) speedup
ForLoops 1127.00 294.00 833.00 283.3%
LinkedList 828.00 165.00 663.00 401.8%
R17315246 982.00 288.00 694.00 241.0%
SmallPT 3018.00 1388.00 1630.00 117.4%
StringWalk 1276.00 89.00 1187.00 1333.7%
-- most others improve ~10% --
O3:
old (ms) new (ms) delta (ms) speedup
Ackermann 4138.00 3724.00 414.00 11.1%
Life 59.00 64.00 5.00 -7.8%
Phonebook 2103.00 1815.00 288.00 15.9%
R17315246 430.00 582.00 152.00 -26.1%
StringWalk 1173.00 1097.00 76.00 6.9%
Ofast:
old (ms) new (ms) delta (ms) speedup
Ackermann 3505.00 3715.00 210.00 -5.7%
Life 49.00 41.00 8.00 19.5%
Memset 684.00 554.00 130.00 23.5%
Phonebook 2166.00 1769.00 397.00 22.4%
StringWalk 829.00 790.00 39.00 4.9%
I've opened the following to track remaining issues that need to be
fixed before we can inline all transparent function applications:
<rdar://problem/17431105>
<rdar://problem/17544901>
<rdar://problem/17714025>
<rdar://problem/17768777>
<rdar://problem/17768931>
<rdar://problem/17769717>
Swift SVN r20378
Move the check for transparency into the caller of getCalleeFunction()
rather than returning nullptr from getCalleeFunction() if the apply
isn't transparent.
Also remove an assert and a nullptr check that can't fail under
reasonable circumstances based on the current SIL design.
Swift SVN r16902
Now the pass does not need to know about the pass manager. We also don't have
runOnFunction or runOnModule anymore because the trnasformation knows
which module it is processing. The Pass itself knows how to invalidate the
analysis, based on the injected pass manager that is internal to the
transformation.
Now our DCE transformation looks like this:
class DCE : public SILModuleTransform {
void run() {
performSILDeadCodeElimination(getModule());
invalidateAnalysis(SILAnalysis::InvalidationKind::All);
}
};
Swift SVN r13598
Thanks to the way we've set up our diagnostics engine, there's not actually
a reason for /everything/ to get rebuilt when /one/ diagnostic changes.
I've split them up into five categories for now: Parse, Sema, SIL, IRGen,
and Frontend, plus a set of "Common" diagnostics that are used in multiple
areas of the compiler. We can massage this later.
No functionality change, but should speed up compile times!
Swift SVN r12438
Teach the mandatory inliner to drop debug_value[_addr] instructions
when inlining. Otherwise, we get debug_value's for all of the arguments
splattered all over the place. We want transparent functions to be
effectively ((nodebug)) in C parlance.
Swift SVN r12404
hanging off partial_apply's when it is cleaning them up. These
occur when autoclosure arguments are marked let, because that is
how debug info for lets is recorded.
Swift SVN r12369
In general, this forces SILGen and IRGen code that's grabbing
a declaration to state whether it's doing so to define it.
Change SIL serialization to serialize the linkage of functions
and global variables, which means also serializing declarations.
Change the deserializer to use this stored linkage, even when
only deserializing a declaration, and to call a callback to
inform the client that it has deserialized a new entity.
Take advantage of that callback in the linking pass to alter
the deserialized linkage as appropriate for the fact that we
imported the declaration. This computation should really take
advantage of the relationship between modules, but currently
it does not.
Swift SVN r12090
- Enhance SILBuilder::emitStrongRelease to be smarter.
- Start using emitStrongRelease in type lowering, SILGen,
CapturePromotion (replacing its implementation of the
same logic), and MandatoryInlining (one more place)
- Rename the primitive createStrongRetain/ReleaseInst
instructions to lose their suffix.
- Now that createStrongRetain/ReleaseInst are not special
cases from the naming perspective, remove some special cases
from DeserializeSIL and ParseSIL.
Swift SVN r10449
They are the same as createStrongRetainInst and createStrongReleaseInst, but
peephole away FunctionRefInst. It turns out that there is only a couple
places in SILGen where this behavior is necessary, and this tramples on the
general pattern used in SILBuilder.
Swift SVN r10448
scanning up the local block to see if it immediately cancels a retain
operation.
Use this in mandatory inlining to zap more retains and release. Before
this patch, the result LogicValue allocation blocked this optimization,
preventing the partial_apply from being deleted from the case in
rdar://15328833.
Swift SVN r10447
wraps emitDestroyValue, and create SILBuilders locally when
needed (they are cheap to construct) instead of creating them
once and passing them around.
Swift SVN r10444
invalidating iterators. The off-by one would lead to it failing to
remove retain/release pairs around partial_apply's on inlined closures,
preventing the partial_apply itself from being removed, and thus making
it look like lots of values (including self) escape.
Fixing this bug allows us to re-enable the String append test and unxfails
three perf tests.
Swift SVN r10425
it inlines after it was invalidated. This was exposed by an unrelated
change I'm about to land. Fixing this required reworking a few things, and
unfortunately seems to have broken the string O(1) optimization again.
Swift SVN r9905
- Introduce emitTupleExtract / emitStructExtract, which fold when their operand is a tuple/struct.
- Rename SILBuilder::createTupleExtractInst -> createTupleExtract, "Inst" isn't used as a suffix.
- Switch capture promotion and DI to use the new functions.
This trims 300 lines out of the stdlib.
Swift SVN r9897
SILFunction that it references. Use this in the mandatory inlining
pass to remove deserialized transparent functions, to clean up the
-emit-sil output of the compiler (and presumably speed up compile
time). This implements rdar://15272652
Swift SVN r9699