Re-apply b00dcbe with a small test update, and a small change in pass
ordering.
I measure around a 10% reduction in compile times of release no-assert
builds of the stdlib and StdlibUnitTest.
For release + debug-swift builds, I see 20% reduction in stdlib compile
time.
My latest measurements show a few regressions at -O:
Calculator
NSError
SetIsSubsetOf
Sim2DArray
There is a small (0.1%) reduction in the libswiftCore.dylib size.
Being able to remove these is a consequence of the reordering that
happened in e50daa6.
The opened archetype already has metatypes stripped off.
The problem was in code that tried to propagate the type from open_existentials
in static existential calls.
%0 = metatype thick ClientSocket.Type
%1 = init_existential_metatype %0 : thick ClientSocket.Type, thick Socket.Type
%2 = open_existential_metatype %1 : thick Socket.Type to thick (@opened(...) Socket).Type
%3 = witness_method opened(...) Socket, #Socket.newWithConfig!1, %2
try_apply %3<@opened(...) Socket>(%2)
We would read the type of '%2' which is a metatype of '@open(...)' in the
substitution replacement code comparing it to the subsitution which is just
'@open(...)'. We already computed the archetype earlier so just use that
instead.
SR-811
rdar://24825970
This ireapplies commit 255c52de9f.
Original commit message:
Serialize debug scope and location info in the SIL assembler language.
At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. This change enable us
to write targeted testcases for single SIL optimization passes.
The new syntax is as follows:
sil-scope-ref ::= 'scope' [0-9]+
sil-scope ::= 'sil_scope' [0-9]+ '{'
sil-loc
'parent' scope-parent
('inlined_at' sil-scope-ref )?
'}'
scope-parent ::= sil-function-name ':' sil-type
scope-parent ::= sil-scope-ref
sil-loc ::= 'loc' string-literal ':' [0-9]+ ':' [0-9]+
Each instruction may have a debug location and a SIL scope reference
at the end. Debug locations consist of a filename, a line number, and
a column number. If the debug location is omitted, it defaults to the
location in the SIL source file. SIL scopes describe the position
inside the lexical scope structure that the Swift expression a SIL
instruction was generated from had originally. SIL scopes also hold
inlining information.
<rdar://problem/22706994>
At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. This change enable us
to write targeted testcases for single SIL optimization passes.
The new syntax is as follows:
sil-scope-ref ::= 'scope' [0-9]+
sil-scope ::= 'sil_scope' [0-9]+ '{'
sil-loc
'parent' scope-parent
('inlined_at' sil-scope-ref )?
'}'
scope-parent ::= sil-function-name ':' sil-type
scope-parent ::= sil-scope-ref
sil-loc ::= 'loc' string-literal ':' [0-9]+ ':' [0-9]+
Each instruction may have a debug location and a SIL scope reference
at the end. Debug locations consist of a filename, a line number, and
a column number. If the debug location is omitted, it defaults to the
location in the SIL source file. SIL scopes describe the position
inside the lexical scope structure that the Swift expression a SIL
instruction was generated from had originally. SIL scopes also hold
inlining information.
<rdar://problem/22706994>
Builtin.once() expects thin functions, so we don't need to try to walk
through thin_to_thick_function here.
I suspect this might have been a vestige of having used apply for these
at one point.
Pre-specializations were only used by Onone builds, but were kept inside the standard library dylyb anyways. This commit moves all the pre-specializations into a dedicated Swift module and a dynamic library, which are only used by Onone builds.
This reduces the code size of libswiftCore.dylib by 4%-5%.
I measure around a 10% reduction in compile times of release no-assert
builds of the stdlib and StdlibUnitTest.
For release + debug-swift builds, I see 20% reduction in stdlib compile
time.
I saw no reproducible regressions in the benchmarks, and a few
improvements.
There is a small (0.1%) reduction in the libswiftCore.dylib size.
Being able to remove these is a consequence of the reordering that
happened in e50daa6.
The end goal here is to end up with a good pass ordering that will allow
us to only run one set of these passes, rather than running them
twice. This is a start in that direction.
No real impact measured on compile times as of this change. On
benchmarks I see a mix of regressions and improvements.
-O improvements:
Calculator -17.6% 1.21x
Chars -54.4% 2.19x
PolymorphicCalls -14.7% 1.17x
SetIsSubsetOf -14.1% 1.16x
Sim2DArray -14.1% 1.16x
StrToInt -30.4% 1.44x
-O regressions:
CaptureProp +32.9% 0.75x
DictionarySwap +36.0% 0.74x
XorLoop +39.8% 0.72x
-Ounchecked improvements:
Chars -58.0% 2.38x
-Ounchecked regressions:
CaptureProp +33.3% 0.75x
-Onone improvements:
StrToInt -14.9% 1.18x
StringWalk -47.6% 1.91x
StringWithCString -17.2% 1.21x
(many more smaller improvements)
-Onone regressions:
Calculator +21.5% 0.82x
OpenClose +10.1% 0.91x
We ignore substitutions from the conformance, using the Self type
substitution from the call site instead.
The new SILFunctionType::getDefaultWitnessMethodProtocol() method
is used to figure out what "shape" the Self substitutions need
to take.
This is cleaner than it was before the method was added, but is
still a bit of a hack; more and more it appears that we need to
stop thinking of witness_method as a separate calling convention,
and design what @rjmccall described as "abstraction patterns for
generic signatures" instead.
We were using a stripCast in some places and getRCIdentityRoot in others.
stripCasts is not identical to getRCIdentityRoot.
In particular, it does not look through struct_extract, tuple_extract,
unchecked_enum_data.
Created a struct and tuple test cases for make sure things are optimized
as they should be.
We have test case for unchecked_enum_data before.
We were giving special handling to ApplyInst when we were attempting to use
getMemoryBehavior(). This commit changes the special handling to work on all
full apply sites instead of just AI. Additionally, we look through partial
applies and thin to thick functions.
I also added a dumper called BasicInstructionPropertyDumper that just dumps the
results of SILInstruction::get{Memory,Releasing}Behavior() for all instructions
in order to verify this behavior.
With this re-abstraction a specialized function has the same calling convention as if it would have been written with the specialized types in the first place.
In general this results in less alloc_stacks and load/stores.
It also can eliminate some re-abstraction thunks, e.g. if a generic closure is used in a non-generic context.
It some (hopefully rare) cases it may require to add re-abstraction thunks.
In case a function has multiple indirect results, only the first is converted to a direct result. This is an open TODO.
Currently the array.get_element calls return the element as indirect result.
The generic specializer will change so that the element can be returned as direct result.
iterator/pointer comparison issue that yields undefined behavior. This updates
Swift for the landing of this change in swift-llvm/stable.
I am going to cherry-pick the given change into swift-llvm/stable since there is no
reason not to do this now and it will prevent more of these conversions from
creeping into the code base.
We really want to avoid as much undefined behavior as we possibly can.
We were handling regular uses, but not handling promotions in things
like debug_value_addr.
This was exposed by some pass ordering changes I have in an upcoming
commit.
Pre-specializations were only used by Onone builds, but were kept inside the standard library dylyb anyways. This commit moves all the pre-specializations into a dedicated Swift module and a dynamic library, which are only used by Onone builds.
This reduces the code size of libswiftCore.dylib by 5%.