The field's ordinal value is used by the Projection abstraction, which is
the basis of efficiently comparing and sorting access paths in SIL. It must
be cached before it is used by any SIL passes, including the verifier, or it
causes widespread quadratic complexity.
Fixes <rdar://problem/50353228> Swift compile time regression with optimizations enabled
In production code, a file that was taking 40 minutes to compile now
takes 1 minute, with more than half of the time in LLVM.
Here's a short script that reproduces the problem. It used to take 30s
and now takes 0.06s:
// swift genlazyinit.swift > lazyinit.sil
// sil-opt ./lazyinit.sil --access-enforcement-opts
var NumProperties = 300
print("""
sil_stage canonical
import Builtin
import Swift
import SwiftShims
public class LazyProperties {
""")
for i in 0..<NumProperties {
print("""
// public lazy var i\(i): Int { get set }
@_hasStorage @_hasInitialValue final var __lazy_storage__i\(i): Int? { get set }
""")
}
print("""
}
// LazyProperties.init()
sil @$s4lazy14LazyPropertiesCACycfc : $@convention(method) (@owned LazyProperties) -> @owned LazyProperties {
bb0(%0 : $LazyProperties):
%enum = enum $Optional<Int>, #Optional.none!enumelt
""")
for i in 0..<NumProperties {
let adr = (i*4) + 2
let access = adr + 1
print("""
%\(adr) = ref_element_addr %0 : $LazyProperties, #LazyProperties.__lazy_storage__i\(i)
%\(access) = begin_access [modify] [dynamic] %\(adr) : $*Optional<Int>
store %enum to %\(access) : $*Optional<Int>
end_access %\(access) : $*Optional<Int>
""")
}
print("""
return %0 : $LazyProperties
} // end sil function '$s4lazy14LazyPropertiesCACycfc'
""")
An interprocedural analysis pass that summarizes the dynamically
enforced formal accesses within a function. These summaries will be
used by a new AccessEnforcementOpts pass to locally fold access scopes
and remove dynamic checks based on whole module analysis.
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
Applying nontrivial generic arguments to a nontrivial SIL layout requires lowered SILType substitution, which requires a SILModule. NFC yet, just an API change.
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
Several functionalities have been added to FSO over time and the logic has become
muddled.
We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.
This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.
Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.
With this change, it will be easier to implement function signature optimization as now
we can do them independently now.
Small modifications to the test cases.
Several functionalities have been added to FSO over time and the logic has become
muddled.
We were always looking at a static image of the SIL and try to reason about what kind of
function signature related optimizations we can do.
This can easily lead to muddled logic. e.g. we need to consider 2 different function
signature optimizations together instead of independently.
Split 1 single function to do all sorts of different analyses in FSO into several
small transformations, each of which does a specific job. After every analysis, we produce
a new function and eventually we collapse all intermediate thunks to in a single thunk.
With this change, it will be easier to implement function signature optimization as now
we can do them independently now.
Minimal modifications to the test cases.
This split the function signature module pass into 2 functin passes.
By doing so, this allows us to rewrite to using the FSO-optimized
function prior to attempting inlining, but allow us to do a substantial
amount of optimization on the current function before attempting to do
FSO on that function.
And also helps us to move to a model which module pass is NOT used unless
necesary.
I do not see regression nor improvement for on the performance test suite.
functionsignopts.sil and functionsignopt_sroa.sil are modified because the
mangler now takes into account of information in the projection tree.
a separate analysis pass.
This pass is run on every function and the optimized signature is return'ed through the
getArgDescList and getResultDescList.
Next step is to split to cloning and callsite rewriting into their own function passes.
rdar://24730896
"
analysis pass.
This pass is run on every function and the optimized signature is return'ed through the
getArgDescList and getResultDescList.
Next step is to split to cloning and callsite rewriting into their own function passes.
rdar://24730896
This change includes an option on how IsLive is defined/computed. the ProjectionTree
can now choose to ignore epilogue releases and mark a node as dead if its only non-debug
user is epilogue release.
It can also mark a node as alive even its only user is epilogue release as before.
Imagine a case where one passes in an array and not access its owner
besides to release it. In such a case, we *do* want to be able to eliminate
that argument even though there is a release in the function epilogue.
This will help to get rid of the retain and release pair at the callsite. i.e.
the guaranteed paramter is elimininated.
rdar://21114206
LSValue::reduce reduces a set of LSValues (mapped to a set of LSLocations) to
a single LSValue.
It can then be used as the forwarding value for the location.
Previously, we expand into intermediate nodes and leaf nodes and then go bottom
up, trying to create a single LSValue out of the given LSValues.
Instead, we now use a recursion to go top down. This simplifies the code. And this
is fine as we do not expect to run into type tree that are too deep.
Existing test cases ensure correctness.
When we have all the epilogue releases. Make sure they cover all the non-trivial
parts of the base. Otherwise, treat as if we've found no releases for the base.
Currently. this is a NFC other than epilogue dumper. I will wire it up with
function signature with next commit.
This is part of rdar://22380547
So instead of only being able to match %1 and release %1 in (1). we
can also match %1 with (release %2, and release%3, i.e. exploded release_value)
in (2).
(1)
foo(%1)
strong_release %1
(2)
foo(%1)
%2 = struct_extract %1, field_a
%3 = struct_extract %1, field_b
strong_release %2
strong_release %3
This will allow function signature to better move the release instructions to
the callers.
Currently, this is a NFC other than testing using the epilogue match dumper.
Previously, we exploded argument to the most-derived fields, i.e. the field that
can no longer be exploded further. And in the spliced (newly created) function,
we form aggregates if necessary.
Changing this to explode only to the deepest level accessed, this enables us to
create the projection tree nodes for fields of which its level is accessed, instead of
all fields on all levels.
Note: this also changes the definition of a leaf node. Leaf node now means the node
which does not have children based on current explosion (it however could have children
if exploded further).
I am refining the old projection tree first before (mostly copying) it to create the
new projection tree.