Replaces an alloc_stack of an enum by an alloc_stack of the payload if only one enum case (with payload) is stored to that location.
For example:
%loc = alloc_stack $Optional<T>
%payload = init_enum_data_addr %loc
store %value to %payload
...
%take_addr = unchecked_take_enum_data_addr %loc
%l = load %take_addr
is transformed to
%loc = alloc_stack $T
store %value to %loc
...
%l = load %loc
https://bugs.swift.org/browse/SR-12710
I am going to use this in mandatory combine, and it seems like a generally
useful transformation.
I also updated the routine to construct its own SILBuilder that injects a user
passed in SILBuilderContext eliminating the bad pattern of passing in
SILBuilders.
This should be an NFC change.
Changes:
* Allow optimizing partial_apply capturing opened existential: we didn't do this originally because it was complicated to insert the required alloc/dealloc_stack instructions at the right places. Now we have the StackNesting utility, which makes this easier.
* Support indirect-in parameters. Not super important, but why not? It's also easy to do with the StackNesting utility.
* Share code between dead closure elimination and the apply(partial_apply) optimization. It's a bit of refactoring and allowed to eliminate some code which is not used anymore.
* Fix an ownership problem: We inserted copies of partial_apply arguments _after_ the partial_apply (which consumes the arguments).
* When replacing an apply(partial_apply) -> apply and the partial_apply becomes dead, avoid inserting copies of the arguments twice.
These changes don't have any immediate effect on our current benchmarks, but will allow eliminating curry thunks for existentials.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
In the previous commit, various methods for adding, replacing, and
removing instructions were duplicate from SILCombiner into
SILInstructionWorklist. Here, SILCombiner is modified to call through
to the methods which were added to SILInstructionWorklist.
- Replaced usage of raw map and vector with the type that wraps the
combination (BlotSetVector); that provided a significant deduplication
since a sizeable portion of the worklist's implementation was in
vector and map management now provided by the BlotSetVector.
- Templated over the type of map and vector used by the blot set vector.
- Added SmallSILInstructionWorklist where the map and vector are
specified to be SmallDenseMap and SmallVector respectively.
- Replaced usages of bare ValueBase with usages of SILValue.
- Renamed zap to resetChecked.
Added a bit of functionality to BlotSetVector, specifically to support
SILInstructionWorklist:
- Made insert return not just the index that the potentially-inserted
item is at on return but additionally whether an insertion occurred,
matching the behavior of llvm::DenseMap::insert.
- Added a method to reserve capacity in the backing vector and map:
BlotSetVector::reserve.
- Added a method to free extra storage used by the backing vector and
map: BlotSetVector::clear.
- Modified SmallBlotSetVector's template parameters so that only a value
and size can be specified--that type will always use SmallVector and
SmallDenseMap for its superclass' VectorT and MapT template
parameters.
- Updated variable names.
In the previous commit, SILInstructionWorklist was added as a verbatim
extraction (modulo some minor style tweaks) of SILCombineWorklist.
Here, SILCombine is moved over to using that renamed type.
Returns `true` if `T.Type` is known to refer to a concrete type. The
implementation allows for the optimizer to specialize this at -O and
eliminate conditional code.
Includes `Swift._isConcrete<T>(T.Type) -> Bool` wrapper function.
If the keypath argument of a keypath access function is a keypath literal instruction, generate the projection inline and remove the access function.
For example, replaces (simplified SIL):
%kp = keypath ... stored_property #Foo.bar
apply %keypath_runtime_function(%root_object, %kp, %addr)
with:
%addr = struct_element_addr %root_object, #Foo.bar
load/store %addr
Currently this only handles stored property patterns.
rdar://problem/36244734
This is the complement to load canonicalization. Although store
canonicalization is not required before diagnostics, it should be
defined in the same utility.
NOTE: I changed all places that the CastOptimizer is created to just pass in
nullptr for now so this is NFC.
----
Right now the interface of the CastOptimizer is muddled and confused. Sometimes
it is returning a value that should be used by the caller, other times it is
returning an instruction that is meant to be reprocessed by the caller.
This series of patches is attempting to clean this up by switching to the
following model:
1. If we are optimizing a cast of a value, we return a SILValue. If the cast
fails, we return an empty SILValue().
2. If we are optimizing a cast of an address, we return a boolean value to show
success/failure and require the user to use the SILBuilderContext to get the
cast if they need to.
Generalizes the ConcreteExistentialInfo abstraction so it can be used
both by the ExistentialSpecializer and SILCombine, allowing redundant
code in ExistentialSpecializer.cpp to be deleted.
Splits OpenedArchetypeInfo from ConcreteExistentialInfo. Adds a
ConcreteOpenedArchetypeInfo convenience wrapper around them both, for
use wherever we were originally using ConcreteExistentialInfo.
Splits getAddressOfStackInit into getStackInitInst, This is cleaner and
allows both the ExistentialSpecializer and SILCombine to handle more
interesting cases in the future, like unconditional_checked_cast.
Creates utilities, initializeSubstitutionMap, and
initializeConcreteTypeDef to simplify an generalize
ConcreteExistentialInfo.
While rewriting ExistentialSpecializer to use the new
abstraction, I fixed a latent bug in which is was using a SIL
argument index as a function type parameter index (this would
have broken up if/when we decide to enable calls with indirect
results).
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
I am going to add the code in a bit that does the notifications. I tried to pass
down the builder instead of the pass manager. I also tried not to change the
formatting.
rdar://42301529
Fixes <rdar://40555427> [SR-7773]:
SILCombiner::propagateConcreteTypeOfInitExistential fails to full propagate type
substitutions.
Fixes <rdar://problem/40923849>
SILCombiner::propagateConcreteTypeOfInitExistential crashes on protocol
compositions.
This rewrite fixes several fundamental bugs in the SILCombiner optimization that
propagates concrete types. In particular, the pass needs to handle:
- Arguments of callee Self type in non-self position.
- Indirect and direct return values of Self type.
- Types that indirectly depend on Self within callee function signature.
- Protocol composition existentials.
- All of the above need to work for protocol extensions as well as witness methods.
- For protocol extensions, conformance lookup should be based on the existential's conformance list.
Additionally, the optimization should not depend on a SILFunction's DeclContext,
which is not serialized. (In fact, we should prevent SIL passes from using
DeclContext). Furthermore, the code needs to be expressed in a way that one can
reason about correctness and invariants.
The root cause of these bugs is that SIL passes are written based on untested
assumptions of Swift type system. A SIL pass needs to handle all verifiable SIL
input because passes need to be composable. Bail-out logic can be added to
simplify the design; however, _the bail-out logic itself cannot make any
assumptions about the language or type system_ that aren't clearly and
explicitly enforced in the SIL verifier. This is a common mistake and major
source of bugs.
I created as many unit tests as I reasonably could to prevent this code from
regressing. Creating enough unit tests to cover all corner cases that were
broken in the original code would be intractable. But the code has been
simplified such that many corner cases disappear.
This opens up some oportunity for generalizing the optimization and eliminating
special cases. However, I want this PR to be limited to fixing correctness
issues only. In the long term, it would be preferable to replace this
optimization entirely with a much more powerful general type propagation pass.
For example:
%0 = string_literal "abc"
%1 = integer_literal 0x8000000000000000
%2 = builtin "stringObjectOr_Int64" (%0, %1)
%3 = integer_literal 0x4000000000000000
%4 = builtin "and_Int64" (%2, %3)
In this case we know that %4 is 0.
Certain patterns of directly applied partial_apply's were not being
inlined. This can happen when a closure is defined in the
implementation of a generic witness method. I noticed this issue while
debugging SR-6254: "Fix (or explain) strange ways to make code >3
times faster/slower".
After the witness method is devirtualization and specialized, the
closure application looks like this:
%pa = partial_apply [callee_guaranteed] %fn() : $@convention(thin) (@in Int) -> @out Int
%cvt = convert_escape_to_noescape %pa
apply %cvt(...)
SILCombine already removes the partial_apply and convert_escape_to_noescape generating:
%thick = thin_to_thick %fn
apply %thick(...)
However, surprisingly, neither the inliner nor SILCombine can handle this.
I cleaned up the code in SILCombine's apply visitor that handles
thin_to_thick and generalized it to handle this and other patterns.
I also added a restriction to this code so that it doesn't break SIL
ownership in the future, as I discussed with Arnold.
Certain patterns of directly applied partial_apply's were not being
inlined. This can happen when a closure is defined in the
implementation of a generic witness method. I noticed this issue while
debugging SR-6254: "Fix (or explain) strange ways to make code >3
times faster/slower".
After the witness method is devirtualization and specialized, the
closure application looks like this:
%pa = partial_apply [callee_guaranteed] %fn() : $@convention(thin) (@in Int) -> @out Int
%cvt = convert_escape_to_noescape %pa
apply %cvt(...)
SILCombine already removes the partial_apply and convert_escape_to_noescape generating:
%thick = thin_to_thick %fn
apply %thick(...)
However, surprisingly, neither the inliner nor SILCombine can handle this.
I cleaned up the code in SILCombine's apply visitor that handles
thin_to_thick and generalized it to handle this and other patterns.
I also added a restriction to this code so that it doesn't break SIL
ownership in the future, as I discussed with Arnold.
Local.cpp was ~3k lines of which 1.5k (i.e. 1/2) was the cast optimizer. This
commit extracts the cast optimizer into its own .cpp and .h file. It is large
enough to stand on its own and allows for Local.cpp to return to being a small
group of helper functions.
I am making some changes in this area due to the change in certain function
conventions caused by the +0-normal-arg work. I am just trying to leave the area
a little cleaner than before.
* Reduce array abstraction on apple platforms dealing with literals
Part of the ongoing quest to reduce swift array literal abstraction
penalties: make the SIL optimizer able to eliminate bridging overhead
when dealing with array literals.
Introduce a new classify_bridge_object SIL instruction to handle the
logic of extracting platform specific bits from a Builtin.BridgeObject
value that indicate whether it contains a ObjC tagged pointer object,
or a normal ObjC object. This allows the SIL optimizer to eliminate
these, which allows constant folding a ton of code. On the example
added to test/SILOptimizer/static_arrays.swift, this results in 4x
less SIL code, and also leads to a lot more commonality between linux
and apple platform codegen when passing an array literal.
This also introduces a couple of SIL combines for patterns that occur
in the array literal passing case.
* Implement a few silcombine transformations for arrays
- Useless existential_ref <-> class conversions.
- mark_dependence_inst depending on uninteresting instructions.
- release(init_existential_ref(x)) -> release(x) when hasOneUse(x)
- Update COWArrayOpt to handle the new forms generated by this.
these aren't massive performance wins, but do shrink the size of SIL when
dealing with arrays.
* Generalize testcase to work on linux and on mac when checking stdlib is enabled.
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
It is important to de-serialize the devirtualized function (and its callees), especially because we must make sure that all transparent functions are de-serialized.
SILCombine did not do that. But as we have the same optimization in the Devirtualizer, it's not needed to duplicate the code in SILCombine.
The only reason we had this peephole in SILCombine is that the Devirtualizer pass could not handle partial applies.
So with this change the Devirtualizer can now also handle partial applies.
Fixes rdar://problem/30544344 (again, after my first attempt failed)
Not sure why but this was another "toxic utility method".
Most of the usages fell into one of three categories:
- The base value was always non-null, so we could just call
getCanonicalType() instead, making intent more explicit
- The result was being compared for equality, so we could
skip canonicalization and call isEqual() instead, removing
some boilerplate
- Utterly insane code that made no sense
There were only a couple of legitimate uses, and even there
open-coding the conditional null check made the code clearer.
Also while I'm at it, make the SIL open archetypes tracker
more typesafe by passing around ArchetypeType * instead of
Type and CanType.