In many places, we're interested in whether a type with archetypes *might be* a superclass of another type with the right bindings, particularly in the optimizer. Provide a separate Type::isBindableToSuperclassOf method that performs this check. Use it in the devirtualizer to fix rdar://problem/24993618. Using it might unblock other places where the optimizer is conservative, but we can fix those separately.
We were creating new uses of an argument just prior to erasing it from
the block argument list.
We need to replace references to that value in the side structure we
generate with references to the new value that we're replacing it with.
Fixes SR-884 / rdar://problem/25008398.
LSValue::reduce reduces a set of LSValues (mapped to a set of LSLocations) to
a single LSValue.
It can then be used as the forwarding value for the location.
Previously, we expand into intermediate nodes and leaf nodes and then go bottom
up, trying to create a single LSValue out of the given LSValues.
Instead, we now use a recursion to go top down. This simplifies the code. And this
is fine as we do not expect to run into type tree that are too deep.
Existing test cases ensure correctness.
For forwarding on allocstacks, we can invalidate the forwable bit when we
hit the deallocate stack.
This helps compilation time as we do not need to propagate these bits down
to subsequent basic blocks.
Reinstates commit 0c2ca94ef7
With two bug fixes:
*) use after free asan crash
*) wrong check in ValueLifetimeAnalysis::isWithinLifetime
And some refactoring
We were giving special handling to ApplyInst when we were attempting to use
getMemoryBehavior(). This commit changes the special handling to work on all
full apply sites instead of just AI. Additionally, we look through partial
applies and thin to thick functions.
I also added a dumper called BasicInstructionPropertyDumper that just dumps the
results of SILInstruction::get{Memory,Releasing}Behavior() for all instructions
in order to verify this behavior.
With this re-abstraction a specialized function has the same calling convention as if it would have been written with the specialized types in the first place.
In general this results in less alloc_stacks and load/stores.
It also can eliminate some re-abstraction thunks, e.g. if a generic closure is used in a non-generic context.
It some (hopefully rare) cases it may require to add re-abstraction thunks.
In case a function has multiple indirect results, only the first is converted to a direct result. This is an open TODO.
We were handling regular uses, but not handling promotions in things
like debug_value_addr.
This was exposed by some pass ordering changes I have in an upcoming
commit.
For a release on a guaranteed function paramater, we know right away
that its not the final release and therefore does not call deinit.
Therefore we know it does not read or write memory other than the reference
count.
This reduces the compilation time of dead store and redundant load elim. As
we need to go over alias analysis to make sure tracked locations do not alias
with it.
After collected enough information in the first iteration of the
data flow. We do not do second iteration (last iteration) for blocks
without loads as we will not forward any load there.
This improves compilation time of redundant load elimination.
When we emit calls to existential methods silgen produces a sequence of the
three instructions below:
open_existential_addr %0 : $*Pingable to $*@opened("1E467EB8-...") Pingable
witness_method $@opened("1E467EB8-...") Pingable, #Pingable.ping!1
apply %3<@opened("1E467EB8-...") Pingable>(%2)
This commit adds a new CSE-like pass that finds sequences of calls to protocol
methods and reuses the first two instructions open_existential_addr and
witness_method. The optimization finds arguments that must not alias and may not
escape and combines all of the existential method calls to use the same method
lookup. The optimization handles control flow by finding the top dominating
open_existential instruction, and uses that instruction.
related to rdar://22704464.
Similarly to how we've always handled parameter types, we
now recursively expand tuples in result types and separately
determine a result convention for each result.
The most important code-generation change here is that
indirect results are now returned separately from each
other and from any direct results. It is generally far
better, when receiving an indirect result, to receive it
as an independent result; the caller is much more likely
to be able to directly receive the result in the address
they want to initialize, rather than having to receive it
in temporary memory and then copy parts of it into the
target.
The most important conceptual change here that clients and
producers of SIL must be aware of is the new distinction
between a SILFunctionType's *parameters* and its *argument
list*. The former is just the formal parameters, derived
purely from the parameter types of the original function;
indirect results are no longer in this list. The latter
includes the indirect result arguments; as always, all
the indirect results strictly precede the parameters.
Apply instructions and entry block arguments follow the
argument list, not the parameter list.
A relatively minor change is that there can now be multiple
direct results, each with its own result convention.
This is a minor change because I've chosen to leave
return instructions as taking a single operand and
apply instructions as producing a single result; when
the type describes multiple results, they are implicitly
bound up in a tuple. It might make sense to split these
up and allow e.g. return instructions to take a list
of operands; however, it's not clear what to do on the
caller side, and this would be a major change that can
be separated out from this already over-large patch.
Unsurprisingly, the most invasive changes here are in
SILGen; this requires substantial reworking of both call
emission and reabstraction. It also proved important
to switch several SILGen operations over to work with
RValue instead of ManagedValue, since otherwise they
would be forced to spuriously "implode" buffers.
Now that we process functions in bottom-up order in the pass manager and
have a mechanism to restart the pass pipeline on the current
function (or on a newly created callee function), we can split these
passes back out from the inliner and end up with the same benefits we
had from initially integrating them. We get the further benefit of fully
optimizing newly created callee functions before continuing with the
function that resulted in the creation of those callee
functions (e.g. as a result of a specialization pass running).
This is done by splitting the transformation into an analysis phase and a transformation phase (which does not use the dominator tree anymore).
The domintator tree is recalucated once after the whole function is processed.
This change eventually solves the compile time problem of rdar://problem/24410167.
If a class has an @objc ancestry this class can be dynamically overridden and
therefore we don't know the default case even if we see the full class
hierarchy.
rdar://23228386
The main intention for this change is to eliminate the use of the post/dominator trees in this transformation.
These were re-calculated on every conversion which caused long compile times for functions with lot of switch_enum instructions: rdar://problem/24410167
Beside that, the code for collecting the target-block's predecessors is now simpler. It's not necessary to handle arbitrary control flow pathes because jump threading is simplifying the CFG anyway.
Now SimplifyCFG does not use the PostDominanceAnalysis anymore.