Beside fixing the compiler crash, this change also improves the stack-nesting correction mechanisms in the inliners:
* Instead of trying to correct the nesting after each inlining of a callee, correct the nesting once when inlining is finished for a caller function.
This fixes a potential compile time problem, because StackNesting iterates over the whole function.
In worst case this can lead to quadratic behavior in case many begin_apply instructions with overlapping stack locations are inlined.
* Because we are doing it only once for a caller, we can remove the complex logic for checking if it is necessary.
We can just do it unconditionally in case any coroutine gets inlined.
The inliners iterate over all instruction of a function anyway, so this does not increase the computational complexity (StackNesting is roughly linear with the number of instructions).
rdar://problem/47615442
Instead of some special treatment of unreachable blocks, model unreachable as implicitly deallocating all alive stack locations at that point.
This requires an additional forward-dataflow pass. But it now correctly models the problem and fixes a compiler crash.
rdar://problem/47402694
Generalizes the ConcreteExistentialInfo abstraction so it can be used
both by the ExistentialSpecializer and SILCombine, allowing redundant
code in ExistentialSpecializer.cpp to be deleted.
Splits OpenedArchetypeInfo from ConcreteExistentialInfo. Adds a
ConcreteOpenedArchetypeInfo convenience wrapper around them both, for
use wherever we were originally using ConcreteExistentialInfo.
Splits getAddressOfStackInit into getStackInitInst, This is cleaner and
allows both the ExistentialSpecializer and SILCombine to handle more
interesting cases in the future, like unconditional_checked_cast.
Creates utilities, initializeSubstitutionMap, and
initializeConcreteTypeDef to simplify an generalize
ConcreteExistentialInfo.
While rewriting ExistentialSpecializer to use the new
abstraction, I fixed a latent bug in which is was using a SIL
argument index as a function type parameter index (this would
have broken up if/when we decide to enable calls with indirect
results).
This cannot be correctly done as a SILCombine because it must create
new instructions at a previous location. Move the optimization into
CastOptimizer. Insert the new metatype instructions in the correct
spot. And manually do the replaceAllUsesWith and eraseInstruction
steps.
Fixes <rdar://problem/46746188> crash in swift_getObjCClassFromObject.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
This allows Swift code to implement a fast path via a protocol type
check as follows:
if let existentialVal = genericVal as? SomeProtocol {
// do something fast.
}
Fixes <rdar://problem/46322928> Failure to devirtualize a protocol
method applied to an opened existential blocks implemention of
DataProtocol.
Note: the approach of devirtualization via backward pattern matching
is fundamentally wrong and will never be fully general. It should be a
forward type propagation.
This is in preparation for verifying that when ownership verification is enabled
that only enums and trivial values can have any ownership. I am doing this in
preparation for eliminating ValueOwnershipKind::Trivial.
rdar://46294760
Implements a constant interpreter that can deal with basic integer operations.
Summary of the features that it includes:
* builtin integer values, and builtin integer insts
* struct and tuple values, and insts that construct and extract them (necessary to use stdlib integers)
* function referencing and application (necessary to call stdlib integer functions)
* error handling data structures and logic, for telling you why your value is not evaluatable
* metatype values (not necessary for integers, but it's only a few extra lines, so I thought it would be more trouble than it's worth to put them in a separate PR)
* conditional branches (ditto)
Inlining has always been quadratic for no good reason. There was a
special hack for single-block callees that allowed linear inlining.
Instead, the now iterates over blocks and instructions in reverse,
splitting blocks as it inlines. There no longer needs to be special
case for single block callees, and the inliner is linear for all kinds
of callees.
This further simplifies and cleans up the code. There are just a few
basic invariants that the common inliner needs to provide about how
blocks are split and laid out. We can do this if we don't add hacks
for special cases within the inliner. Those invariants allow the
inliner clients to be much simpler and more efficient.
PerformanceInliner still needs to be fixed.
Fixes SR-9223: Inliner exhibits slow compilation time with a large
static array.
A recent SILCloner rewrite removed a special case hack for single
basic block callee functions:
commit c6865c0dff
Merge: 76e6c4157e9e440d13a6
Author: Andrew Trick <atrick@apple.com>
Date: Thu Oct 11 14:23:32 2018
Merge pull request #19786 from atrick/silcloner-cleanup
SILCloner and SILInliner rewrite.
Instead, the new inliner simply merges trivial unconditional branches
after inlining the return block. This way, the CFG is always in
canonical state after inlining. This is more robust, and avoids
interfering with subsequent SIL passes when non-single-block callees
are inlined.
The problem is that inlining a series of calls within a large block
could result in interleaved block splitting and merging operations,
which is quadratic in the block size. This showed up when inlining the
tens of thousands of array subscript calls emitted for a large array
initialization.
The first half of the fix is to simply defer block merging until all
calls are inlined. We can't expect SimplifyCFG to run immediately
after inlining, nor would we want to do that, *especially* for
mandatory inlining. This fix instead exposes block merging as a
trivial utility.
Note: by eliminating some unconditional branches, this change could
reduce the number of debug locations emitted. This does not
fundamentally change any debug information guarantee, and I was unable
to observe any behavior difference in the debugger.
Rewrite the SILCLoners used in SimplifyCFG. For convenience, there is
now simply a BasicBlockCloner and a SILFunctionCloner. It's pretty
obvious what they do and almost impossible to use incorrectly.
This is worthwhile on its own just to make the usage clear, but the
real reason is that after this cleanup, it will be possible to remove
many extraneous calls to global critical edge splitting related to
cloning.
Mostly functionally neutral:
- may fix latent bugs.
- may reduce useless basic blocks after inlining.
This rewrite encapsulates the cloner's internal state, providing a
clean API for the CRTP subclasses. The subclasses are rewritten to use
the exposed API and extension points. This makes it much easier to
understand, work with, and extend SIL cloners, which are central to
many optimization passes. Basic SIL invariants are now clearly
expressed and enforced. There is no longer a intricate dance between
multiple levels of subclasses operating on underlying low-level data
structures. All of the logic needed to keep the original SIL in a
consistent state is contained within the SILCloner itself. Subclasses
only need to be responsible for their own modifications.
The immediate motiviation is to make CFG updates self-contained so
that SIL remains in a valid state. This will allow the removal of
critical edge splitting hacks and will allow general SIL utilities to
take advantage of the fact that we don't allow critical edges.
This rewrite establishes a simple principal that should be followed
everywhere: aside from the primitive mutation APIs on SIL data types,
each SIL utility is responsibile for leaving SIL in a valid state and
the logic for doing so should exist in one central location.
This includes, for example:
- Generating a valid CFG, splitting edges if needed.
- Returning a valid instruction iterator if any instructions are removed.
- Updating dominance.
- Updating SSA (block arguments).
(Dominance info and SSA properties are fundamental to SIL verification).
LoopInfo is also somewhat fundamental to SIL, and should generally be
updated, but it isn't required.
This also fixes some latent bugs related to iterator invalidation in
recursivelyDeleteTriviallyDeadInstructions and SILInliner. Note that
the SILModule deletion callback should be avoided. It can be useful as
a simple cache invalidation mechanism, but it is otherwise bug prone,
too limited to be very useful, and basically bad design. Utilities
that mutate should return a valid instruction iterator and provide
their own deletion callbacks.
With removing of pinning and with addressors, the pattern matching did not work anymore.
The good thing is that the SIL is now much simpler and we can handle the 2D case without pattern matching at all.
This removes a lot of code from COWArrayOpts.
rdar://problem/43863081
NOTE: This is not the final form of how operand ownership restraints will be
represented. This patch is instead an incremental change that extracts out this
functionality from the ownership verifier as a pure refactor.
rdar://44667493
I believe that these were in SILInstruction for historic reasons. This is a
separate API on top of SILInstruction so it makes sense to pull it out into its
own header.
The optimizations now handle the ref_tail_addr instructions for detecting element addresses
(in addition to the array semantics function _getElementAddress).
After _modify for Array subscript lands, we can get rid of _getElementAddress at all.
SIL passes were violating the existing invariant on non-cond-br
critical edges in several places. I fixed the places that I could
find. Wherever there was a post-pass to "clean up" critical edges, I
replaced it with a a call to verification that the critical edges
aren't broken in the first place.
We still need to eliminate critical edges entirely before enabling
ownership SIL.
This utility works by taking in a function_ref and then traverses the transitive
uses of the function_ref until it finds either a use it does not understand
"escape" or an "apply" instruction. It returns a result structure that contains
the final found applications and more importantly a bool telling the caller if
we found any "escaping" uses.
This is intended to be an inverse operation to ApplySite::getCalleeOrigin(). As
such it has a bunch of assertions in it that check that the two stay in sync.
rdar://41146023
In order to make this reasonable, I needed to shift responsibilities
around a little; the devirtualization operation is now responsible for
replacing uses of the original apply. I wanted to remove the
phase-separation completely, but there was optimization-remark code
relying on the old apply site not having been deleted yet.
The begin_apply aspects of this aren't testable independently of
replacing materializeForSet because coroutines are currently never
called indirectly.
With this change and some other changes that I am committing in parallel, the
stdlib and all of the overlays send all proper pass manager notifications.
rdar://42301529