This bug triggered a "instruction isn't dominated by its operand" verifier crash.
Or - if the verifier doesn't run - a crash later in IRGen.
rdar://94376582
IterableBackwardReachability just requires an iterable list of gens.
ShrinkBorrowScope, LexicalDestroyHoisting, and SSADestroyHoisting all
need to be able to check whether a given instruction is scope ending
quickly. Use a SmallSetVector rather than a SmallVector for the gens in
all three.
Instead of doing one or two non-iterative BackwardReachability runs,
do a single run of IterativeBackwardReachability. During that, pause
after discovery/local dataflow and use VisitBarrierAccessScopes to
determine which end_access instructions in the discovered region are
barriers. Add those instructions as kills to the dataflow. Finally run
the global dataflow.
Enables SSADestroyHoisting to hoist destroys over loops.
Addresses a correctness issue where access scopes which were open at
barrier blocks were not promoted to barriers, resulting in destroy_addrs
getting hoisted into unrelated access scopes.
We must no pre-specialize imported code (except if this was explicitly
called for by the importing module).
Therefore, don't pre-specialize `shared` definitions based on their
pre-specialization attributes.
Rather, only pre-specialize if the pre-specialization is called for
using a `target: "theFunctionToSpecialize"` parameter.
Run OnonePrespecializations before serialization so that module native functions
are not yet marked `shared` and can be identified as native.
rdar://92337361
Jump threading in an unreachable CFG region can lead to a crash (in an assert compiler) or hang (in an no-assert compiler) in `ValueBase::replaceAllUsesWith`.
Unfortunately I couldn't come up with an isolated SIL test case.
rdar://92267349
IRGen only implements box lowering for single-field boxes at the moment.
We can represent closure contexts that don't capture type info as just
capturing a tuple of the values, so let's do that for now to allow
for initial end-to-end testing of the pass.
- Add a `[reflection]` bit to `alloc_box` instructions, to indicate that a box
should be allocated with reflection metadata attached.
- Add a `@captures_generics` attribute to SILLayouts, to indicate a type layout
that captures the generic arguments it's substituted with, meaning it can
recreate the generic environment without additional ABI-level arguments, like
a generic partial application can.
This will turn `partial_apply` instructions into explicit box construction and
extraction code sequences. To begin with, recognize when a private function
is only used in partial applications and directly modify the function to be
usable as a closure invocation function. This simplifies the lowering in IRGen
and avoids generating a "partial application forwarder" thunk.
Sometimes the def-use chain between `end_cow_mutation` and `begin_cow_mutation` has a phi-term which is wrapped in a struct (e.g. `Array`).
The PhiExpansionPass is supposed to clean that up, but this only works if there are no "unknown" uses of the phi term.
With this change, COWOpts can handle such patterns without relying on the PhiExpansionPass.
rdar://91964659
When canonicalizing an owned value's lifetime, also check whether the
value is dead. If it is, track it for deletion. In particular, this
eliminates dead copy_values.
Allow round-tripping access to global variables. Previously,
AccessedStorage asserted that global variables were always associated
with a VarDecl. This was to ensure that AccessEnforcmentWMO always
recognized the global. Failing to recognize access to a global will
cause a miscompile.
SILGlobalVariable now has all the information needed by
SIL. Particularly, the 'isLet' flag. Simply replace VarDecl with
SILGlobalVariable in AccessEnforcmentWMO to eliminate the need for the
assert.
Make loads and copy_addrs of casts of the underlying storage barriers to
folding. Destroying the target address may not be equivalent to
destroying the source address: for example, if the target address is a
generic and the source address is AnyObject, specialization may turn the
generic into a trivial type; the destruction of that trivial type fails
to destroy the original stored AnyObject, resulting in a leak.
Previously, destroy_addrs were folded into copy_addrs and load [copy]s
to produce copy_addr [take]s and load [take]s respectively, but only if
the source of the load/copy was exactly the address being destroyed.
Generalize that to a single-block sequence of copy_addrs and load
[copy]s of projections of the address being destroyed.
Only respect deinit barriers when lexical lifetimes are enabled. If
they aren't, hoist destroy_addrs of all addresses aggressively
regardless of whether doing so involves hoisting over deinit barriers.
Enable caller and callee to be printed as inlining runs. The printing
is filtered based on -sil-print-function/-sil-print-functions and
includes finer-grained info than those do already. The caller before
and after each callee is inlined can be printed as well as the callee
on its own as it exists when inlining occurs.
Previously, SSADestroyHoisting was attempting to check whether an
unknown use of a variable was an address_to_pointer.
UniqueStorageUseVisitor, however, doesn't call back with that
instruction. Instead, it adds its uses to the stack of uses to visit.
Instead, we need to check whether the use was produced by an
address_to_pointer or more generally whether it's a
BuiltinRawPointerType.
Already, load [take]s of struct_element_addr|tuple_element_addr
projections resulted in Mem2Reg bailing. Expand that to include load
[take]s involving unchecked_addr_cast.
To handle load [take]s of (struct|tuple)_element_addr projections, it
would be necessary to replace the running value with a value obtained
from the original product by recursive destructuring, replacing the
value at the load [take]n address with undef, and then restructuring.
To handle load [take]s of cast projections, it would be necessary to use
unchecked_value_cast instead of unchecked_bitwise_cast. But we would
need to still use unchecked_bitwise_cast in the case of load [copy]
because otherwise we would lose the original value--unchecked_value_cast
forwards ownership, and not all casts can be reversed (because they may
narrow).
For now, just bail out in the face of these complex load [take]s.
The main point of this change is to make sure that a shared function always has a body: both, in the optimizer pipeline and in the swiftmodule file.
This is important because the compiler always needs to emit code for a shared function. Shared functions cannot be referenced from outside the module.
In several corner cases we missed to maintain this invariant which resulted in unresolved-symbol linker errors.
As side-effect of this change we can drop the shared_external SIL linkage and the IsSerializable flag, which simplifies the serialization and linkage concept.
When optimizing an enum `store` to an `alloc_stack`, require that all uses are in the same block.
Otherwise it could be a `switch_enum` of an optional where the none-case does not have a destroy of the enum value.
After transforming such an `alloc_stack`, the value would leak in the none-case block.
It fixes the same OSSA verification error as done for TempRValueOpt in a previous commit.
When optimizing an enum `store` to an `alloc_stack`, require that all uses are in the same block.
Otherwise it could be a `switch_enum` of an optional where the none-case does not have a destroy of the enum value.
After transforming such an `alloc_stack`, the value would leak in the none-case block.
Fixes a OSSA verification error.
Previously, FindBarrierAccessScopes::checkReachablePhiBarrier was not
looking at the terminator of predecessors but rather looking at the
terminator of block itself. Previously, in cases where the current
block's terminator was in fact a barrier, that resulted in failing to
hoist any live-in access scopes.
Now that we aren't running the data flow twice, the result was worse: in
cases where the current block's terminator was a barrier but there was
no access scope in play, no barrier would be added at all.
In order to determine which end_access instructions are barriers to
hoisting, a data flow which looks for access scopes containing barriers
is run. Those scopes that do contain barriers are added to a set. When
the second pass runs, the end_access instructions corresponding to
scopes in that set (i.e. the ends of scopes which contain barriers) are
treated as barriers.
In the common case where there are no barrier access scopes, though,
running two dataflows per variable is wasteful. Avoid that by just
checking whether we found any scopes that are barriers. If we didn't,
then we already visited all the barrier instructions and were told by
BackwardReachability which blocks had reachable ends and begins.
Tweaked the first data flow to record the barriers and the blocks in
DeinitBarriers. In DeinitBarriers::compute, if no access scopes that
are barriers were found, stop working. If any were found, clear what
had been recorded so far and run the second data flow.
In order to be able to clear everything, switched from using
BasicBlockSet and BasicBlockSetVector to SmallPtrSet<SILBasicBlock *>
and SmallPtrSetVector<SILBasicBlock *>.
As was done with store [init], transform instructions like
copy_addr %n to %m
into the sequence
destroy_addr %m
copy_addr %n to [initialization] %m
in order to create more opportunities for hoisting destroys.
After hoisting, if these opportunities for hoisting don't result in
hoisting actually occurring, recombine the two instructions.