IRGen only implements box lowering for single-field boxes at the moment.
We can represent closure contexts that don't capture type info as just
capturing a tuple of the values, so let's do that for now to allow
for initial end-to-end testing of the pass.
- Add a `[reflection]` bit to `alloc_box` instructions, to indicate that a box
should be allocated with reflection metadata attached.
- Add a `@captures_generics` attribute to SILLayouts, to indicate a type layout
that captures the generic arguments it's substituted with, meaning it can
recreate the generic environment without additional ABI-level arguments, like
a generic partial application can.
This will turn `partial_apply` instructions into explicit box construction and
extraction code sequences. To begin with, recognize when a private function
is only used in partial applications and directly modify the function to be
usable as a closure invocation function. This simplifies the lowering in IRGen
and avoids generating a "partial application forwarder" thunk.
Sometimes the def-use chain between `end_cow_mutation` and `begin_cow_mutation` has a phi-term which is wrapped in a struct (e.g. `Array`).
The PhiExpansionPass is supposed to clean that up, but this only works if there are no "unknown" uses of the phi term.
With this change, COWOpts can handle such patterns without relying on the PhiExpansionPass.
rdar://91964659
When canonicalizing an owned value's lifetime, also check whether the
value is dead. If it is, track it for deletion. In particular, this
eliminates dead copy_values.
Allow round-tripping access to global variables. Previously,
AccessedStorage asserted that global variables were always associated
with a VarDecl. This was to ensure that AccessEnforcmentWMO always
recognized the global. Failing to recognize access to a global will
cause a miscompile.
SILGlobalVariable now has all the information needed by
SIL. Particularly, the 'isLet' flag. Simply replace VarDecl with
SILGlobalVariable in AccessEnforcmentWMO to eliminate the need for the
assert.
Make loads and copy_addrs of casts of the underlying storage barriers to
folding. Destroying the target address may not be equivalent to
destroying the source address: for example, if the target address is a
generic and the source address is AnyObject, specialization may turn the
generic into a trivial type; the destruction of that trivial type fails
to destroy the original stored AnyObject, resulting in a leak.
Previously, destroy_addrs were folded into copy_addrs and load [copy]s
to produce copy_addr [take]s and load [take]s respectively, but only if
the source of the load/copy was exactly the address being destroyed.
Generalize that to a single-block sequence of copy_addrs and load
[copy]s of projections of the address being destroyed.
Only respect deinit barriers when lexical lifetimes are enabled. If
they aren't, hoist destroy_addrs of all addresses aggressively
regardless of whether doing so involves hoisting over deinit barriers.
Enable caller and callee to be printed as inlining runs. The printing
is filtered based on -sil-print-function/-sil-print-functions and
includes finer-grained info than those do already. The caller before
and after each callee is inlined can be printed as well as the callee
on its own as it exists when inlining occurs.
Previously, SSADestroyHoisting was attempting to check whether an
unknown use of a variable was an address_to_pointer.
UniqueStorageUseVisitor, however, doesn't call back with that
instruction. Instead, it adds its uses to the stack of uses to visit.
Instead, we need to check whether the use was produced by an
address_to_pointer or more generally whether it's a
BuiltinRawPointerType.
Already, load [take]s of struct_element_addr|tuple_element_addr
projections resulted in Mem2Reg bailing. Expand that to include load
[take]s involving unchecked_addr_cast.
To handle load [take]s of (struct|tuple)_element_addr projections, it
would be necessary to replace the running value with a value obtained
from the original product by recursive destructuring, replacing the
value at the load [take]n address with undef, and then restructuring.
To handle load [take]s of cast projections, it would be necessary to use
unchecked_value_cast instead of unchecked_bitwise_cast. But we would
need to still use unchecked_bitwise_cast in the case of load [copy]
because otherwise we would lose the original value--unchecked_value_cast
forwards ownership, and not all casts can be reversed (because they may
narrow).
For now, just bail out in the face of these complex load [take]s.
The main point of this change is to make sure that a shared function always has a body: both, in the optimizer pipeline and in the swiftmodule file.
This is important because the compiler always needs to emit code for a shared function. Shared functions cannot be referenced from outside the module.
In several corner cases we missed to maintain this invariant which resulted in unresolved-symbol linker errors.
As side-effect of this change we can drop the shared_external SIL linkage and the IsSerializable flag, which simplifies the serialization and linkage concept.
When optimizing an enum `store` to an `alloc_stack`, require that all uses are in the same block.
Otherwise it could be a `switch_enum` of an optional where the none-case does not have a destroy of the enum value.
After transforming such an `alloc_stack`, the value would leak in the none-case block.
It fixes the same OSSA verification error as done for TempRValueOpt in a previous commit.
When optimizing an enum `store` to an `alloc_stack`, require that all uses are in the same block.
Otherwise it could be a `switch_enum` of an optional where the none-case does not have a destroy of the enum value.
After transforming such an `alloc_stack`, the value would leak in the none-case block.
Fixes a OSSA verification error.
Previously, FindBarrierAccessScopes::checkReachablePhiBarrier was not
looking at the terminator of predecessors but rather looking at the
terminator of block itself. Previously, in cases where the current
block's terminator was in fact a barrier, that resulted in failing to
hoist any live-in access scopes.
Now that we aren't running the data flow twice, the result was worse: in
cases where the current block's terminator was a barrier but there was
no access scope in play, no barrier would be added at all.
In order to determine which end_access instructions are barriers to
hoisting, a data flow which looks for access scopes containing barriers
is run. Those scopes that do contain barriers are added to a set. When
the second pass runs, the end_access instructions corresponding to
scopes in that set (i.e. the ends of scopes which contain barriers) are
treated as barriers.
In the common case where there are no barrier access scopes, though,
running two dataflows per variable is wasteful. Avoid that by just
checking whether we found any scopes that are barriers. If we didn't,
then we already visited all the barrier instructions and were told by
BackwardReachability which blocks had reachable ends and begins.
Tweaked the first data flow to record the barriers and the blocks in
DeinitBarriers. In DeinitBarriers::compute, if no access scopes that
are barriers were found, stop working. If any were found, clear what
had been recorded so far and run the second data flow.
In order to be able to clear everything, switched from using
BasicBlockSet and BasicBlockSetVector to SmallPtrSet<SILBasicBlock *>
and SmallPtrSetVector<SILBasicBlock *>.
As was done with store [init], transform instructions like
copy_addr %n to %m
into the sequence
destroy_addr %m
copy_addr %n to [initialization] %m
in order to create more opportunities for hoisting destroys.
After hoisting, if these opportunities for hoisting don't result in
hoisting actually occurring, recombine the two instructions.
Previously, all arguments using the inout convention were hoisted
ignoring deinit barriers. That was incorrect because @inout_aliasable
addresses are modifications but aren't exclusive. Here, that's fixed by
only allowing arguments with the @inout convention to be hoisted.
If a load [copy] appears near the end of the scope protecting access to
another address and a destroy_addr of the loaded address appears
afterwards, don't fold the destroy into the scope. The reason is that
doing so could allow a deinit which previously executed outside the
exclusivity scope to subsequently execute within it.
Mandatory copy propagation was primarily a stop-gap until lexcial
lifetimes were implemented. It supposedly made variables lifetimes
more consistent between -O and -Onone builds. Now that lexical
lifetimes are enabled, it is no longer needed for that purpose (and
will never satisfactorily meet that goal anyway).
Mandatory copy propagation may be enabled again later as a -Onone "
optimization. But that requires a more careful audit of the effect on
debug information.
For now, it should be disabled.
Assertion failed: (succeed && "should be filtered by
FindBorrowScopeUses"), function canonicalizeFunctionArgument, file
CanonicalizeBorrowScope.cpp, line 798
Canonicalization for guaranteed function arguments is triggered by
SILCombine without any up-front analysis. Because the canonicalization
rewrites the function argument's copies in place, it must always
succeed.
Fix the visitBorrowScopeUses utility to be aware that it is being
invoked on a function argument and avoid bailing out.
Mandatory copy propagation was primarily a stop-gap until lexcial
lifetimes were implemented. It supposedly made variables lifetimes
more consistent between -O and -Onone builds. Now that lexical
lifetimes are enabled, it is no longer needed for that purpose (and
will never satisfactorily meet that goal anyway).
Mandatory copy propagation may be enabled again later as a -Onone "
optimization. But that requires a more careful audit of the effect on
debug information.
For now, it should be disabled.
Before hoisting destroy_addrs, we split store [assign]s into
destroy_addrs and store [init]s. If those destroy_addrs were not able
to be hoisted, though, recombine the two back into a store [assign].
Doing so avoids introducing extra ARC traffic.
Added a second backward reachability data flow that determines whether
any open access scopes contain barriers. The end_access instructions
for those access scopes are themselves barriers.
If the destroy_addr's barrier is an end_access, try to fold with copies
or loads that occur inside the scope so long as there are no barriers
between the destroy_addr and the instruction it is to be fold with.
Extract code for classifying instructions out of the one data flow where
it is currently used into the DeinitBarriers type. This will facilitate
a second data flow which needs to access the same info and adding an
isBarrier member function to DeinitBarriers for use by folding.
In addition to hoisting destroy_addrs for alloc_stacks and function
arguments, also hoist begin_access [modify] insts. Hoist starting from
innermost scopes and proceeding outwards.