This patch replace all in-memory objects of DebugValueAddrInst with
DebugValueInst + op_deref, and duplicates logics that handles
DebugValueAddrInst with the latter. All related check in the tests
have been updated as well.
Note that this patch neither remove the DebugValueAddrInst class nor
remove `debug_value_addr` syntax in the test inputs.
In OSSA RLE for loops, in certain cases SSAUpdater will not create a new
SILPhiArgument to be used as the forwarding value. Based on dominator info
it may return the newly copied available value as the forwarding value.
This newly copied available value in the dominating predecessor
will have destroy values at leaking blocks.
Rename makeNewValueAvailable to makeValueAvailable and handle users so that only
additional required destroy_values are inserted.
Don't let debug_value instructions bail the optimization.
This fixes a couple of performance regressions, which were introduced by adding more debug_value instructions (https://github.com/apple/swift/pull/38736).
rdar://82327743
Due to mismatch in the instructions handled in DestroyHoisting::getUsedLocationsOfInst
and MemoryLocations::analyzeLocationUsesRecursively, certain users of addresses
were not considered and the destroys were hoisted before valid uses causing use-after-frees
After hoisting the destroy of the copy src, debug_value_addr users of the
copy src become dead, this PR cleans them so that MemoryLifetimeVerifier does not
complain later.
OSSA rauw cleans up end of scope markers before rauw'ing.
This can lead to inadvertant deleting of end_lifetime, later
resulting in an ownership verifier error indicating a leak.
This PR stops treating end_lifetime scope ending like end_borrow/end_access.
SROA and Mem2Reg now can leverage DIExpression -- op_fragment, more
specifically -- to generate correct debug info for optimized SIL. Some
important highlights:
- The new swift::salvageDebugInfo, similar to llvm::salvageDebugInfo,
tries to restore / transfer debug info from a deleted instruction.
Currently I only implemented this for store instruction whose
destination is an alloc_stack value.
- Since we now have source-variable-specific SIL location inside a
`debug_value` instruction (and its friends), this patch teaches
SILCloner and SILInliner to remap the debug scope there in addition
to debug scope of the instruction.
- DCE now does not remove `debug_value` instruction whose associating
with a function argument SSA value that is not used elsewhere. Since
that SSA value will not disappear so we should keep the debug info.
Change the code generation patterns for `async let` bindings to use an ABI based on the following
functions:
- `swift_asyncLet_begin`, which starts an `async let` child task, but which additionally
now associates the `async let` with a caller-owned buffer to receive the result of the task.
This is intended to allow the task to emplace its result in caller-owned memory, allowing the
child task to be deallocated after completion without invalidating the result buffer.
- `swift_asyncLet_get[_throwing]`, which replaces `swift_asyncLet_wait[_throwing]`. Instead of
returning a copy of the value, this entry point concerns itself with populating the local buffer.
If the buffer hasn't been populated, then it awaits completion of the task and emplaces the
result in the buffer; otherwise, it simply returns. The caller can then read the result out of
its owned memory. These entry points are intended to be used before every read from the
`async let` binding, after which point the local buffer is guaranteed to contain an initialized
value.
- `swift_asyncLet_finish`, which replaces `swift_asyncLet_end`. Unlike `_end`, this variant
is async and will suspend the parent task after cancelling the child to ensure it finishes
before cleaning up. The local buffer will also be deinitialized if necessary. This is intended
to be used on exit from an `async let` scope, to handle cleaning up the local buffer if necessary
as well as cancelling, awaiting, and deallocating the child task.
- `swift_asyncLet_consume[_throwing]`, which combines `get` and `finish`. This will await completion
of the task, leaving the result value in the result buffer (or propagating the error, if it
throws), while destroying and deallocating the child task. This is intended as an optimization
for reading `async let` variables that are read exactly once by their parent task.
To avoid an epoch break with existing swiftinterfaces and ABI clients, the old builtins and entry
points are kept intact for now, but SILGen now only generates code using the new interface.
This new interface fixes several issues with the old async let codegen, including use-after-free
crashes if the `async let` was never awaited, and the inability to read from an `async let` variable
more than once.
rdar://77855176
With such alloc_stack we can pass over-consumed values to a
proactively added phi. Even though such a SIL does not have
issues at runtime, they raise an ownership verification error.
Mem2Reg creates phis proactively that may be unnecessary.
Unnecessary phis are those without uses or operands to other
proactive phis that are unnecessary.
Even though this does not translate to a real issue at runtime, we can
see ownership verification errors. This PR identifies such phis and deletes them.
Optimize code like:
puts("\(String.self)")
Optimizing string interpolation and optimizing C-strings are both done in StringOptimization.
A second run of the StringOptimization is needed in the pipeline to optimize such code, because the result of the interpolation-optimization must be cleaned up so that the C-String optimization can kick in.
Also, StringOptimization must handle struct_extract(struct(literal)), where the struct_extract may be in a called function.
To solve a phase ordering problem with inlining String semantics and inlining the `String(stringInterpolation: DefaultStringInterpolation)` constructor, we do a simple analysis of the callee. Doing this simple "interprocedural" analysis avoids relying on inlining that String constructor.
rdar://74941849
TLDR: I fixed a whole in the assembly-vision opt-remark pass where we were not
emitting a remark for end of scope instructions at the beginning of blocks. Now
all of these instructions (strong_release, end_access) should always reliably
have a remark emitted for them.
----
I think that this is a pragmatic first solution to the problem of
strong_release, release_value being the first instruction of a block. For those
who are unaware, this issue is that for a long time we have searched backwards
first for "end of scope" like instructions. This then allows us to identify the
"end of scope" instruction as happening at the end of the previous statement
which is where the developer thinks it should be:
```
var global: Klass
func bar() -> @owned Klass { global }
func foo() {
// We want the remark for the
bar() // expected-remark {{retain}}
}
```
This makes sense since we want to show end of scope instructions as being
applied to the earlier code whose scope it is ending. We can be clear that it is
at the end of the statement by placing the carrot on the end of statement
SourceLoc so there isn't any confusion upon whether or not
That generally has delivered nice looking results, but what if our release is
the first instruction in the block? In that case, we do not have any instruction
that we can immediately use, so traditionally we just gave up and didn't emit
anything. This is not an acceptable solution! We should be able to emit
something for every retain/release in the program if we want users to be able to
rely upon this! Thus we need to be able to get source location information from
somewhere around
First before we begin, my approach here is informed by my seeing over time that
the optimizer does a pretty good job of not breaking SourceLoc info for
terminators.
With that in mind, there are two possible approaches here: using the terminator
from the previous block and searching forward at worst taking the SourceLoc of
the current block's terminator (or earlier if we find a good SourceLoc). I
wasn't sure what the correct thing to do was at the time so I didn't fix the
issue. After some thought, I realized that the correct solution is to if we fail
a backwards search, search forwards. The reason why is that since our remarks
runs late in the optimization pipeline, there is a very high likelihood that if
we aren't folded into our previous block that there is a true need in the
program for conditional control flow here. We want to avoid placing the release
out of such pieces of code since it is misleading to the user:
```
In this example there is a release inside the case for .x but none for .y. In
that case it is possible that we get a release for .f since payload is passed in
at +1 at SILGen time. In such a case, using the terminator of the previous block
would mean that we would have the release be marked as on payload instead of
inside the case of .x. By using the terminator of the releases block, we can
switch payload {
case let .x(f):
...
case let .y:
...
}
```
So using the terminator from the previous block would be
misleading to the user. Instead it is better to pick a location after the release that way
we know at least the instruction we are inferring from must in some sense be
With this fix, we should not emit locations for all retains, releases. We may
not identify a good source loc for all of them, but we will identify them.
optimization pipeline, if our block was not folded into the previous block there
is a very high liklihood that there is some sort of conditional control flow
that is truly necessary in the program. If we
this
generally implies that there is a real side effect in the program that is
requiring conditional code execution (since the optimizer would have folded it).
The reason why is that we are at least going to hit a terminator or a
side-effect having instruction that generally have debug info preserved by the
optimizer.
This rewrites functionality that was mostly disabled but is now ready
to be enabled.
Allow lifetime canonicalization of owned values and function arguments
as a simple stand-alone utility. This is now being called from within
SILCombine, so we should only do the kind of canonicalization that
makes sense in that context.
Canonicalizing other borrow scopes should *not* be invoked as a
single-value cleanup because it affects other lifetimes outside the
borrow scope boundary. It is a somewhat complicated process that
hoists and sinks forwarding instructions and can generate surrounding
compensation code. The copy propagation pass knows how to post-process
the related lifetimes in just the right order. So borrow scope
rewriting should only be done in the copy propagation pass.
Similarly, only do simple canonicalization of owned values and
function arguments at -Onone.
The feature to canoncalize borrow scopes is now ready to be
enabled (-canonical-ossa-rewrite-borrows), but flipping the switch
should be a separate commit. So most of the functionality that was
affected is not exposed by this PR.
Changes:
Split canonicalization of owned lifetimes vs. borrowed lifetimes into
separate utilities. The owned lifetime utility is now back to being
the simple utility that I originally envisioned. So not much happened
to it other than removing complexity.
We now have a separate entry point for finding the starting point for
rewriting borrow scopes:
CanonicalizeBorrowScope::getCanonicalBorrowedDef.
We now have a utility that defines forwarding instructions that we can
treat consistently as part of a guaranteed lifetime,
CanonicalizeBorrowScope::isRewritableOSSAForward.
We now have a utility that defines the uses of a borrowed value that
are considered part of its lifetime,
CanonicalizeBorrowScope::visitBorrowScopeUses. This single utility is
used to implement three different parts of the alogrithm:
1. Find any uses of the borrowed value that need to be propagated
outside the borrow scope
2. RewriteInnerBorrowUses for SILFunction arguments and borrow scopes
with no outer uses.
3. RewriteOuterBorrowUses for borrow scopes with outer uses. Handling
these involves creating new copies outside the borrow scope and
hoisting forwarding instructions.
The end result is that a lot of borrow scopes can be eliminated and
owned values can be forwarded to destructures, reducing copies and
destroys.
If we stop generating borrow scopes for all interior pointers, then
we'll need to design a comparable optimization that works on
"implicit" borrow scopes:
%ownedDef = ...
%element struct_extract %ownedDef
%copy = copy_value %element
apply(@guaranteed %element)
apply(@owned %copy)
destroy %ownedDef
Should be:
%ownedDef = ...
%borrowedElement = destructure_struct @guaranteed %ownedDef
apply(@guaranteed %borrowedElement)
%ownedElement = destructure_struct %ownedDef
apply(@owned %copy)