The reason why is that due to ARCCodeMotion, they could have moved enough that
the SourceLoc on them is incorrect. That is why you can see in the tests that I
had to update, I am moving the retain to the return statement from the body of
the function since the retain was now right before the return.
I also went in and cleaned up the logic here a little bit. What we do now is
that we have a notion of instructions that we /always/ infer SourceLocs for (rr)
and ones that if we have a valid non-inlined location we use (e.x.:
allocations).
This mucked a little bit with my ability to run SIL tests since the SIL tests
were relying on this not happening to rr so that we would emit remarks on the rr
instructions themselves. I added an option that disables the always infer
behavior for this test.
That being said at this point to me it seems like the SourceLoc inference stuff
is really tied to OptRemarkGenerator and I am going to see if I can move it to
there. But that is for a future commit on another day.
A class which is marked as internal or public can be visible outside
of the current file, where the use of the VWT indirectly is possible.
If the VWT is modified and inlined, it is possible that the offsets will
no longer match resulting in an invalid dispatch. Limit the pass to
when WMO is enabled or the type is private in non-WMO cases.
A key concept in late ARC optimization is "RC Identity". In short, a result of
an instruction is rc-identical to an operand of the instruction if one can
safely move a retain (release) from before the instruction on the result to one
after on the operand without changing the program semantics. This creates a
simple model where one can work on equivalence classes of rc-identical values
(using a dominating definition generally as the representative) and thus
optimize/pair retain, release.
When preparing for late ARC optimization, the optimizer will normalize aggregate
ARC operations (retain_value, release_value) into singular strong_retain,
strong_release operations on leaf types of the aggregate that are
non-trivial. As an example, a retain_value on a KlassPair would be canonicalized
into two strong_retain, one for the lhs and one for the rhs. When this is done,
the optimizer generally just creates new struct_extract at the point where the
retain is. In such a case, we may have that the debug_value for the underlying
type is actually on a reformed aggregate whose underlying parts we are
retaining:
```
bb0(%0 : $Builtin.NativeObject):
strong_retain %0
%1 = struct $Array(%0 : $Builtin.NativeObject, ...)
debug_value %1 : $Array, ...
```
By looking through RC identical uses, we can handle a large subset of these
cases without much effort: ones were there is a single owning pointer like Array.
To handle more complex cases we would have to calculate an inverse access path needed to get
back to our value and somehow deal with all of the complexity therein (I am sure
we can do it I just haven't thought through all of the details).
The only interesting behavior that this results in is that when we emit
diagnostics, we just use the rc-identical transitive use debug_value's name
without a projection path. This is because the source location associated with
that debug_value is with a separate value that is rc-identical to the actual
value that we visited during our opt-remark traversal up the def-use
graph. Consider the following example below, noting the comments that show in
the SIL itself what I attempted to explain above.
```
struct KlassPair {
var lhs: Klass
var rhs: Klass
}
struct StateWithOwningPointer {
var state: TrivialState
var owningPtr: Klass
}
sil @theFunction : $@convention(thin) () -> () {
bb0:
%0 = apply %getKlassPair() : $@convention(thin) () -> @owned KlassPair
// This debug_value's name can be combined...
debug_value %0 : $KlassPair, name "myPair"
// ... with the access path from the struct_extract here...
%1 = struct_extract %0 : $KlassPair, #KlassPair.lhs
// ... to emit a nice diagnostic that 'myPair.lhs' is being retained.
strong_retain %1 : $Klass
// In contrast in the case below, we rely on looking through rc-identity uses
// to find the debug_value. In this case, the source info associated with the
// debug_value (%2) is no longer associated with the underlying access path we
// have been tracking upwards (%1 is in our access path list). Instead, we
// know that the debug_value is rc-identical to whatever value we were
// originally tracking up (%1) and thus the correct identifier to use is the
// direct name of the identifier alone (without access path) since that source
// identifier must be some value in the source that by itself is rc-identical
// to whatever is being manipulated. Thus if we were to emit the access path
// here for na rc-identical use we would get "myAdditionalState.owningPtr"
// which is misleading since ArrayWrapperWithMoreState does not have a field
// named 'owningPtr', its subfield array does. That being said since
// rc-identity means a retain_value on the value with the debug_value upon it
// is equivalent to the access path value we found by walking up the def-use
// graph from our strong_retain's operand.
%0a = apply %getStateWithOwningPointer() : $@convention(thin) () -> @owned StateWithOwningPointer
%1 = struct_extract %0a : $StateWithOwningPointer, #StateWithOwningPointer.owningPtr
strong_retain %1 : $Klass
%2 = struct $Array(%0 : $Builtin.NativeObject, ...)
%3 = struct $ArrayWrapperWithMoreState(%2 : $Array, %moreState : MoreState)
debug_value %2 : $ArrayWrapperWithMoreState, name "myAdditionalState"
}
```
Instead, in such a case, we use the name on the debug_value instead and just use
as the loc the debug_value itself. This will let me write SIL test cases for
opt-remark-gen.
I am going to add SIL test cases for future changes (as well as swift ones)
using this technique. I am going to in forthcoming commits fill in some tests
for the current opt-remark-generation here. I just want to get in this part.
We cannot prove that the whole struct is overwritten between two lazy property getters.
We would need AliasAnalysis for this, but currently this is not used in CSE.
rdar://problem/67734844
SemanticARCOpts keeps on growing with various optimizations attached to a single
"optimization" manager. Move it to its own folder in prepation for splitting it
into multiple different optimizations and utility files.
Specifically, now we properly identify the SILLocation for the final inlined
call site of a Source Location and put the remark there instead of in the
callee.
This eliminates opt-remarks on initializers/friends, e.x.:
```
struct KlassPair {
var lhs: Klass // expected-remark {{retain of type 'Klass'}}
// expected-note @-1 {{of 'self.lhs'}}
// expected-remark @-2 {{release of type 'Klass'}}
// expected-note @-3 {{of 'self.lhs'}}
var rhs: Klass // expected-remark {{retain of type 'Klass'}}
// expected-note @-1 {{of 'self.rhs'}}
// expected-remark @-2 {{release of type 'Klass'}}
// expected-note @-3 {{of 'self.rhs'}}
}
```
becomes:
```
struct KlassPair {
var lhs: Klass
var rhs: Klass
}
```
I also added an -Xllvm option (-optremarkgen-visit-implicit-autogen-funcs) that
will force these to be emitted as a compiler knob for compiler developers. There
is a test that validates the behavior.
LLVM, as of 77e0e9e17daf0865620abcd41f692ab0642367c4, now builds with
-Wsuggest-override. Let's clean up the swift sources rather than disable
the warning locally.
TLDR: This fixes an ownership verifier assert caused by not placing end_borrows
along paths where an enum is provable to have a trivial case. It only happens if
all non-trivial cases in a switch_enum are "dead end blocks" where the program
will end and we leak objects.
The Problem
-----------
The actual bug here only occurs in cases where we have a switch_enum on an enum
with mixed trivial, non-trivial cases and all of the non-trivial payloaded cases
are "dead end blocks". As an example, lets look at a simple switch_enum over an
optional where the .some case is a dead end block and we leak the Klass object
into program termination:
```
%0 = load [copy] %mem : $Klass
switch_enum %0 : $Optional<Klass>, case #Optional.some: bbDeadEnd, case #Optional.none: bbContinue
bbDeadEnd(%0a : @owned $Klass): // %0 is leaked into program end!
unreachable
bbContinue:
... // program continue.
```
In this case, if we were only looking at final destroying uses, we would pass a
def without any uses to the ValueLifetimeChecker causing us to not have a
frontier at all causing us to not insert any end_borrows, yielding:
```
%0 = load_borrow %mem : $Klass
switch_enum %0 : $Optional<Klass>, case #Optional.some: bbDeadEnd, case #Optional.none: bbContinue
bbDeadEnd(%0a : @guaranteed $Klass): // %0 is leaked into program end and
// doesnt need an end_borrow!
unreachable
bbContinue:
... // program continue... we need an end_borrow here though!
```
This then trips the ownership verifier since switch_enum is a transforming
terminator that acts like a forwarding instruction implying we need an
end_borrow on the base value along all non-dead end paths through the program.
Importantly this is not actually a leak of a value or unsafe behavior since the
only time that we enter into unsafe territory is along paths where the enum was
actually trivial. So the load_borrow is actually just loaded the trivial enum
value.
The Fix
-------
In order to work around this, I realized that the right solution is to also
include the forwarding consuming uses (in this case the switch_enum use) when
determining the lifetime and that this solves the problem.
That being said, after I made that change, I noticed that I needed to remove my
previous manner of computing the insertion point to use for arguments when
finding the lifetime using ValueLifetimeAnalysis. Previously since I was using
only the destroying uses I knew that the destroy_value could not be the first
instruction in the block of my argument since I handled that case individually
before using the ValueLifetimeAnalysis. That invariant is no longer true as can
be seen in the case above if %0 was from a SILArgument itself instead of a load
[copy] and we were converting that argument to be a guaranteed argument.
To fix this, I taught ValueLifetimeAnalysis how to handle defs from
Arguments. The key thing is I noticed while reading the code that the analysis
only generally cared about the instruction's parent block. Beyond that, the def
being from an instruction was only needed to determine if a user is earlier in
the same block as the def instruction. Those concerns not apply to SILArgument
which dominate all instructions in the same block, so in this patch, we just
skip those conditional checks when we have a SILArgument. The rest of the code
that uses the parent block is the same for both SILArgument/SILInstructions.
rdar://65244617
Specifically:
1. I made methods, variables camelCase.
2. I expanded out variable names (e.x.: bb -> block, predBB -> predBlocks, U -> wrappedUse).
3. I changed typedef -> using.
4. I changed a few c style for loops into for each loops using llvm::enumerate.
NOTE: I left the parts needed for syncing to LLVM in the old style since LLVM
needs these to exist for CRTP to work correctly for the SILSSAUpdater.
For now I am trying out /not/ expanding when ownership is enabled. I still fixed
the problems in it though. I also put in a force expand everything pass to make
sure that in ossa we can still successfully go down this code path if we want
to.
The reason why I think this is the right thing to do is that the original reason
why lower aggregate instrs was written was to help the low level arc optimizer
(which only runs on non-ossa code). To the high level arc optimizer this is just
noise and will keep the IR simpler.
Another thing to note is that I updated the code so it should work in both ossa
and non-ossa. In order to not perturb the code too much, I had to add a small
optimization to TypeLowering where when ownership is disabled, we do not reform
aggregates when copying. Instead, we just return the passed in value. This makes
the pass the same when optimizing in both modes.