Now we get opt-remarks like the following:
```
public func condCast5<NS, T>(_ ns: NS) -> T? {
// Make sure the colon info is right so that the arrow is under the a.
//
// Today, we lose that x was assigned.
if let x = ns as? T { // expected-remark @:17 {{conditional runtime cast of value with type 'NS' to 'T'}}
return x // expected-note @-5:32 {{of 'ns'}}
}
return nil
}
```
For example, now we get the following diagnostic on globals:
public func getGlobal() -> Klass {
return global // expected-remark @:5 {{retain of type 'Klass'}}
// expected-note @-5:12 {{of 'global'}}
+ // expected-remark @-2:12 {{begin exclusive access to value of type 'Klass'}}
+ // expected-note @-7:12 {{of 'global'}}
+ // expected-remark @-4 {{end exclusive access to value of type 'Klass'}}
+ // expected-note @-9:12 {{of 'global'}}
+
}
and for classes when we can't eliminate the access:
+func simpleInOut() -> Klass {
+ let x = Klass() // expected-remark @:13 {{heap allocated ref of type 'Klass'}}
+ // expected-note @-1:9 {{of 'x'}}
+ simpleInOutUser(&x.next) // expected-remark @:5 {{begin exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-3:9 {{of 'x.next'}}
+ // expected-remark @-2:28 {{end exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-5:9 {{of 'x.next'}}
+ return x
+}
The reason why is that due to ARCCodeMotion, they could have moved enough that
the SourceLoc on them is incorrect. That is why you can see in the tests that I
had to update, I am moving the retain to the return statement from the body of
the function since the retain was now right before the return.
I also went in and cleaned up the logic here a little bit. What we do now is
that we have a notion of instructions that we /always/ infer SourceLocs for (rr)
and ones that if we have a valid non-inlined location we use (e.x.:
allocations).
This mucked a little bit with my ability to run SIL tests since the SIL tests
were relying on this not happening to rr so that we would emit remarks on the rr
instructions themselves. I added an option that disables the always infer
behavior for this test.
That being said at this point to me it seems like the SourceLoc inference stuff
is really tied to OptRemarkGenerator and I am going to see if I can move it to
there. But that is for a future commit on another day.
A key concept in late ARC optimization is "RC Identity". In short, a result of
an instruction is rc-identical to an operand of the instruction if one can
safely move a retain (release) from before the instruction on the result to one
after on the operand without changing the program semantics. This creates a
simple model where one can work on equivalence classes of rc-identical values
(using a dominating definition generally as the representative) and thus
optimize/pair retain, release.
When preparing for late ARC optimization, the optimizer will normalize aggregate
ARC operations (retain_value, release_value) into singular strong_retain,
strong_release operations on leaf types of the aggregate that are
non-trivial. As an example, a retain_value on a KlassPair would be canonicalized
into two strong_retain, one for the lhs and one for the rhs. When this is done,
the optimizer generally just creates new struct_extract at the point where the
retain is. In such a case, we may have that the debug_value for the underlying
type is actually on a reformed aggregate whose underlying parts we are
retaining:
```
bb0(%0 : $Builtin.NativeObject):
strong_retain %0
%1 = struct $Array(%0 : $Builtin.NativeObject, ...)
debug_value %1 : $Array, ...
```
By looking through RC identical uses, we can handle a large subset of these
cases without much effort: ones were there is a single owning pointer like Array.
To handle more complex cases we would have to calculate an inverse access path needed to get
back to our value and somehow deal with all of the complexity therein (I am sure
we can do it I just haven't thought through all of the details).
The only interesting behavior that this results in is that when we emit
diagnostics, we just use the rc-identical transitive use debug_value's name
without a projection path. This is because the source location associated with
that debug_value is with a separate value that is rc-identical to the actual
value that we visited during our opt-remark traversal up the def-use
graph. Consider the following example below, noting the comments that show in
the SIL itself what I attempted to explain above.
```
struct KlassPair {
var lhs: Klass
var rhs: Klass
}
struct StateWithOwningPointer {
var state: TrivialState
var owningPtr: Klass
}
sil @theFunction : $@convention(thin) () -> () {
bb0:
%0 = apply %getKlassPair() : $@convention(thin) () -> @owned KlassPair
// This debug_value's name can be combined...
debug_value %0 : $KlassPair, name "myPair"
// ... with the access path from the struct_extract here...
%1 = struct_extract %0 : $KlassPair, #KlassPair.lhs
// ... to emit a nice diagnostic that 'myPair.lhs' is being retained.
strong_retain %1 : $Klass
// In contrast in the case below, we rely on looking through rc-identity uses
// to find the debug_value. In this case, the source info associated with the
// debug_value (%2) is no longer associated with the underlying access path we
// have been tracking upwards (%1 is in our access path list). Instead, we
// know that the debug_value is rc-identical to whatever value we were
// originally tracking up (%1) and thus the correct identifier to use is the
// direct name of the identifier alone (without access path) since that source
// identifier must be some value in the source that by itself is rc-identical
// to whatever is being manipulated. Thus if we were to emit the access path
// here for na rc-identical use we would get "myAdditionalState.owningPtr"
// which is misleading since ArrayWrapperWithMoreState does not have a field
// named 'owningPtr', its subfield array does. That being said since
// rc-identity means a retain_value on the value with the debug_value upon it
// is equivalent to the access path value we found by walking up the def-use
// graph from our strong_retain's operand.
%0a = apply %getStateWithOwningPointer() : $@convention(thin) () -> @owned StateWithOwningPointer
%1 = struct_extract %0a : $StateWithOwningPointer, #StateWithOwningPointer.owningPtr
strong_retain %1 : $Klass
%2 = struct $Array(%0 : $Builtin.NativeObject, ...)
%3 = struct $ArrayWrapperWithMoreState(%2 : $Array, %moreState : MoreState)
debug_value %2 : $ArrayWrapperWithMoreState, name "myAdditionalState"
}
```
Instead, in such a case, we use the name on the debug_value instead and just use
as the loc the debug_value itself. This will let me write SIL test cases for
opt-remark-gen.
I am going to add SIL test cases for future changes (as well as swift ones)
using this technique. I am going to in forthcoming commits fill in some tests
for the current opt-remark-generation here. I just want to get in this part.
Specifically, now we properly identify the SILLocation for the final inlined
call site of a Source Location and put the remark there instead of in the
callee.
This eliminates opt-remarks on initializers/friends, e.x.:
```
struct KlassPair {
var lhs: Klass // expected-remark {{retain of type 'Klass'}}
// expected-note @-1 {{of 'self.lhs'}}
// expected-remark @-2 {{release of type 'Klass'}}
// expected-note @-3 {{of 'self.lhs'}}
var rhs: Klass // expected-remark {{retain of type 'Klass'}}
// expected-note @-1 {{of 'self.rhs'}}
// expected-remark @-2 {{release of type 'Klass'}}
// expected-note @-3 {{of 'self.rhs'}}
}
```
becomes:
```
struct KlassPair {
var lhs: Klass
var rhs: Klass
}
```
I also added an -Xllvm option (-optremarkgen-visit-implicit-autogen-funcs) that
will force these to be emitted as a compiler knob for compiler developers. There
is a test that validates the behavior.
This ensures that we are able to properly look through struct_extract,
tuple_extract in cases where we have an aggregate with multiple non-trivial
subtypes (implying it is not-rc identical with those sub-types).
The end result is that we now emit good diagnostics for things like this:
```
func returnStructWithOwnerOwner(x: StructWithOwner) -> Klass {
return x.owner // expected-remark {{retain}}
// expected-note @-7:33 {{of 'x.owner'}}
}
```
More specifically, if one wants to force emit /all/ opt-remarks on a function, mark it with:
```
@_semantics("optremark")
```
If one wants to emit opt-remarks only for a specific SIL pass (like lets say
sil-opt-remark-gen), one can write:
```
@_semantics("optremark.sil-opt-remark-gen")
```
I made the pattern matching strict so if you just put in a '.' or add additional
suffixes, it will not pattern match. I think that this is sufficient for a
prototyping tool.
This is useful if one wants to play around with opt-remarks when optimizing code
in Xcode or any IDE that can use serialized diagnostics.
So now if one looks at remarks in an IDE, you will see in the NOTE itself the
decl to look for. I am going to expand this to include handling of projection
paths.
I noticed that I kept on needing to hack on this as I added stuff to
OptRemarkGenerator and felt the need to tie it even closer to
OptRemarkGenerator. As I began to think about it, this is really something
specific to that algorithm, so it really doesn't make sense to have it as part
of the general SIL library.
This is a NFC refactoring.
In all of these cases, we already had a SILFunction and were just grabbing its
SILModule instead of passing it in. So this is just an NFC change.
The reason why I am doing this is so that I can force emit opt-remarks on
functions with the semantics attribute "optremark", so I need to be able to
access the SILFunction in the optimization remark infrastructure.
This is just an initial implementation and doesn't handle all cases. This works
by grabbing the rc-identity root of a retain/release operation and then seeing:
1. If it is an argument, we use the ValueDecl of the argument.
2. If it is an instruction result, we try to infer the ValueDecl by looking for debug_value uses.
3. In the future, we should add support for globals and looking at addresses.
I may do 3 since it is important to have support for that due to inout objects.
In order to test this, I implemented a small source loc inference routine for
instructions without valid SILLocations. This is an optional nob that the
opt-remark writer can optionally enable on a per remark basis. The current
behaviors are just forward/backward scans in the same basic block. If we scan
forwards, if we find a valid SourceLoc, we just use ethat. If we are scanning
backwards, instead we grab the SourceRange and if it is valid use the end source
range of the given instruction. This seems to give a good result for retain
(forward scan) and release (backward scan).
The specific reason that I did that is that my test case for this are
retain/release operations. Often times these operations due to code-motion are
moved around (and rightly to prevent user confusion) given by optimizations auto
generated SIL locations. Since that is the test case I am using, to test this I
needed said engine.
We do allow for limited def-use traversal to do things like find debug_value.
But nothing like a full data flow pass. As a proof of concept I just emit
opt-remarks when we see retains, releases. Eventually I would like to also print
out which lvalue the retain is from but that is more than I want to do with this
proof of concept.