NOTE: I am just adding coverage that we support these instructions. One can only
use this with Error today and Error is always Sendable. So this is just going
for completeness.
Semantically a mark_dependence returns a value that is equal to its first
parameter with the extra semantics that any destroys of the 2nd operand cannot
occur before any uses of the result of the instruction. From a region
perspective this suggests that the instruction should be an assign from the
first operand onto the result and act as a require on the result. Semantically
the requirement that the 2nd operand cannot be destroyed before any uses of the
result does not expose any memory or state from the first operand implying that
we don't need to merge it into the result region. The restriction is purely to
tell the optimizer what it can/cannot do rather.
We already only supported Ownership SSA since we run early in the pipeline
before OSSA is lowered. This just formalizes this behavior. I am marking these
instructions as Asserting (even though we will never see them) so I can
semantically be sure that all of the instructions are covered without using an
"unsupported" like moniker that I fear will lead to new instructions being added
as unsupported. Better to have a semantic thing for new instruction adders to
use.
This involved fixing a few different bugs.
1. We were just performing dataflow by setting that only the initial block needs
to be updated. This means that if there isn’t anything in the initial dataflow
block, we won’t visit any successor blocks. Instead the correct thing to do here
is to visit all blocks in the initial round.
2. I also needed to fix a separate issue where we were updating our union-find
data structure incorrectly as found by an assert on transfernonsendable.swift
that was triggered once I fixed 1. Put simply, we needed to set a new max label
+ 1 when our new max element is less than or equal to the old max label + 1…
before we just did less than so if we had a new max element that is the same as
our fresh label, we wouldn’t increment the fresh label.
rdar://119584497
More specifically this patch does the following:
* Rather than having a large switch with misc code, I changed the partition op
translator to use a visitor that defines a declaration for all SILInstructions
and in translateSILInstruction visits all such instructions. This ensures via
the linker that when ever a new SILInstruction is added, a link error occurs.
* Rather than just have misc translation code from the switch in the visitor, I
created a new enum called TranslationSemantics that describes the semantics for
instructions and made it so that the visitor methods return an instance of the
enum. This enum is then switched over to determine the action to perform. This
handles the vast majority of cases and allows for a reader of the translation
code to read a small amount of code (< 20-30 lines) to understand at a glance
the available semantics rather than having to read a huge switch. To make it
easy to implement the constant semantics that most instructions have, I followed
what we did in OperandOwnership (the model that I followed here) by using
preprocessor macros to define explicitly the semantics for each instruction. In
the case where special handling is needed, we can create a custom method, handle
the translation by hand, and then return TranslationSemantics::Special to signal
to the handling loop to just not do anything.
Some notes:
This is not emitted by SILGen. This is just intended to be used so I can write
SIL test cases for transfer non sendable. I did this by adding an
ActorIsolationCrossing field to all FullApplySites rather than adding it into
the type system on a callee. The reason that this makes sense from a modeling
perspective is that an actor isolation crossing is a caller concept since it is
describing a difference in between the caller's and callee's isolation. As a
bonus it makes this a less viral change.
For simplicity, I made it so that the isolation is represented as an optional
modifier on the instructions:
apply [callee_isolation=XXXX] [caller_isolation=XXXX]
where XXXX is a printed representation of the actor isolation.
When neither callee or caller isolation is specified then the
ApplyIsolationCrossing is std::nullopt. If only one is specified, we make the
other one ActorIsolation::Unspecified.
This required me to move ActorIsolationCrossing from AST/Expr.h ->
AST/ActorIsolation.h to work around compilation issues... Arguably that is where
it should exist anyways so it made sense.
rdar://118521597
I left them as friends since that was in the original code. There isn't a reason
to do this and break the encapsulation of Partition. I just added reasonable
helpers that give PartitionOpEvaluator all of the functionality it needs.
I did this by abstracting the representative value of an equivalence class into
two cases: the normal case of actually having a value and a second case which is
used only to inject into a region an actor derived value.
rdar://119113563
Previously I avoided doing this since the only problem would be that in a case
where we had two transfer instructions that were in an if-else block, we would
just emit an error for one:
```swift
if boolValue {
transfer(x)
} else {
transfer(x) // Only emit error for this transfer!
}
useValue(x)
```
Now that we are tracking at the transfer point if any element in the transfer
was captured in a closure, this becomes an actual semantic issue since if we
track the transfer instruction that isn't reachable from the closure capture, we
will not become more pessimistic:
```swift
if boolValue {
closure = { useInOut(&x) }
transfer(x)
} else {
transfer(x)
}
// Since we grab from the else block, sendableField is allowed to be accessed
// since we do not track that x was captured by reference in a closure.
x.sendableField
useValue(x)
```
To be truly safe, we need to emit both errors.
rdar://119048779
If the var is captured in a closure before it is transferred, it is not safe to
access the Sendable field since we may race on accessing the field with an
assignment to the field in another concurrency domain.
rdar://115124361
Specifically:
1. If the value is transferred such that it becomes part of an actor region, the
value is permanently part of the actor region as one would normally have.
2. If the value is just used in an async let or is used by a nonisolated async
function within the async let then while the async let is alive it cannot be
used. But once the async let has been awaited upon, we allow for it to be used
again.
rdar://117506395
Specifically:
1. Classes. We allow for access to Sendable let fields.
2. Structs. We allow for access to Sendable let/var fields.
3. Tuples. We allow for access to Sendable let/var fields.
I am going to finish enums in a subsequent PR since I found that I need to mark
a bunch more instructions as look through to get that to work (e.x.:
load/load_borrow need to be viewed as a cast from address -> object so that we
can emit errors on the uses of the load instead of the load itself). These are
more invasive so I want to do it a little later.
rdar://115124361
[region-isolation] Since we now propagate the transferred instruction, use that to emit the error instead of attempting to infer the transfer instruction for a requires
This involved me removing the complex logic for emitting diagnostics we have
today in favor of a simple diagnostic that works by:
1. Instead of searching for transfer instructions, we use the transfer
instruction that we propagated by the dataflow... so there is no way for us to
possible not identify a transfer instruction... we always have it.
2. Instead of emitting diagnostics for all requires, just emit a warning for the
first require we see along a path. The reason that we need to do this is that in
certain cases we will have multiple "require" instructions from slightly
different source locations (e.x.: differing by column) but on the same line. I
saw this happen specifically with print where we treat stores into the array as
a require as well as the actual call to print itself where we pass the array.
An additional benefit of this is that this allowed me to get rid of the
cache of already seen require instructions. By doing this, we now emit errors
when the same apply needs to be required by different transfer instructions for
different arguments.
NOTE: I left in the original implementation code to make it easier to review
this new code. I deleted it in the next commit. Otherwise the git diff for this
patch is too difficult to read.
This is another NFC refactor in preparation for changing how we emit
errors. Specifically, we need access to not only the instruction, but also the
specific operand that the transfer occurs at. This ensures that we can look up
the specific type information later when we emit an error rather than tracking
this information throughout the entire pass.
This came up while I was debugging test cases from the other parts of this
work. The specific issue was around a pointer_to_address from a
RawPointer (which is considered non-Sendable) to a Sendable type. We were
identifying the RawPointer as being the representative of the Sendable value
implying we were processing Sendable values like they were
non-Sendable. =><=. I wish we had SIL test cases for region isolation since I
would add one for this...
Currently when one says that an instruction is not a "look through" instruction,
each of its results gets a separate element number and we track these results as
independent entities that can be in a region. The one issue with this is
whenever we perform this sort of operation we actually are at the same time
performing a require on the operand of the instruction. This causes us to emit
errors on non-side effect having instructions when we really want to emit an
error on their side-effect having results. As an example of the world before
this patch, the following example would force the struct_element_addr to have a
require so we would emit an error on it instead of the apply (the thing that we
actually care about):
```
%0 = ...
// We transferred %0, so we cannot use it again.
apply %transfer(%0)
// We track %1 and %0 as separate elements and we treat this as an assignment of
// %0 into %1 which forces %0 to be required live at this point causing us to
// emit an error here...
%1 = struct_element_addr %0
// Instead of in the SIL here on the actual side-effect having instruction.
apply %actualUse(%1)
```
the solution is to make instructions like struct_element_addr lookthrough
instructions which force their result to just be the same element as their
operand. As part of doing this, we have to ensure that getUnderlyingTrackedValue
knows how to look through these types. This ensures that they are not considered
roots.
----
As an aside to implement this I needed to compose some functionality ontop of
getUnderlyingObject (specifically the look through behavior on destructures) in
a new helper routine called getUnderlyingTrackedObjectValue(). It just in a loop
calls getUnderlyingObject() and looks through destructures until its iterator
doesn't change.
getExprForPartitionOp(...) just returned the expression from the loc of op.currentInst:
SILInstruction *sourceInstr = op.getSourceInst(/*assertNonNull=*/true);
Expr *expr = sourceInstr->getLoc().getAsASTNode<Expr>();
Instead of mucking around with exprs, just use the SILLocation from the
SILInstruction.
I also changed how we unique transfer instructions to just use the transfer
instruction itself instead of the AST/Expr of the transfer instruction.
Was experimenting with making PartitionOps a noncopyable type and I discovered
these places where we copy PartitionOps when we could use a const reference. It
is good not to copy PartitionOps since they potentially contain a heap allocated
array.
Sadly, my change to make PartitionOps noncopyable will have to wait until a
forthcoming commit here I overhaul how we emit errors since that older code
copies PartitionOps a lot and I would rather just delete that code and then fix
PartitionOps. But these are on the surface safe changes that makes sense to get
in separately to make that next patch easier to review.
What this does is really split the one dataflow we are performing into two
dataflows we perform at the same time. The first dataflow is the region dataflow
that we already have with transferring never occurring. The second dataflow is a
simple gen/kill dataflow where we gen on a transfer instruction and kill on
AssignFresh. What it tracks are regions where a specific element is transferred
and propagates the region until the element is given a new value. This of course
means that once the dataflow has converged, we have to emit an error not if the
value was transferred, but if any value in its region was transferred.
There isn't a strong reason to use a callback here since we aren't ever
composing transformations on partition ops. Better instead to go for simplicity
and just iterate directly. If we start doing complex transformations over
partition ops with composable APIs, we can always add this back in later. As a
nice benefit, one doesn't need to worry that the callback API is hiding actual
complexity... since just by using a for loop we communicate that nothing
interesting is happening here.
Just reducing the amount of code surface area in the pass.
This is only used in one place in partition analysis which is a data structure
that does computation. In contrast, we want BlockPartitionState to be more of a
POD type of data that each BasicBlock has mapped to it.
Simplifying the code.
This also let me get rid of the translator field in BasicBlockState. We only
need to pass it in as an argument to the constructor to initialize our
translation. It doesn't need to be stored anymore.
PR #69652 protected one call of `printID` but left another two in the
file. Create two small lambdas to print the ID with `printID` or just
print `NOASSERTS` depending on `NDEBUG` being defined. Change all the
callsites of `printID` to use that lambda.
I also included changes to the rest of the SIL optimizer pipeline to ensure that
the part of the optimizer pipeline before we lower tuple_addr_constructor (which
is right after we run TransferNonSendable) work as before.
The reason why I am doing this is that this ensures that diagnostic passes can
tell the difference in between:
```
x = (a, b, c)
```
and
```
x.0 = a
x.1 = b
x.2 = c
```
This is important for things like TransferNonSendable where assigning over the
entire tuple element is treated differently from if one were to initialize it in
pieces using projections.
rdar://117880194