This requires two major changes.
The first is that we need to teach SILGen that the isolation of an initializer
is essentially dynamic (as far as SILGen is concerned) --- that it needs to emit
code in order to get the isolation reference. To make this work, I needed to
refactor how we store the expected executor of a function so that it's not
always a constant value; instead, we'll need to emit code that DI will lower
properly. Fortunately, I can largely build on top of the work that Doug previously
did to support #isolation in these functions. The SIL we emit here around delegating
initializer calls is not ideal --- the breadcrumb hop ends up jumping to the
generic executor, and then DI actually emits the hop to the actor. This is a little
silly, but it's hard to eliminate without special-casing the self-rebinding, which
honestly we should consider rather than the weirdly global handling of that in
SILGen today. The optimizer should eliminate this hop pretty reliably, at least.
The second is that we need to teach DI to handle the pattern of code we get in
delegating initializers, where the builtin actually has to be passed the self var
rather than a class reference. This is because we don't *have* a class reference
that's consistently correct in these cases. This ended up being a fairly
straightforward generalization.
I also taught the hop_to_executor optimizer to skip over the initialization of
the default-actor header; there are a lot of simple cases where we still do emit
the prologue generic-executor hop, but at least the most trivial case is handled.
To do this better, we'd need to teach this bit of the optimizer that the properties
of self can be stored to in an initializer prior to the object having escaped, and
we don't have that information easily at hand, I think.
Fixes rdar://87485045.
This corresponds to the parameter-passing convention of the Itanium C++
ABI, in which the argument is passed indirectly and possibly modified,
but not destroyed, by the callee.
@in_cxx is handled the same way as @in in callers and @in_guaranteed in
callees. OwnershipModelEliminator emits the call to destroy_addr that is
needed to destroy the argument in the caller.
rdar://122707697
Distributed actors can be treated as actors by accessing the `asLocalActor`
property. When lowering `#isolation` in a distributed actor initializer,
use a separate builtin `flowSensitiveDistributedSelfIsolation` to
capture the conformance to `DistributedActor`, and have Definite
Initialization introduce the call to the `asLocalActor` getter when
needed.
Actor initializers have a flow-sensitive property where they are isolated
to the actor being initialized only after the actor instance itself is
fully-initialized. However, this behavior was not being reflected in
the expansion of `#isolation`, which was always expanding to `self`,
even before `self` is fully formed.
This led to a source compatibility issue with code that used the async
for..in loop within an actor initializer *prior* to the point where the
actor was fully initialized, because the type checker is introducing
the `#isolation` (SE-0421) but Definite Initialization properly rejects
the use of `self` before it is initialized.
Address this issue by delaying the expansion of `#isolation` until
after the actor is fully initialized. In SILGen, we introduce a new
builtin for this case (and *just* this case) called
`flowSensitiveSelfIsolation`, which takes in `self` as its argument
and produces an `(any Actor)?`. Definite initialization does not treat
this as a use of `self`. Rather, it tracks these builtins and
replaces them either with `self` (if it is fully-initialized at this
point) or `nil` (if it is not fully-initialized at this point),
mirroring the flow-sensitive isolation semantics described in SE-0327.
Fixes rdar://127080037.
Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
LLVM is presumably moving towards `std::string_view` -
`StringRef::startswith` is deprecated on tip. `SmallString::startswith`
was just renamed there (maybe with some small deprecation inbetween, but
if so, we've missed it).
The `SmallString::startswith` references were moved to
`.str().starts_with()`, rather than adding the `starts_with` on
`stable/20230725` as we only had a few of them. Open to switching that
over if anyone feels strongly though.
I also included changes to the rest of the SIL optimizer pipeline to ensure that
the part of the optimizer pipeline before we lower tuple_addr_constructor (which
is right after we run TransferNonSendable) work as before.
The reason why I am doing this is that this ensures that diagnostic passes can
tell the difference in between:
```
x = (a, b, c)
```
and
```
x.0 = a
x.1 = b
x.2 = c
```
This is important for things like TransferNonSendable where assigning over the
entire tuple element is treated differently from if one were to initialize it in
pieces using projections.
rdar://117880194
I was originally hoping to reuse mark_must_check for multiple types of checkers.
In practice, this is not what happened... so giving it a name specifically to do
with non copyable types makes more sense and makes the code clearer.
Just a pure rename.
Adjust DI to treat init accessor properties that have only 'accesses'
or no restrictions as if they are stored properties, this means that
if such property doesn't have a default initializer users would have
to reference it explicitly in their custom initializers.
We also need to suppress default init synthesis for such cases which
would be done in a followup commit.
Resolves: rdar://113401979
DI marks all of of the previously initialized properties and Raw SIL
lowering emits `destroy_addr` before calling init accessor for such
properties to destroy previously set value.
- Adds a missing check to `collectClassSelfUses` to find assign_or_init instructions;
- RawSIL lowering should start emitting access around synthesized member references.
- Record all properties listed in `accesses` as loads;
- Record all properties listed in `initialized` as init-or-assign;
- Detect situations when double-init could happen i.e. if one of
the properties listed in `initializes` attribute is explicitly
initialized before init accessor call.
- SILPackType carries whether the elements are stored directly
in the pack, which we're not currently using in the lowering,
but it's probably something we'll want in the final ABI.
Having this also makes it clear that we're doing the right
thing with substitution and element lowering. I also toyed
with making this a scalar type, which made it necessary in
various places, although eventually I pulled back to the
design where we always use packs as addresses.
- Pack boundaries are a core ABI concept, so the lowering has
to wrap parameter pack expansions up as packs. There are huge
unimplemented holes here where the abstraction pattern will
need to tell us how many elements to gather into the pack,
but a naive approach is good enough to get things off the
ground.
- Pack conventions are related to the existing parameter and
result conventions, but they're different on enough grounds
that they deserve to be separated.
Adding a type wrapper attribute on a protocol does two things:
- Synthesizes `associatedtype $Storage` declaration with `internal` access
- Synthesizes `var $storage: <#Wrapper#><Self, Self.$Storage>` value requirement
`_storage` is used primary to support member-by-member initialization
of `$Storage`, so we need to refer to `$Storage` to determine whether
a partial field of `_storage` is `let` or not.
`_storage` is special because it emits assignments to compiler
synthesized stored property `$storage`, without that `self`
cannot be fully initialized.
This is staging for type wrappers that need to inject
`self.$storage = ...` call to user-defined initializers,
to be able to do that logic needs to be able to find
`self` that is being constructed.
Andy some time ago already created the new API but didn't go through and update
the old occurences. I did that in this PR and then deprecated the old API. The
tree is clean, so I could just remove it, but I decided to be nicer to
downstream people by deprecating it first.
This is just the very beginning... I still need to implement more parts of
SILGen for this. But all great things start small. I am going to iterate on top
of this and just wanted to get some initial parts of the work in as I go.
The resignID call within the initializer moved into DI, because an assignment to
the actorSystem, and thus initialization of the id, is no longer guaranteed to happen.
Thus, while we previously could model the resignation as a clean-up emitted on all
unwind paths in an initializer, now it's conditional, based on whether the id was
initialized. This is exactly what DI is designed to do, so we inject the resignation
call just before we call the identity's deinit.
This attribute is designed for let-bound variables whose initializing
assignment is synthesized by the compiler. This assignment is
expected to happen at some point before DefiniteInitialization has
run, which is the pass that verifies whether the compiler truly
initialized the variable.
I generally expect that this will never be a user-facing feature, and
that the synthesized assignment happens in SILGen.
Flow-isolation is a diagnostic SIL pass that finds
unsafe accesses to properties in initializers and
deinitializers that cannot gain isolation to otherwise
protect those accesses from concurrent modifications.
See SE-327 for more details about how and why it exists.
This commit includes changes and features like:
- The removal of the escaping-use restriction
- Flow-isolation that works properly with `defer` statements
- Flow-isolation with an emphasis on helpful diagnostics.
It also includes known issues like:
- Local / nonescaping functions are not analyzed by
flow-isolation, despite it being technically possible.
The main challenge in supporting it efficiently is that
such functions do not have a single exit-point, like
a `defer`. In particular, arbitrary functions can throw
so there are points where nonisolation should _not_ flow
out of the function at a call-site in the initializer, etc.
- The implementation of the flow-isolation pass is not
particularly memory efficient; it relies on BitDataflow
even though the particular flow problem is simple.
So, a more efficient implementation would be specialized for
this particular problem, etc.
There are also some changes to the Swift language itself: defer
will respect its context when deciding its property access kind.
Previously, a defer in an initializer would always access a stored
property through its accessor methods, instead of doing so directly
like its enclosing function might. This inconsistency is unfortunate,
so for Swift 6+ we make this consistent. For Swift 5, only a defer
in a function that is a member of the following kinds of types
will gain this consistency:
- an actor type
- any nominal type that is actor-isolated, excluding UnsafeGlobalActor.
These types are still rather new, so there is much less of a chance of
breaking expected behaviors around defer. In particular, the danger is
that users are relying on the behavior of defer triggering a property
observer within an init or deinit, when it would not be triggering it
without the defer.
This instruction is similar to a copy_addr except that it marks a move of an
address that has to be checked. In order to keep the memory lifetime verifier
happy, the semantics before the checker runs are the mark_unresolved_move_addr is
equivalent to copy_addr [init] (not copy_addr [take][init]).
The use of this instruction is that Mandatory Inlining converts builtin "move"
to a mark_unresolved_move_addr when inlining the function "_move" (the only
place said builtin is invoked).
This is then run through a special checker (that is later in this PR) that
either proves that the mark_unresolved_move_addr can actually be a move in which
case it converts it to copy_addr [take][init] or if it can not be a move, emit
an error and convert the instruction to a copy_addr [init]. After this is done
for all instructions, we loop back through again and emit an error on any
mark_unresolved_move_addr that were not processed earlier allowing for us to
know that we have completeness.
NOTE: The move kills checker for addresses is going to run after Mandatory
Inlining, but before predictable memory opts and friends.