Currently, the convention for partially applying an async function is to
form a thick function consisting of (1) a pointer to the partial apply
forwarder and (2) a thick context containing the async context size.
Previously, the partial application of async methods--specifically, for
which the partial application of the non-async analog produced a simple
partial apply (meaning that their context was a single ref-counted
object in the form of self)--would result in a simple partial
application. Here, that error is addressed by never allowing the
partial application of async methods to produce a simple partial apply.
Eventually, when the complex temporary ABI for partial application of
async functions is unrolled, this change will need to be unrolled as
well.
This commit depends on changes to the coroutine-splitting pass in LLVM. Shadow
copies are also turned off for async function arguments, because they make it
impossible to track debug info during coroutine splitting. Instead we are
relying on LLVM's CoroSplit.cpp to emit shadow copies. The Swift frontend gives
CoroSplit license to move do this by describing the arguments using a
dbg.declare intrinsic, even though it points to chain of load/GEP/bitcase
instructions into the Swift context function argument.
rdar://71866936
For this, store those 3 values on the stack at function entry and update them with the return values of coro_suspend_async intrinsic calls.
This fixes a correctness issue, because the executor may be different after a resume.
It also is more efficient, because this means that the 3 values don't have to preserved in the context over a suspension point.
Previously, the thick context was passed as a fourth parameter to
partial apply forwarders. Here, the thick context is instead moved into
the async context at the local context position. To support this, the
local context is made always available.
As part of concurrency bringup, some workarounds were put in place to
enable the async cc execution tests to continue to run while enabling
async function SIL verification (specifically, that an async function
must be called from an async function) to land before we have the
Task.runDetached mechanism. Specifically, these workaround allow @main
to be annotated @async but continue to be emitted as if it were not
@async.
Now that we have a better mechanism in the form of runAsync, use that
instead.
rdar://problem/70597390
Previously, when lowering the entry point of an async function, the
parameters were lowered to explosions that matched those of sync
functions, namely native explosions. That is incorrect for async
functions where the structured values are within the async context.
Here, that error is fixed, by adding a new customization point to
NativeCCEntryPointArgumentEmission which behaves as before for sync
functions but which simply extracts an argument from the async context
for async functions.
rdar://problem/71260972
An AsyncFunctionPointer, defined in Task.h, is a struct consisting of
two i32s: (1) the relative address of the async function and (2) the
size of the async context to be allocated when calling that function.
Here, such structs are emitted for every async SILFunction that is
emitted.
I have a need to have SwitchEnum{,Addr}Inst have different base classes
(TermInst, OwnershipForwardingTermInst). To do this I need to add a template to
SwitchEnumInstBase so I can switch that BaseTy. Sadly since we are using
SwitchEnumInstBase as an ADT type as well as an actual base type for
Instructions, this is impossible to do without introducing a template in a ton
of places.
Rather than doing that, I changed the code that was using SwitchEnumInstBase as
an ADT to instead use a proper ADT SwitchEnumBranch. I am happy to change the
name as possible see fit (maybe SwitchEnumTerm?).
Previously, an IRGenFunction was being passed to the functions that
construct an AsyncContextLayout. That was not actually necessary and
prevented construction of the layout in contexts where no IRGenFunction
was present. Here that requirement is eased to requiring an IRGenModule
which is indeed required to construct an AsyncContextLayout.
This instructions ensures that all instructions, which need to run on the specified executor actually run on that executor.
For details see the description in SIL.rst.
Because all async functions have the same signature, namely
void(%swift.task*, %swift.executor*, %swift.context*)
it is always possible to provide access to the three argument values
(the current task, current executor, and current context) within an
IRGenFunction which is async. Here, that is provided in the form of
IRGenFunction::getAsyncExecutor,IRGenFunction::getAsyncContext, and
IRGenFunction::getAsyncTask.
The previous stage of bringup only had async functions taking a single
argument: the async context. The next stage will involve the task and
executor. Here, arguments are added for those values. To begin with,
null is always passed for these values.
The majority of support comes in the form of emitting partial
application forwarders for partial applications of async functions.
Such a partial application forwarder must take an async context which
has been partially populated at the apply site. It is responsible for
populating it "the rest of the way". To do so, like sync partial
application forwarders, it takes a second argument, its context, from
which it pulls the additional arguments which were capture at
partial_apply time.
The size of the async context that is passed to the forwarder, however,
can't be known at the apply site by simply looking at the signature of
the function to be applied (not even by looking at the size associated
with the function in the special async function pointer constant which
will soon be emitted). The reason is that there are an unknown (at the
apply site) number of additional arguments which will be filled by the
partial apply forwarder (and in the case of repeated partial
applications, further filled in incrementally at each level). To enable
this, there will always be a heap object for thick async functions.
These heap objects will always store the size of the async context to be
allocated as their first element. (Note that it may be possible to
apply the same optimization that was applied for thick sync functions
where a single refcounted object could be used as the context; doing so,
however, must be made to interact properly with the async context size
stored in the heap object.)
To continue to allow promoting thin async functions to thick async
functions without incurring a thunk, at the apply site, a null-check
will be performed on the context pointer. If it is null, then the async
context size will be determined based on the signature. (When async
function pointers become pointers to a constant with a size i32 and a
relative address to the underlying function, the size will be read from
that constant.) When it is not-null, the size will be pulled from the
first field of the context (which will in that case be cast to
<{%swift.refcounted, i32}>).
To facilitate sharing code and preserving the original structure of
emitPartialApplicationForwarder (which weighed in at roughly 700 lines
prior to this change), a new small class hierarchy, descending from
PartialApplicationForwarderEmission has been added, with subclasses for
the sync and async case. The shuffling of arguments into and out of the
final explosion that was being performed in the synchronous case has
been preserved there, though the arguments are added and removed through
a number of methods on the superclass with more descriptive names. That
was necessary to enable the async class to handle these different
flavors of parameters correctly.
To get some initial test coverage, the preexisting
IRGen/partial_apply.sil and IRGen/partial_apply_forwarder.sil tests have
been duplicated into the async folder. Those tests cases within these
files which happened to have been crashing have each been extracted into
its own runnable test that both verifies that the compiler does not
crash and also that the partial application forwarder behaves correctly.
The FileChecks in these tests are extremely minimal, providing only
enough information to be sure that arguments are in fact squeezed into
an async context.
Add AccesssedStorage::compute and computeInScope to mirror AccessPath.
Allow recovering the begin_access for Nested storage.
Adds AccessedStorage.visitRoots().
Things that have come up recently but are somewhat blocked on this:
- Moving AccessMarkerElimination down in the pipeline
- SemanticARCOpts correctness and improvements
- AliasAnalysis improvements
- LICM performance regressions
- RLE/DSE improvements
Begin to formalize the model for valid memory access in SIL. Ignoring
ownership, every access is a def-use chain in three parts:
object root -> formal access base -> memory operation address
AccessPath abstracts over this path and standardizes the identity of a
memory access throughout the optimizer. This abstraction is the basis
for a new AccessPathVerification.
With that verification, we now have all the properties we need for the
type of analysis requires for exclusivity enforcement, but now
generalized for any memory analysis. This is suitable for an extremely
lightweight analysis with no side data structures. We currently have a
massive amount of ad-hoc memory analysis throughout SIL, which is
incredibly unmaintainable, bug-prone, and not performance-robust. We
can begin taking advantage of this verifably complete model to solve
that problem.
The properties this gives us are:
Access analysis must be complete over memory operations: every memory
operation needs a recognizable valid access. An access can be
unidentified only to the extent that it is rooted in some non-address
type and we can prove that it is at least *not* part of an access to a
nominal class or global property. Pointer provenance is also required
for future IRGen-level bitfield optimizations.
Access analysis must be complete over address users: for an identified
object root all memory accesses including subobjects must be
discoverable.
Access analysis must be symmetric: use-def and def-use analysis must
be consistent.
AccessPath is merely a wrapper around the existing accessed-storage
utilities and IndexTrieNode. Existing passes already very succesfully
use this approach, but in an ad-hoc way. With a general utility we
can:
- update passes to use this approach to identify memory access,
reducing the space and time complexity of those algorithms.
- implement an inexpensive on-the-fly, debug mode address lifetime analysis
- implement a lightweight debug mode alias analysis
- ultimately improve the power, efficiency, and maintainability of
full alias analysis
- make our type-based alias analysis sensistive to the access path
Previously, EmitPolymorphicParameters dealt directly with an Explosion
from which it pulled values. In one place, there was a conditional
check for async which handled some cases. There was however another
place where the polymorphic parameter was pulled directly from the
explosion. That missed case resulted in attempting to pull a
polymorphic parameter directly from an Explosion which contains only a
%swift.context* per the async calling convention.
Here, those parameters are now pulled from an EntryPointArgumentEmission
subclasses of which are able to provide the relevant definition of what
pulling a parameter means.
rdar://problem/70144083
Previously, the AsyncContextLayout did not make space for the trailing
witness fields (self metadata and self witness table) and the
AsyncNativeCCEntryPointArgumentEmission could consequently not vend
these fields. Here, the fields are added to the layout.
Previously the methods for getting the index into the layout were public
and were being used to directly access the underlying buffer. Here,
that abstraction leakage is fixed and field access is forced to go
through the appropriate methods.
Here, the following is implemented:
- Construction of SwiftContext struct with the fields needed for calling
functions.
- Allocating and deallocating these swift context via runtime calls
before calling async functions and after returning from them.
- Storing arguments (including bindings and the self parameter but not
including protocol fields for witness methods) and returns (both
direct and indirect).
- Calling async functions.
Additional things that still need to be done:
- protocol extension methods
- protocol witness methods
- storing yields
- partial applies
`get_async_continuation[_addr]` begins a suspend operation by accessing the continuation value that can resume
the task, which can then be used in a callback or event handler before executing `await_async_continuation` to
suspend the task.