So far, static arrays had to be put into a writable section, because the isa pointer and the (immortal) ref count field were initialized dynamically at the first use of such an array.
But with a new runtime library, which exports the symbols for the (immortal) ref count field and the isa pointer, it's possible to put the whole array into a read-only section. I.e. make it a constant global.
rdar://94185998
I wrote out this whole analysis of why different existential types
might have the same logical content, and then I turned around and
immediately uniqued existential shapes purely by logical content
rather than the (generalized) formal type. Oh well. At least it's
not too late to make ABI changes like this.
We now store a reference to a mangling of the generalized formal
type directly in the shape. This type alone is sufficient to unique
the shape:
- By the nature of the generalization algorithm, every type parameter
in the generalization signature should be mentioned in the
generalized formal type in a deterministic order.
- By the nature of the generalization algorithm, every other
requirement in the generalization signature should be implied
by the positions in which generalization type parameters appear
(e.g. because the formal type is C<T> & P, where C constrains
its type parameter for well-formedness).
- The requirement signature and type expression are extracted from
the existential type.
As a result, we no longer rely on computing a unique hash at
compile time.
Storing this separately from the requirement signature potentially
allows runtimes with general shape support to work with future
extensions to existential types even if they cannot demangle the
generalized formal type.
Storing the generalized formal type also allows us to easily and
reliably extract the formal type of the existential. Otherwise,
it's quite a heroic endeavor to match requirements back up with
primary associated types. Doing so would also only allows us to
extract *some* matching formal type, not necessarily the *right*
formal type. So there's some good synergy here.
Avoid a reabstraction thunk every time an async let entry point is emitted,
by setting the context abstraction level while we emit the implicit closure.
Adjust some surrounding logic that breaks when we do this:
- When lowering a non-throwing function type against a throwing abstraction
pattern, include the error type in the lowered substituted type. Async
throwing and nonthrowing functions would require another thunk to convert
away the throwingness of the async context, which would defeat the purpose.
- Adjust the code in IRGen that pads the initial context size for `async let`
entry points so that it works when the entry point has not yet emitted, by
marking the async function pointer to be padded later if it isn't defined
yet.
Enable testing makes `internal` types visible from outside the module.
We can no longer treat super classes as resilient.
Follow-up to #41044.
rdar://90489618
We fixed a bug in the concurrency runtime (rdar://problem/90357994) that would
lead to memory corruption when a new `async let` child task tried to use the
last 16 bytes of the preallocated slab from its parent to seed its own task
allocator. To work around this bug in older OSes that shipped with the bug,
artificially pad the async coroutine context size for any async functions
used as an `async let` entry point, which will prevent the runtime from
going down the path of trying to use the preallocated storage at all.
rdar://90506708
This adds a new reflection record type carrying spare bit information for multi-payload enums.
The compiler includes this for any type that might need it in order to accurately reflect the contents of the enum. The RemoteMirror library will use this if present to determine how to project the contents of the enum. If not present (for example, in older binaries), the RemoteMirror library falls back on an internal calculation of the spare bitmask.
A few notes:
* The internal calculation is not perfect. In particular, it does not support MPEs that contain other enums (e.g., optionals). It should accurately refuse to project any MPE that it does not correctly support.
* The new reflection field is designed to be expandable; this might someday avoid the need for a new section.
Resolves rdar://61158214
The special convention currently means three things:
1. There is no async FP symbol for the function. Calls should go directly
to the function symbol, and an async context of fixed static size should be
allocated. This is mandatory for calling runtime-provided async functions.
2. The callee context should be allocated but not initialized. The main
context pointer passed should be the caller's context, and the continuation
function pointer and callee context should be passed as separate arguments.
The function will resume the continuation function pointer with the caller's
context. This is a micro-optimization appropriate for functions that are
expected to frequently return immediately; other functions shouldn't bother.
3. Generic arguments should be suppressed. This is a microoptimization for
certain specific runtime functions where we happen to know that the runtime
already stores the appropriate information internally. Other functions
probably don't want this.
Obviously, these different treatments should be split into different
predicates so that functions can opt in to different subsets of them.
I've also set the code up so that runtime functions can more easily
request a specific static async context size. Previously, it was a
confusingly embedded assumption that the static context size was always
exactly two pointers more than the header.
* [Distributed] Implement func metadata and executeDistributedTarget
dont expose new entrypoints
able to get all the way to calling _execute
* [Distributed] reimplement distributed get type info impls
* [Distributed] comment out distributed_actor_remoteCall for now
* [Distributed] disable test on linux for now
Direct pointer to the accessor cannot be called at runtime,
so here is how everything is stored:
- For `distributed` and `async` functions -> async function pointer;
- For regular functions -> function pointer.
Uses a dedicated section in the binary to emit records about
functions that can be looked up by name at the runtime, and
then called through a fully-abstracted entry point whose
arguments can be constructed in code.
A "accessible" function that can be looked up based on a string key,
and then called through a fully-abstracted entry point whose arguments
can be constructed in code.
With PE/COFF, one cannot reference a data symbol directly across the
binary module boundary. Instead, the reference must be indirected
through the Import Address Table (IAT) to allow for position
independence.
When generating a reference to a AsyncFunctionPointer ({i8*, i32}), we
tag the pointer as being indirected by tagging bit 1 (with the
assumption that native alignment will ensure 4/8 byte alignment, freeing
the bottom 2 bits at least for bit-packing). We tweak the
v-table/witness table emission such that all references to the
AsyncFunctionPointer are replaced with the linker synthetic import
symbol with the bit packing:
~~~
.quad __imp_$s1L1CC1yyYaKFTu+1
~~~
rather than
~~~
.quad $s1L1CC1yyYaKFTu
~~~
Upon access of the async function pointer reference, we open-code the
check for the following:
~~~
pointer = (pointer & 1) ? *(void **)(pointer & ~1) : pointer;
~~~
Thanks to @DougGregor for the discussion and the suggestion for the
pointer tagging. Thanks to @aschwaighofer for pointers to the code that
I had missed. Also, thanks to @SeanROlszewski for the original code
sample that led to the reduced test case.
Fixes: SR-15399
Define the possible runtime effects of an instruction in an enum `RuntimeEffect`.
Add a new utility `swift:getRuntimeEffect` to estimate the runtime effects of an instruction.
Also, add a mechanism to validate the correctness of the analysis in IRGen: annotate all runtime functions in RuntimeFunctions.def with the actual effect what the runtime function has or can have. Then check if the effects of emitted runtime functions for an instruction match what `getRuntimeEffect` predicts.
This check is only enabled on demand by defining the CHECK_RUNTIME_EFFECT_ANALYSIS macro in RuntimeEffect.h
@objc actors implicitly inherit from the new, hidden
`SwiftNativeNSObject` class that inherits from `NSObject` yet provides
Swift-native reference counting, which is important for the actor
runtime's handling of zombies. However, `SwiftNativeNSObject` is only
available in the Swift runtime in newer OS versions (e.g., macOS
12.0/iOS 15.0), and is available in the back-deployed _Concurrency
library, but there is no stable place to link against for
back-deployed code. Tricky, tricky.
When back-deploying @objc actors, record `NSObject` as the superclass
in the metadata in the binary, because we cannot reference
`SwiftNativeNSObject`. Then, emit a static initializer to
dynamically look up `SwiftNativeNSObject` by name (which will find it
in either the back-deployment library, on older systems, or in the
runtime for newer systems), then swizzle that in as the superclass of
the @objc actor.
Fixes rdar://83919973.
A new LLVM IR affordance that allows expressing conditions under which globals
can be removed/dropped (even when marked with @llvm.used) is being discussed at:
- <https://reviews.llvm.org/D104496>
- <https://lists.llvm.org/pipermail/llvm-dev/2021-September/152656.html>
This is a preliminary implementation that marks runtime lookup records (namely
protocol records, type descriptors records and protocol conformance records)
with the !llvm.used.conditional descriptors. That allows link-time / LTO-time
removal of these records (by GlobalDCE) based on whether they're actually used
within the linkage unit. Effectively, this allows libraries that have a limited
and known set of clients, to be optimized against the client at LTO time, and
significantly reduce the code size of that library.
Parts of the implementation:
- New -conditional-runtime-records frontend flag to enable using !llvm.used.conditional
- IRGen code that emits these records can now emit these either as a single contiguous
array (asContiguousArray = true, the old way), which is used for JIT mode, or
as indivial globals (asContiguousArray = false), which is necessary for the
!llvm.used.conditional stripping to work.
- When records are emitted as individual globals, they have new names of
"\01l_protocol_" + mangled name of the protocol descriptor, and similarly for
other records.
- Fixed existing tests to account for individual records instead of a single array
- Added an IR level test, and an end-to-end execution test to demonstrate that
the !llvm.used.conditional-based stripping actually works.
When we deploy to a minimum target that is not known to support extended
frame information the function prolog of Swift async functions will
contain a reference to swift_async_extendedFramePointerFlags. This
reference needs to be weak like any async symbols.
rdar://83412550
- Virtual calls are done via a @llvm.type.checked.load instrinsic call with a type identifier
- Type identifier of a vfunc is the base method's mangling
- Type descriptors and class metadata get !type markers that list offsets and type identifiers of all vfuncs
- The -enable-llvm-vfe frontend flag enables VFE
- Two added tests verify the behavior on IR and by executing a program
Tracking this as a single bit is actually largely uninteresting
to the runtime. To handle priority escalation properly, we really
need to track this at a finer grain of detail: recording that the
task is running on a specific thread, enqueued on a specific actor,
or so on. But starting by tracking a single bit is important for
two reasons:
- First, it's more realistic about the performance overheads of
tasks: we're going to be doing this tracking eventually, and
the cost of that tracking will be dominated by the atomic
access, so doing that access now sets the baseline about right.
- Second, it ensures that we've actually got runtime involvement
in all the right places to do this tracking.
A propos of the latter: there was no runtime involvement with
awaiting a continuation, which is a point at which the task
potentially transitions from running to suspended. We must do
the tracking as part of this transition, rather than recognizing
in the run-loops that a task is still active and treating it as
having suspended, because the latter point potentially races with
the resumption of the task. To do this, I've had to introduce
a runtime function, swift_continuation_await, to do this awaiting
rather than inlining the atomic operation on the continuation.
As part of doing this work, I've also fixed a bug where we failed
to load-acquire in swift_task_escalate before walking the task
status records to invoke escalation actions.
I've also fixed several places where the handling of task statuses
may have accidentally allowed the task to revert to uncancelled.
Same issue as with TBDGen. Clang doesn't hold onto the Datalayout object
anymore, but instead keeps the description string that is constructed on
the fly. This patch updates IRGen to behave the same way.
Rather than using group task options constructed from the Swift parts
of the _Concurrency library and passed through `createAsyncTask`'s
options, introduce a separate builtin that always takes a group. Move
the responsibility for creating the options structure into IRGen, so
we don't need to expose the TaskGroupTaskOptionRecord type in Swift.
introduce new options parameter to all task spawning
[Concurrency] ABI for asynclet start to accept options
[Concurrency] fix unittest usages of changed task creation ABI
[Concurrency] introduce constants for parameter indexes in ownership
[Concurrency] fix test/SILOptimizer/closure_lifetime_fixup_concurrency.swift