Not all targets have a 16-byte type alignment guarantee. For the types
which are not naturally aligned, provide a type specific `operator new`
overload to ensure that we are properly aligning the type on allocation
as we run the risk of under-aligned allocations otherwise.
This should no longer be needed with C++17 and newer which do a two
phase `operator new` lookup preferring
`operator new(std::size, std::align_val_t)` if needed. The base type
would be fully pre-processed away. The empty base class optimization
should help ensure that we do not pay any extra size costs for the
alignment fixes.
As we are a C++14 codebase, we must locally implement some of the
standard type_traits utilities, namely `void_t`. We take the minimal
definition here, assuming that the compiler is up-to-date with C++14 DR
reports which fixed an issue in SFINAE. We use the SFINAE for detecting
the presence of the `operator new` overload to guide the over-alignment,
which is inherited through the new `swift::overaligned_type<>` base
type.
Annotate the known classes which request explicit alignment which is
non-pointer alignment. This list was identified by
`git grep ' alignas(.*) '`.
When SWIFT_COMPACT_ABSOLUTE_FUNCTION_POINTER is enabled, relative direct
pointers whose pointees are functions will be turned into absolute
pointer at compile-time.
The immediate use case is only concretely-constrained existential
types, which could use a much simpler representation, but I've
future-proofed the representation as much as I can; thus, the
requirement signature can have arbitrary parameters and
requirements, and the type can have an arbitrary type as the
sub-expression. The latter is also necessary for existential
metatypes.
The chief implementation complexity here is that we must be able
to agree on the identity of an existential type that might be
produced by substitution. Thus, for example, `any P<T>` when
`T == Int` must resolve to the same type metadata as
`any P<Int>`. To handle this, we identify the "shape" of the
existential type, consisting of those parts which cannot possibly
be the result of substitution, and then abstract the substitutable
"holes" as an application of a generalization signature. That
algorithm will come in a later patch; this patch just represents
it.
Uniquing existential shapes from the requirements would be quite
complex because of all the symbolic mangled names they use.
This is particularly true because it's not reasonable to require
translation units to agree about what portions they mangle vs.
reference symbolically. Instead, we expect the compiler to do
a cryptographic hash of a mangling of the shape, then use that
as the unique key identifying the shape.
This is just the core representation and runtime interface; other
parts of the runtime, such as dynamic casting and demangling
support, will come later.
The `AllocationPool` type is reflected upon and inspected by debugging
tools. This requires that the layout of the atomically wrapped type
must store the value at offset 0. However, the `std::atomic` does not
make any such guarantees, and a layout differing would render debug
utilities useless. Switch instead from a `std::atomic` to
`swift::atomic`. In order to do so, we also need to introduce the
`compare_exchange_strong` helper which maps to
`compare_exchange_strong_explicit` mirroring `compare_exchange_weak`.
* When ptrauth-copying vtable/wtables, allow NULL entries (due to VFE)
* Mark virtual-function-elimination-generics-exec.swift UNSUPPORTED: arm64e until the rebranch
* Fix test expectations
The current system is based on MetadataCompletionQueueEntry
objects which are allocated and then enqueued on dependencies.
Blocking is achieved using a condition variable associated
with the lock on the appropriate metadata cache. Condition
variables are inherently susceptible to priority inversions
because the waiting threads have no dynamic knowledge of
which thread will notify the condition. In the current system,
threads that unblock dependencies synchronously advance their
dependent metadata completions, which means the signaling
thread is unreliable even if we could represent it in condition
variables. As a result, the current system is wholly unsuited
for eliminating these priority inversions.
An AtomicWaitQueue is an object containing a lock. The queue
is eagerly allocated, and the lock is held, whenever a thread
is doing work that other threads might wish to block on. In
the metadata completion system, this means whenever we construct
a metadata cache entry and the metadata isn't already allocated
and transitively complete after said construction. Blocking
is done by safely acquiring a shared reference to the queue
object (which, in the current implementation, requires briefly
taking a lock that's global to the surrounding metadata cache)
and then acquiring the contained lock. For typical lock
implementations, this avoids priority inversions by temporarily
propagating the priority of waiting threads to the locking
threads.
Dependencies are unblocked by simply releasing the lock held
in the queue. The unblocking thread doesn't know exactly what
metadata are blocked on it and doesn't make any effort to
directly advance their completion; instead, the blocking
thread will wake up and then attempt to advance the dependent
metadata completion itself, eliminating a source of priority
overhang that affected the old system. Successive rounds of
unblocking (e.g. when a metadata makes partial progress but
isn't yet complete) can be achieved by creating a new queue
and unlocking the old one. We can still record dependencies
and use them to dynamically diagnose metadata cycles.
The new system allocates more eagerly than the old one.
Formerly, metadata completions which were never blocked never
needed to allocate a MetadataCompletionQueueEntry; we were
then unable to actually deallocate those entries once they
were allocated. The new system will allocate a queue for
most metadata completions, although, on the positive side,
we can reliably deallocate these queues. Cache entries are
also now slightly smaller because some of the excess storage
for status has been folded into the queue.
The fast path of an actual read of the metadata remains a
simple load-acquire. Slow paths may require a bit more
locking. On Darwin, the metadata cache lock can now use
os_unfair_lock instead of pthread_mutex_t (which is a massive
improvement) because it does not need to support associated
condition variables.
The excess locking could be eliminated with some sort of
generational scheme. Sadly, those are not portable, and I
didn't want to take it on up-front.
rdar://76127798
MetadataAllocator should never return NULL, but bugs or corruption could potentially make that happen. On the large path, switch from malloc to swift_slowAlloc, which aborts on failure. On the pool path, check for a NULL allocation pointer, and log a bunch of information about the allocation request and the allocator's current state.
rdar://84503396
Mangling can fail, usually because the Node structure has been built
incorrectly or because something isn't supported with the old remangler.
We shouldn't just terminate the program when that happens, particularly
if it happens because someone has passed bad data to the demangler.
rdar://79725187
If anyone else is building Windows ARM64 they should be using a new
enough Visual Studio. This workaround is more difficult to keep working
properly and the CI hosts should have a new enough Visual Studio
installation hopefully in order to enable the ARM64 builds of the
runtime. If they do not, we can re-evaluate whether to re-instate the
workaround. This allows building part of the runtime with Visual Studio
2022 and reduces the maintenance overheads for the runtime.
Moved the test for the metadata kind to MetadataLookup.cpp.
Added an assertion (for debug builds) to Metadata.cpp to catch the case where
something manages to bypass that test.
Added a special test for getObjCClassByMangledName; this needs testing
separately as it uses the DecodedMetadataBuilder, which doesn't get exercised
by the normal demangling tests.
Added all the test cases from rdar://63485806, rdar://63488139, rdar://63496478,
rdar://63410196 and rdar://68449341. The test cases from rdar://63485806 are
disabled for now because the problem there is the error handling mechanism (or
lack thereof), rather than us not handling errors.
Fixes the remaining cases from
rdar://63488139
rdar://63496478
Added SWIFT_RUNTIME_WEAK_IMPORT/CHECK/USE macros.
Everything supports fast dealloc except x86 iOS simulators, so we no longer need
to look up objc_has_weak_formation_callout.
Added direct references for
objc_setHook_lazyClassNamer
_objc_realizeClassFromSwift
objc_setHook_getClass
os_system_version_get_current_version
_dyld_is_objc_constant
Implement name mangling, type metadata, runtime demangling, etc. for
global-actor qualified function types. Ensure that the manglings
round-trip through the various subsystems.
Implements rdar://78269642.
Previously, AsyncFunctionPointer constants were signed as code. That
was incorrect considering that these constants are in fact data. Here,
that is fixed.
rdar://76118522
* Move differentiability kinds from target function type metadata to trailing objects so that we don't exhaust all remaining bits of function type metadata.
* Differentiability kind is now stored in a tail-allocated word when function type flags say it's differentiable, located immediately after the normal function type metadata's contents (with proper alignment in between).
* Add new runtime function `swift_getFunctionTypeMetadataDifferentiable` which handles differentiable function types.
* Fix mangling of different differentiability kinds in function types. Mangle it like `ConcurrentFunctionType` so that we can drop special cases for escaping functions.
```
function-signature ::= params-type params-type async? sendable? throws? differentiable? // results and parameters
...
differentiable ::= 'jf' // @differentiable(_forward) on function type
differentiable ::= 'jr' // @differentiable(reverse) on function type
differentiable ::= 'jd' // @differentiable on function type
differentiable ::= 'jl' // @differentiable(_linear) on function type
```
Resolves rdar://75240064.
The previous fix here switched dyn_cast for dyn_cast_or_null, but this left us with an assertion failure in cast<>. Instead, explicitly check for NULL at the top of the function.
Somehow, clang was generating code that accepted nil and returned nil in swift_getObjCClassFromMetadata, but is no longer doing so. Some code relies on this to work, so switch to dyn_cast_or_null to accept nil explicitly.
rdar://74895271
Take the existing CompatibilityOverride mechanism and generalize it so it can be used in both the runtime and Concurrency libraries. The mechanism is preprocessor-heavy, so this requires some tricks. Use the SWIFT_TARGET_LIBRARY_NAME define to distinguish the libraries, and use a different .def file and mach-o section name accordingly.
We want the global/main executor functions to be a little more flexible. Instead of using the override mechanism, we expose function pointers that can be set by the compatibility library, or by any other code that wants to use a custom implementation.
rdar://73726764
Some ObjC runtime calls are weak or strong depending on the deployment target. When strong, we get warnings that the NULL checks always succeed; silence them.
Some of the adjacent code looked up functions using dlsym when they aren't provided by the SDK. Our current minimum SDK always has them, so remove the dlsym workaround.
In the uncached case, we'd scan conformances, cache them, then re-query the cache. This worked fine when the cache always grew, but now we clear the cache when loading new Swift images into the process. If that happens between the scan and the re-query, we lose the entry and return a false negative.
Instead, track what we've found in the scan in a separate local table, then query that after completing the scan.
While we're in there, fix a bug in TypeLookupError where operator= accidentally copied this->Context instead of other.Context. This caused the runtime to crash when trying to print error messages due to the false negative.
Add a no-parameter constructor to TypeLookupErrorOr<> to distinguish the case where it's being initialized with nothing from the case where it's being initialized with nullptr.
Instead of scribbling each allocation as it's parceled out, scribble the entire chunk up-front. Then, when handing out an allocation, check to make sure it still has the right scribbled data in it.
To prevent rdar://problem/68997282 from regressing, verify at runtime in
debug builds that in calls to swift_allocateGenericValueMetadata the
extraDataSize argument matches the OffsetInWords and SizeInWords
specified by the GenericMetadataPartialPattern available within the
pattern argument.
initClassFieldOffsetVector writes the instanceStart and size to the class's rodata. In some cases they already match, and this write will dirty memory unnecessarily, and prevent the compiler from emitting those rodatas into read-only memory.
rdar://problem/71119533
* [Runtime] Switch MetadataCache to ConcurrentReadableHashMap.
Use StableAddressConcurrentReadableHashMap. MetadataCacheEntry's methods for awaiting a particular state assume a stable address, where it will repeatedly examine `this` in a loop while waiting on a condition variable, so we give it a stable address to accommodate that. Some of these caches may be able to tolerate unstable addresses if this code is changed to perform the necessary table lookup each time through the loop instead. Some of them store metadata inline and we assume metadata never moves, so they'll have to stay this way.
* Have StableAddressConcurrentReadableHashMap remember the last found entry and check that before doing a more expensive lookup.
* Make a SmallMutex type that stores the mutex data out of line, and use it to get LockingConcurrentMapStorage to fit into the available space on 32-bit.
rdar://problem/70220660
Add a new entry point for getting generic metadata which adds the
canonical metadata records attached to the nominal type descriptor to
the metadata cache.
Change the implementation of the primary entry-point
swift_getGenericMetadata to stop looking through canonical
prespecialized records.
Change the implementation of swift_getCanonicalSpecializedMetadata to
use the caching token attached to the nominal type descriptor to add
canonical prespecialized metadata records to the metadata cache only
once rather than using the cache variables to limit the number of times
the attempt was made.
When swift_compareTypeContextDescriptors was added, it did not auth the
TypeContextDesriptor arguments that were passed to it. Fix that here.
There are not any uses of this function yet, so there are no signs that
need to be added.