The Swift compiler incorrectly sets the LLVM "ReadNone" attribute when
declaring swift_getObjCClassFromObject and swift_projectBox. This
means that the LLVM ARC optimizer will hoist ARC "release" operations
above these runtime calls. Since the implementation of the calls reads
the object, this causes a use-after-free crash.
This problem is easy to reproduce with
swift_getObjCClassFromObject. It was only exposed recently because the
SIL optimizer now shrinks object lifetimes, making it easier for LLVM
optimizations to kick in. It is difficult to expose the problem with
swift_projectBox, but the bug/fix is still obvious.
Fixes rdar://73820091: Use-after free application crash.
It would be more abstractly correct if this got DI support so
that we destroy the member if the constructor terminates
abnormally, but we can get to that later.
In derivatives of loops, no longer allocate boxes for indirect case payloads. Instead, use a custom pullback context in the runtime which contains a bump-pointer allocator.
When a function contains a differentiated loop, the closure context is a `Builtin.NativeObject`, which contains a `swift::AutoDiffLinearMapContext` and a tail-allocated top-level linear map struct (which represents the linear map struct that was previously directly partial-applied into the pullback). In branching trace enums, the payloads of previously indirect cases will be allocated by `swift::AutoDiffLinearMapContext::allocate` and stored as a `Builtin.RawPointer`.
Thick async functions store their async context size in the closure
context. Only if the closure context is nil can we assume the
partial_apply_forwarder function to be the address of an async function
pointer struct value.
`Builtin.createAsyncTask` takes flags, an optional parent task, and an
async/throwing function to execute, and passes it along to the
`swift_task_create_f` entry point to create a new (potentially child)
task, returning the new task and its initial context.
Implement a new builtin, `cancelAsyncTask()`, to cancel the given
asynchronous task. This lowers down to a call into the runtime
operation `swift_task_cancel()`.
Use this builtin to implement Task.Handle.cancel().
Add a new entry point for getting generic metadata which adds the
canonical metadata records attached to the nominal type descriptor to
the metadata cache.
Change the implementation of the primary entry-point
swift_getGenericMetadata to stop looking through canonical
prespecialized records.
Change the implementation of swift_getCanonicalSpecializedMetadata to
use the caching token attached to the nominal type descriptor to add
canonical prespecialized metadata records to the metadata cache only
once rather than using the cache variables to limit the number of times
the attempt was made.
The new function swift_getCanonicalSpecializedMetadata takes a metadata
request, a prespecialized non-canonical metadata, and a cache as its
arguments. The idea of the function is either to bless the provided
prespecialized metadata as canonical if there is not currently a
canonical metadata record for the type it describes or else to return
the actual canonical metadata.
When called, the metadata cache checks for a preexisting entry for this
metadata. If none is found, the passed-in prespecialized metadata is
added to the cache. Otherwise, the metadata record found in the cache
is returned.
rdar://problem/56995359
The new function swift_compareProtocolConformanceDescriptors calls
through to the preexisting code in MetadataCacheKey which has been
extracted out from MetadataCacheKey::compareWitnessTables into a new
public static function
MetadataCacheKey::compareProtocolConformanceDescriptors.
The new function's availability is "future" for now.
The new function `swift_compareTypeContextDescriptors` is equivalent to
a call through to swift::equalContexts. The implementation it the same
as that of swift::equalContexts with the following removals:
- Handling of context descriptors of kind other outside of
ContextDescriptorKind::Type_First...ContextDescriptorKind::Type_Last.
Because the arguments are both TypeContextDescriptors, the kinds are
known to fall within that range.
- Casting to TypeContextDescriptor. The arguments are already of that
type.
For now, the new function has "future" availability.
use getTypeByMangledName when abstract metadata state is requested
This can significantly reduce the code size of apps constructing deeply
nested types with conditional conformances.
Requires a new runtime.
rdar://57157619
When we generate code that asks for complete metadata for a fully concrete specific type that
doesn't have trivial metadata access, like `(Int, String)` or `[String: [Any]]`,
generate a cache variable that points to a mangled name, and use a common accessor function
that turns that cache variable into a pointer to the instantiated metadata. This saves a bunch
of code size, and should have minimal runtime impact, since the demangling of any string only
has to happen once.
This mostly just works, though it exposed a couple of issues:
- Mangling a type ref including objc protocols didn't cause the objc protocol record to get
instantiated. Fixed as part of this patch.
- The runtime type demangler doesn't correctly handle retroactive conformances. If there are
multiple retroactive conformances in a process at runtime, then even though the mangled string
refers to a specific conformance, the runtime still just picks one without listening to the
mangler. This is left to fix later, rdar://problem/53828345.
There is some more follow-up work that we can do to further improve the gains:
- We could improve the runtime-provided entry points, adding versions that don't require size
to be cached, and which can handle arbitrary metadata requests. This would allow for mangled
names to also be used for incomplete metadata accesses and improve code size of some generic
type accessors. However, we'd only be able to take advantage of the new entry points in
OSes that ship a new runtime.
- We could choose to always symbolic reference all type references, which would generally reduce
the size of mangled strings, as well as make runtime demangling more efficient, since it wouldn't
need to hit the runtime caches. This would however require that we be able to handle symbolic
references across files in the MetadataReader in order to avoid regressing remote mirror
functionality.
dynamic-replacement runtime functions.
The recent change of how we do dynamic replacements added 2 new runtime
functions. This patch adds those functions to the Compatibility50 static
archive.
This will allow backward deployment to a swift 5.0 runtime.
Patch by Erik Eckstein with a modification to call the standard
libraries implementation (marked as weak) when it is available.
This ensures we can change the implementation in the future and are not
ABI locked.
rdar://problem/51601233
Instead of a thunk insert the dispatch into the original function.
If the original function should be executed the prolog just jumps to the "real" code in the function. Otherwise the replacement function is called.
There is one little complication here: when the replacement function calls the original function, the original function should not dispatch to the replacement again.
To pass this information, we use a flag in thread local storage.
The setting and reading of the flag is done in two new runtime functions.
rdar://problem/51043781
When backward deploying to an OS that may not have these entry points, weak-link them so that they
can be used conditionally in availability contexts that check for them.
rdar://problem/50731151
This is essentially a long-belated follow-up to Arnold's #12606.
The key observation here is that the enum-tag-single-payload witnesses
are strictly more powerful than the XI witnesses: you can simulate
the XI witnesses by using an extra case count that's <= the XI count.
Of course the result is less efficient than the XI witnesses, but
that's less important than overall code size, and we can work on
fast-paths for that.
The extra inhabitant count is stored in a 32-bit field (always present)
following the ValueWitnessFlags, which now occupy a fixed 32 bits.
This inflates non-XI VWTs on 32-bit targets by a word, but the net effect
on XI VWTs is to shrink them by two words, which is likely to be the
more important change. Also, being able to access the XI count directly
should be a nice win.
Currently ignored, but this will allow future compilers to pass down source location information for cast
failure runtime errors without backward deployment constraints.
Introduce a new runtime entry point, swift_getAssociatedConformanceWitness(),
which extracts an associated conformance witness from a witness table.
Teach IRGen to use this entry point rather than loading the witness
from the witness table and calling it directly.
There’s no advantage to doing this now, but it is staging for changing the
representation of associated conformances in witness tables.
Runtime functions need to use the Swift calling convention for any function
returning MetadataResponse, so that we get the two values returned in separate
registers.
Fixes rdar://problem/45042971 and rdar://problem/45851050.
This runtime function doesn’t always perform instantiation; it’s how we
get a witness table given a conformance, type, and set of instantiation
arguments. Name it accordingly.
Witness table accessors return a witness table for a given type's
conformance to a protocol. They are called directly from IRGen
(when we need the witness table instance) and from runtime conformance
checking (swift_conformsToProtocol digs the access function out of the
protocol conformance record). They have two interesting functions:
1) For witness tables requiring instantiation, they call
swift_instantiateWitnessTable directly.
2) For synthesized witness tables that might not be unique, they call
swift_getForeignWitnessTable.
Extend swift_instantiateWitnessTable() to handle both runtime
uniquing (for #2) as well as handling witness tables that don't have
a "generic table", i.e., don't need any actual instantiation. Use it
as the universal entry point for "get a witness table given a specific
conformance descriptor and type", eliminating witness table accessors
entirely.
Make a few related simplifications:
* Drop the "pattern" from the generic witness table. Instead, store
the pattern in the main part of the conformance descriptor, always.
* Drop the "conformance kind" from the protocol conformance
descriptor, since it was only there to distinguish between witness
table (pattern) vs. witness table accessor.
* Internalize swift_getForeignWitnessTable(); IRGen no longer needs to
call it.
Reduces the code size of the standard library (+assertions build) by
~149k.
Addresses rdar://problem/45489388.
Collapse the generic witness table, which was used only as a uniquing
data structure during witness table instantiation, into the protocol
conformance record. This colocates all of the constant protocol conformance
metadata and makes it possible for us to recover the generic witness table
from the conformance descriptor (including looking at the pattern itself).
Rename swift_getGenericWitnessTable() to swift_instantiateWitnessTable()
to make it clearer what its purpose is, and take the conformance descriptor
directly.
Have clients pass the requirement base descriptor to
swift_getAssociatedTypeWitness(), so that the witness index is just one
subtraction away, avoiding several dependent loads (witness table ->
conformance descriptor -> protocol descriptor -> requirement offset)
in the hot path.
Rather than rely on the metadata initialization function to compute and
fill in the superclass, use the mangled superclass name to construct the
superclass metadata.
swift_getAssociatedTypeWitness() is logically readnone because it is the
only function that accesses associated type witness information within the
witness table. Mark calls to it as readnone and nounwind.
Rather than storing associated type metadata access functions in
witness tables, initially store a pointer to a mangled type name.
On first access, demangle that type name and replace the witness
table entry with the resulting type metadata.
This reduces the code size of protocol conformances, because we no
longer need to create associated type metadata access functions for
every associated type, and the mangled names are much smaller (and
sharable). The same code size improvements apply to defaulted
associated types for resilient protocols, although those are more
rare. Witness tables themselves are slightly smaller, because we
don’t need separate private entries in them to act as caches.
On the caller side, associated type metadata is always produced via
a call to swift_getAssociatedTypeWitness(), which handles the demangling
and caching behavior.
In all, this reduces the size of the standard library by ~70k. There
are additional code-size wins that are possible with follow-on work:
* We can stop emitting type metadata access functions for non-resilient
types that have constant metadata (like `Int`), because they’re only
currently used as associated type metadata access functions.
* We can stop emitting separate associated type reflection metadata,
because the reflection infrastructure can use these mangled names
directly.
If a class has a backward deployment layout:
- We still want to emit it using the FixedClassMetadataBuilder.
- We still want it to appear in the objc_classes section, and get an
OBJC_CLASS_$_ symbol if its @objc.
- However, we want to use the singleton metadata initialization pattern
in the metadata accessor.
- We want to emit metadata for all field types, and call the
swift_updateClassMetadata() function to initialize the class
metadata.
For now, this function just performs the idempotent initialization of
invoking a static method on the class, causing it to be realized with
the Objective-C runtime.
Previously we would emit class metadata for classes with resilient
ancestry, and relocate it at runtime once the correct size was known.
However most of the fields were blank, so it makes more sense to
construct the metadata from scratch, and store the few bits that we
do need in a true-const pattern where we can use relative pointers.
They were, already, but remove the isConstant parameter to
getAddrOfTypeMetadataPattern(), and just assert that its true for
patterns in defineTypeMetadata() instead.
Also, metadata patterns are i8*, not i8**. In fact they don't contain any
absolute pointers at all.
Should be NFC other than the LLVM type change.