When generic metadata for a class is requested in the same module where
the class is defined, rather than a call to the generic metadata
accessor or to a variant of typeForMangledNode, a call to a new
accessor--a canonical specialized generic metadata accessor--is emitted.
The new function is defined schematically as follows:
MetadataResponse `canonical specialized metadata accessor for C<K>`(MetadataRequest request) {
(void)`canonical specialized metadata accessor for superclass(C<K>)`(::Complete)
(void)`canonical specialized metadata accessor for generic_argument_class(C<K>, 1)`(::Complete)
...
(void)`canonical specialized metadata accessor for generic_argument_class(C<K>, count)`(::Complete)
auto *metadata = objc_opt_self(`canonical specialized metadata for C<K>`);
return {metadata, MetadataState::Complete};
}
where generic_argument_class(C<K>, N) denotes the Nth generic argument
which is both (1) itself a specialized generic type and is also (2) a
class. These calls to the specialized metadata accessors for these
related types ensure that all generic class types are registered with
the Objective-C runtime.
To enable these new canonical specialized generic metadata accessors,
metadata for generic classes is prespecialized as needed. So are the
metaclasses and the corresponding rodata.
Previously, the lazy objc naming hook was registered during process
execution when the first generic class metadata was instantiated. Since
that instantiation may occur "before process launch" (i.e. if the
generic metadata is prespecialized), the lazy naming hook is now
installed at process launch.
The `Qo` operator expects to consume a type name and a list (terminated with a `y` empty list marker) from the stack. After popping the list, it doesn't check whether the stack is empty, so `$syQo` crashes (it pops down to the `y` then tries to pop again).
This PR just adds the obvious check to guard against this.
Resolves rdar://63128307
A malformed mangled name that ends in a truncated symbolic
reference could trigger a read beyond the end of the name.
This is because the code that grabs the next four bytes
bypasses the existing bounds checks. Insert an explicit
bounds check to guard against this.
Mangle `@noDerivative` parameters to fix type reconstruction errors.
Resolves SR-12650. The new mangling is non-breaking.
When differentiation supports multiple result indices and `@noDerivative`
results are added, we can reuse some of this mangling support.
Resolve mangled names containing symbolic references to indirect opaque type descriptors from other
dylibs by demangling the referenced symbol name, like we do for other kinds of context descriptor.
Add an OpaqueArchetypeTypeRef that can represent unresolved opaque types in the Reflection library.
Add mangling scheme for `@differentiable` and `@differentiable(linear)` function
types. Mangling support is important for debug information, among other things.
Update docs and add tests.
Resolves TF-948.
In order to allow this, I've had to rework the syntax of substituted function types; what was previously spelled `<T> in () -> T for <X>` is now spelled `@substituted <T> () -> T for <X>`. I think this is a nice improvement for readability, but it did require me to churn a lot of test cases.
Distinguishing the substitutions has two chief advantages over the existing representation. First, the semantics seem quite a bit clearer at use points; the `implicit` bit was very subtle and not always obvious how to use. More importantly, it allows the expression of generic function types that must satisfy a particular generic abstraction pattern, which was otherwise impossible to express.
As an example of the latter, consider the following protocol conformance:
```
protocol P { func foo() }
struct A<T> : P { func foo() {} }
```
The lowered signature of `P.foo` is `<Self: P> (@in_guaranteed Self) -> ()`. Without this change, the lowered signature of `A.foo`'s witness would be `<T> (@in_guaranteed A<T>) -> ()`, which does not preserve information about the conformance substitution in any useful way. With this change, the lowered signature of this witness could be `<T> @substituted <Self: P> (@in_guaranteed Self) -> () for <A<T>>`, which nicely preserves the exact substitutions which relate the witness to the requirement.
When we adopt this, it will both obviate the need for the special witness-table conformance field in SILFunctionType and make it far simpler for the SILOptimizer to devirtualize witness methods. This patch does not actually take that step, however; it merely makes it possible to do so.
As another piece of unfinished business, while `SILFunctionType::substGenericArgs()` conceptually ought to simply set the given substitutions as the invocation substitutions, that would disturb a number of places that expect that method to produce an unsubstituted type. This patch only set invocation arguments when the generic type is a substituted type, which we currently never produce in type-lowering.
My plan is to start by producing substituted function types for accessors. Accessors are an important case because the coroutine continuation function is essentially an implicit component of the function type which the current substitution rules simply erase the intended abstraction of. They're also used in narrower ways that should exercise less of the optimizer.
Archiving expects to be able to instantiate generic classes by name. This previously worked if you had instantiated the specialization in question, because the ObjC runtime would be able to look it up. Now, if the name hasn't been created, Swift has to look it up. demangleObjCTypeName doesn't do generics, so this failed.
rdar://problem/57674583
- Ensure QuotedString prints characters as `char` and not as an integer-like `unsigned char`
- Assert when trying to add a null child node to a Demangle::Node
Teach SILGen to emit a separate SIL function to capture the
initialization of the backing storage type for a wrapped property
based on the wrapped value. This eliminates manual code expansion at
every use site.
This makes for a cleaner and less implicit-context-heavy API, and makes it easier for symbolic
reference resolvers to do context-dependent things (like map the in-memory base address back to a
remote address in MetadataReader).
The archetype mangling does not have enough information to accurately recover the associated type
at runtime. This fixes rdar://problem/54084733.
Although this changes the mangling in both runtime and symbols, this should not affect ABI, because
there is no way for associated types of opaque types to be surfaced in the types of public
declarations today.
When we generate code that asks for complete metadata for a fully concrete specific type that
doesn't have trivial metadata access, like `(Int, String)` or `[String: [Any]]`,
generate a cache variable that points to a mangled name, and use a common accessor function
that turns that cache variable into a pointer to the instantiated metadata. This saves a bunch
of code size, and should have minimal runtime impact, since the demangling of any string only
has to happen once.
This mostly just works, though it exposed a couple of issues:
- Mangling a type ref including objc protocols didn't cause the objc protocol record to get
instantiated. Fixed as part of this patch.
- The runtime type demangler doesn't correctly handle retroactive conformances. If there are
multiple retroactive conformances in a process at runtime, then even though the mangled string
refers to a specific conformance, the runtime still just picks one without listening to the
mangler. This is left to fix later, rdar://problem/53828345.
There is some more follow-up work that we can do to further improve the gains:
- We could improve the runtime-provided entry points, adding versions that don't require size
to be cached, and which can handle arbitrary metadata requests. This would allow for mangled
names to also be used for incomplete metadata accesses and improve code size of some generic
type accessors. However, we'd only be able to take advantage of the new entry points in
OSes that ship a new runtime.
- We could choose to always symbolic reference all type references, which would generally reduce
the size of mangled strings, as well as make runtime demangling more efficient, since it wouldn't
need to hit the runtime caches. This would however require that we be able to handle symbolic
references across files in the MetadataReader in order to avoid regressing remote mirror
functionality.
When mangling a dependent protocol conformance ref, the mangler currently uses `0_` to mean an unknown index and `N_` to mean the index `N - 1`. Unfortunately, this is somewhat confused: `0_` is actually the mangling for index 1, and index 0 is supposed to be mangled as just `_`, so true indexes are actually offset by 2. So the first thing to do here is to clarify what's going on throughout the mangler, demangler, and ABI documentation.
Also, the demangler attempts to produce a `DependentProtocolConformance*` node with the appropriate child nodes and an optional index payload. Unfortunately, demangle nodes cannot have both children and a value payload, so whenever it creates a node with an index payload, the demangler will assert. It does this whenever the mangled index is not 0; since (per above) the mangler always produces a non-zero mangled index in this production, the demangler will always assert when processing these. So clearly this is well-tested code, since +asserts builds will always trigger the demangler when mangling a name in the first place. To fix this, we need to make the index a child of the mangling node instead of its payload; at the same time, we can make it store the semantically correct index value and just introduce a new `UnknownIndex` node to handle the `0_` case. This is easy because all current clients ignore this information.
Finally, due to an apparent copy-and-paste error, the demangler attempts to produce a `DependentProtocolConformanceRoot` node for associated protocol conformances; this is easily resolved.
This fixes the crash in SR-10926 (rdar://51710424). The obscurity of this crash --- which originally made us think it might be related to Error self-conformance --- is because it is only triggered when a function signature takes advantage of a concrete-but-dependent retroactive conformance, which (to be both concrete and dependent) must furthermore be conditional. Testing the other cases besides a root conformance requires an even more obscure testcase.
If somebody called `demangleType` or `demangleSymbol` using a demangler that was already
in the middle of demangling a string, then the state for the new demangler would clobber the old
demangler. This manifested in rdar://problem/50380275 because, as part of demangling a
string with a symbolic reference to a private type's context, we would demangle the debug string
for the referenced context using the same demangler. If there were additional operators in the
original string after the symbolic reference, these never got demangled because the demangler
was now in the state of having completed demangling the other string, so we would drop nesting,
and incorrectly report field types of, for instance, `Array<PrivateStruct>` or
`(PrivateStruct, OtherStruct)` as just being `PrivateStruct`.
This is a likely situation to be in, especially now that `Demangler` objects also serve as
arena allocators for their demangled nodes, so change the top-level demangler entry points
to use an RAII object to push and pop the existing state instead of unilaterally clobbering
the existing state.
Our mangling did not encode if an Objective-C block was escaping or
not. This is not a huge problem in practice, but for debug info we
want type reconstruction to round-trip exactly. There was a previous
workaround to paper over this specific problem.
Remove the workaround, and add a new 'XL' mangling for escaping
blocks. Since we don't actually want to break ABI compatibility,
only use the new mangling in DWARF debug info.
Fix a trio of issues involving mangling for opaque result types:
* Symbolic references to opaque type descriptors are not substitutions
* Mangle protocol extension contexts correctly
* Mangle generic arguments for opaque result types of generic functions
The (de-)serialization of generic parameter lists for opaque type
declarations is important for the last bullet, to ensure that the
mangling of generic arguments of opaque result types works across
module boundaries.
Fixes the rest of rdar://problem/50038754.
This is to support dynamic function replacement of functions with opaque
result type.
This approach requires that all state is thrown away (that could contain the
old returned type for an opaque type) between replacements.
rdar://48887938
The demangler can be initialized with a preallocated memory on the stack. Only in case of an overflow, the bump pointer allocator mallocs new memory.
Also, support that a new instance of a demangler can "borrow" the free memory from an existing demangler. This is useful because in the runtime the demangler is invoked recursively. With this feature, all the nested demanglers can share a single stack allocated space.
This is done by disallowing nodes with children to also have index or text payloads.
In some cases those payloads were not needed anyway, because the information can be derived later.
In other cases the fix was to insert an additional child node with the index/text payload.
Also, implement single or double children as "inline" children, which avoids needing a separate node vector for children.
All this reduces the needed size for node trees by over 2x.
New(er) grammar:
// same module as conforming type, or non-unique
protocol-conformance-ref ::= protocol 'HP'
// same module as protocol
protocol-conformance-ref ::= protocol 'Hp'
// retroactive
protocol-conformance-ref ::= protocol module
We don't make use of this distinction anywhere yet, but we could in
the future.
Fix to 510b64fcd5. The mangling operator "HP" has to distinguish
between "protocol" and "protocol module", not between the presence
or absence of protocol-conformance-ref. New grammar:
protocol-conformance-ref ::= protocol
protocol-conformance-ref ::= protocol module 'HP'
rdar://problem/46735592, again
Due to some unfortunate refactoring, protocol-conformance-ref is a
nonterminal in the mangling grammar that doesn't have its own
operator:
```
protocol-conformance-ref ::= protocol module?
```
Both "module" and "protocol" can be an "identifier", which introduces
a mangling collision. Address the mangling collision by using the
operator "HP".
Fixes rdar://problem/46735592.
Start emitting associated conformance requirement descriptors for
inherited protocols, so we have a symbol to reference from resilient
witness tables and mangled names in the future.
Use a general ‘type’ production for the conforming type of an associated
witness table accessor mangling, so that we can mangle base protocol
witness table accessors. These entities are always internal symbols, so the
mangling itself doesn’t affect the ABI.