Archiving expects to be able to instantiate generic classes by name. This previously worked if you had instantiated the specialization in question, because the ObjC runtime would be able to look it up. Now, if the name hasn't been created, Swift has to look it up. demangleObjCTypeName doesn't do generics, so this failed.
rdar://problem/57674583
This makes for a cleaner and less implicit-context-heavy API, and makes it easier for symbolic
reference resolvers to do context-dependent things (like map the in-memory base address back to a
remote address in MetadataReader).
When mangling a dependent protocol conformance ref, the mangler currently uses `0_` to mean an unknown index and `N_` to mean the index `N - 1`. Unfortunately, this is somewhat confused: `0_` is actually the mangling for index 1, and index 0 is supposed to be mangled as just `_`, so true indexes are actually offset by 2. So the first thing to do here is to clarify what's going on throughout the mangler, demangler, and ABI documentation.
Also, the demangler attempts to produce a `DependentProtocolConformance*` node with the appropriate child nodes and an optional index payload. Unfortunately, demangle nodes cannot have both children and a value payload, so whenever it creates a node with an index payload, the demangler will assert. It does this whenever the mangled index is not 0; since (per above) the mangler always produces a non-zero mangled index in this production, the demangler will always assert when processing these. So clearly this is well-tested code, since +asserts builds will always trigger the demangler when mangling a name in the first place. To fix this, we need to make the index a child of the mangling node instead of its payload; at the same time, we can make it store the semantically correct index value and just introduce a new `UnknownIndex` node to handle the `0_` case. This is easy because all current clients ignore this information.
Finally, due to an apparent copy-and-paste error, the demangler attempts to produce a `DependentProtocolConformanceRoot` node for associated protocol conformances; this is easily resolved.
This fixes the crash in SR-10926 (rdar://51710424). The obscurity of this crash --- which originally made us think it might be related to Error self-conformance --- is because it is only triggered when a function signature takes advantage of a concrete-but-dependent retroactive conformance, which (to be both concrete and dependent) must furthermore be conditional. Testing the other cases besides a root conformance requires an even more obscure testcase.
If somebody called `demangleType` or `demangleSymbol` using a demangler that was already
in the middle of demangling a string, then the state for the new demangler would clobber the old
demangler. This manifested in rdar://problem/50380275 because, as part of demangling a
string with a symbolic reference to a private type's context, we would demangle the debug string
for the referenced context using the same demangler. If there were additional operators in the
original string after the symbolic reference, these never got demangled because the demangler
was now in the state of having completed demangling the other string, so we would drop nesting,
and incorrectly report field types of, for instance, `Array<PrivateStruct>` or
`(PrivateStruct, OtherStruct)` as just being `PrivateStruct`.
This is a likely situation to be in, especially now that `Demangler` objects also serve as
arena allocators for their demangled nodes, so change the top-level demangler entry points
to use an RAII object to push and pop the existing state instead of unilaterally clobbering
the existing state.
The demangler can be initialized with a preallocated memory on the stack. Only in case of an overflow, the bump pointer allocator mallocs new memory.
Also, support that a new instance of a demangler can "borrow" the free memory from an existing demangler. This is useful because in the runtime the demangler is invoked recursively. With this feature, all the nested demanglers can share a single stack allocated space.
This is done by disallowing nodes with children to also have index or text payloads.
In some cases those payloads were not needed anyway, because the information can be derived later.
In other cases the fix was to insert an additional child node with the index/text payload.
Also, implement single or double children as "inline" children, which avoids needing a separate node vector for children.
All this reduces the needed size for node trees by over 2x.
New(er) grammar:
// same module as conforming type, or non-unique
protocol-conformance-ref ::= protocol 'HP'
// same module as protocol
protocol-conformance-ref ::= protocol 'Hp'
// retroactive
protocol-conformance-ref ::= protocol module
We don't make use of this distinction anywhere yet, but we could in
the future.
Due to some unfortunate refactoring, protocol-conformance-ref is a
nonterminal in the mangling grammar that doesn't have its own
operator:
```
protocol-conformance-ref ::= protocol module?
```
Both "module" and "protocol" can be an "identifier", which introduces
a mangling collision. Address the mangling collision by using the
operator "HP".
Fixes rdar://problem/46735592.
Introduce complete mangling for references to protocol conformances:
* Mangle requirements of conditional conformances when present.
* Mangle conformance access paths for generic environment-dependent
conformances.
* Abstract protocol conformance references so we can introduce
symbolic references for them.
Extending the mangling of symbolic references to also include indirect
symbolic references. This allows mangled names to refer to context
descriptors (both type and protocol) not in the current source file.
For now, only permit indirect symbolic references within the current module,
because remote mirrors (among other things) is unable to handle relocations.
Co-authored-by: Joe Groff <jgroff@apple.com>
This makes resolving mangled names to nominal types in the same module more efficient, and for eventual secrecy improvements, also allows types in the same module to be referenced from mangled typerefs without encoding any source-level name information about them.
A "retroactive" protocol conformance is a conformance that is provided
by a module that is neither the module that defines the protocol nor
the module that defines the conforming type. It is possible for such
conformances to conflict at runtime, if defined in different modules
that were not both visible to the compiler at the same time.
When mangling a bound generic type, also mangle retroactive protocol
conformances that were needed to satisfy the generic requirements of
the generic type. This prevents name collisions between (e.g.) types
formed using retroactive conformances from different modules. The
impact on the size of the mangling is expected to be relatively small,
because most conformances are not retroactive.
Fixes the ABI part of rdar://problem/14375889.
This new format more efficiently represents existing information, while
more accurately encoding important information about nested generic
contexts with same-type and layout constraints that need to be evaluated
at runtime. It's also designed with an eye to forward- and
backward-compatible expansion for ABI stability with future Swift
versions.
This makes them consistent no matter what shenanigans are pulled by
the importer, particularly NS_ENUM vs. NS_OPTIONS and NS_SWIFT_NAME.
The 'NSErrorDomain' API note /nearly/ works with this, but the
synthesized error struct is still mangled as a Swift declaration,
which means it's not rename-stable. See follow-up commits.
The main place where this still falls down is NS_STRING_ENUM: when
this is applied, a typedef is imported as a unique struct, but without
it it's just a typealias for the underlying type. There's also still a
problem with synthesized conformances, which have a module mangled
into the witness table symbol even though that symbol is linkonce_odr.
rdar://problem/31616162
Make function mangling backward compatible based on the prefix
of the mangled symbol, which is used to distinguish between names
with old/new parameter label mangling schemes.
Resolves: rdar://problem/36357120
Demangle such suffixes as "unmangled suffix"
IRGen still uses '.<n>' to disambiguate partial apply thunks and outlined copy functions.
rdar://problem/32934962
Either the demangling completely succeeds or it fails. Don't demangle to something like: [...] with unmangled suffix "..."
This avoids getting really stupid demangled names for symbols which are actually not swift symbols.
- Allow them to use substitutions.
- Consistently use 'a' as a mangling operator.
- For generic typealiases, include the alias as context for any generic
parameters.
Typealiases don't show up in symbol names, which always refer to
canonical types, but they are mangled for debug info and for USRs
(unique identifiers used by SourceKit), so it's good to get this
right.
A protocol composition with an explicit 'AnyObject' member is
now mangled as <protocol list> 'Xl'. For subclass existentials,
I changed the mangling from <protocol list> <class> 'XE' to
<protocol list> <class> 'Xl'.
Not used for anything just yet.
Previously it was part of swiftBasic.
The demangler library does not depend on llvm (except some header-only utilities like StringRef). Putting it into its own library makes sure that no llvm stuff will be linked into clients which use the demangler library.
This change also contains other refactoring, like moving demangler code into different files. This makes it easier to remove the old demangler from the runtime library when we switch to the new symbol mangling.
Also in this commit: remove some unused API functions from the demangler Context.
fixes rdar://problem/30503344