Yet more preprocessor metaprogramming to eliminate per-macro-role boilerplate
in the compiler. This time, focused on mangling, demangling, and remangling
of the accessor macro roles.
Using symbolic references instead of a text based mangling avoids the
expensive type descriptor scan when objective c protocols are requested.
rdar://111536582
Macro expansions are currently written to disk using the mangled name of
the macro. Do not use operators that only differ in case-sensitivity to
avoid issues on case-insensitive filesystems.
Resolves rdar://109371653.
Extend the name mangling scheme for macro expansions to cover attached
macros, and use that scheme for the names of macro expansions buffers.
Finishes rdar://104038303, stabilizing file/buffer names for macro
expansion buffers.
Use the name mangling scheme we've devised for macro expansions to
back the implementation of the macro expansion context's
`getUniqueName` operation. This way, we guarantee that the names
provided by macro expansions don't conflict, as well as making them
demangleable so we can determine what introduced the names.
When a declaration has a structural opaque return type like:
func foo() -> Bar<some P>
then to mangle that return type `Bar<some P>`, we have to mangle the `some P`
part by referencing its defining declaration `foo()`, which in turn includes
its return type `Bar<some P>` again (this time using a special mangling for
`some P` that prevents infinite recursion). Since we mangle `Bar<some P>`
once as part of mangling the declaration, and we register substitutions for
bound generic types when they're complete, we end up registering the
substitution for `Bar<some P>` twice, once as the return type of the
declaration name, and again as the actual type. This would be fine, except
that the mangler doesn't check for key collisions, and it picks
substitution indexes based on the number of entries in its hash map, so
the duplicated substitution ends up corrupting the substitution sequence,
causing the mangler to produce an invalid mangled name.
Fixing that exposes us to another problem in the remangler: the AST
mangler keys substitutions by type identity, but the remangler
uses the value of the demangled nodes to recognize substitutions.
The mangling for `Bar<current declaration's opaque return type>` can
appear multiple times in a demangled tree, but referring to different
declarations' opaque return types, and the remangler would reconstruct
an incorrect mangled name when this happens. To avoid this, change the
way the demangler represents `OpaqueReturnType` nodes so that they
contain a backreference to the declaration they represent, so that
substitutions involving different declarations' opaque return types
don't get confused.
For performance annotations we need the generic specializer to trop non-generic metatype argumentrs
(which we don't do in general). For this we need a separate mangling.
Upgrade the old mangling from a list of argument types to a
list of requiremnets. For now, only same-type requirements
may actually be mangled since those are all that are available
to the surface language.
Reconstruction of existential types now consists of demangling (a list of)
base protocol(s), decoding the constraints, and converting the same-type
constraints back into a list of arguments.
rdar://96088707
The layout of constant static arrays differs from non-constant static arrays.
Therefore use a different mangling to get symbol mismatches if for some reason two modules don't agree on which version a static array is.
I wrote out this whole analysis of why different existential types
might have the same logical content, and then I turned around and
immediately uniqued existential shapes purely by logical content
rather than the (generalized) formal type. Oh well. At least it's
not too late to make ABI changes like this.
We now store a reference to a mangling of the generalized formal
type directly in the shape. This type alone is sufficient to unique
the shape:
- By the nature of the generalization algorithm, every type parameter
in the generalization signature should be mentioned in the
generalized formal type in a deterministic order.
- By the nature of the generalization algorithm, every other
requirement in the generalization signature should be implied
by the positions in which generalization type parameters appear
(e.g. because the formal type is C<T> & P, where C constrains
its type parameter for well-formedness).
- The requirement signature and type expression are extracted from
the existential type.
As a result, we no longer rely on computing a unique hash at
compile time.
Storing this separately from the requirement signature potentially
allows runtimes with general shape support to work with future
extensions to existential types even if they cannot demangle the
generalized formal type.
Storing the generalized formal type also allows us to easily and
reliably extract the formal type of the existential. Otherwise,
it's quite a heroic endeavor to match requirements back up with
primary associated types. Doing so would also only allows us to
extract *some* matching formal type, not necessarily the *right*
formal type. So there's some good synergy here.
The `Qr` mangling is used to refer to the opaque type within the
declaration that produces the opaque type. When there are multiple
opaque types, e.g., due to structural or named opaque result types, it
does not specify which of the opaque type parameters it refers to.
Introduce a new mangling `QR INDEX` for opaque type parameters after
the first, retaining the `Qr` mangling for the first opaque type
parameter. This way, existing (non-structural) uses of opaque result
types retain the same manglings, but uses of structural or named
opaque result types (new features) will have distinct manglings.
Note that this mangling within a declaration is only used for the
declaration itself, and not for references to the opaque type of the
declaration, so there is no impact on the runtime demangler.
This cleans up 90 instances of this warning and reduces the build spew
when building on Linux. This helps identify actual issues when
building which can get lost in the stream of warning messages. It also
helps restore the ability to build the compiler with gcc.
Distributed thunks were using the same mangling as direct method
reference thunks (i.e., for "super" calls). Although not technically
conflicting so long as actors never gain inheritance, it's confusing
and could cause problems in the future. So, introduce a distinct
mangling for distributed thunks and plumb them through the demangling
and remangler.
Because DEMANGLER_ASSERT() might cause the remanglers to return a ManglingError
with the code ManglingError::AssertionFailed, it's useful to have a line number
in the ManglingError as well as the other information. This is also potentially
helpful for other cases where the code is used multiple times in the remanglers.
rdar://79725187