mirror of
https://github.com/apple/swift.git
synced 2025-12-14 20:36:38 +01:00
Fix some interface type/context type confusion in the AST synthesis from the previous patch, add a unique private mangling for behavior protocol conformances, and set up SILGen to emit the conformances when property declarations with behaviors are visited. Disable synthesis of the struct memberwise initializer if any instance properties use behaviors; codegen will need to be redesigned here.
1244 lines
56 KiB
ReStructuredText
1244 lines
56 KiB
ReStructuredText
:orphan:
|
|
|
|
.. @raise litre.TestsAreMissing
|
|
.. _ABI:
|
|
|
|
The Swift ABI
|
|
=============
|
|
|
|
.. contents::
|
|
|
|
Hard Constraints on Resilience
|
|
------------------------------
|
|
|
|
The root of a class hierarchy must remain stable, at pain of
|
|
invalidating the metaclass hierarchy. Note that a Swift class without an
|
|
explicit base class is implicitly rooted in the SwiftObject
|
|
Objective-C class.
|
|
|
|
Type Layout
|
|
-----------
|
|
|
|
Fragile Struct and Tuple Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Structs and tuples currently share the same layout algorithm, noted as the
|
|
"Universal" layout algorithm in the compiler implementation. The algorithm
|
|
is as follows:
|
|
|
|
- Start with a **size** of **0** and an **alignment** of **1**.
|
|
- Iterate through the fields, in element order for tuples, or in ``var``
|
|
declaration order for structs. For each field:
|
|
|
|
* Update **size** by rounding up to the **alignment of the field**, that is,
|
|
increasing it to the least value greater or equal to **size** and evenly
|
|
divisible by the **alignment of the field**.
|
|
* Assign the **offset of the field** to the current value of **size**.
|
|
* Update **size** by adding the **size of the field**.
|
|
* Update **alignment** to the max of **alignment** and the
|
|
**alignment of the field**.
|
|
|
|
- The final **size** and **alignment** are the size and alignment of the
|
|
aggregate. The **stride** of the type is the final **size** rounded up to
|
|
**alignment**.
|
|
|
|
Note that this differs from C or LLVM's normal layout rules in that *size*
|
|
and *stride* are distinct; whereas C layout requires that an embedded struct's
|
|
size be padded out to its alignment and that nothing be laid out there,
|
|
Swift layout allows an outer struct to lay out fields in the inner struct's
|
|
tail padding, alignment permitting. Unlike C, zero-sized structs and tuples
|
|
are also allowed, and take up no storage in enclosing aggregates. The Swift
|
|
compiler emits LLVM packed struct types with manual padding to get the
|
|
necessary control over the binary layout. Some examples:
|
|
|
|
::
|
|
|
|
// LLVM <{ i64, i8 }>
|
|
struct S {
|
|
var x: Int
|
|
var y: UInt8
|
|
}
|
|
|
|
// LLVM <{ i8, [7 x i8], <{ i64, i8 }>, i8 }>
|
|
struct S2 {
|
|
var x: UInt8
|
|
var s: S
|
|
var y: UInt8
|
|
}
|
|
|
|
// LLVM <{}>
|
|
struct Empty {}
|
|
|
|
// LLVM <{ i64, i64 }>
|
|
struct ContainsEmpty {
|
|
var x: Int
|
|
var y: Empty
|
|
var z: Int
|
|
}
|
|
|
|
Class Layout
|
|
~~~~~~~~~~~~
|
|
|
|
Swift relies on the following assumptions about the Objective-C runtime,
|
|
which are therefore now part of the Objective-C ABI:
|
|
|
|
- 32-bit platforms never have tagged pointers. ObjC pointer types are
|
|
either nil or an object pointer.
|
|
|
|
- On x86-64, a tagged pointer either sets the lowest bit of the pointer
|
|
or the highest bit of the pointer. Therefore, both of these bits are
|
|
zero if and only if the value is not a tagged pointer.
|
|
|
|
- On ARM64, a tagged pointer always sets the highest bit of the pointer.
|
|
|
|
- 32-bit platforms never perform any isa masking. ``object_getClass``
|
|
is always equivalent to ``*(Class*)object``.
|
|
|
|
- 64-bit platforms perform isa masking only if the runtime exports a
|
|
symbol ``uintptr_t objc_debug_isa_class_mask;``. If this symbol
|
|
is exported, ``object_getClass`` on a non-tagged pointer is always
|
|
equivalent to ``(Class)(objc_debug_isa_class_mask & *(uintptr_t*)object)``.
|
|
|
|
- The superclass field of a class object is always stored immediately
|
|
after the isa field. Its value is either nil or a pointer to the
|
|
class object for the superclass; it never has other bits set.
|
|
|
|
The following assumptions are part of the Swift ABI:
|
|
|
|
- Swift class pointers are never tagged pointers.
|
|
|
|
TODO
|
|
|
|
Fragile Enum Layout
|
|
~~~~~~~~~~~~~~~~~~~
|
|
|
|
In laying out enum types, the ABI attempts to avoid requiring additional
|
|
storage to store the tag for the enum case. The ABI chooses one of five
|
|
strategies based on the layout of the enum:
|
|
|
|
Empty Enums
|
|
```````````
|
|
|
|
In the degenerate case of an enum with no cases, the enum is an empty type.
|
|
|
|
::
|
|
|
|
enum Empty {} // => empty type
|
|
|
|
Single-Case Enums
|
|
`````````````````
|
|
|
|
In the degenerate case of an enum with a single case, there is no
|
|
discriminator needed, and the enum type has the exact same layout as its
|
|
case's data type, or is empty if the case has no data type.
|
|
|
|
::
|
|
|
|
enum EmptyCase { case X } // => empty type
|
|
enum DataCase { case Y(Int, Double) } // => LLVM <{ i64, double }>
|
|
|
|
C-Like Enums
|
|
````````````
|
|
|
|
If none of the cases has a data type (a "C-like" enum), then the enum
|
|
is laid out as an integer tag with the minimal number of bits to contain
|
|
all of the cases. The machine-level layout of the type then follows LLVM's
|
|
data layout rules for integer types on the target platform. The cases are
|
|
assigned tag values in declaration order.
|
|
|
|
::
|
|
|
|
enum EnumLike2 { // => LLVM i1
|
|
case A // => i1 0
|
|
case B // => i1 1
|
|
}
|
|
|
|
enum EnumLike8 { // => LLVM i3
|
|
case A // => i3 0
|
|
case B // => i3 1
|
|
case C // => i3 2
|
|
case D // etc.
|
|
case E
|
|
case F
|
|
case G
|
|
case H
|
|
}
|
|
|
|
Discriminator values after the one used for the last case become *extra
|
|
inhabitants* of the enum type (see `Single-Payload Enums`_).
|
|
|
|
Single-Payload Enums
|
|
````````````````````
|
|
|
|
If an enum has a single case with a data type and one or more no-data cases
|
|
(a "single-payload" enum), then the case with data type is represented using
|
|
the data type's binary representation, with added zero bits for tag if
|
|
necessary. If the data type's binary representation
|
|
has **extra inhabitants**, that is, bit patterns with the size and alignment of
|
|
the type but which do not form valid values of that type, they are used to
|
|
represent the no-data cases, with extra inhabitants in order of ascending
|
|
numeric value matching no-data cases in declaration order. If the type
|
|
has *spare bits* (see `Multi-Payload Enums`_), they are used to form extra
|
|
inhabitants. The enum value is then represented as an integer with the storage
|
|
size in bits of the data type. Extra inhabitants of the payload type not used
|
|
by the enum type become extra inhabitants of the enum type itself.
|
|
|
|
::
|
|
|
|
enum CharOrSectionMarker { => LLVM i32
|
|
case Paragraph => i32 0x0020_0000
|
|
case Char(UnicodeScalar) => i32 (zext i21 %Char to i32)
|
|
case Chapter => i32 0x0020_0001
|
|
}
|
|
|
|
CharOrSectionMarker.Char('\x00') => i32 0x0000_0000
|
|
CharOrSectionMarker.Char('\u10FFFF') => i32 0x0010_FFFF
|
|
|
|
enum CharOrSectionMarkerOrFootnoteMarker { => LLVM i32
|
|
case CharOrSectionMarker(CharOrSectionMarker) => i32 %CharOrSectionMarker
|
|
case Asterisk => i32 0x0020_0002
|
|
case Dagger => i32 0x0020_0003
|
|
case DoubleDagger => i32 0x0020_0004
|
|
}
|
|
|
|
If the data type has no extra inhabitants, or there are not enough extra
|
|
inhabitants to represent all of the no-data cases, then a tag bit is added
|
|
to the enum's representation. The tag bit is set for the no-data cases, which
|
|
are then assigned values in the data area of the enum in declaration order.
|
|
|
|
::
|
|
|
|
enum IntOrInfinity { => LLVM <{ i64, i1 }>
|
|
case NegInfinity => <{ i64, i1 }> { 0, 1 }
|
|
case Int(Int) => <{ i64, i1 }> { %Int, 0 }
|
|
case PosInfinity => <{ i64, i1 }> { 1, 1 }
|
|
}
|
|
|
|
IntOrInfinity.Int( 0) => <{ i64, i1 }> { 0, 0 }
|
|
IntOrInfinity.Int(20721) => <{ i64, i1 }> { 20721, 0 }
|
|
|
|
Multi-Payload Enums
|
|
```````````````````
|
|
|
|
If an enum has more than one case with data type, then a tag is necessary to
|
|
discriminate the data types. The ABI will first try to find common
|
|
**spare bits**, that is, bits in the data types' binary representations which are
|
|
either fixed-zero or ignored by valid values of all of the data types. The tag
|
|
will be scattered into these spare bits as much as possible. Currently only
|
|
spare bits of primitive integer types, such as the high bits of an ``i21``
|
|
type, are considered. The enum data is represented as an integer with the
|
|
storage size in bits of the largest data type.
|
|
|
|
::
|
|
|
|
enum TerminalChar { => LLVM i32
|
|
case Plain(UnicodeScalar) => i32 (zext i21 %Plain to i32)
|
|
case Bold(UnicodeScalar) => i32 (or (zext i21 %Bold to i32), 0x0020_0000)
|
|
case Underline(UnicodeScalar) => i32 (or (zext i21 %Underline to i32), 0x0040_0000)
|
|
case Blink(UnicodeScalar) => i32 (or (zext i21 %Blink to i32), 0x0060_0000)
|
|
case Empty => i32 0x0080_0000
|
|
case Cursor => i32 0x0080_0001
|
|
}
|
|
|
|
If there are not enough spare bits to contain the tag, then additional bits are
|
|
added to the representation to contain the tag. Tag values are
|
|
assigned to data cases in declaration order. If there are no-data cases, they
|
|
are collected under a common tag, and assigned values in the data area of the
|
|
enum in declaration order.
|
|
|
|
::
|
|
|
|
class Bignum {}
|
|
|
|
enum IntDoubleOrBignum { => LLVM <{ i64, i2 }>
|
|
case Int(Int) => <{ i64, i2 }> { %Int, 0 }
|
|
case Double(Double) => <{ i64, i2 }> { (bitcast %Double to i64), 1 }
|
|
case Bignum(Bignum) => <{ i64, i2 }> { (ptrtoint %Bignum to i64), 2 }
|
|
}
|
|
|
|
Existential Container Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Values of protocol type, protocol composition type, or "any" type
|
|
(``protocol<>``) are laid out using **existential containers** (so-called
|
|
because these types are "existential types" in type theory).
|
|
|
|
Opaque Existential Containers
|
|
`````````````````````````````
|
|
|
|
If there is no class constraint on a protocol or protocol composition type,
|
|
the existential container has to accommodate a value of arbitrary size and
|
|
alignment. It does this using a **fixed-size buffer**, which is three pointers
|
|
in size and pointer-aligned. This either directly contains the value, if its
|
|
size and alignment are both less than or equal to the fixed-size buffer's, or
|
|
contains a pointer to a side allocation owned by the existential container.
|
|
The type of the contained value is identified by its `type metadata` record,
|
|
and witness tables for all of the required protocol conformances are included.
|
|
The layout is as if declared in the following C struct::
|
|
|
|
struct OpaqueExistentialContainer {
|
|
void *fixedSizeBuffer[3];
|
|
Metadata *type;
|
|
WitnessTable *witnessTables[NUM_WITNESS_TABLES];
|
|
};
|
|
|
|
Class Existential Containers
|
|
````````````````````````````
|
|
|
|
If one or more of the protocols in a protocol or protocol composition type
|
|
have a class constraint, then only class values can be stored in the existential
|
|
container, and a more efficient representation is used. Class instances are
|
|
always a single pointer in size, so a fixed-size buffer and potential side
|
|
allocation is not needed, and class instances always have a reference to their
|
|
own type metadata, so the separate metadata record is not needed. The
|
|
layout is thus as if declared in the following C struct::
|
|
|
|
struct ClassExistentialContainer {
|
|
HeapObject *value;
|
|
WitnessTable *witnessTables[NUM_WITNESS_TABLES];
|
|
};
|
|
|
|
Note that if no witness tables are needed, such as for the "any class" type
|
|
``protocol<class>`` or an Objective-C protocol type, then the only element of
|
|
the layout is the heap object pointer. This is ABI-compatible with ``id``
|
|
and ``id <Protocol>`` types in Objective-C.
|
|
|
|
Type Metadata
|
|
-------------
|
|
|
|
The Swift runtime keeps a **metadata record** for every type used in a program,
|
|
including every instantiation of generic types. These metadata records can
|
|
be used by (TODO: reflection and) debugger tools to discover information about
|
|
types. For non-generic nominal types, these metadata records are generated
|
|
statically by the compiler. For instances of generic types, and for intrinsic
|
|
types such as tuples, functions, protocol compositions, etc., metadata records
|
|
are lazily created by the runtime as required. Every type has a unique metadata
|
|
record; two **metadata pointer** values are equal iff the types are equivalent.
|
|
|
|
In the layout descriptions below, offsets are given relative to the
|
|
metadata pointer as an index into an array of pointers. On a 32-bit platform,
|
|
**offset 1** means an offset of 4 bytes, and on 64-bit platforms, it means
|
|
an offset of 8 bytes.
|
|
|
|
Common Metadata Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
All metadata records share a common header, with the following fields:
|
|
|
|
- The **value witness table** pointer references a vtable of functions
|
|
that implement the value semantics of the type, providing fundamental
|
|
operations such as allocating, copying, and destroying values of the type.
|
|
The value witness table also records the size, alignment, stride, and other
|
|
fundamental properties of the type. The value witness table pointer is at
|
|
**offset -1** from the metadata pointer, that is, the pointer-sized word
|
|
**immediately before** the pointer's referenced address.
|
|
|
|
- The **kind** field is a pointer-sized integer that describes the kind of type
|
|
the metadata describes. This field is at **offset 0** from the metadata
|
|
pointer.
|
|
|
|
The current kind values are as follows:
|
|
|
|
* `Struct metadata`_ has a kind of **1**.
|
|
* `Enum metadata`_ has a kind of **2**.
|
|
* **Opaque metadata** has a kind of **8**. This is used for compiler
|
|
``Builtin`` primitives that have no additional runtime information.
|
|
* `Tuple metadata`_ has a kind of **9**.
|
|
* `Function metadata`_ has a kind of **10**.
|
|
* `Protocol metadata`_ has a kind of **12**. This is used for
|
|
protocol types, for protocol compositions, and for the "any" type
|
|
``protocol<>``.
|
|
* `Metatype metadata`_ has a kind of **13**.
|
|
* `Class metadata`_, instead of a kind, has an *isa pointer* in its kind slot,
|
|
pointing to the class's metaclass record. This isa pointer is guaranteed
|
|
to have an integer value larger than **4096** and so can be discriminated
|
|
from non-class kind values.
|
|
|
|
Struct Metadata
|
|
~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, struct metadata records
|
|
contain the following fields:
|
|
|
|
- The `nominal type descriptor`_ is referenced at **offset 1**.
|
|
|
|
- A reference to the **parent** metadata record is stored at **offset 2**. For
|
|
structs that are members of an enclosing nominal type, this is a reference
|
|
to the enclosing type's metadata. For top-level structs, this is null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
- A vector of **field offsets** begins at **offset 3**. For each field of the
|
|
struct, in ``var`` declaration order, the field's offset in bytes from the
|
|
beginning of the struct is stored as a pointer-sized integer.
|
|
|
|
- If the struct is generic, then the
|
|
`generic parameter vector`_ begins at **offset 3+n**, where **n** is the
|
|
number of fields in the struct.
|
|
|
|
Enum Metadata
|
|
~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, enum metadata records
|
|
contain the following fields:
|
|
|
|
- The `nominal type descriptor`_ is referenced at **offset 1**.
|
|
|
|
- A reference to the **parent** metadata record is stored at **offset 2**. For
|
|
enums that are members of an enclosing nominal type, this is a reference to
|
|
the enclosing type's metadata. For top-level enums, this is null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
- If the enum is generic, then the
|
|
`generic parameter vector`_ begins at **offset 3**.
|
|
|
|
Tuple Metadata
|
|
~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, tuple metadata records
|
|
contain the following fields:
|
|
|
|
- The **number of elements** in the tuple is a pointer-sized integer at
|
|
**offset 1**.
|
|
- The **labels string** is a pointer to a list of consecutive null-terminated
|
|
label names for the tuple at **offset 2**. Each label name is given as a
|
|
null-terminated, UTF-8-encoded string in sequence. If the tuple has no
|
|
labels, this is a null pointer.
|
|
|
|
TODO: The labels string pointer is currently always null, and labels are
|
|
not factored into tuple metadata uniquing.
|
|
|
|
- The **element vector** begins at **offset 3** and consists of a vector of
|
|
type-offset pairs. The metadata for the *n*\ th element's type is a pointer
|
|
at **offset 3+2*n**. The offset in bytes from the beginning of the tuple to
|
|
the beginning of the *n*\ th element is at **offset 3+2*n+1**.
|
|
|
|
Function Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, function metadata records
|
|
contain the following fields:
|
|
|
|
- The number of arguments to the function is stored at **offset 1**.
|
|
- A reference to the **result type** metadata record is stored at
|
|
**offset 2**. If the function has multiple returns, this references a
|
|
`tuple metadata`_ record.
|
|
- The **argument vector** begins at **offset 3** and consists of pointers to
|
|
metadata records of the function's arguments.
|
|
|
|
If the function takes any **inout** arguments, a pointer to each argument's
|
|
metadata record will be appended separately, the lowest bit being set if it is
|
|
**inout**. Because of pointer alignment, the lowest bit will always be free to
|
|
hold this tag.
|
|
|
|
If the function takes no **inout** arguments, there will be only one pointer in
|
|
the vector for the following cases:
|
|
|
|
* 0 arguments: a `tuple metadata`_ record for the empty tuple
|
|
* 1 argument: the first and only argument's metadata record
|
|
* >1 argument: a `tuple metadata`_ record containing the arguments
|
|
|
|
Protocol Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, protocol metadata records
|
|
contain the following fields:
|
|
|
|
- A **layout flags** word is stored at **offset 1**. The bits of this word
|
|
describe the `existential container layout`_ used to represent
|
|
values of the type. The word is laid out as follows:
|
|
|
|
* The **number of witness tables** is stored in the least significant 31 bits.
|
|
Values of the protocol type contain this number of witness table pointers
|
|
in their layout.
|
|
* The **class constraint** is stored at bit 31. This bit is set if the type
|
|
is **not** class-constrained, meaning that struct, enum, or class values
|
|
can be stored in the type. If not set, then only class values can be stored
|
|
in the type, and the type uses a more efficient layout.
|
|
|
|
Note that the field is pointer-sized, even though only the lowest 32 bits are
|
|
currently inhabited on all platforms. These values can be derived from the
|
|
`protocol descriptor`_ records, but are pre-calculated for convenience.
|
|
|
|
- The **number of protocols** that make up the protocol composition is stored at
|
|
**offset 2**. For the "any" types ``protocol<>`` or ``protocol<class>``, this
|
|
is zero. For a single-protocol type ``P``, this is one. For a protocol
|
|
composition type ``protocol<P, Q, ...>``, this is the number of protocols.
|
|
|
|
- The **protocol descriptor vector** begins at **offset 3**. This is an inline
|
|
array of pointers to the `protocol descriptor`_ for every protocol in the
|
|
composition, or the single protocol descriptor for a protocol type. For
|
|
an "any" type, there is no protocol descriptor vector.
|
|
|
|
Metatype Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, metatype metadata records
|
|
contain the following fields:
|
|
|
|
- A reference to the metadata record for the **instance type** that the metatype
|
|
represents is stored at **offset 1**.
|
|
|
|
Class Metadata
|
|
~~~~~~~~~~~~~~
|
|
|
|
Class metadata is designed to interoperate with Objective-C; all class metadata
|
|
records are also valid Objective-C ``Class`` objects. Class metadata pointers
|
|
are used as the values of class metatypes, so a derived class's metadata
|
|
record also serves as a valid class metatype value for all of its ancestor
|
|
classes.
|
|
|
|
- The **destructor pointer** is stored at **offset -2** from the metadata
|
|
pointer, behind the value witness table. This function is invoked by Swift's
|
|
deallocator when the class instance is destroyed.
|
|
- The **isa pointer** pointing to the class's Objective-C-compatible metaclass
|
|
record is stored at **offset 0**, in place of an integer kind discriminator.
|
|
- The **super pointer** pointing to the metadata record for the superclass is
|
|
stored at **offset 1**. If the class is a root class, it is null.
|
|
- Two words are reserved for use by the Objective-C runtime at **offset 2**
|
|
and **offset 3**.
|
|
- The **rodata pointer** is stored at **offset 4**; it points to an Objective-C
|
|
compatible rodata record for the class. This pointer value includes a tag.
|
|
The **low bit is always set to 1** for Swift classes and always set to 0 for
|
|
Objective-C classes.
|
|
- The **class flags** are a 32-bit field at **offset 5**.
|
|
- The **instance address point** is a 32-bit field following the class flags.
|
|
A pointer to an instance of this class points this number of bytes after the
|
|
beginning of the instance.
|
|
- The **instance size** is a 32-bit field following the instance address point.
|
|
This is the number of bytes of storage present in every object of this type.
|
|
- The **instance alignment mask** is a 16-bit field following the instance size.
|
|
This is a set of low bits which must not be set in a pointer to an instance
|
|
of this class.
|
|
- The **runtime-reserved field** is a 16-bit field following the instance
|
|
alignment mask. The compiler initializes this to zero.
|
|
- The **class object size** is a 32-bit field following the runtime-reserved
|
|
field. This is the total number of bytes of storage in the class metadata
|
|
object.
|
|
- The **class object address point** is a 32-bit field following the class
|
|
object size. This is the number of bytes of storage in the class metadata
|
|
object.
|
|
- The `nominal type descriptor`_ for the most-derived class type is referenced
|
|
at an offset immediately following the class object address point. This is
|
|
**offset 8** on a 64-bit platform or **offset 11** on a 32-bit platform.
|
|
- For each Swift class in the class's inheritance hierarchy, in order starting
|
|
from the root class and working down to the most derived class, the following
|
|
fields are present:
|
|
|
|
* First, a reference to the **parent** metadata record is stored.
|
|
For classes that are members of an enclosing nominal type, this is a
|
|
reference to the enclosing type's metadata. For top-level classes, this is
|
|
null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
* If the class is generic, its `generic parameter vector`_ is stored inline.
|
|
* The **vtable** is stored inline and contains a function pointer to the
|
|
implementation of every method of the class in declaration order.
|
|
* If the layout of a class instance is dependent on its generic parameters,
|
|
then a **field offset vector** is stored inline, containing offsets in
|
|
bytes from an instance pointer to each field of the class in declaration
|
|
order. (For classes with fixed layout, the field offsets are accessible
|
|
statically from global variables, similar to Objective-C ivar offsets.)
|
|
|
|
Note that none of these fields are present for Objective-C base classes in
|
|
the inheritance hierarchy.
|
|
|
|
Generic Parameter Vector
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Metadata records for instances of generic types contain information about their
|
|
generic parameters. For each parameter of the type, a reference to the metadata
|
|
record for the type argument is stored. After all of the type argument
|
|
metadata references, for each type parameter, if there are protocol
|
|
requirements on that type parameter, a reference to the witness table for each
|
|
protocol it is required to conform to is stored in declaration order.
|
|
|
|
For example, given a generic type with the parameters ``<T, U, V>``, its
|
|
generic parameter record will consist of references to the metadata records
|
|
for ``T``, ``U``, and ``V`` in succession, as if laid out in a C struct::
|
|
|
|
struct GenericParameterVector {
|
|
TypeMetadata *T, *U, *V;
|
|
};
|
|
|
|
If we add protocol requirements to the parameters, for example,
|
|
``<T: Runcible, U: protocol<Fungible, Ansible>, V>``, then the type's generic
|
|
parameter vector contains witness tables for those protocols, as if laid out::
|
|
|
|
struct GenericParameterVector {
|
|
TypeMetadata *T, *U, *V;
|
|
RuncibleWitnessTable *T_Runcible;
|
|
FungibleWitnessTable *U_Fungible;
|
|
AnsibleWitnessTable *U_Ansible;
|
|
};
|
|
|
|
Nominal Type Descriptor
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The metadata records for class, struct, and enum types contain a pointer to a
|
|
**nominal type descriptor**, which contains basic information about the nominal
|
|
type such as its name, members, and metadata layout. For a generic type, one
|
|
nominal type descriptor is shared for all instantiations of the type. The
|
|
layout is as follows:
|
|
|
|
- The **kind** of type is stored at **offset 0**, which is as follows:
|
|
|
|
* **0** for a class,
|
|
* **1** for a struct, or
|
|
* **2** for an enum.
|
|
|
|
- The mangled **name** is referenced as a null-terminated C string at
|
|
**offset 1**. This name includes no bound generic parameters.
|
|
- The following four fields depend on the kind of nominal type.
|
|
|
|
* For a struct or class:
|
|
|
|
+ The **number of fields** is stored at **offset 2**. This is the length
|
|
of the field offset vector in the metadata record, if any.
|
|
+ The **offset to the field offset vector** is stored at **offset 3**.
|
|
This is the offset in pointer-sized words of the field offset vector for
|
|
the type in the metadata record. If no field offset vector is stored
|
|
in the metadata record, this is zero.
|
|
+ The **field names** are referenced as a doubly-null-terminated list of
|
|
C strings at **offset 4**. The order of names corresponds to the order
|
|
of fields in the field offset vector.
|
|
+ The **field type accessor** is a function pointer at **offset 5**. If
|
|
non-null, the function takes a pointer to an instance of type metadata
|
|
for the nominal type, and returns a pointer to an array of type metadata
|
|
references for the types of the fields of that instance. The order matches
|
|
that of the field offset vector and field name list.
|
|
|
|
* For an enum:
|
|
|
|
+ The **number of payload cases** and **payload size offset** are stored
|
|
at **offset 2**. The least significant 24 bits are the number of payload
|
|
cases, and the most significant 8 bits are the offset of the payload
|
|
size in the type metadata, if present.
|
|
+ The **number of no-payload cases** is stored at **offset 3**.
|
|
+ The **case names** are referenced as a doubly-null-terminated list of
|
|
C strings at **offset 4**. The names are ordered such that payload cases
|
|
come first, followed by no-payload cases. Within each half of the list,
|
|
the order of names corresponds to the order of cases in the enum
|
|
declaration.
|
|
+ The **case type accessor** is a function pointer at **offset 5**. If
|
|
non-null, the function takes a pointer to an instance of type metadata
|
|
for the enum, and returns a pointer to an array of type metadata
|
|
references for the types of the cases of that instance. The order matches
|
|
that of the case name list. This function is similar to the field type
|
|
accessor for a struct, except also the least significant bit of each
|
|
element in the result is set if the enum case is an **indirect case**.
|
|
|
|
- If the nominal type is generic, a pointer to the **metadata pattern** that
|
|
is used to form instances of the type is stored at **offset 6**. The pointer
|
|
is null if the type is not generic.
|
|
|
|
- The **generic parameter descriptor** begins at **offset 7**. This describes
|
|
the layout of the generic parameter vector in the metadata record:
|
|
|
|
* The **offset of the generic parameter vector** is stored at **offset 7**.
|
|
This is the offset in pointer-sized words of the generic parameter vector
|
|
inside the metadata record. If the type is not generic, this is zero.
|
|
* The **number of type parameters** is stored at **offset 8**. This count
|
|
includes associated types of type parameters with protocol constraints.
|
|
* The **number of type parameters** is stored at **offset 9**. This count
|
|
includes only the primary formal type parameters.
|
|
* For each type parameter **n**, the following fields are stored:
|
|
|
|
+ The **number of witnesses** for the type parameter is stored at
|
|
**offset 10+n**. This is the number of witness table pointers that are
|
|
stored for the type parameter in the generic parameter vector.
|
|
|
|
Note that there is no nominal type descriptor for protocols or protocol types.
|
|
See the `protocol descriptor`_ description below.
|
|
|
|
Protocol Descriptor
|
|
~~~~~~~~~~~~~~~~~~~
|
|
|
|
`Protocol metadata` contains references to zero, one, or more **protocol
|
|
descriptors** that describe the protocols values of the type are required to
|
|
conform to. The protocol descriptor is laid out to be compatible with
|
|
Objective-C ``Protocol`` objects. The layout is as follows:
|
|
|
|
- An **isa** placeholder is stored at **offset 0**. This field is populated by
|
|
the Objective-C runtime.
|
|
- The mangled **name** is referenced as a null-terminated C string at
|
|
**offset 1**.
|
|
- If the protocol inherits one or more other protocols, a pointer to the
|
|
**inherited protocols list** is stored at **offset 2**. The list starts with
|
|
the number of inherited protocols as a pointer-sized integer, and is followed
|
|
by that many protocol descriptor pointers. If the protocol inherits no other
|
|
protocols, this pointer is null.
|
|
- For an ObjC-compatible protocol, its **required instance methods** are stored
|
|
at **offset 3** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **required class methods** are stored
|
|
at **offset 4** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **optional instance methods** are stored
|
|
at **offset 5** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **optional class methods** are stored
|
|
at **offset 6** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **instance properties** are stored
|
|
at **offset 7** as an ObjC-compatible property list. This is null for native
|
|
Swift protocols.
|
|
- The **size** of the protocol descriptor record is stored as a 32-bit integer
|
|
at **offset 8**. This is currently 72 on 64-bit platforms and 40 on 32-bit
|
|
platforms.
|
|
- **Flags** are stored as a 32-bit integer after the size. The following bits
|
|
are currently used (counting from least significant bit zero):
|
|
|
|
* **Bit 0** is the **Swift bit**. It is set for all protocols defined in
|
|
Swift and unset for protocols defined in Objective-C.
|
|
* **Bit 1** is the **class constraint bit**. It is set if the protocol is
|
|
**not** class-constrained, meaning that any struct, enum, or class type
|
|
may conform to the protocol. It is unset if only classes can conform to
|
|
the protocol. (The inverted meaning is for compatibility with Objective-C
|
|
protocol records, in which the bit is never set. Objective-C protocols can
|
|
only be conformed to by classes.)
|
|
* **Bit 2** is the **witness table bit**. It is set if dispatch to the
|
|
protocol's methods is done through a witness table, which is either passed
|
|
as an extra parameter to generic functions or included in the `existential
|
|
container layout`_ of protocol types. It is unset if dispatch is done
|
|
through ``objc_msgSend`` and requires no additional information to accompany
|
|
a value of conforming type.
|
|
* **Bit 31** is set by the Objective-C runtime when it has done its
|
|
initialization of the protocol record. It is unused by the Swift runtime.
|
|
|
|
Heap Objects
|
|
------------
|
|
|
|
Heap Metadata
|
|
~~~~~~~~~~~~~
|
|
|
|
Heap Object Header
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
Mangling
|
|
--------
|
|
::
|
|
|
|
mangled-name ::= '_T' global
|
|
|
|
All Swift-mangled names begin with this prefix.
|
|
|
|
Globals
|
|
~~~~~~~
|
|
|
|
::
|
|
|
|
global ::= 't' type // standalone type (for DWARF)
|
|
global ::= 'M' type // type metadata (address point)
|
|
// -- type starts with [BCOSTV]
|
|
global ::= 'Mf' type // 'full' type metadata (start of object)
|
|
global ::= 'MP' type // type metadata pattern
|
|
global ::= 'Ma' type // type metadata access function
|
|
global ::= 'ML' type // type metadata lazy cache variable
|
|
global ::= 'Mm' type // class metaclass
|
|
global ::= 'Mn' nominal-type // nominal type descriptor
|
|
global ::= 'Mp' protocol // protocol descriptor
|
|
global ::= 'PA' .* // partial application forwarder
|
|
global ::= 'PAo' .* // ObjC partial application forwarder
|
|
global ::= 'w' value-witness-kind type // value witness
|
|
global ::= 'Wa' protocol-conformance // protocol witness table accessor
|
|
global ::= 'WG' protocol-conformance // generic protocol witness table
|
|
global ::= 'WI' protocol-conformance // generic protocol witness table instantiation function
|
|
global ::= 'Wl' type protocol-conformance // lazy protocol witness table accessor
|
|
global ::= 'WL' protocol-conformance // lazy protocol witness table cache variable
|
|
global ::= 'Wo' entity // witness table offset
|
|
global ::= 'WP' protocol-conformance // protocol witness table
|
|
global ::= 'Wt' protocol-conformance identifier // associated type metadata accessor
|
|
global ::= 'WT' protocol-conformance identifier nominal-type // associated type witness table accessor
|
|
global ::= 'Wv' directness entity // field offset
|
|
global ::= 'WV' type // value witness table
|
|
global ::= entity // some identifiable thing
|
|
global ::= 'TO' global // ObjC-as-swift thunk
|
|
global ::= 'To' global // swift-as-ObjC thunk
|
|
global ::= 'TD' global // dynamic dispatch thunk
|
|
global ::= 'Td' global // direct method reference thunk
|
|
global ::= 'TR' reabstract-signature // reabstraction thunk helper function
|
|
global ::= 'Tr' reabstract-signature // reabstraction thunk
|
|
|
|
global ::= 'TS' specializationinfo '_' mangled-name
|
|
specializationinfo ::= 'g' passid (type protocol-conformance* '_')+ // Generic specialization info.
|
|
specializationinfo ::= 'f' passid (funcspecializationarginfo '_')+ // Function signature specialization kind
|
|
passid ::= integer // The id of the pass that generated this specialization.
|
|
funcsigspecializationarginfo ::= 'cl' closurename type* // Closure specialized with closed over types in argument order.
|
|
funcsigspecializationarginfo ::= 'n' // Unmodified argument
|
|
funcsigspecializationarginfo ::= 'cp' funcsigspecializationconstantproppayload // Constant propagated argument
|
|
funcsigspecializationarginfo ::= 'd' // Dead argument
|
|
funcsigspecializationarginfo ::= 'g' 's'? // Owned => Guaranteed and Exploded if 's' present.
|
|
funcsigspecializationarginfo ::= 's' // Exploded
|
|
funcsigspecializationarginfo ::= 'k' // Exploded
|
|
funcsigspecializationconstantpropinfo ::= 'fr' mangled-name
|
|
funcsigspecializationconstantpropinfo ::= 'g' mangled-name
|
|
funcsigspecializationconstantpropinfo ::= 'i' 64-bit-integer
|
|
funcsigspecializationconstantpropinfo ::= 'fl' float-as-64-bit-integer
|
|
funcsigspecializationconstantpropinfo ::= 'se' stringencoding 'v' md5hash
|
|
|
|
global ::= 'TV' global // vtable override thunk
|
|
global ::= 'TW' protocol-conformance entity
|
|
// protocol witness thunk
|
|
entity ::= nominal-type // named type declaration
|
|
entity ::= static? entity-kind context entity-name
|
|
entity-kind ::= 'F' // function (ctor, accessor, etc.)
|
|
entity-kind ::= 'v' // variable (let/var)
|
|
entity-kind ::= 'i' // subscript ('i'ndex) itself (not the individual accessors)
|
|
entity-kind ::= 'I' // initializer
|
|
entity-name ::= decl-name type // named declaration
|
|
entity-name ::= 'A' index // default argument generator
|
|
entity-name ::= 'a' addressor-kind decl-name type // mutable addressor
|
|
entity-name ::= 'C' type // allocating constructor
|
|
entity-name ::= 'c' type // non-allocating constructor
|
|
entity-name ::= 'D' // deallocating destructor; untyped
|
|
entity-name ::= 'd' // non-deallocating destructor; untyped
|
|
entity-name ::= 'g' decl-name type // getter
|
|
entity-name ::= 'i' // non-local variable initializer
|
|
entity-name ::= 'l' addressor-kind decl-name type // non-mutable addressor
|
|
entity-name ::= 'm' decl-name type // materializeForSet
|
|
entity-name ::= 's' decl-name type // setter
|
|
entity-name ::= 'U' index type // explicit anonymous closure expression
|
|
entity-name ::= 'u' index type // implicit anonymous closure
|
|
entity-name ::= 'w' decl-name type // willSet
|
|
entity-name ::= 'W' decl-name type // didSet
|
|
static ::= 'Z' // entity is a static member of a type
|
|
decl-name ::= identifier
|
|
decl-name ::= local-decl-name
|
|
decl-name ::= private-decl-name
|
|
local-decl-name ::= 'L' index identifier // locally-discriminated declaration
|
|
private-decl-name ::= 'P' identifier identifier // file-discriminated declaration
|
|
reabstract-signature ::= ('G' generic-signature)? type type
|
|
addressor-kind ::= 'u' // unsafe addressor (no owner)
|
|
addressor-kind ::= 'O' // owning addressor (non-native owner)
|
|
addressor-kind ::= 'o' // owning addressor (native owner)
|
|
addressor-kind ::= 'p' // pinning addressor (native owner)
|
|
|
|
An ``entity`` starts with a ``nominal-type-kind`` (``[COPV]``), a
|
|
substitution (``[Ss]``) of a nominal type, or an ``entity-kind``
|
|
(``[FIiv]``).
|
|
|
|
An ``entity-name`` starts with ``[AaCcDggis]`` or a ``decl-name``.
|
|
A ``decl-name`` starts with ``[LP]`` or an ``identifier`` (``[0-9oX]``).
|
|
|
|
A ``context`` starts with either an ``entity``, an ``extension`` (which starts
|
|
with ``[Ee]``), or a ``module``, which might be an ``identifier`` (``[0-9oX]``)
|
|
or a substitution of a module (``[Ss]``).
|
|
|
|
A global mangling starts with an ``entity`` or ``[MTWw]``.
|
|
|
|
If a partial application forwarder is for a static symbol, its name will
|
|
start with the sequence ``_TPA_`` followed by the mangled symbol name of the
|
|
forwarder's destination.
|
|
|
|
A generic specialization mangling consists of a header, specifying the types
|
|
and conformances used to specialize the generic function, followed by the
|
|
full mangled name of the original unspecialized generic symbol.
|
|
|
|
The first identifier in a ``<private-decl-name>`` is a string that represents
|
|
the file the original declaration came from. It should be considered unique
|
|
within the enclosing module. The second identifier is the name of the entity.
|
|
|
|
Not all declarations marked ``private`` declarations will use the
|
|
``<private-decl-name>`` mangling; if the entity's context is enough to uniquely
|
|
identify the entity, the simple ``identifier`` form is preferred.
|
|
|
|
The types in a ``<reabstract-signature>`` are always non-polymorphic
|
|
``<impl-function-type>`` types.
|
|
|
|
Direct and Indirect Symbols
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
directness ::= 'd' // direct
|
|
directness ::= 'i' // indirect
|
|
|
|
A direct symbol resolves directly to the address of an object. An
|
|
indirect symbol resolves to the address of a pointer to the object.
|
|
They are distinct manglings to make a certain class of bugs
|
|
immediately obvious.
|
|
|
|
The terminology is slightly overloaded when discussing offsets. A
|
|
direct offset resolves to a variable holding the true offset. An
|
|
indirect offset resolves to a variable holding an offset to be applied
|
|
to type metadata to get the address of the true offset. (Offset
|
|
variables are required when the object being accessed lies within a
|
|
resilient structure. When the layout of the object may depend on
|
|
generic arguments, these offsets must be kept in metadata. Indirect
|
|
field offsets are therefore required when accessing fields in generic
|
|
types where the metadata itself has unknown layout.)
|
|
|
|
Declaration Contexts
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
context ::= module
|
|
context ::= extension
|
|
context ::= entity
|
|
module ::= substitution // other substitution
|
|
module ::= identifier // module name
|
|
module ::= known-module // abbreviation
|
|
extension ::= 'E' module entity
|
|
extension ::= 'e' module generic-signature entity
|
|
|
|
These manglings identify the enclosing context in which an entity was declared,
|
|
such as its enclosing module, function, or nominal type.
|
|
|
|
An ``extension`` mangling is used whenever an entity's declaration context is
|
|
an extension *and* the entity being extended is in a different module. In this
|
|
case the extension's module is mangled first, followed by the entity being
|
|
extended. If the extension and the extended entity are in the same module, the
|
|
plain ``entity`` mangling is preferred. If the extension is constrained, the
|
|
constraints on the extension are mangled in its generic signature.
|
|
|
|
When mangling the context of a local entity within a constructor or
|
|
destructor, the non-allocating or non-deallocating variant is used.
|
|
|
|
Types
|
|
~~~~~
|
|
|
|
::
|
|
|
|
type ::= 'Bb' // Builtin.BridgeObject
|
|
type ::= 'BB' // Builtin.UnsafeValueBuffer
|
|
type ::= 'Bf' natural '_' // Builtin.Float<n>
|
|
type ::= 'Bi' natural '_' // Builtin.Int<n>
|
|
type ::= 'BO' // Builtin.ObjCPointer
|
|
type ::= 'Bo' // Builtin.ObjectPointer
|
|
type ::= 'Bp' // Builtin.RawPointer
|
|
type ::= 'Bv' natural type // Builtin.Vec<n>x<type>
|
|
type ::= 'Bw' // Builtin.Word
|
|
type ::= nominal-type
|
|
type ::= associated-type
|
|
type ::= 'a' context identifier // Type alias (DWARF only)
|
|
type ::= 'b' type type // objc block function type
|
|
type ::= 'c' type type // C function pointer type
|
|
type ::= 'F' throws-annotation? type type // function type
|
|
type ::= 'f' throws-annotation? type type // uncurried function type
|
|
type ::= 'G' type <type>+ '_' // generic type application
|
|
type ::= 'K' type type // @auto_closure function type
|
|
type ::= 'M' type // metatype without representation
|
|
type ::= 'XM' metatype-repr type // metatype with representation
|
|
type ::= 'P' protocol-list '_' // protocol type
|
|
type ::= 'PM' type // existential metatype without representation
|
|
type ::= 'XPM' metatype-repr type // existential metatype with representation
|
|
type ::= archetype
|
|
type ::= 'R' type // inout
|
|
type ::= 'T' tuple-element* '_' // tuple
|
|
type ::= 't' tuple-element* '_' // variadic tuple
|
|
type ::= 'Xo' type // @unowned type
|
|
type ::= 'Xu' type // @unowned(unsafe) type
|
|
type ::= 'Xw' type // @weak type
|
|
type ::= 'XF' impl-function-type // function implementation type
|
|
type ::= 'Xf' type type // @thin function type
|
|
nominal-type ::= known-nominal-type
|
|
nominal-type ::= substitution
|
|
nominal-type ::= nominal-type-kind declaration-name
|
|
nominal-type-kind ::= 'C' // class
|
|
nominal-type-kind ::= 'O' // enum
|
|
nominal-type-kind ::= 'V' // struct
|
|
declaration-name ::= context decl-name
|
|
archetype ::= 'Q' index // archetype with depth=0, idx=N
|
|
archetype ::= 'Qd' index index // archetype with depth=M+1, idx=N
|
|
archetype ::= associated-type
|
|
archetype ::= qualified-archetype
|
|
associated-type ::= substitution
|
|
associated-type ::= 'Q' protocol-context // self type of protocol
|
|
associated-type ::= 'Q' archetype identifier // associated type
|
|
qualified-archetype ::= 'Qq' index context // archetype+context (DWARF only)
|
|
protocol-context ::= 'P' protocol
|
|
tuple-element ::= identifier? type
|
|
metatype-repr ::= 't' // Thin metatype representation
|
|
metatype-repr ::= 'T' // Thick metatype representation
|
|
metatype-repr ::= 'o' // ObjC metatype representation
|
|
throws-annotation ::= 'z' // 'throws' annotation on function types
|
|
|
|
|
|
type ::= 'u' generic-signature type // generic type
|
|
type ::= 'x' // generic param, depth=0, idx=0
|
|
type ::= 'q' generic-param-index // dependent generic parameter
|
|
type ::= 'q' type assoc-type-name // associated type of non-generic param
|
|
type ::= 'w' generic-param-index assoc-type-name // associated type
|
|
type ::= 'W' generic-param-index assoc-type-name+ '_' // associated type at depth
|
|
|
|
generic-param-index ::= 'x' // depth = 0, idx = 0
|
|
generic-param-index ::= index // depth = 0, idx = N+1
|
|
generic-param-index ::= 'd' index index // depth = M+1, idx = N
|
|
|
|
``<type>`` never begins or ends with a number.
|
|
``<type>`` never begins with an underscore.
|
|
``<type>`` never begins with ``d``.
|
|
``<type>`` never begins with ``z``.
|
|
|
|
Note that protocols mangle differently as types and as contexts. A protocol
|
|
context always consists of a single protocol name and so mangles without a
|
|
trailing underscore. A protocol type can have zero, one, or many protocol bounds
|
|
which are juxtaposed and terminated with a trailing underscore.
|
|
|
|
::
|
|
|
|
assoc-type-name ::= ('P' protocol-name)? identifier
|
|
assoc-type-name ::= substitution
|
|
|
|
Associated types use an abbreviated mangling when the base generic parameter
|
|
or associated type is constrained by a single protocol requirement. The
|
|
associated type in this case can be referenced unambiguously by name alone.
|
|
If the base has multiple conformance constraints, then the protocol name is
|
|
mangled in to disambiguate.
|
|
|
|
::
|
|
|
|
impl-function-type ::=
|
|
impl-callee-convention impl-function-attribute* generic-signature? '_'
|
|
impl-parameter* '_' impl-result* '_'
|
|
impl-callee-convention ::= 't' // thin
|
|
impl-callee-convention ::= impl-convention // thick, callee transferred with given convention
|
|
impl-convention ::= 'a' // direct, autoreleased
|
|
impl-convention ::= 'd' // direct, no ownership transfer
|
|
impl-convention ::= 'D' // direct, no ownership transfer,
|
|
// dependent on 'self' parameter
|
|
impl-convention ::= 'g' // direct, guaranteed
|
|
impl-convention ::= 'e' // direct, deallocating
|
|
impl-convention ::= 'i' // indirect, ownership transfer
|
|
impl-convention ::= 'l' // indirect, inout
|
|
impl-convention ::= 'G' // indirect, guaranteed
|
|
impl-convention ::= 'o' // direct, ownership transfer
|
|
impl-convention ::= 'z' impl-convention // error result
|
|
impl-function-attribute ::= 'Cb' // compatible with C block invocation function
|
|
impl-function-attribute ::= 'Cc' // compatible with C global function
|
|
impl-function-attribute ::= 'Cm' // compatible with Swift method
|
|
impl-function-attribute ::= 'CO' // compatible with ObjC method
|
|
impl-function-attribute ::= 'Cw' // compatible with protocol witness
|
|
impl-function-attribute ::= 'N' // noreturn
|
|
impl-function-attribute ::= 'G' // generic
|
|
impl-parameter ::= impl-convention type
|
|
impl-result ::= impl-convention type
|
|
|
|
For the most part, manglings follow the structure of formal language
|
|
types. However, in some cases it is more useful to encode the exact
|
|
implementation details of a function type.
|
|
|
|
Any ``<impl-function-attribute>`` productions must appear in the order
|
|
in which they are specified above: e.g. a noreturn C function is
|
|
mangled with ``CcN``.
|
|
|
|
Note that the convention and function-attribute productions do not
|
|
need to be disambiguated from the start of a ``<type>``.
|
|
|
|
Generics
|
|
~~~~~~~~
|
|
|
|
::
|
|
|
|
protocol-conformance ::= ('u' generic-signature)? type protocol module
|
|
|
|
``<protocol-conformance>`` refers to a type's conformance to a protocol. The
|
|
named module is the one containing the extension or type declaration that
|
|
declared the conformance.
|
|
|
|
::
|
|
|
|
// Property behavior conformance
|
|
protocol-conformance ::= ('u' generic-signature)?
|
|
'b' identifier context identifier protocol
|
|
|
|
Property behaviors are implemented using private protocol conformances.
|
|
|
|
::
|
|
|
|
generic-signature ::= (generic-param-count+)? ('R' requirement*)? 'r'
|
|
generic-param-count ::= 'z' // zero parameters
|
|
generic-param-count ::= index // N+1 parameters
|
|
requirement ::= type-param protocol-name // protocol requirement
|
|
requirement ::= type-param type // base class requirement
|
|
// type starts with [CS]
|
|
requirement ::= type-param 'z' type // 'z'ame-type requirement
|
|
|
|
// Special type mangling for type params that saves the initial 'q' on
|
|
// generic params
|
|
type-param ::= generic-param-index // generic parameter
|
|
type-param ::= 'w' generic-param-index assoc-type-name // associated type
|
|
type-param ::= 'W' generic-param-index assoc-type-name+ '_'
|
|
|
|
A generic signature begins by describing the number of generic parameters at
|
|
each depth of the signature, followed by the requirements. As a special case,
|
|
no ``generic-param-count`` values indicates a single generic parameter at
|
|
the outermost depth::
|
|
|
|
urFq_q_ // <T_0_0> T_0_0 -> T_0_0
|
|
u_0_rFq_qd_0_ // <T_0_0><T_1_0, T_1_1> T_0_0 -> T_1_1
|
|
|
|
Value Witnesses
|
|
~~~~~~~~~~~~~~~
|
|
|
|
TODO: document these
|
|
|
|
::
|
|
|
|
value-witness-kind ::= 'al' // allocateBuffer
|
|
value-witness-kind ::= 'ca' // assignWithCopy
|
|
value-witness-kind ::= 'ta' // assignWithTake
|
|
value-witness-kind ::= 'de' // deallocateBuffer
|
|
value-witness-kind ::= 'xx' // destroy
|
|
value-witness-kind ::= 'XX' // destroyBuffer
|
|
value-witness-kind ::= 'Xx' // destroyArray
|
|
value-witness-kind ::= 'CP' // initializeBufferWithCopyOfBuffer
|
|
value-witness-kind ::= 'Cp' // initializeBufferWithCopy
|
|
value-witness-kind ::= 'cp' // initializeWithCopy
|
|
value-witness-kind ::= 'TK' // initializeBufferWithTakeOfBuffer
|
|
value-witness-kind ::= 'Tk' // initializeBufferWithTake
|
|
value-witness-kind ::= 'tk' // initializeWithTake
|
|
value-witness-kind ::= 'pr' // projectBuffer
|
|
value-witness-kind ::= 'xs' // storeExtraInhabitant
|
|
value-witness-kind ::= 'xg' // getExtraInhabitantIndex
|
|
value-witness-kind ::= 'Cc' // initializeArrayWithCopy
|
|
value-witness-kind ::= 'Tt' // initializeArrayWithTakeFrontToBack
|
|
value-witness-kind ::= 'tT' // initializeArrayWithTakeBackToFront
|
|
value-witness-kind ::= 'ug' // getEnumTag
|
|
value-witness-kind ::= 'up' // destructiveProjectEnumData
|
|
value-witness-kind ::= 'ui' // destructiveInjectEnumTag
|
|
|
|
``<value-witness-kind>`` differentiates the kinds of value
|
|
witness functions for a type.
|
|
|
|
Identifiers
|
|
~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
identifier ::= natural identifier-start-char identifier-char*
|
|
identifier ::= 'o' operator-fixity natural operator-char+
|
|
|
|
operator-fixity ::= 'p' // prefix operator
|
|
operator-fixity ::= 'P' // postfix operator
|
|
operator-fixity ::= 'i' // infix operator
|
|
|
|
operator-char ::= 'a' // & 'and'
|
|
operator-char ::= 'c' // @ 'commercial at'
|
|
operator-char ::= 'd' // / 'divide'
|
|
operator-char ::= 'e' // = 'equals'
|
|
operator-char ::= 'g' // > 'greater'
|
|
operator-char ::= 'l' // < 'less'
|
|
operator-char ::= 'm' // * 'multiply'
|
|
operator-char ::= 'n' // ! 'not'
|
|
operator-char ::= 'o' // | 'or'
|
|
operator-char ::= 'p' // + 'plus'
|
|
operator-char ::= 'q' // ? 'question'
|
|
operator-char ::= 'r' // % 'remainder'
|
|
operator-char ::= 's' // - 'subtract'
|
|
operator-char ::= 't' // ~ 'tilde'
|
|
operator-char ::= 'x' // ^ 'xor'
|
|
operator-char ::= 'z' // . 'zperiod'
|
|
|
|
``<identifier>`` is run-length encoded: the natural indicates how many
|
|
characters follow. Operator characters are mapped to letter characters as
|
|
given. In neither case can an identifier start with a digit, so
|
|
there's no ambiguity with the run-length.
|
|
|
|
::
|
|
|
|
identifier ::= 'X' natural identifier-start-char identifier-char*
|
|
identifier ::= 'X' 'o' operator-fixity natural identifier-char*
|
|
|
|
Identifiers that contain non-ASCII characters are encoded using the Punycode
|
|
algorithm specified in RFC 3492, with the modifications that ``_`` is used
|
|
as the encoding delimiter, and uppercase letters A through J are used in place
|
|
of digits 0 through 9 in the encoding character set. The mangling then
|
|
consists of an ``X`` followed by the run length of the encoded string and the
|
|
encoded string itself. For example, the identifier ``vergüenza`` is mangled
|
|
to ``X12vergenza_JFa``. (The encoding in standard Punycode would be
|
|
``vergenza-95a``)
|
|
|
|
Operators that contain non-ASCII characters are mangled by first mapping the
|
|
ASCII operator characters to letters as for pure ASCII operator names, then
|
|
Punycode-encoding the substituted string. The mangling then consists of
|
|
``Xo`` followed by the fixity, run length of the encoded string, and the encoded
|
|
string itself. For example, the infix operator ``«+»`` is mangled to
|
|
``Xoi7p_qcaDc`` (``p_qcaDc`` being the encoding of the substituted
|
|
string ``«p»``).
|
|
|
|
Substitutions
|
|
~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
substitution ::= 'S' index
|
|
|
|
``<substitution>`` is a back-reference to a previously mangled entity. The mangling
|
|
algorithm maintains a mapping of entities to substitution indices as it runs.
|
|
When an entity that can be represented by a substitution (a module, nominal
|
|
type, or protocol) is mangled, a substitution is first looked for in the
|
|
substitution map, and if it is present, the entity is mangled using the
|
|
associated substitution index. Otherwise, the entity is mangled normally, and
|
|
it is then added to the substitution map and associated with the next
|
|
available substitution index.
|
|
|
|
For example, in mangling a function type
|
|
``(zim.zang.zung, zim.zang.zung, zim.zippity) -> zim.zang.zoo`` (with module
|
|
``zim`` and class ``zim.zang``),
|
|
the recurring contexts ``zim``, ``zim.zang``, and ``zim.zang.zung``
|
|
will be mangled using substitutions after being mangled
|
|
for the first time. The first argument type will mangle in long form,
|
|
``CC3zim4zang4zung``, and in doing so, ``zim`` will acquire substitution ``S_``,
|
|
``zim.zang`` will acquire substitution ``S0_``, and ``zim.zang.zung`` will
|
|
acquire ``S1_``. The second argument is the same as the first and will mangle
|
|
using its substitution, ``S1_``. The
|
|
third argument type will mangle using the substitution for ``zim``,
|
|
``CS_7zippity``. (It also acquires substitution ``S2_`` which would be used
|
|
if it mangled again.) The result type will mangle using the substitution for
|
|
``zim.zang``, ``CS0_3zoo`` (and acquire substitution ``S3_``). The full
|
|
function type thus mangles as ``fTCC3zim4zang4zungS1_CS_7zippity_CS0_3zoo``.
|
|
|
|
::
|
|
|
|
substitution ::= 's'
|
|
|
|
The special substitution ``s`` is used for the ``Swift`` standard library
|
|
module.
|
|
|
|
Predefined Substitutions
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
known-module ::= 's' // Swift
|
|
known-module ::= 'SC' // C
|
|
known-module ::= 'So' // Objective-C
|
|
known-nominal-type ::= 'Sa' // Swift.Array
|
|
known-nominal-type ::= 'Sb' // Swift.Bool
|
|
known-nominal-type ::= 'Sc' // Swift.UnicodeScalar
|
|
known-nominal-type ::= 'Sd' // Swift.Float64
|
|
known-nominal-type ::= 'Sf' // Swift.Float32
|
|
known-nominal-type ::= 'Si' // Swift.Int
|
|
known-nominal-type ::= 'SP' // Swift.UnsafePointer
|
|
known-nominal-type ::= 'Sp' // Swift.UnsafeMutablePointer
|
|
known-nominal-type ::= 'SQ' // Swift.ImplicitlyUnwrappedOptional
|
|
known-nominal-type ::= 'Sq' // Swift.Optional
|
|
known-nominal-type ::= 'SR' // Swift.UnsafeBufferPointer
|
|
known-nominal-type ::= 'Sr' // Swift.UnsafeMutableBufferPointer
|
|
known-nominal-type ::= 'SS' // Swift.String
|
|
known-nominal-type ::= 'Su' // Swift.UInt
|
|
|
|
``<known-module>`` and ``<known-nominal-type>`` are built-in substitutions for
|
|
certain common entities. Like any other substitution, they all start
|
|
with 'S'.
|
|
|
|
The Objective-C module is used as the context for mangling Objective-C
|
|
classes as ``<type>``\ s.
|
|
|
|
Indexes
|
|
~~~~~~~
|
|
|
|
::
|
|
|
|
index ::= '_' // 0
|
|
index ::= natural '_' // N+1
|
|
natural ::= [0-9]+
|
|
|
|
``<index>`` is a production for encoding numbers in contexts that can't
|
|
end in a digit; it's optimized for encoding smaller numbers.
|