mirror of
https://github.com/apple/swift.git
synced 2025-12-21 12:14:44 +01:00
1236 lines
56 KiB
ReStructuredText
1236 lines
56 KiB
ReStructuredText
:orphan:
|
|
|
|
.. @raise litre.TestsAreMissing
|
|
.. _ABI:
|
|
|
|
The Swift ABI
|
|
=============
|
|
|
|
.. contents::
|
|
|
|
Hard Constraints on Resilience
|
|
------------------------------
|
|
|
|
The root of a class hierarchy must remain stable, at pain of
|
|
invalidating the metaclass hierarchy. Note that a Swift class without an
|
|
explicit base class is implicitly rooted in the SwiftObject
|
|
Objective-C class.
|
|
|
|
Type Layout
|
|
-----------
|
|
|
|
Fragile Struct and Tuple Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Structs and tuples currently share the same layout algorithm, noted as the
|
|
"Universal" layout algorithm in the compiler implementation. The algorithm
|
|
is as follows:
|
|
|
|
- Start with a **size** of **0** and an **alignment** of **1**.
|
|
- Iterate through the fields, in element order for tuples, or in ``var``
|
|
declaration order for structs. For each field:
|
|
|
|
* Update **size** by rounding up to the **alignment of the field**, that is,
|
|
increasing it to the least value greater or equal to **size** and evenly
|
|
divisible by the **alignment of the field**.
|
|
* Assign the **offset of the field** to the current value of **size**.
|
|
* Update **size** by adding the **size of the field**.
|
|
* Update **alignment** to the max of **alignment** and the
|
|
**alignment of the field**.
|
|
|
|
- The final **size** and **alignment** are the size and alignment of the
|
|
aggregate. The **stride** of the type is the final **size** rounded up to
|
|
**alignment**.
|
|
|
|
Note that this differs from C or LLVM's normal layout rules in that *size*
|
|
and *stride* are distinct; whereas C layout requires that an embedded struct's
|
|
size be padded out to its alignment and that nothing be laid out there,
|
|
Swift layout allows an outer struct to lay out fields in the inner struct's
|
|
tail padding, alignment permitting. Unlike C, zero-sized structs and tuples
|
|
are also allowed, and take up no storage in enclosing aggregates. The Swift
|
|
compiler emits LLVM packed struct types with manual padding to get the
|
|
necessary control over the binary layout. Some examples:
|
|
|
|
::
|
|
|
|
// LLVM <{ i64, i8 }>
|
|
struct S {
|
|
var x: Int
|
|
var y: UInt8
|
|
}
|
|
|
|
// LLVM <{ i8, [7 x i8], <{ i64, i8 }>, i8 }>
|
|
struct S2 {
|
|
var x: UInt8
|
|
var s: S
|
|
var y: UInt8
|
|
}
|
|
|
|
// LLVM <{}>
|
|
struct Empty {}
|
|
|
|
// LLVM <{ i64, i64 }>
|
|
struct ContainsEmpty {
|
|
var x: Int
|
|
var y: Empty
|
|
var z: Int
|
|
}
|
|
|
|
Class Layout
|
|
~~~~~~~~~~~~
|
|
|
|
Swift relies on the following assumptions about the Objective-C runtime,
|
|
which are therefore now part of the Objective-C ABI:
|
|
|
|
- 32-bit platforms never have tagged pointers. ObjC pointer types are
|
|
either nil or an object pointer.
|
|
|
|
- On x86-64, a tagged pointer either sets the lowest bit of the pointer
|
|
or the highest bit of the pointer. Therefore, both of these bits are
|
|
zero if and only if the value is not a tagged pointer.
|
|
|
|
- On ARM64, a tagged pointer always sets the highest bit of the pointer.
|
|
|
|
- 32-bit platforms never perform any isa masking. ``object_getClass``
|
|
is always equivalent to ``*(Class*)object``.
|
|
|
|
- 64-bit platforms perform isa masking only if the runtime exports a
|
|
symbol ``uintptr_t objc_debug_isa_class_mask;``. If this symbol
|
|
is exported, ``object_getClass`` on a non-tagged pointer is always
|
|
equivalent to ``(Class)(objc_debug_isa_class_mask & *(uintptr_t*)object)``.
|
|
|
|
- The superclass field of a class object is always stored immediately
|
|
after the isa field. Its value is either nil or a pointer to the
|
|
class object for the superclass; it never has other bits set.
|
|
|
|
The following assumptions are part of the Swift ABI:
|
|
|
|
- Swift class pointers are never tagged pointers.
|
|
|
|
TODO
|
|
|
|
Fragile Enum Layout
|
|
~~~~~~~~~~~~~~~~~~~
|
|
|
|
In laying out enum types, the ABI attempts to avoid requiring additional
|
|
storage to store the tag for the enum case. The ABI chooses one of five
|
|
strategies based on the layout of the enum:
|
|
|
|
Empty Enums
|
|
```````````
|
|
|
|
In the degenerate case of an enum with no cases, the enum is an empty type.
|
|
|
|
::
|
|
|
|
enum Empty {} // => empty type
|
|
|
|
Single-Case Enums
|
|
`````````````````
|
|
|
|
In the degenerate case of an enum with a single case, there is no
|
|
discriminator needed, and the enum type has the exact same layout as its
|
|
case's data type, or is empty if the case has no data type.
|
|
|
|
::
|
|
|
|
enum EmptyCase { case X } // => empty type
|
|
enum DataCase { case Y(Int, Double) } // => LLVM <{ i64, double }>
|
|
|
|
C-Like Enums
|
|
````````````
|
|
|
|
If none of the cases has a data type (a "C-like" enum), then the enum
|
|
is laid out as an integer tag with the minimal number of bits to contain
|
|
all of the cases. The machine-level layout of the type then follows LLVM's
|
|
data layout rules for integer types on the target platform. The cases are
|
|
assigned tag values in declaration order.
|
|
|
|
::
|
|
|
|
enum EnumLike2 { // => LLVM i1
|
|
case A // => i1 0
|
|
case B // => i1 1
|
|
}
|
|
|
|
enum EnumLike8 { // => LLVM i3
|
|
case A // => i3 0
|
|
case B // => i3 1
|
|
case C // => i3 2
|
|
case D // etc.
|
|
case E
|
|
case F
|
|
case G
|
|
case H
|
|
}
|
|
|
|
Discriminator values after the one used for the last case become *extra
|
|
inhabitants* of the enum type (see `Single-Payload Enums`_).
|
|
|
|
Single-Payload Enums
|
|
````````````````````
|
|
|
|
If an enum has a single case with a data type and one or more no-data cases
|
|
(a "single-payload" enum), then the case with data type is represented using
|
|
the data type's binary representation, with added zero bits for tag if
|
|
necessary. If the data type's binary representation
|
|
has **extra inhabitants**, that is, bit patterns with the size and alignment of
|
|
the type but which do not form valid values of that type, they are used to
|
|
represent the no-data cases, with extra inhabitants in order of ascending
|
|
numeric value matching no-data cases in declaration order. If the type
|
|
has *spare bits* (see `Multi-Payload Enums`_), they are used to form extra
|
|
inhabitants. The enum value is then represented as an integer with the storage
|
|
size in bits of the data type. Extra inhabitants of the payload type not used
|
|
by the enum type become extra inhabitants of the enum type itself.
|
|
|
|
::
|
|
|
|
enum CharOrSectionMarker { => LLVM i32
|
|
case Paragraph => i32 0x0020_0000
|
|
case Char(UnicodeScalar) => i32 (zext i21 %Char to i32)
|
|
case Chapter => i32 0x0020_0001
|
|
}
|
|
|
|
CharOrSectionMarker.Char('\x00') => i32 0x0000_0000
|
|
CharOrSectionMarker.Char('\u10FFFF') => i32 0x0010_FFFF
|
|
|
|
enum CharOrSectionMarkerOrFootnoteMarker { => LLVM i32
|
|
case CharOrSectionMarker(CharOrSectionMarker) => i32 %CharOrSectionMarker
|
|
case Asterisk => i32 0x0020_0002
|
|
case Dagger => i32 0x0020_0003
|
|
case DoubleDagger => i32 0x0020_0004
|
|
}
|
|
|
|
If the data type has no extra inhabitants, or there are not enough extra
|
|
inhabitants to represent all of the no-data cases, then a tag bit is added
|
|
to the enum's representation. The tag bit is set for the no-data cases, which
|
|
are then assigned values in the data area of the enum in declaration order.
|
|
|
|
::
|
|
|
|
enum IntOrInfinity { => LLVM <{ i64, i1 }>
|
|
case NegInfinity => <{ i64, i1 }> { 0, 1 }
|
|
case Int(Int) => <{ i64, i1 }> { %Int, 0 }
|
|
case PosInfinity => <{ i64, i1 }> { 1, 1 }
|
|
}
|
|
|
|
IntOrInfinity.Int( 0) => <{ i64, i1 }> { 0, 0 }
|
|
IntOrInfinity.Int(20721) => <{ i64, i1 }> { 20721, 0 }
|
|
|
|
Multi-Payload Enums
|
|
```````````````````
|
|
|
|
If an enum has more than one case with data type, then a tag is necessary to
|
|
discriminate the data types. The ABI will first try to find common
|
|
**spare bits**, that is, bits in the data types' binary representations which are
|
|
either fixed-zero or ignored by valid values of all of the data types. The tag
|
|
will be scattered into these spare bits as much as possible. Currently only
|
|
spare bits of primitive integer types, such as the high bits of an ``i21``
|
|
type, are considered. The enum data is represented as an integer with the
|
|
storage size in bits of the largest data type.
|
|
|
|
::
|
|
|
|
enum TerminalChar { => LLVM i32
|
|
case Plain(UnicodeScalar) => i32 (zext i21 %Plain to i32)
|
|
case Bold(UnicodeScalar) => i32 (or (zext i21 %Bold to i32), 0x0020_0000)
|
|
case Underline(UnicodeScalar) => i32 (or (zext i21 %Underline to i32), 0x0040_0000)
|
|
case Blink(UnicodeScalar) => i32 (or (zext i21 %Blink to i32), 0x0060_0000)
|
|
case Empty => i32 0x0080_0000
|
|
case Cursor => i32 0x0080_0001
|
|
}
|
|
|
|
If there are not enough spare bits to contain the tag, then additional bits are
|
|
added to the representation to contain the tag. Tag values are
|
|
assigned to data cases in declaration order. If there are no-data cases, they
|
|
are collected under a common tag, and assigned values in the data area of the
|
|
enum in declaration order.
|
|
|
|
::
|
|
|
|
class Bignum {}
|
|
|
|
enum IntDoubleOrBignum { => LLVM <{ i64, i2 }>
|
|
case Int(Int) => <{ i64, i2 }> { %Int, 0 }
|
|
case Double(Double) => <{ i64, i2 }> { (bitcast %Double to i64), 1 }
|
|
case Bignum(Bignum) => <{ i64, i2 }> { (ptrtoint %Bignum to i64), 2 }
|
|
}
|
|
|
|
Existential Container Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Values of protocol type, protocol composition type, or "any" type
|
|
(``protocol<>``) are laid out using **existential containers** (so-called
|
|
because these types are "existential types" in type theory).
|
|
|
|
Opaque Existential Containers
|
|
`````````````````````````````
|
|
|
|
If there is no class constraint on a protocol or protocol composition type,
|
|
the existential container has to accommodate a value of arbitrary size and
|
|
alignment. It does this using a **fixed-size buffer**, which is three pointers
|
|
in size and pointer-aligned. This either directly contains the value, if its
|
|
size and alignment are both less than or equal to the fixed-size buffer's, or
|
|
contains a pointer to a side allocation owned by the existential container.
|
|
The type of the contained value is identified by its `type metadata` record,
|
|
and witness tables for all of the required protocol conformances are included.
|
|
The layout is as if declared in the following C struct::
|
|
|
|
struct OpaqueExistentialContainer {
|
|
void *fixedSizeBuffer[3];
|
|
Metadata *type;
|
|
WitnessTable *witnessTables[NUM_WITNESS_TABLES];
|
|
};
|
|
|
|
Class Existential Containers
|
|
````````````````````````````
|
|
|
|
If one or more of the protocols in a protocol or protocol composition type
|
|
have a class constraint, then only class values can be stored in the existential
|
|
container, and a more efficient representation is used. Class instances are
|
|
always a single pointer in size, so a fixed-size buffer and potential side
|
|
allocation is not needed, and class instances always have a reference to their
|
|
own type metadata, so the separate metadata record is not needed. The
|
|
layout is thus as if declared in the following C struct::
|
|
|
|
struct ClassExistentialContainer {
|
|
HeapObject *value;
|
|
WitnessTable *witnessTables[NUM_WITNESS_TABLES];
|
|
};
|
|
|
|
Note that if no witness tables are needed, such as for the "any class" type
|
|
``protocol<class>`` or an Objective-C protocol type, then the only element of
|
|
the layout is the heap object pointer. This is ABI-compatible with ``id``
|
|
and ``id <Protocol>`` types in Objective-C.
|
|
|
|
Type Metadata
|
|
-------------
|
|
|
|
The Swift runtime keeps a **metadata record** for every type used in a program,
|
|
including every instantiation of generic types. These metadata records can
|
|
be used by (TODO: reflection and) debugger tools to discover information about
|
|
types. For non-generic nominal types, these metadata records are generated
|
|
statically by the compiler. For instances of generic types, and for intrinsic
|
|
types such as tuples, functions, protocol compositions, etc., metadata records
|
|
are lazily created by the runtime as required. Every type has a unique metadata
|
|
record; two **metadata pointer** values are equal iff the types are equivalent.
|
|
|
|
In the layout descriptions below, offsets are given relative to the
|
|
metadata pointer as an index into an array of pointers. On a 32-bit platform,
|
|
**offset 1** means an offset of 4 bytes, and on 64-bit platforms, it means
|
|
an offset of 8 bytes.
|
|
|
|
Common Metadata Layout
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
All metadata records share a common header, with the following fields:
|
|
|
|
- The **value witness table** pointer references a vtable of functions
|
|
that implement the value semantics of the type, providing fundamental
|
|
operations such as allocating, copying, and destroying values of the type.
|
|
The value witness table also records the size, alignment, stride, and other
|
|
fundamental properties of the type. The value witness table pointer is at
|
|
**offset -1** from the metadata pointer, that is, the pointer-sized word
|
|
**immediately before** the pointer's referenced address.
|
|
|
|
- The **kind** field is a pointer-sized integer that describes the kind of type
|
|
the metadata describes. This field is at **offset 0** from the metadata
|
|
pointer.
|
|
|
|
The current kind values are as follows:
|
|
|
|
* `Struct metadata`_ has a kind of **1**.
|
|
* `Enum metadata`_ has a kind of **2**.
|
|
* **Opaque metadata** has a kind of **8**. This is used for compiler
|
|
``Builtin`` primitives that have no additional runtime information.
|
|
* `Tuple metadata`_ has a kind of **9**.
|
|
* `Function metadata`_ has a kind of **10**.
|
|
* `Protocol metadata`_ has a kind of **12**. This is used for
|
|
protocol types, for protocol compositions, and for the "any" type
|
|
``protocol<>``.
|
|
* `Metatype metadata`_ has a kind of **13**.
|
|
* `Class metadata`_, instead of a kind, has an *isa pointer* in its kind slot,
|
|
pointing to the class's metaclass record. This isa pointer is guaranteed
|
|
to have an integer value larger than **4096** and so can be discriminated
|
|
from non-class kind values.
|
|
|
|
Struct Metadata
|
|
~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, struct metadata records
|
|
contain the following fields:
|
|
|
|
- The `nominal type descriptor`_ is referenced at **offset 1**.
|
|
|
|
- A reference to the **parent** metadata record is stored at **offset 2**. For
|
|
structs that are members of an enclosing nominal type, this is a reference
|
|
to the enclosing type's metadata. For top-level structs, this is null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
- A vector of **field offsets** begins at **offset 3**. For each field of the
|
|
struct, in ``var`` declaration order, the field's offset in bytes from the
|
|
beginning of the struct is stored as a pointer-sized integer.
|
|
|
|
- If the struct is generic, then the
|
|
`generic parameter vector`_ begins at **offset 3+n**, where **n** is the
|
|
number of fields in the struct.
|
|
|
|
Enum Metadata
|
|
~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, enum metadata records
|
|
contain the following fields:
|
|
|
|
- The `nominal type descriptor`_ is referenced at **offset 1**.
|
|
|
|
- A reference to the **parent** metadata record is stored at **offset 2**. For
|
|
enums that are members of an enclosing nominal type, this is a reference to
|
|
the enclosing type's metadata. For top-level enums, this is null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
- If the enum is generic, then the
|
|
`generic parameter vector`_ begins at **offset 3**.
|
|
|
|
Tuple Metadata
|
|
~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, tuple metadata records
|
|
contain the following fields:
|
|
|
|
- The **number of elements** in the tuple is a pointer-sized integer at
|
|
**offset 1**.
|
|
- The **labels string** is a pointer to a list of consecutive null-terminated
|
|
label names for the tuple at **offset 2**. Each label name is given as a
|
|
null-terminated, UTF-8-encoded string in sequence. If the tuple has no
|
|
labels, this is a null pointer.
|
|
|
|
TODO: The labels string pointer is currently always null, and labels are
|
|
not factored into tuple metadata uniquing.
|
|
|
|
- The **element vector** begins at **offset 3** and consists of a vector of
|
|
type-offset pairs. The metadata for the *n*\ th element's type is a pointer
|
|
at **offset 3+2*n**. The offset in bytes from the beginning of the tuple to
|
|
the beginning of the *n*\ th element is at **offset 3+2*n+1**.
|
|
|
|
Function Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, function metadata records
|
|
contain the following fields:
|
|
|
|
- The number of arguments to the function is stored at **offset 1**.
|
|
- A reference to the **result type** metadata record is stored at
|
|
**offset 2**. If the function has multiple returns, this references a
|
|
`tuple metadata`_ record.
|
|
- The **argument vector** begins at **offset 3** and consists of pointers to
|
|
metadata records of the function's arguments.
|
|
|
|
If the function takes any **inout** arguments, a pointer to each argument's
|
|
metadata record will be appended separately, the lowest bit being set if it is
|
|
**inout**. Because of pointer alignment, the lowest bit will always be free to
|
|
hold this tag.
|
|
|
|
If the function takes no **inout** arguments, there will be only one pointer in
|
|
the vector for the following cases:
|
|
|
|
* 0 arguments: a `tuple metadata`_ record for the empty tuple
|
|
* 1 argument: the first and only argument's metadata record
|
|
* >1 argument: a `tuple metadata`_ record containing the arguments
|
|
|
|
Protocol Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, protocol metadata records
|
|
contain the following fields:
|
|
|
|
- A **layout flags** word is stored at **offset 1**. The bits of this word
|
|
describe the `existential container layout`_ used to represent
|
|
values of the type. The word is laid out as follows:
|
|
|
|
* The **number of witness tables** is stored in the least significant 31 bits.
|
|
Values of the protocol type contain this number of witness table pointers
|
|
in their layout.
|
|
* The **class constraint** is stored at bit 31. This bit is set if the type
|
|
is **not** class-constrained, meaning that struct, enum, or class values
|
|
can be stored in the type. If not set, then only class values can be stored
|
|
in the type, and the type uses a more efficient layout.
|
|
|
|
Note that the field is pointer-sized, even though only the lowest 32 bits are
|
|
currently inhabited on all platforms. These values can be derived from the
|
|
`protocol descriptor`_ records, but are pre-calculated for convenience.
|
|
|
|
- The **number of protocols** that make up the protocol composition is stored at
|
|
**offset 2**. For the "any" types ``protocol<>`` or ``protocol<class>``, this
|
|
is zero. For a single-protocol type ``P``, this is one. For a protocol
|
|
composition type ``protocol<P, Q, ...>``, this is the number of protocols.
|
|
|
|
- The **protocol descriptor vector** begins at **offset 3**. This is an inline
|
|
array of pointers to the `protocol descriptor`_ for every protocol in the
|
|
composition, or the single protocol descriptor for a protocol type. For
|
|
an "any" type, there is no protocol descriptor vector.
|
|
|
|
Metatype Metadata
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the `common metadata layout`_ fields, metatype metadata records
|
|
contain the following fields:
|
|
|
|
- A reference to the metadata record for the **instance type** that the metatype
|
|
represents is stored at **offset 1**.
|
|
|
|
Class Metadata
|
|
~~~~~~~~~~~~~~
|
|
|
|
Class metadata is designed to interoperate with Objective-C; all class metadata
|
|
records are also valid Objective-C ``Class`` objects. Class metadata pointers
|
|
are used as the values of class metatypes, so a derived class's metadata
|
|
record also serves as a valid class metatype value for all of its ancestor
|
|
classes.
|
|
|
|
- The **destructor pointer** is stored at **offset -2** from the metadata
|
|
pointer, behind the value witness table. This function is invoked by Swift's
|
|
deallocator when the class instance is destroyed.
|
|
- The **isa pointer** pointing to the class's Objective-C-compatible metaclass
|
|
record is stored at **offset 0**, in place of an integer kind discriminator.
|
|
- The **super pointer** pointing to the metadata record for the superclass is
|
|
stored at **offset 1**. If the class is a root class, it is null.
|
|
- Two words are reserved for use by the Objective-C runtime at **offset 2**
|
|
and **offset 3**.
|
|
- The **rodata pointer** is stored at **offset 4**; it points to an Objective-C
|
|
compatible rodata record for the class. This pointer value includes a tag.
|
|
The **low bit is always set to 1** for Swift classes and always set to 0 for
|
|
Objective-C classes.
|
|
- The **class flags** are a 32-bit field at **offset 5**.
|
|
- The **instance address point** is a 32-bit field following the class flags.
|
|
A pointer to an instance of this class points this number of bytes after the
|
|
beginning of the instance.
|
|
- The **instance size** is a 32-bit field following the instance address point.
|
|
This is the number of bytes of storage present in every object of this type.
|
|
- The **instance alignment mask** is a 16-bit field following the instance size.
|
|
This is a set of low bits which must not be set in a pointer to an instance
|
|
of this class.
|
|
- The **runtime-reserved field** is a 16-bit field following the instance
|
|
alignment mask. The compiler initializes this to zero.
|
|
- The **class object size** is a 32-bit field following the runtime-reserved
|
|
field. This is the total number of bytes of storage in the class metadata
|
|
object.
|
|
- The **class object address point** is a 32-bit field following the class
|
|
object size. This is the number of bytes of storage in the class metadata
|
|
object.
|
|
- The `nominal type descriptor`_ for the most-derived class type is referenced
|
|
at an offset immediately following the class object address point. This is
|
|
**offset 8** on a 64-bit platform or **offset 11** on a 32-bit platform.
|
|
- For each Swift class in the class's inheritance hierarchy, in order starting
|
|
from the root class and working down to the most derived class, the following
|
|
fields are present:
|
|
|
|
* First, a reference to the **parent** metadata record is stored.
|
|
For classes that are members of an enclosing nominal type, this is a
|
|
reference to the enclosing type's metadata. For top-level classes, this is
|
|
null.
|
|
|
|
TODO: The parent pointer is currently always null.
|
|
|
|
* If the class is generic, its `generic parameter vector`_ is stored inline.
|
|
* The **vtable** is stored inline and contains a function pointer to the
|
|
implementation of every method of the class in declaration order.
|
|
* If the layout of a class instance is dependent on its generic parameters,
|
|
then a **field offset vector** is stored inline, containing offsets in
|
|
bytes from an instance pointer to each field of the class in declaration
|
|
order. (For classes with fixed layout, the field offsets are accessible
|
|
statically from global variables, similar to Objective-C ivar offsets.)
|
|
|
|
Note that none of these fields are present for Objective-C base classes in
|
|
the inheritance hierarchy.
|
|
|
|
Generic Parameter Vector
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Metadata records for instances of generic types contain information about their
|
|
generic parameters. For each parameter of the type, a reference to the metadata
|
|
record for the type argument is stored. After all of the type argument
|
|
metadata references, for each type parameter, if there are protocol
|
|
requirements on that type parameter, a reference to the witness table for each
|
|
protocol it is required to conform to is stored in declaration order.
|
|
|
|
For example, given a generic type with the parameters ``<T, U, V>``, its
|
|
generic parameter record will consist of references to the metadata records
|
|
for ``T``, ``U``, and ``V`` in succession, as if laid out in a C struct::
|
|
|
|
struct GenericParameterVector {
|
|
TypeMetadata *T, *U, *V;
|
|
};
|
|
|
|
If we add protocol requirements to the parameters, for example,
|
|
``<T: Runcible, U: protocol<Fungible, Ansible>, V>``, then the type's generic
|
|
parameter vector contains witness tables for those protocols, as if laid out::
|
|
|
|
struct GenericParameterVector {
|
|
TypeMetadata *T, *U, *V;
|
|
RuncibleWitnessTable *T_Runcible;
|
|
FungibleWitnessTable *U_Fungible;
|
|
AnsibleWitnessTable *U_Ansible;
|
|
};
|
|
|
|
Nominal Type Descriptor
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The metadata records for class, struct, and enum types contain a pointer to a
|
|
**nominal type descriptor**, which contains basic information about the nominal
|
|
type such as its name, members, and metadata layout. For a generic type, one
|
|
nominal type descriptor is shared for all instantiations of the type. The
|
|
layout is as follows:
|
|
|
|
- The **kind** of type is stored at **offset 0**, which is as follows:
|
|
|
|
* **0** for a class,
|
|
* **1** for a struct, or
|
|
* **2** for an enum.
|
|
|
|
- The mangled **name** is referenced as a null-terminated C string at
|
|
**offset 1**. This name includes no bound generic parameters.
|
|
- The following four fields depend on the kind of nominal type.
|
|
|
|
* For a struct or class:
|
|
|
|
+ The **number of fields** is stored at **offset 2**. This is the length
|
|
of the field offset vector in the metadata record, if any.
|
|
+ The **offset to the field offset vector** is stored at **offset 3**.
|
|
This is the offset in pointer-sized words of the field offset vector for
|
|
the type in the metadata record. If no field offset vector is stored
|
|
in the metadata record, this is zero.
|
|
+ The **field names** are referenced as a doubly-null-terminated list of
|
|
C strings at **offset 4**. The order of names corresponds to the order
|
|
of fields in the field offset vector.
|
|
+ The **field type accessor** is a function pointer at **offset 5**. If
|
|
non-null, the function takes a pointer to an instance of type metadata
|
|
for the nominal type, and returns a pointer to an array of type metadata
|
|
references for the types of the fields of that instance. The order matches
|
|
that of the field offset vector and field name list.
|
|
|
|
* For an enum:
|
|
|
|
+ The **number of payload cases** and **payload size offset** are stored
|
|
at **offset 2**. The least significant 24 bits are the number of payload
|
|
cases, and the most significant 8 bits are the offset of the payload
|
|
size in the type metadata, if present.
|
|
+ The **number of no-payload cases** is stored at **offset 3**.
|
|
+ The **case names** are referenced as a doubly-null-terminated list of
|
|
C strings at **offset 4**. The names are ordered such that payload cases
|
|
come first, followed by no-payload cases. Within each half of the list,
|
|
the order of names corresponds to the order of cases in the enum
|
|
declaration.
|
|
+ The **case type accessor** is a function pointer at **offset 5**. If
|
|
non-null, the function takes a pointer to an instance of type metadata
|
|
for the enum, and returns a pointer to an array of type metadata
|
|
references for the types of the cases of that instance. The order matches
|
|
that of the case name list. This function is similar to the field type
|
|
accessor for a struct, except also the least significant bit of each
|
|
element in the result is set if the enum case is an **indirect case**.
|
|
|
|
- If the nominal type is generic, a pointer to the **metadata pattern** that
|
|
is used to form instances of the type is stored at **offset 6**. The pointer
|
|
is null if the type is not generic.
|
|
|
|
- The **generic parameter descriptor** begins at **offset 7**. This describes
|
|
the layout of the generic parameter vector in the metadata record:
|
|
|
|
* The **offset of the generic parameter vector** is stored at **offset 7**.
|
|
This is the offset in pointer-sized words of the generic parameter vector
|
|
inside the metadata record. If the type is not generic, this is zero.
|
|
* The **number of type parameters** is stored at **offset 8**. This count
|
|
includes associated types of type parameters with protocol constraints.
|
|
* The **number of type parameters** is stored at **offset 9**. This count
|
|
includes only the primary formal type parameters.
|
|
* For each type parameter **n**, the following fields are stored:
|
|
|
|
+ The **number of witnesses** for the type parameter is stored at
|
|
**offset 10+n**. This is the number of witness table pointers that are
|
|
stored for the type parameter in the generic parameter vector.
|
|
|
|
Note that there is no nominal type descriptor for protocols or protocol types.
|
|
See the `protocol descriptor`_ description below.
|
|
|
|
Protocol Descriptor
|
|
~~~~~~~~~~~~~~~~~~~
|
|
|
|
`Protocol metadata` contains references to zero, one, or more **protocol
|
|
descriptors** that describe the protocols values of the type are required to
|
|
conform to. The protocol descriptor is laid out to be compatible with
|
|
Objective-C ``Protocol`` objects. The layout is as follows:
|
|
|
|
- An **isa** placeholder is stored at **offset 0**. This field is populated by
|
|
the Objective-C runtime.
|
|
- The mangled **name** is referenced as a null-terminated C string at
|
|
**offset 1**.
|
|
- If the protocol inherits one or more other protocols, a pointer to the
|
|
**inherited protocols list** is stored at **offset 2**. The list starts with
|
|
the number of inherited protocols as a pointer-sized integer, and is followed
|
|
by that many protocol descriptor pointers. If the protocol inherits no other
|
|
protocols, this pointer is null.
|
|
- For an ObjC-compatible protocol, its **required instance methods** are stored
|
|
at **offset 3** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **required class methods** are stored
|
|
at **offset 4** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **optional instance methods** are stored
|
|
at **offset 5** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **optional class methods** are stored
|
|
at **offset 6** as an ObjC-compatible method list. This is null for native
|
|
Swift protocols.
|
|
- For an ObjC-compatible protocol, its **instance properties** are stored
|
|
at **offset 7** as an ObjC-compatible property list. This is null for native
|
|
Swift protocols.
|
|
- The **size** of the protocol descriptor record is stored as a 32-bit integer
|
|
at **offset 8**. This is currently 72 on 64-bit platforms and 40 on 32-bit
|
|
platforms.
|
|
- **Flags** are stored as a 32-bit integer after the size. The following bits
|
|
are currently used (counting from least significant bit zero):
|
|
|
|
* **Bit 0** is the **Swift bit**. It is set for all protocols defined in
|
|
Swift and unset for protocols defined in Objective-C.
|
|
* **Bit 1** is the **class constraint bit**. It is set if the protocol is
|
|
**not** class-constrained, meaning that any struct, enum, or class type
|
|
may conform to the protocol. It is unset if only classes can conform to
|
|
the protocol. (The inverted meaning is for compatibility with Objective-C
|
|
protocol records, in which the bit is never set. Objective-C protocols can
|
|
only be conformed to by classes.)
|
|
* **Bit 2** is the **witness table bit**. It is set if dispatch to the
|
|
protocol's methods is done through a witness table, which is either passed
|
|
as an extra parameter to generic functions or included in the `existential
|
|
container layout`_ of protocol types. It is unset if dispatch is done
|
|
through ``objc_msgSend`` and requires no additional information to accompany
|
|
a value of conforming type.
|
|
* **Bit 31** is set by the Objective-C runtime when it has done its
|
|
initialization of the protocol record. It is unused by the Swift runtime.
|
|
|
|
Heap Objects
|
|
------------
|
|
|
|
Heap Metadata
|
|
~~~~~~~~~~~~~
|
|
|
|
Heap Object Header
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
Mangling
|
|
--------
|
|
::
|
|
|
|
mangled-name ::= '_T' global
|
|
|
|
All Swift-mangled names begin with this prefix.
|
|
|
|
Globals
|
|
~~~~~~~
|
|
|
|
::
|
|
|
|
global ::= 't' type // standalone type (for DWARF)
|
|
global ::= 'M' type // type metadata (address point)
|
|
// -- type starts with [BCOSTV]
|
|
global ::= 'Mf' type // 'full' type metadata (start of object)
|
|
global ::= 'MP' type // type metadata pattern
|
|
global ::= 'Ma' type // type metadata access function
|
|
global ::= 'ML' type // type metadata lazy cache variable
|
|
global ::= 'Mm' type // class metaclass
|
|
global ::= 'Mn' nominal-type // nominal type descriptor
|
|
global ::= 'Mp' protocol // protocol descriptor
|
|
global ::= 'PA' .* // partial application forwarder
|
|
global ::= 'PAo' .* // ObjC partial application forwarder
|
|
global ::= 'w' value-witness-kind type // value witness
|
|
global ::= 'Wa' protocol-conformance // protocol witness table accessor
|
|
global ::= 'WG' protocol-conformance // generic protocol witness table
|
|
global ::= 'WI' protocol-conformance // generic protocol witness table instantiation function
|
|
global ::= 'Wl' type protocol-conformance // lazy protocol witness table accessor
|
|
global ::= 'WL' protocol-conformance // lazy protocol witness table cache variable
|
|
global ::= 'Wo' entity // witness table offset
|
|
global ::= 'WP' protocol-conformance // protocol witness table
|
|
global ::= 'Wt' protocol-conformance identifier // associated type metadata accessor
|
|
global ::= 'WT' protocol-conformance identifier nominal-type // associated type witness table accessor
|
|
global ::= 'Wv' directness entity // field offset
|
|
global ::= 'WV' type // value witness table
|
|
global ::= entity // some identifiable thing
|
|
global ::= 'TO' global // ObjC-as-swift thunk
|
|
global ::= 'To' global // swift-as-ObjC thunk
|
|
global ::= 'TD' global // dynamic dispatch thunk
|
|
global ::= 'Td' global // direct method reference thunk
|
|
global ::= 'TR' reabstract-signature // reabstraction thunk helper function
|
|
global ::= 'Tr' reabstract-signature // reabstraction thunk
|
|
|
|
global ::= 'TS' specializationinfo '_' mangled-name
|
|
specializationinfo ::= 'g' passid (type protocol-conformance* '_')+ // Generic specialization info.
|
|
specializationinfo ::= 'f' passid (funcspecializationarginfo '_')+ // Function signature specialization kind
|
|
passid ::= integer // The id of the pass that generated this specialization.
|
|
funcsigspecializationarginfo ::= 'cl' closurename type* // Closure specialized with closed over types in argument order.
|
|
funcsigspecializationarginfo ::= 'n' // Unmodified argument
|
|
funcsigspecializationarginfo ::= 'cp' funcsigspecializationconstantproppayload // Constant propagated argument
|
|
funcsigspecializationarginfo ::= 'd' // Dead argument
|
|
funcsigspecializationarginfo ::= 'g' 's'? // Owned => Guaranteed and Exploded if 's' present.
|
|
funcsigspecializationarginfo ::= 's' // Exploded
|
|
funcsigspecializationarginfo ::= 'k' // Exploded
|
|
funcsigspecializationconstantpropinfo ::= 'fr' mangled-name
|
|
funcsigspecializationconstantpropinfo ::= 'g' mangled-name
|
|
funcsigspecializationconstantpropinfo ::= 'i' 64-bit-integer
|
|
funcsigspecializationconstantpropinfo ::= 'fl' float-as-64-bit-integer
|
|
funcsigspecializationconstantpropinfo ::= 'se' stringencoding 'v' md5hash
|
|
|
|
global ::= 'TV' global // vtable override thunk
|
|
global ::= 'TW' protocol-conformance entity
|
|
// protocol witness thunk
|
|
entity ::= nominal-type // named type declaration
|
|
entity ::= static? entity-kind context entity-name
|
|
entity-kind ::= 'F' // function (ctor, accessor, etc.)
|
|
entity-kind ::= 'v' // variable (let/var)
|
|
entity-kind ::= 'i' // subscript ('i'ndex) itself (not the individual accessors)
|
|
entity-kind ::= 'I' // initializer
|
|
entity-name ::= decl-name type // named declaration
|
|
entity-name ::= 'A' index // default argument generator
|
|
entity-name ::= 'a' addressor-kind decl-name type // mutable addressor
|
|
entity-name ::= 'C' type // allocating constructor
|
|
entity-name ::= 'c' type // non-allocating constructor
|
|
entity-name ::= 'D' // deallocating destructor; untyped
|
|
entity-name ::= 'd' // non-deallocating destructor; untyped
|
|
entity-name ::= 'g' decl-name type // getter
|
|
entity-name ::= 'i' // non-local variable initializer
|
|
entity-name ::= 'l' addressor-kind decl-name type // non-mutable addressor
|
|
entity-name ::= 'm' decl-name type // materializeForSet
|
|
entity-name ::= 's' decl-name type // setter
|
|
entity-name ::= 'U' index type // explicit anonymous closure expression
|
|
entity-name ::= 'u' index type // implicit anonymous closure
|
|
entity-name ::= 'w' decl-name type // willSet
|
|
entity-name ::= 'W' decl-name type // didSet
|
|
static ::= 'Z' // entity is a static member of a type
|
|
decl-name ::= identifier
|
|
decl-name ::= local-decl-name
|
|
decl-name ::= private-decl-name
|
|
local-decl-name ::= 'L' index identifier // locally-discriminated declaration
|
|
private-decl-name ::= 'P' identifier identifier // file-discriminated declaration
|
|
reabstract-signature ::= ('G' generic-signature)? type type
|
|
addressor-kind ::= 'u' // unsafe addressor (no owner)
|
|
addressor-kind ::= 'O' // owning addressor (non-native owner)
|
|
addressor-kind ::= 'o' // owning addressor (native owner)
|
|
addressor-kind ::= 'p' // pinning addressor (native owner)
|
|
|
|
An ``entity`` starts with a ``nominal-type-kind`` (``[COPV]``), a
|
|
substitution (``[Ss]``) of a nominal type, or an ``entity-kind``
|
|
(``[FIiv]``).
|
|
|
|
An ``entity-name`` starts with ``[AaCcDggis]`` or a ``decl-name``.
|
|
A ``decl-name`` starts with ``[LP]`` or an ``identifier`` (``[0-9oX]``).
|
|
|
|
A ``context`` starts with either an ``entity``, an ``extension`` (which starts
|
|
with ``[Ee]``), or a ``module``, which might be an ``identifier`` (``[0-9oX]``)
|
|
or a substitution of a module (``[Ss]``).
|
|
|
|
A global mangling starts with an ``entity`` or ``[MTWw]``.
|
|
|
|
If a partial application forwarder is for a static symbol, its name will
|
|
start with the sequence ``_TPA_`` followed by the mangled symbol name of the
|
|
forwarder's destination.
|
|
|
|
A generic specialization mangling consists of a header, specifying the types
|
|
and conformances used to specialize the generic function, followed by the
|
|
full mangled name of the original unspecialized generic symbol.
|
|
|
|
The first identifier in a ``<private-decl-name>`` is a string that represents
|
|
the file the original declaration came from. It should be considered unique
|
|
within the enclosing module. The second identifier is the name of the entity.
|
|
|
|
Not all declarations marked ``private`` declarations will use the
|
|
``<private-decl-name>`` mangling; if the entity's context is enough to uniquely
|
|
identify the entity, the simple ``identifier`` form is preferred.
|
|
|
|
The types in a ``<reabstract-signature>`` are always non-polymorphic
|
|
``<impl-function-type>`` types.
|
|
|
|
Direct and Indirect Symbols
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
directness ::= 'd' // direct
|
|
directness ::= 'i' // indirect
|
|
|
|
A direct symbol resolves directly to the address of an object. An
|
|
indirect symbol resolves to the address of a pointer to the object.
|
|
They are distinct manglings to make a certain class of bugs
|
|
immediately obvious.
|
|
|
|
The terminology is slightly overloaded when discussing offsets. A
|
|
direct offset resolves to a variable holding the true offset. An
|
|
indirect offset resolves to a variable holding an offset to be applied
|
|
to type metadata to get the address of the true offset. (Offset
|
|
variables are required when the object being accessed lies within a
|
|
resilient structure. When the layout of the object may depend on
|
|
generic arguments, these offsets must be kept in metadata. Indirect
|
|
field offsets are therefore required when accessing fields in generic
|
|
types where the metadata itself has unknown layout.)
|
|
|
|
Declaration Contexts
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
context ::= module
|
|
context ::= extension
|
|
context ::= entity
|
|
module ::= substitution // other substitution
|
|
module ::= identifier // module name
|
|
module ::= known-module // abbreviation
|
|
extension ::= 'E' module entity
|
|
extension ::= 'e' module generic-signature entity
|
|
|
|
These manglings identify the enclosing context in which an entity was declared,
|
|
such as its enclosing module, function, or nominal type.
|
|
|
|
An ``extension`` mangling is used whenever an entity's declaration context is
|
|
an extension *and* the entity being extended is in a different module. In this
|
|
case the extension's module is mangled first, followed by the entity being
|
|
extended. If the extension and the extended entity are in the same module, the
|
|
plain ``entity`` mangling is preferred. If the extension is constrained, the
|
|
constraints on the extension are mangled in its generic signature.
|
|
|
|
When mangling the context of a local entity within a constructor or
|
|
destructor, the non-allocating or non-deallocating variant is used.
|
|
|
|
Types
|
|
~~~~~
|
|
|
|
::
|
|
|
|
type ::= 'Bb' // Builtin.BridgeObject
|
|
type ::= 'BB' // Builtin.UnsafeValueBuffer
|
|
type ::= 'Bf' natural '_' // Builtin.Float<n>
|
|
type ::= 'Bi' natural '_' // Builtin.Int<n>
|
|
type ::= 'BO' // Builtin.ObjCPointer
|
|
type ::= 'Bo' // Builtin.ObjectPointer
|
|
type ::= 'Bp' // Builtin.RawPointer
|
|
type ::= 'Bv' natural type // Builtin.Vec<n>x<type>
|
|
type ::= 'Bw' // Builtin.Word
|
|
type ::= nominal-type
|
|
type ::= associated-type
|
|
type ::= 'a' context identifier // Type alias (DWARF only)
|
|
type ::= 'b' type type // objc block function type
|
|
type ::= 'c' type type // C function pointer type
|
|
type ::= 'F' throws-annotation? type type // function type
|
|
type ::= 'f' throws-annotation? type type // uncurried function type
|
|
type ::= 'G' type <type>+ '_' // generic type application
|
|
type ::= 'K' type type // @auto_closure function type
|
|
type ::= 'M' type // metatype without representation
|
|
type ::= 'XM' metatype-repr type // metatype with representation
|
|
type ::= 'P' protocol-list '_' // protocol type
|
|
type ::= 'PM' type // existential metatype without representation
|
|
type ::= 'XPM' metatype-repr type // existential metatype with representation
|
|
type ::= archetype
|
|
type ::= 'R' type // inout
|
|
type ::= 'T' tuple-element* '_' // tuple
|
|
type ::= 't' tuple-element* '_' // variadic tuple
|
|
type ::= 'Xo' type // @unowned type
|
|
type ::= 'Xu' type // @unowned(unsafe) type
|
|
type ::= 'Xw' type // @weak type
|
|
type ::= 'XF' impl-function-type // function implementation type
|
|
type ::= 'Xf' type type // @thin function type
|
|
nominal-type ::= known-nominal-type
|
|
nominal-type ::= substitution
|
|
nominal-type ::= nominal-type-kind declaration-name
|
|
nominal-type-kind ::= 'C' // class
|
|
nominal-type-kind ::= 'O' // enum
|
|
nominal-type-kind ::= 'V' // struct
|
|
declaration-name ::= context decl-name
|
|
archetype ::= 'Q' index // archetype with depth=0, idx=N
|
|
archetype ::= 'Qd' index index // archetype with depth=M+1, idx=N
|
|
archetype ::= associated-type
|
|
archetype ::= qualified-archetype
|
|
associated-type ::= substitution
|
|
associated-type ::= 'Q' protocol-context // self type of protocol
|
|
associated-type ::= 'Q' archetype identifier // associated type
|
|
qualified-archetype ::= 'Qq' index context // archetype+context (DWARF only)
|
|
protocol-context ::= 'P' protocol
|
|
tuple-element ::= identifier? type
|
|
metatype-repr ::= 't' // Thin metatype representation
|
|
metatype-repr ::= 'T' // Thick metatype representation
|
|
metatype-repr ::= 'o' // ObjC metatype representation
|
|
throws-annotation ::= 'z' // 'throws' annotation on function types
|
|
|
|
|
|
type ::= 'u' generic-signature type // generic type
|
|
type ::= 'x' // generic param, depth=0, idx=0
|
|
type ::= 'q' generic-param-index // dependent generic parameter
|
|
type ::= 'q' type assoc-type-name // associated type of non-generic param
|
|
type ::= 'w' generic-param-index assoc-type-name // associated type
|
|
type ::= 'W' generic-param-index assoc-type-name+ '_' // associated type at depth
|
|
|
|
generic-param-index ::= 'x' // depth = 0, idx = 0
|
|
generic-param-index ::= index // depth = 0, idx = N+1
|
|
generic-param-index ::= 'd' index index // depth = M+1, idx = N
|
|
|
|
``<type>`` never begins or ends with a number.
|
|
``<type>`` never begins with an underscore.
|
|
``<type>`` never begins with ``d``.
|
|
``<type>`` never begins with ``z``.
|
|
|
|
Note that protocols mangle differently as types and as contexts. A protocol
|
|
context always consists of a single protocol name and so mangles without a
|
|
trailing underscore. A protocol type can have zero, one, or many protocol bounds
|
|
which are juxtaposed and terminated with a trailing underscore.
|
|
|
|
::
|
|
|
|
assoc-type-name ::= ('P' protocol-name)? identifier
|
|
assoc-type-name ::= substitution
|
|
|
|
Associated types use an abbreviated mangling when the base generic parameter
|
|
or associated type is constrained by a single protocol requirement. The
|
|
associated type in this case can be referenced unambiguously by name alone.
|
|
If the base has multiple conformance constraints, then the protocol name is
|
|
mangled in to disambiguate.
|
|
|
|
::
|
|
|
|
impl-function-type ::=
|
|
impl-callee-convention impl-function-attribute* generic-signature? '_'
|
|
impl-parameter* '_' impl-result* '_'
|
|
impl-callee-convention ::= 't' // thin
|
|
impl-callee-convention ::= impl-convention // thick, callee transferred with given convention
|
|
impl-convention ::= 'a' // direct, autoreleased
|
|
impl-convention ::= 'd' // direct, no ownership transfer
|
|
impl-convention ::= 'D' // direct, no ownership transfer,
|
|
// dependent on 'self' parameter
|
|
impl-convention ::= 'g' // direct, guaranteed
|
|
impl-convention ::= 'e' // direct, deallocating
|
|
impl-convention ::= 'i' // indirect, ownership transfer
|
|
impl-convention ::= 'l' // indirect, inout
|
|
impl-convention ::= 'G' // indirect, guaranteed
|
|
impl-convention ::= 'o' // direct, ownership transfer
|
|
impl-convention ::= 'z' impl-convention // error result
|
|
impl-function-attribute ::= 'Cb' // compatible with C block invocation function
|
|
impl-function-attribute ::= 'Cc' // compatible with C global function
|
|
impl-function-attribute ::= 'Cm' // compatible with Swift method
|
|
impl-function-attribute ::= 'CO' // compatible with ObjC method
|
|
impl-function-attribute ::= 'Cw' // compatible with protocol witness
|
|
impl-function-attribute ::= 'N' // noreturn
|
|
impl-function-attribute ::= 'G' // generic
|
|
impl-parameter ::= impl-convention type
|
|
impl-result ::= impl-convention type
|
|
|
|
For the most part, manglings follow the structure of formal language
|
|
types. However, in some cases it is more useful to encode the exact
|
|
implementation details of a function type.
|
|
|
|
Any ``<impl-function-attribute>`` productions must appear in the order
|
|
in which they are specified above: e.g. a noreturn C function is
|
|
mangled with ``CcN``.
|
|
|
|
Note that the convention and function-attribute productions do not
|
|
need to be disambiguated from the start of a ``<type>``.
|
|
|
|
Generics
|
|
~~~~~~~~
|
|
|
|
::
|
|
|
|
protocol-conformance ::= ('u' generic-signature)? type protocol module
|
|
|
|
``<protocol-conformance>`` refers to a type's conformance to a protocol. The
|
|
named module is the one containing the extension or type declaration that
|
|
declared the conformance.
|
|
|
|
::
|
|
|
|
generic-signature ::= (generic-param-count+)? ('R' requirement*)? 'r'
|
|
generic-param-count ::= 'z' // zero parameters
|
|
generic-param-count ::= index // N+1 parameters
|
|
requirement ::= type-param protocol-name // protocol requirement
|
|
requirement ::= type-param type // base class requirement
|
|
// type starts with [CS]
|
|
requirement ::= type-param 'z' type // 'z'ame-type requirement
|
|
|
|
// Special type mangling for type params that saves the initial 'q' on
|
|
// generic params
|
|
type-param ::= generic-param-index // generic parameter
|
|
type-param ::= 'w' generic-param-index assoc-type-name // associated type
|
|
type-param ::= 'W' generic-param-index assoc-type-name+ '_'
|
|
|
|
A generic signature begins by describing the number of generic parameters at
|
|
each depth of the signature, followed by the requirements. As a special case,
|
|
no ``generic-param-count`` values indicates a single generic parameter at
|
|
the outermost depth::
|
|
|
|
urFq_q_ // <T_0_0> T_0_0 -> T_0_0
|
|
u_0_rFq_qd_0_ // <T_0_0><T_1_0, T_1_1> T_0_0 -> T_1_1
|
|
|
|
Value Witnesses
|
|
~~~~~~~~~~~~~~~
|
|
|
|
TODO: document these
|
|
|
|
::
|
|
|
|
value-witness-kind ::= 'al' // allocateBuffer
|
|
value-witness-kind ::= 'ca' // assignWithCopy
|
|
value-witness-kind ::= 'ta' // assignWithTake
|
|
value-witness-kind ::= 'de' // deallocateBuffer
|
|
value-witness-kind ::= 'xx' // destroy
|
|
value-witness-kind ::= 'XX' // destroyBuffer
|
|
value-witness-kind ::= 'Xx' // destroyArray
|
|
value-witness-kind ::= 'CP' // initializeBufferWithCopyOfBuffer
|
|
value-witness-kind ::= 'Cp' // initializeBufferWithCopy
|
|
value-witness-kind ::= 'cp' // initializeWithCopy
|
|
value-witness-kind ::= 'TK' // initializeBufferWithTakeOfBuffer
|
|
value-witness-kind ::= 'Tk' // initializeBufferWithTake
|
|
value-witness-kind ::= 'tk' // initializeWithTake
|
|
value-witness-kind ::= 'pr' // projectBuffer
|
|
value-witness-kind ::= 'xs' // storeExtraInhabitant
|
|
value-witness-kind ::= 'xg' // getExtraInhabitantIndex
|
|
value-witness-kind ::= 'Cc' // initializeArrayWithCopy
|
|
value-witness-kind ::= 'Tt' // initializeArrayWithTakeFrontToBack
|
|
value-witness-kind ::= 'tT' // initializeArrayWithTakeBackToFront
|
|
value-witness-kind ::= 'ug' // getEnumTag
|
|
value-witness-kind ::= 'up' // destructiveProjectEnumData
|
|
value-witness-kind ::= 'ui' // destructiveInjectEnumTag
|
|
|
|
``<value-witness-kind>`` differentiates the kinds of value
|
|
witness functions for a type.
|
|
|
|
Identifiers
|
|
~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
identifier ::= natural identifier-start-char identifier-char*
|
|
identifier ::= 'o' operator-fixity natural operator-char+
|
|
|
|
operator-fixity ::= 'p' // prefix operator
|
|
operator-fixity ::= 'P' // postfix operator
|
|
operator-fixity ::= 'i' // infix operator
|
|
|
|
operator-char ::= 'a' // & 'and'
|
|
operator-char ::= 'c' // @ 'commercial at'
|
|
operator-char ::= 'd' // / 'divide'
|
|
operator-char ::= 'e' // = 'equals'
|
|
operator-char ::= 'g' // > 'greater'
|
|
operator-char ::= 'l' // < 'less'
|
|
operator-char ::= 'm' // * 'multiply'
|
|
operator-char ::= 'n' // ! 'not'
|
|
operator-char ::= 'o' // | 'or'
|
|
operator-char ::= 'p' // + 'plus'
|
|
operator-char ::= 'q' // ? 'question'
|
|
operator-char ::= 'r' // % 'remainder'
|
|
operator-char ::= 's' // - 'subtract'
|
|
operator-char ::= 't' // ~ 'tilde'
|
|
operator-char ::= 'x' // ^ 'xor'
|
|
operator-char ::= 'z' // . 'zperiod'
|
|
|
|
``<identifier>`` is run-length encoded: the natural indicates how many
|
|
characters follow. Operator characters are mapped to letter characters as
|
|
given. In neither case can an identifier start with a digit, so
|
|
there's no ambiguity with the run-length.
|
|
|
|
::
|
|
|
|
identifier ::= 'X' natural identifier-start-char identifier-char*
|
|
identifier ::= 'X' 'o' operator-fixity natural identifier-char*
|
|
|
|
Identifiers that contain non-ASCII characters are encoded using the Punycode
|
|
algorithm specified in RFC 3492, with the modifications that ``_`` is used
|
|
as the encoding delimiter, and uppercase letters A through J are used in place
|
|
of digits 0 through 9 in the encoding character set. The mangling then
|
|
consists of an ``X`` followed by the run length of the encoded string and the
|
|
encoded string itself. For example, the identifier ``vergüenza`` is mangled
|
|
to ``X12vergenza_JFa``. (The encoding in standard Punycode would be
|
|
``vergenza-95a``)
|
|
|
|
Operators that contain non-ASCII characters are mangled by first mapping the
|
|
ASCII operator characters to letters as for pure ASCII operator names, then
|
|
Punycode-encoding the substituted string. The mangling then consists of
|
|
``Xo`` followed by the fixity, run length of the encoded string, and the encoded
|
|
string itself. For example, the infix operator ``«+»`` is mangled to
|
|
``Xoi7p_qcaDc`` (``p_qcaDc`` being the encoding of the substituted
|
|
string ``«p»``).
|
|
|
|
Substitutions
|
|
~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
substitution ::= 'S' index
|
|
|
|
``<substitution>`` is a back-reference to a previously mangled entity. The mangling
|
|
algorithm maintains a mapping of entities to substitution indices as it runs.
|
|
When an entity that can be represented by a substitution (a module, nominal
|
|
type, or protocol) is mangled, a substitution is first looked for in the
|
|
substitution map, and if it is present, the entity is mangled using the
|
|
associated substitution index. Otherwise, the entity is mangled normally, and
|
|
it is then added to the substitution map and associated with the next
|
|
available substitution index.
|
|
|
|
For example, in mangling a function type
|
|
``(zim.zang.zung, zim.zang.zung, zim.zippity) -> zim.zang.zoo`` (with module
|
|
``zim`` and class ``zim.zang``),
|
|
the recurring contexts ``zim``, ``zim.zang``, and ``zim.zang.zung``
|
|
will be mangled using substitutions after being mangled
|
|
for the first time. The first argument type will mangle in long form,
|
|
``CC3zim4zang4zung``, and in doing so, ``zim`` will acquire substitution ``S_``,
|
|
``zim.zang`` will acquire substitution ``S0_``, and ``zim.zang.zung`` will
|
|
acquire ``S1_``. The second argument is the same as the first and will mangle
|
|
using its substitution, ``S1_``. The
|
|
third argument type will mangle using the substitution for ``zim``,
|
|
``CS_7zippity``. (It also acquires substitution ``S2_`` which would be used
|
|
if it mangled again.) The result type will mangle using the substitution for
|
|
``zim.zang``, ``CS0_3zoo`` (and acquire substitution ``S3_``). The full
|
|
function type thus mangles as ``fTCC3zim4zang4zungS1_CS_7zippity_CS0_3zoo``.
|
|
|
|
::
|
|
|
|
substitution ::= 's'
|
|
|
|
The special substitution ``s`` is used for the ``Swift`` standard library
|
|
module.
|
|
|
|
Predefined Substitutions
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
::
|
|
|
|
known-module ::= 's' // Swift
|
|
known-module ::= 'SC' // C
|
|
known-module ::= 'So' // Objective-C
|
|
known-nominal-type ::= 'Sa' // Swift.Array
|
|
known-nominal-type ::= 'Sb' // Swift.Bool
|
|
known-nominal-type ::= 'Sc' // Swift.UnicodeScalar
|
|
known-nominal-type ::= 'Sd' // Swift.Float64
|
|
known-nominal-type ::= 'Sf' // Swift.Float32
|
|
known-nominal-type ::= 'Si' // Swift.Int
|
|
known-nominal-type ::= 'SP' // Swift.UnsafePointer
|
|
known-nominal-type ::= 'Sp' // Swift.UnsafeMutablePointer
|
|
known-nominal-type ::= 'SQ' // Swift.ImplicitlyUnwrappedOptional
|
|
known-nominal-type ::= 'Sq' // Swift.Optional
|
|
known-nominal-type ::= 'SR' // Swift.UnsafeBufferPointer
|
|
known-nominal-type ::= 'Sr' // Swift.UnsafeMutableBufferPointer
|
|
known-nominal-type ::= 'SS' // Swift.String
|
|
known-nominal-type ::= 'Su' // Swift.UInt
|
|
|
|
``<known-module>`` and ``<known-nominal-type>`` are built-in substitutions for
|
|
certain common entities. Like any other substitution, they all start
|
|
with 'S'.
|
|
|
|
The Objective-C module is used as the context for mangling Objective-C
|
|
classes as ``<type>``\ s.
|
|
|
|
Indexes
|
|
~~~~~~~
|
|
|
|
::
|
|
|
|
index ::= '_' // 0
|
|
index ::= natural '_' // N+1
|
|
natural ::= [0-9]+
|
|
|
|
``<index>`` is a production for encoding numbers in contexts that can't
|
|
end in a digit; it's optimized for encoding smaller numbers.
|