:orphan: Hacking on Swift ================ .. contents:: Abstract -------- The point of this document is to collect folk wisdom about Swift development. It's folk wisdom if: - You really need to understand it if you're hacking on a particular subsystem. - It doesn't really fit in any of the specification documents: - the Swift language reference - the SIL language reference - the ABI specification - It feels too broad for just a comment in the code somewhere. By implication, anything that *does* fit into one of those places should just go there with, at most, a cross-reference here. This document is broken down into different sections for different subsystems. The AST ------- Types ~~~~~ The Type hierarchy serves a number of masters: - the Swift system of *formal types*, - the SIL system of *lowered types*, and - intermediate scratch types required by Sema's constraint systems. A *formal type* is the type of something in Swift, like an expression or declaration. Type Variables `````````````` A *type variable* stands for an unknown type. There are several different kinds of type variables: - ``TypeVariableType`` is used for the temporary type variables introduced as part of generating and solving constraints in the type checker. These variables should all be resolved by the solver eventually in a well-formed program, leaving all AST nodes totally free of TypeVariableTypes after type-checking. - ``ArchetypeType`` is used within the implementation of generic declarations to represent the rigid type parameters of context and their associated types. For example:: func swap(a : @inout T, b : @inout T) { var tmp = a b = a a = tmp } Within this context, ``T`` is an archetype; the ``VarDecl``\ s for ``a``, ``b``, and ``c`` (and expressions referring to them) will have that type. Similarly, in the SIL body of this function, the instructions will use an archetype type. For now, archetypes are also used in generic signatures as seen externally: that is, ``swap->getType()`` will yield a ``PolymorphicFunctionType`` whose input type is a tuple of l-value types of archetypes. But this is being actively changed, and these archetypes will eventually become ``GenericTypeParamType``\ s and ``AssociatedTypeType``\ s. - ``GenericTypeParamType`` and ``AssociatedTypeType`` are used within the signature of a generic declaration to represent its rigid type parameters and their associated types. Substituted Types ````````````````` A type can be *substituted* or *unsubstituted*. A substituted type is expressed using only types meaningful in the current context. An unsubstituted type may be expressed using types meaningful only in a different context. An expression or local declaration will always have a substituted type. A declaration from a different context will have an unsubstituted type, although this might be trivially identifical to a correctly-substituted type. For example:: struct Dictionary { ... } func lookupIterative(value: X, map: Dictionary) { while var newValue = map[value] { value = newValue } return value } ``Dictionary``\'s subscript getter will have formal unsubstituted type `` (@inout Dictionary) -> (T) -> U?``. The substituted use of it has type ``(@inout Dictionary) -> (X) -> X?``. The index expression has type ``X``, and the unsubstituted parameter type correspond to that is ``T``. Declarations ~~~~~~~~~~~~ A declaration's *interface type* is its type "as seen from outside". A declaration's *r-value type* is the formal type of an r-value reference to it. This is just the type it was declared with. A variable's *storage type* is the formal type of its storage. It differs from the r-value type only for declarations with ownership: - A ``@weak`` variable of type ``T?`` has storage type ``WeakStorageType(T)``, i.e. ``@sil_weak T``. - An ``@unowned`` variable of type ``T`` has storage type ``UnownedStorageType``, i.e. ``@sil_unowned T``. ``VD->getType()`` yields an unsubstituted storage type. Parsing ------- Semantic Analysis ----------------- Serialization ------------- SIL --- Type Lowering ~~~~~~~~~~~~~ An *uncurried type* is a formal function type which has had two or more of its formal input clauses combined into a single tuple input clause. The exact transformation is ``A -> B -> C -> D`` to ``(C, B, A) -> D``. Note that doing this in parts is not equivalent: ``(C, (B, A)) -> D`` is a different type. A *bridged type* has had native Swift representations of types turned into foreign equivalents. For example, ``String`` might turn into ``NSString``. Type bridging only affects function types with a foreign abstract CC. An uncurried, bridged function type is essentially one step away from being a lowered SIL function type. However, the lowered type of a declaration is not necessarily the same as the lowering of its uncurried, bridged function type. Type lowerings always use the standard conventions for their abstract CCs, but specific functions may use non-standard conventions. SIL type lowering does the following manipulations: - tuples are element-wise lowered and then reconstructed - function parameter types are deeply exploded ("de-tupled") and element-wise lowered - function result types are lowered (and turned into a parameter if address-only) Or at least, that's how it effectively plays out when there isn't an abstraction pattern. What actually happens is that we simultaneously walk into the abstraction pattern, decide how *that* would be represented, and then expand stuff according to it. When the abstraction pattern is totally opaque, we just throw up our hands and do the worst thing we can possibly imagine. Abstraction Difference ~~~~~~~~~~~~~~~~~~~~~~ The basic principle for the soundness of abstraction difference in SIL is that, as long as type substitution is done respecting abstraction difference, it should be okay to work with a single value at any level of substitution. This principle means that, any time you're working with a value whose type in the local context has been derived from substituting the type of a generic entity (e.g. a generic function or a member variable of a generic type), you need to derive its lowered type in one of two ways: - Lower the substituted formal type using the original formal type as an abstraction pattern. The substituted type needs to actually be a substitution of the original type, differing only in things like top-level polymorphism. This is generally the easiest thing, and you'll want these types anyway if you need to do conversions. - Use SIL type substitution (not Swift type substitution!) on the lowered original formal type. SIL Generation -------------- ``emitOrigToSubstValue`` transforms a value that's abstracted according to the original abstraction conventions into a value that's abstracted according to the substituted abstraction conventions. That is, it turns ``@callee_owned (@out (), @in ()) -> ()`` into ``@callee_owned () -> ()``. ``emitSubstToOrigValue`` is just the reverse of that. Both require you to give the original and subst formal types, uncurried where applicable. It's theoretically possible to do re-abstraction based on lowered types, but what I've found is that, if you try, you will pretty quick get stuck dealing with endless problems involving empty tuple types. Having the formal types around makes this brain-dead. When you're making a re-abstraction thunk, you need to be able to reverse a transformation; for example, if you orig-to-subst ``(T -> U)`` into ``((Int -> Float) -> Float)``, the ensuing thunk actually does a subst-to-orig on its ``Int -> Float`` parameter to turn it into a ``@callee_owned (@out Float, @in Int) -> ()``, then does an orig-to-subst on the result to turn the indirect return into a direct one. IR Generation ------------- *Explosion levels* are used to make IR-generation more explicit about when it can get away with passing values around directly that might be resilient in a different resilience domain. It's probably not the right abstraction for this, though. An *explosion* is a linear collection of LLVM IR scalar values. Swift has a number of types (primitive and otherwise) that comprise multiple IR values; ``Explosion`` makes it much easier to pass them around without artificially turning them into first-class aggregates. An explosion is first filled, then drained. Extra elements cannot be added after the first element is claimed. As a sanity check, explosions must be drained to completion. Any given explosion may contain several concatenated values at once, so a type-specific operation that consumes its inputs out of an explosion should take care to only consume the values that belong to that type. For example, an operation on a type whose explosion schema is three pointer values should always claim exactly three values. ``claimAll()`` and ``reset()`` should be used only very carefully.