Swift Language Reference

Introduction

In addition to the main spec, there are lots of open ended questions, justification, and ideas of what best practices should be. That random discussion is placed in boxes to the right side of the main text (like this one) to clarify what is normative and what is discussion.

This is the language reference manual for the Swift language, which is highly volatile and constantly under development. As the prototype evolves, this document should be kept up to date with what is actually implemented.

The grammar and structure of the language is defined in BNF form in yellow boxes. Examples are shown in gray boxes, and assume that the standard library is in use (unless otherwise specified).

Basic Goals

A non-goal of the Swift project in general is to become some amazing research project. We really want to focus on delivering a real product, and having the design and spec co-evolve.

In no particular order, and not explained well:

Support building great frameworks and applications, with a specific focus on permiting rich and powerful APIs.
Get the defaults right: this reduces the barrier to entry and increases the odds that the right thing happens.
Through our support for building great APIs, we aim to provide an expressive and productive language that is fun to program in.
Support low-level system programming. We should want to write compilers, operating system kernels, and media codecs in Swift. This means that being able to obtain high performance is really quite important.
Provide really great tools, like an IDE, debugger, profiling, etc.
Where possible, steal great ideas instead of innovating new things that will work out in unpredictable ways. It turns out that there are a lot of good ideas already out there.
Memory safe by default: array overrun errors, uninitialized values, and other problems endemic to C should not occur in Swift, even if it means some amount of runtime overhead. Eventually these checks will be disablable for people who want ultimate performance in production builds.
Efficiently implementable with a static compiler: runtime compilation is great technology and Swift may eventually get a runtime optimizer, but it is a strong goal to be able to implement swift with just a static compiler.
Interoperate as transparently as possible with C, Objective-C, and C++ without having to write an equivalent of "extern C" for every referenced definition.
Great support for efficient by-value types.
Elegant and natural syntax, aiming to be familiar and easy to transition to for "C" people. Differences from the C family should only be done when it provides a significant win (e.g. eliminate declarator syntax).
Lots of other stuff too.

A smaller wishlist goal is to support embedded sub-languages in swift, so that we don't get the OpenCL-is-like-C-but-very-different-in-many-details problem.

Basic Approach

Pushing as much of the language as realistic out of the compiler and into the library is generally good for a few reasons: 1) we end up with a smaller core language. 2) we force the language that is left to be highly expressive and extensible. 3) this highly expressive language core can then be used to build a lot of other great libraries, hopefully many we can't even anticipate at this point.

The basic approach in designing and implementing the Swift prototype was to start at the very bottom of the stack (simple expressions and the trivial bits of the type system) and incrementally build things up one brick at a time. There is a big focus on making things as simple as possible and having a clean internal core. Where it makes sense, sugar is added on top to make the core more expressive for common situations.

One major aspect that dovetails with expressivity, learnability, and focus on API development is that much of the language is implemented in a standard library (inspired in part by the Haskell Standard Prelude). This means that things like 'Int' and 'Void' are not part of the language itself, but are instead part of the standard library.

Phases of Translation

Because Swift doesn't rely on a C-style "lexer hack" to know what is a type and what is a value, it is possible to fully parse a file without resolving import declarations.

Swift has a strict separation between its phases of translation, and the compiler follows a conceptually simple design. The phases of translation are:

Lexing: A translation unit is broken into tokens according to a (nearly, /**/ comments can be nested) regular grammar.
Parsing and AST Building: The tokens are parsed according to the grammar set out below. The grammar is context free and does not require any "type feedback" from the lexer or later stages. During parsing, name binding for references to local variables and other declarations that are not at translation unit (and eventually namespace) scope are bound.
Name Binding: At this phase, references to non-local types and values are bound, and import directives are both validated and searched. Name binding can cause recursive compilation of modules that are referenced but not yet built.
Type Checking: During this phase all types are resolved within value definitions, function application and binary expressions are found and formed, and overloaded functions are resolved.
Code Generation: The AST is converted the LLVM IR, optimizations are performed, and machine code generated.
Linking: runtime libraries and referenced modules are linked in.

FIXME: "import swift" implicitly added as the last import in translation unit.

Lexical Structure

Not all characters are "taken" in the language, this is because it is still growing. As there becomes a reason to assign things into the identifier or punctuation bucket, we will do so as swift evolves.

The lexical structure of a Swift file is very simple: the files are tokenized according to the following productions and categories. As is usual with most languages, tokenization uses the maximal munch rule and whitespace separates tokens. This means that "a b" and "ab" lex into different token streams and are therefore different in the grammar.

Whitespace and Comments

Nested block comments are important because we don't have the nestable "#if 0" hack from C to rely on.

    whitespace ::= ' '
    whitespace ::= '\n'
    whitespace ::= '\r'
    whitespace ::= '\t'
    whitespace ::= '\0'
    comment    ::= //.*[\n\r]
    comment    ::= /* .... */

Space, newline, tab, and the nul byte are all considered whitespace and are discarded, with one exception: a '(' or '[' which does not follow a non-whitespace character is different kind of token (called spaced) from one which does not (called unspaced). A '(' or '[' at the beginning of a file is spaced.

Comments may follow the BCPL style, starting with a "//" and running to the end of the line, or may be recursively nested /**/ style comments. Comments are ignored and treated as whitespace.

Reserved Punctuation Tokens

The difference between reserved punctuation and identifiers is that you can't "overload an operator" with one of these names.

Note that -> is used for function types "() -> Int", not pointer dereferencing.

    punctuation ::= '('
    punctuation ::= ')'
    punctuation ::= '{'
    punctuation ::= '}'
    punctuation ::= '['
    punctuation ::= ']'
    punctuation ::= '.'
    punctuation ::= ','
    punctuation ::= ';'
    punctuation ::= ':'
    punctuation ::= '='
    punctuation ::= '->'
    punctuation ::= '...'
    punctuation ::= '&' // unary prefix operator

These are all reserved punctuation that are lexed into tokens. Most other punctuation is matched as identifiers.

Reserved Keywords

The number of keywords is reduced by pushing most functionality into the library (e.g. "builtin" datatypes like 'Int' and 'Bool'). This allows us to add new stuff to the library in the future without worrying about conflicting with the user's namespace.

    // Declarations and Type Keywords
    keyword ::= 'class'
    keyword ::= 'destructor'
    keyword ::= 'extension'
    keyword ::= 'import'
    keyword ::= 'init'
    keyword ::= 'func'
    keyword ::= 'metatype'
    keyword ::= 'enum'
    keyword ::= 'protocol'
    keyword ::= 'static'
    keyword ::= 'struct'
    keyword ::= 'subscript'
    keyword ::= 'typealias'
    keyword ::= 'var'
    keyword ::= 'where'

    // Statements
    keyword ::= 'break'
    keyword ::= 'case'
    keyword ::= 'continue'
    keyword ::= 'default'
    keyword ::= 'do'
    keyword ::= 'else'
    keyword ::= 'if'
    keyword ::= 'in'
    keyword ::= 'for'
    keyword ::= 'return'
    keyword ::= 'switch'
    keyword ::= 'then'
    keyword ::= 'while'

    // Expressions
    keyword ::= 'as'
    keyword ::= 'is'
    keyword ::= 'new'
    keyword ::= 'super'
    keyword ::= 'self'
    keyword ::= 'Self'
    keyword ::= '__COLUMN__'
    keyword ::= '__FILE__'
    keyword ::= '__LINE__'

These are the builtin keywords.

Integer Literals

    integer_literal ::= [0-9][0-9_]*
    integer_literal ::= 0x[0-9a-fA-F][0-9a-fA-F_]*
    integer_literal ::= 0o[0-7][0-7_]*
    integer_literal ::= 0b[01][01_]*

Integer literal tokens represent simple integer values of unspecified precision. They may be expressed in decimal, binary with the '0b' prefix, octal with the '0o' prefix, or hexadecimal with the '0x' prefix. Unlike C, a leading zero does not affect the base of the literal.

Integer literals may contain underscores at arbitrary positions after the first digit. These underscores may be used for human readability and do not affect the value of the literal.

    789
    0789

    1000000
    1_000_000

    0b111_101_101
    0o755

    0b1111_1011
    0xFB

Floating Point Literals

We require a digit on both sides of the dot to allow lexing "4.km" as "4 . km" instead of "4. km" and for a series of dots to be an operator (for ranges). The regex for decimal literals is same as Java, and the one for hex literals is the same as C99, except that we do not allow a trailing suffix that specifies a precision.

    floating_literal ::= [0-9][0-9_]*\.[0-9][0-9_]*
    floating_literal ::= [0-9][0-9_]*\.[0-9][0-9_]*[eE][+-]?[0-9][0-9_]*
    floating_literal ::= [0-9][0-9_]*[eE][+-]?[0-9][0-9_]*
    floating_literal ::= 0x[0-9A-Fa-f][0-9A-Fa-f_]*
                           (\.[0-9A-Fa-f][0-9A-Fa-f_]*)?[pP][+-]?[0-9][0-9_]*

Floating point literal tokens represent floating point values of unspecified precision. Decimal and hexadecimal floating-point literals are supported.

The integer, fraction, and exponent of a floating point literal may each contain underscores at arbitrary positions after their first digits. These underscores may be used for human readability and do not affect the value of the literal. Each part of the floating point literal must however start with a digit; 1._0 would be a reference to the _0 member of 1.

    1.0
    1000000.75
    1_000_000.75

    0x1.FFFFFFFFFFFFFp1022
    0x1.FFFF_FFFF_FFFF_Fp1_022

Character Literals

    character_literal ::= '[^'\\\n\r]|character_escape'
    character_escape  ::= [\]0 [\][\] | [\]t | [\]n | [\]r | [\]" | [\]'
    character_escape  ::= [\]x hex hex
    character_escape  ::= [\]u hex hex hex hex
    character_escape  ::= [\]U hex hex hex hex hex hex hex hex
    hex               ::= [0-9a-fA-F]

character_literal tokens represent a single character, and are surrounded by single quotes.

The ASCII and Unicode character escapes:

    \0 == nul
    \n == new line
    \r == carriage return
    \t == horizontal tab
    \u == small Unicode code points
    \U == large Unicode code points
    \x == raw ASCII byte (less than 0x80)

String Literals

FIXME: Forcing + to concatenate strings is somewhat gross, a proper protocol would be better.

    string_literal   ::= ["]([^"\\\n\r]|character_escape|escape_expr)*["]
    escape_expr      ::= [\]escape_expr_body
    escape_expr_body ::= [(]escape_expr_body[)]
    escape_expr_body ::= [^\n\r"()]

string_literal tokens represent a string, and are surrounded by double quotes. String literals cannot span multiple lines.

String literals may contain embedded expressions in them (known as "interpolated expressions") subject to some specific lexical constraints: the expression may not contain a double quote ["], newline [\n], or carriage return [\r]. All parentheses must be balanced.

In addition to these lexical rules, an interpolated expression must satisfy the expr production of the general swift grammar. This expression is evaluated, and passed to the constructor for the inferred type of the string literal. It is concatenated onto any fixed portions of the string literal with a global "+" operator that is found through normal name lookup.

    // Simple string literal.
    "Hello world!"
    
    // Interpolated expressions.
    "\(min)..\(max)" + "Result is \((4+i)*j)"

Identifier Tokens

    identifier ::= id-start id-continue*
    
    // An identifier can start with an ASCII letter or underscore...
    id-start ::= [A-Za-z_]

    // or a Unicode alphanumeric character in the Basic Multilingual Plane...
    // (excluding combining characters, which can't appear initially)
    id-start ::= [\u00A8\u00AA\u00AD\u00AF\u00B2-\u00B5\u00B7-00BA]
    id-start ::= [\u00BC-\u00BE\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF]
    id-start ::= [\u0100-\u02FF\u0370-\u167F\u1681-\u180D\u180F-\u1DBF]
    id-start ::= [\u1E00-\u1FFF]
    id-start ::= [\u200B-\u200D\u202A-\u202E\u203F-\u2040\u2054\u2060-\u206F]
    id-start ::= [\u2070-\u20CF\u2100-\u218F\u2460-\u24FF\u2776-\u2793]
    id-start ::= [\u2C00-\u2DFF\u2E80-\u2FFF]
    id-start ::= [\u3004-\u3007\u3021-\u302F\u3031-\u303F\u3040-\uD7FF]
    id-start ::= [\uF900-\uFD3D\uFD40-\uFDCF\uFDF0-\uFE1F\uFE30-FE44]
    id-start ::= [\uFE47-\uFFFD]

    // or a non-private-use, valid code point outside of the BMP.
    id-start ::= [\u10000-\u1FFFD\u20000-\u2FFFD\u30000-\u3FFFD\u40000-\u4FFFD]
    id-start ::= [\u50000-\u5FFFD\u60000-\u6FFFD\u70000-\u7FFFD\u80000-\u8FFFD]
    id-start ::= [\u90000-\u9FFFD\uA0000-\uAFFFD\uB0000-\uBFFFD\uC0000-\uCFFFD]
    id-start ::= [\uD0000-\uDFFFD\uE0000-\uEFFFD]

    // After the first code point, an identifier can contain ASCII digits...
    id-continue ::= [0-9]

    // and/or combining characters...
    id-continue ::= [\u0300-\u036F\u1DC0-\u1DFF\u20D0-\u20FF\uFE20-\uFE2F]

    // in addition to the starting character set.
    id-continue ::= id-start

    identifier-or-any ::= identifier
    identifier-or-any ::= '_'

The set of valid identifier characters is consistent with WG14 N1518, "Recommendations for extended identifier characters for C and C++". This roughly corresponds to the alphanumeric characters in the Basic Multilingual Plane and all non-private-use code points outside of the BMP. It excludes mathematical symbols, arrows, line and box drawing characters, and private-use and invalid code points. An identifier cannot begin with one of the ASCII digits '0' through '9' or with a combining character.

The Swift compiler does not normalize Unicode source code, and matches identifiers by code points only. Source code must be normalized to a consistent normalization form before being submitted to the compiler.

    // Valid identifiers
    foo
    _0
    swift
    vernissé
    闪亮
    מבריק
    😄

    // Invalid identifiers
    ☃     // Is a symbol
    0cool // Starts with an ASCII digit
     ́foo  // Starts with a combining character
         // Is a private-use character

Operators

[1] The '=' token is explicitly handled in the grammar elsewhere, and in general users cannot provide custom definitions for the '=' operator. This distinctly differs from C++, which allows '=' to be overloaded.

[2] The '->' token is reserved punctuation, and cannot be used as an operator identifier.

[3] The unary prefix '&' token is reserved punctuation, and cannot be used as an operator identifier.

[4] The '//', '/*', and '*/' tokens are reserved for comments, and cannot be used as operator identifiers.

    operator ::= [@/=-+*%<>!&|^~]+
    operator ::= \.\.

      Note: excludes '=', see [1]
            excludes '->', see [2]
            excludes unary '&', see [3]
            excludes '//', '/*', and '*/', see [4]
            '..' is an operator, not two '.'s.

    operator-binary ::= operator
    operator-prefix ::= operator
    operator-postfix ::= operator

    left-binder  ::= [ \r\n\t\(\[\{,;:]
    right-binder ::= [ \r\n\t\)\]\},;:]

    any-identifier ::= identifier | operator

operator-binary, operator-prefix, and operator-postfix are distinguished by immediate lexical context. An operator token is called left-bound if it is immediately preceded by a character matching left-binder. An operator token is called right-bound if it is immediately followed by a character matching right-binder. An operator token is an operator-prefix if it is right-bound but not left-bound, an operator-postfix if it is left-bound but not right-bound, and an operator-binary in either of the other two cases.

As an exception to the above rule, an operator immediately followed by a dot ('.') is only considered right-bound if not already left-bound. This allows a@.prop to be parsed as (a@).prop rather than as a @ .prop. Similarly, because the '!' operator is defined by the standard library to destructure optional types and is thus expected to be used ubiquitously, it is also only considered right-bound if not already left-bound.

When parsing certain grammatical constructs that involve '<' and '>' (such as protocol composition types), an operator with a leading '<' or '>' may be split into two or more tokens: the leading '<' or '>' and the remainder of the token, which may be an operator or punctuation token that may itself be further split. This rule allows us to parse nested constructs such as A<B<C>> without requiring spaces between the closing '>'s.

Implementation Identifier Token

    dollarident ::= '$' id-continue+

Tokens that start with a $ are separate class of identifier, which are fixed purpose names that are defined by the implementation.

Declarations

    decl ::= decl-class
    decl ::= decl-constructor
    decl ::= decl-destructor
    decl ::= decl-extension
    decl ::= decl-func
    decl ::= decl-import
    decl ::= decl-enum
    decl ::= decl-enum-element
    decl ::= decl-protocol
    decl ::= decl-struct
    decl ::= decl-typealias
    decl ::= decl-var
    decl ::= decl-subscript

Translation Unit

    translation-unit ::= brace-item*

The top level of a swift source file is grammatically identical to the contents of a func decl. Some declarations have semantic restrictions that only allow them within a translation unit though.

Brace Enclosed Items

    brace-item-list ::= '{' brace-item* '}'
    
    brace-item      ::= decl
    brace-item      ::= expr
    brace-item      ::= stmt

The brace item list provides a sequencing operation which evaluates the members of its body in order. Function bodies and the bodies of control flow statements use braces. Also, the translation unit itself is effectively a brace item list, but without the braces.

import Declarations

    decl-import ::= 'import' attribute-list import-kind? import-path
    
    import-kind ::= 'typealias'
    import-kind ::= 'struct'
    import-kind ::= 'class'
    import-kind ::= 'enum'
    import-kind ::= 'protocol'
    import-kind ::= 'var'
    import-kind ::= 'func'
    
    import-path ::= any-identifier ('.' any-identifier)*

'import' declarations allow named values and types to be accessed with local names, even when they are defined in other modules and namespaces. See the section on name binding for more information on how these work. import declarations are only allowed at translation unit scope.

'import' directives only impact a single translation unit: imports in one swift file do not affect name lookup in another file. import directives can only occur at the top level of a file, not within a function or namespace.

An import without an explicit import-kind names a module; all of the module's members are imported into the current scope. The module's name is also imported into the the current scope in order to allow qualified access to the module's members, which can be useful for disambiguation.

If an import-kind is provided, the last element of the import path is taken to be the name of a decl within the module named by the rest of the path. Only that name is introduced into the current scope; the name of the module itself is not accessible, nor any other decls within the module.

Different import-kinds perform different filters on the decls within a module:

typealias can be used to import any concrete type (struct, class, enum, or another typealias). It cannot be used to import protocols, which are often used for more than just their existential type.
struct, class, enum can be used to import any type whose canonical type is a struct, class, or enum, respectively. (This allows "Int" to be imported as a struct, for example, even though its definition in the standard library may be a typealias for another struct type.)
protocol is used to import a protocol
var is used to import a module-scoped variable
func will import all overloads of a function

    // Import all of the top level symbols and types in a module.
    import swift

    // Import all of the symbols within a submodule.
    import swift.io

    // Import a single variable, function, type, etc.
    import typealias swift.io.BufferedStream

    // Import all addition overloads.
    import func swift.+

extension Declarations

    decl-extension ::= 'extension' type-identifier inheritance? '{' decl* '}'

'extension' declarations allow adding member declarations to existing types, even in other translation units and modules. There are different semantic rules for each type that is extended.

enum, struct, and class declaration extensions

FIXME: Write this section.

var Declarations

    decl-var        ::= 'var' attribute-list pattern initializer?  (',' pattern initializer?)*

    decl-var        ::= 'var' attribute-list identifier ':' type-annotation brace-item-list

    decl-var        ::= 'var' attribute-list identifier ':' type-annotation '{' get-set '}'
  
   initializer     ::= '=' expr

    get-set         ::= get set?
    get-set         ::= set get

    get             ::= 'get:' brace-item*

    set             ::= 'set' set-name? ':' brace-item*

    set-name        ::= '(' identifier ')'

'var' declarations form the backbone of value declarations in Swift. A var declaration takes a pattern and an optional initializer, and declares all the pattern-identifiers in the pattern as variables. If there is an initializer and the pattern is fully-typed, the initializer is converted to the type of the pattern. If there is an initializer and the pattern is not fully-typed, the type of initializer is computed independently of the pattern, and the type of the pattern is derived from the initializer. If no initializer is specified, the pattern must be fully-typed, and the values are default-initialized.

If there is more than one pattern in a 'var' declaration, they are each considered independently, as if there were multiple declarations. The initial attribute-list is shared between all the declared variables.

A var declaration may contain a getter and (optionally) a setter, which will be used when reading or writing the variable, respectively. Such a variable does not have any associated storage. A var declaration with a getter or setter must have a type (call it T). The getter function, whose body is provided as part of the var-get clause, has type () -> T. Similarly, the setter function, whose body is part of the var-set clause (if provided), has type (T) -> (). If the var-set clause contains a var-set-name clause, the identifier of that clause is used as the name of the parameter to the setter. Otherwise, the parameter name is "value".

FIXME: Should the type of a pattern which isn't fully typed affect the type-checking of the expression (i.e. should we compute a structured dependent type)?

Like all other declarations, var's can optionally have a list of attributes applied to them.

The type of a variable must be materializable. A variable is an lvalue unless it has a var-get clause but not var-set clause.

Here are some examples of var declarations:

    // Simple examples.
    var a = 4
    var b : Int
    var c : Int = 42
    
    // This decodes the tuple return value into independently named parts
    // and both 'val' and 'err' are in scope after this line.
    var (val, err) = foo()

    // Variable getter/setter
    var _x : Int = 0
    var x_modify_count : Int = 0
    var x1 : Int {
      return _x
    }
    var x2 : Int {
    get:
      return _x
    set:
      x_modify_count = x_modify_count + 1
      _x = value
    }

Note that both 'get' and 'set' are context-sensitive keywords.

func Declarations

    decl-func        ::= 'static'? 'func' attribute-list any-identifier generic-params? func-signature brace-item-list?

'func' is a declaration for a function. The argument list and optional return value are specified by the type production of the function, and the body is either a brace expression or elided. Like all other declarations, functions are can have attributes.

If the type is not syntactically a function type (i.e., has no -> in it at top-level), then the return value is implicitly inferred to be "()". All of the argument and return value names are injected into the scope of the function body.

A function in an extension of some type (or in other places that are semantically equivalent to an extension) implicitly get a 'self' argument with these rules ... [todo]

'static' functions are only allowed in an extension of some type (or in other places that are semantically equivalent to an extension). They indicate that the function is actually defined on the metatype for the type, not on the type itself. Thus it does not implicitly get a first 'self' argument, and can be used with dot syntax on the metatype.

TODO: Func should be an immutable name binding, it should implicitly add an attribute immutable when it exists.

TODO: Incoming arguments should be readonly, result should be implicitly writeonly when we have these attributes.

Function signatures

    func-signature ::= func-arguments func-signature-result?
    func-arguments ::= pattern-tuple+
    func-arguments ::= selector-tuple
    selector-tuple ::= '(' pattern-tuple-element ')' (identifier-or-any '(' pattern-tuple-element ')')+
    func-signature-result ::= '->' type

A function signature specifies one or more sets of parameter patterns, plus an optional result type.

When a result type is not written, it is implicitly the empty tuple type, ().

In the body of the function described by a particular signature, all the variables bound by all of the parameter patterns are in scope, and the function must return a value of the result type.

An outermost pattern in a function signature must be fully-typed and irrefutable. If a result type is given, it must also be fully-typed.

The type of a function with signature (P₀)(P₁)..(P_n) -> R is T₀ -> T₁ -> .. -> T_n -> R, where T_i is the bottom-up type of the pattern P_i. This is called "currying". The behavior of all the intermediate functions (those which do not return R) is to capture their arguments, plus any arguments from prior patterns, and returns a function which takes the next set of arguments. When the "uncurried" function is called (the one taking T_n and returning R), all of the arguments are then available and the function body is finally evaluated as normal.

A function declared with a selector-style signature func(a₀:T₀) name₁(a₁:T₁) .. name_n(a_n:T_n) -> R has the type (_:T₀, name₁:T₁, .. name_n:T_n) -> R, that is, the names of the fields in the argument tuple are the name_n identifiers preceding each argument pattern. However, in the body of a function described by a signature, those arguments will be bound using the corresponding a_n patterns inside the arguments. This allows for Cocoa-style keyword function names such as doThing(x, withThing:y) to be defined without requiring that an awkward keyword name be the same as the variable name.

Here are some examples of func definitions:

    // Implicitly returns (), aka Void
    func a() {}

    // Same as 'a'
    func a1() -> Void {}
    
    // Function pointers to a function expression.
    var a2 = func ()->() {}
    var a3 = func () {}
    var a4 = func {}

    // Really simple function
    func c(arg : Int) -> Int { return arg+4 }

    // Simple operators.
    func [infix_left=190] +  (lhs : Int, rhs : Int) -> Int
    func [infix_left=160] == (lhs : Int, rhs : Int) -> Bool

    // Curried function with multiple return values:
    func d(a : Int) (b : Int) -> (res1 : Int, res2 : Int) {
      return (a,b)
    }
    
    // A more realistic example on a trivial type.
    struct bankaccount { 
      amount : Int
    
      static func bankaccount() -> bankaccount {
        // Custom 'constructor' logic goes here.
      }
      func deposit(arg : Int) {
        amount = amount + arg
      }
    
      static func someMetaTypeMethod() {}
    }
    
    // Dot syntax on metatype.
    bankaccount.someMetaTypeMethod()

    // A function with selector-style signature.

    enum PersonOfInterest {
      case ColonelMustard
      case MissScarlet
    }
    enum Room {
      case Conservatory
      case Ballroom
    }
    enum Weapon {
      case Candlestick
      case LeadPipe
    }

    func accuseSuspect(suspect:PersonOfInterest)
        inRoom(room:Room)
        withWeapon(weapon:Weapon) {
      println("It was \(suspect) in the \(room) with the \(weapon)")
    }

    // Calling a selector-style function.
    accuseSuspect(.ColonelMustard, inRoom:.Ballroom, withWeapon:.LeadPipe)

typealias Declarations

We use the keyword "typealias" instead of "typedef" because it really is an alias for an existing type, not a "definition" of a new type.

    decl-typealias ::= typealias-head '=' type
    typealias-head ::= 'typealias' identifier inheritance?

'typealias' makes a named alias of a type, like a typedef in C. From that point on, the alias may be used in all situations the specified name is. If an inheritance clause is provided, it specifies protocols to which the aliased type shall conform.

Here are some examples of type aliases:

    // location is an alias for a tuple of ints.
    typealias location = (x : Int, y : Int)
      
    // pair_fn is a function that takes two ints and returns a tuple.
    typealias pair_fn = (Int) -> (Int) -> (first : Int, second : Int)

enum Declarations

    decl-enum ::= 'enum' attribute-list identifier generic-params? inheritance? enum-body
    enum-body ::= '{' decl* '}'
    
    decl-enum-element ::= 'case' enum-case (',' enum-case)*
    enum-case ::= identifier type-tuple? ('->' type)?

an enum declaration creates a enum type. Here are some examples of enum declarations:

    // Declares three enums.
    enum DataSearchFlags {
      case None
      case Backward
      case Anchored
    }

    // Shorthand for the above.
    enum DataSearchFlags {
      case None, Backward, Anchored
    }
    
    // Declare discriminated union with enum decl.
    enum SomeInts {
      case None
      case One(Int)
      case Two(Int, Int)
    }
    
    func f1(searchpolicy : DataSearchFlags)  // DataSearchFlags is a valid type name
    func test1() {
      f1(DataSearchFlags.None)  // Use of constructor with qualified identifier
      f1(.None)                 // Use of constructor with context sensitive type inference
    
      // "None" has no type argument, so the constructor's type is "DataSearchFlags". 
      var a : DataSearchFlags = .None
    }
    
    enum SomeMoreInts {
      case None           // Doesn't conflict with previous "None".
      case One(Int)
      case Two(Int, Int)
    }
    
    func f2(a : SomeMoreInts)
    
    func test2() {
      // Constructors for enum element can be used in the obvious way.
      f2(.None)
      f2(.One(4))
      f2(.Two(1, 2))
    
      // Constructor for None has type "SomeMoreInts".
      var a : SomeMoreInts = SomeMoreInts.None
    
      // Constructor for One has type "(Int) -> SomeMoreInts".
      var b : (Int) -> SomeMoreInts = SomeMoreInts.One
    
      // Constructor for Two has type "(Int,Int) -> SomeMoreInts".
      var c : (Int,Int) -> SomeMoreInts = SomeMoreInts.Two
    }

struct Declarations

    decl-struct ::= 'struct' attribute-list identifier generic-params? inheritance? '{' decl-struct-body '}'
    decl-struct-body ::= decl*

A struct declares a simple value type that can contain data members and have methods.

The body of a 'struct' is a list of decls. Non-property 'var' decls declare members with storage in the struct. Other declarations act like they would in an extension of the struct type.

Here are a few simple examples:

    struct S1 {
      var a : Int, b : Int
    }
    
    struct S2 {
      var a : Int
      func f() -> Int { return b }
      var b : Int
    }

Here are some more realistic examples of structs:

    struct Point { x : Int, y : Int }
    struct Size { width : Int, height : Int }
    struct Rect {
      origin : Point,
      size : Size

      typealias CoordinateType = Int
    
      func area() -> Int { return size.width*size.height }
    }
    
    func test4() {
      var a : Point
      var b = Point.Point(1, 2)    // Silly but fine.
      var c = Point(y = 1, x = 2)  // Using metatype.
    
      var x1 = Rect(a, Size(42, 123))
      var x2 = Rect(size = Size(width = 42, height=123), origin = a)
    
      var x1_area = x1.width*x1.height
      var x1_area2 = x1.area()
    }

Structs do not support inheritance due to undesirable ripple effects across the design of the language. For example, method dispatch would arguably need to become virtual, not static. The storage of the type would arguably need to become indirected so that an array of T could be implemented sanely (because we don't know if T is actually a T, or a subclass of T). We'd need to store the "isa"/vtable in the struct so that virtual method dispatch could be implemented, and this has additional storage costs. None of these tradeoffs make sense for the intended use cases we have in mind for structs (Ints, Floats, Points, Rects, UUIDs, IP addresses, C struct interop, etc, etc). Said differently: we're trying to force a hard wall between types that need indirect access by their nature and those types that need direct access by their nature. The former are called classes. The latter are called structs.

class Declarations

    decl-class ::= 'class' attribute-list identifier generic-params? inheritance? '{' decl-class-body '}'
    decl-class-body ::= decl*

A class declares a reference type referring to an object which can contain data members and have methods. Classes support single inheritance; a parent class should be listed as the first type in the inheritance list.

The body of a 'class' is a list of decls. Non-property 'var' decls declare members with storage in the class. Non-static 'var' and 'func' decls declare instance members; static 'var' and 'func' decls declare members of the class itself. Both class and instance members can be overridden by a subclass.

Type declarations inside a class act essentially the same way as type declarations outside a class.

FIXME: For the moment, see classes.rst for more details on the class system.

FIXME: Add a reference to the section on generics.

The only way to create a new instance of a class is with a new expression.

Here is a simple example:

    class C1 {
      var a : Int
      var b : Int
    }

Protocol Declarations

    decl-protocol ::= 'protocol' attribute-list identifier inheritance? '{' protocol-member* '}'

A protocol declaration describes an abstract interface implemented by another type. It consists of a set of declarations, which may be instance methods or properties. A type conforms to a protocol if it provides declarations that correspond to each of the declarations in a protocol.

Here are some examples of protocols:

    protocol Document {
      var title : String
    }

'func' protocol elements

    protocol-member ::= decl-func

'func' members of a protocol define a value of function type that may be accessed with dot syntax on a value of the protocol's type. The function gets an implicit "self" argument of the protocol type and shall not be static.

'var' protocol elements

    protocol-member ::= decl-var

'var' members of a protocol define "property" values that may be accessed with dot syntax on a value of the protocol's type. The actual variables have no storage, and will instead by accessed by a getter and setter. Thus, the variables shall have neither an initializer nor a getter/setter clause.

'subscript' protocol elements

    protocol-member ::= subscript-head

'subscript' members of a protocol define subscripting operations that may be accessed with the subscript operator ('[]') applied to a value of the protocol's type.

TODO: There is currently no way to express a requirement for a read-only or write-only subscript operation or variable. We may end up doing this with some kind of 'const' or 'immutable' attribute.

'typealias' protocol elements (associated types)

    protocol-member ::= typealias-head

'typealias' members of a protocol define associated types, which are types used within the description of a protocol (typically in the inputs and outputs of 'func' members) that vary from one conforming type to another. When an associated type has an inheritance clause, any type meant to satisfy the associated type requirement must conform to each of the protocols specified within that inheritance clause.

    protocol Enumerable {
      typename EnumeratorType : Enumerator
      func getElements() -> EnumeratorType
    }

subscript Declarations

    decl-subscript        ::= subscript-head '{' get-set '}'
    subscript-head        ::= 'subscript' attribute-list pattern-tuple '->' type

A subscript declaration provides support for subscripting an object of a particular type via a getter and (optional) setter. Therefore, subscript declarations can only appear within a type definition or extension.

The pattern-tuple of a subscript declaration provides the indices that will be used in the subscript expression, e.g., the i in a[i]. This pattern must be fully-typed. The type following the arrow provides the type of element being accessed, which must be materializable. Subscript declarations can be overloaded, so long as either the pattern-tuple or type differs from other declarations.

The get-set clause specifies the getter and setter used for subscripting. The getter is a function whose input is the type of the pattern-tuple and whose result is the element type. Similarly, the setter is a function whose result type is () and whose input is the type of the pattern-tuple with a parameter of the element type added to the end of the tuple; the name of the parameter is the set-name, if provided, or value otherwise.

// Simple bit vector with storage for 64 boolean values
struct BitVector64 {
  bits : Int64

  // Allow subscripting with integer subscripts and a boolean result.
  subscript (bit : Int) -> Bool {
    // Getter tests the given bit
    get {
      if (bits & (1 << bit)) != 0 {
        return true
      }
      return false;
    }

    // Setter sets the given bit to the provided value
    set {
      var mask = 1 << bit
      if value {
        bits = bits | mask
      } else {
        bits = bits & ~mask
      }  
    }
  }
}

var vec : BitVector64
vec[2] = true
if vec[3] {
  print("third bit is set\n");
}

constructor Declarations

    decl-constructor ::= 'init' attribute-list generic-params? constructor-signature brace-item-list

    constructor-signature ::= pattern-tuple
    constructor-signature ::= identifier-or-any selector-tuple

'constructor' declares a constructor for a class, struct, or enum. Such a declaration is used whenever an object is constructed. Specifically, for classes, it is used when a new expression is written, and for structs and enums, it is used for function application when the "function" is a metatype.

FIXME: We haven't decided the precise rules for when constructors are implicitly declared. Default construction doesn't work right for structs or enums. We haven't decided what the restrictions are if a member isn't default-constructible.

A simple example:

    struct X {
      var member : Int
      init(x : Int) {
        member = x
      }
    }
    var a = X(10)

If a class is derived from a superclass, it must explicitly invoke a superclass constructor using the super.init syntax. super.init may only be used in a subclass constructor; it is not valid in a struct, enum, or root class constructor. Additionally, super.init may only be referenced exactly once per derived constructor. An example:

    class View {
      var bounds : Rect
      init(bounds:Rect) {
        self.bounds = bounds
      }
    }

    class Button : View {
      var onClick : () -> ()
      init(bounds:Rect, onClick:() -> ()) {
        super.init(bounds)
        self.onClick = onClick
      }
    }

destructor Declarations

    decl-constructor ::= 'destructor' attribute-list brace-item-list

'destructor' declares a destructor for a class. This function is called when there are no longer any references to a class object, just before it is destroyed. Note that destructors can only be declared for classes, and cannot be declared in extensions. Subclass destructors implicitly invoke their superclass destructors after executing.

FIXME: We haven't really decided the precise rules here, but it's probably a fatal error to either throw an exception or stash a reference to 'self' in a destructor. Not sure what happens when we cause the reference count of another object to reach zero inside a destructor. We might eventually allow destructors in extensions once we have ivars in extensions.

A simple example:

    class X {
      var fd : Int
      destructor {
        close(fd)
      }
    }

Attribute Lists

    attribute-list        ::= /*empty*/
    attribute-list        ::= attribute-list-clause attribute-list
    attribute-list-clause ::= '[' ']'
    attribute-list-clause ::= '[' attribute (',' attribute)* ']'
    
    attribute      ::= attribute-infix
    attribute      ::= attribute-resilience
    attribute      ::= attribute-byref
    attribute      ::= attribute-auto_closure
    attribute      ::= attribute-noreturn

An attribute list is written as a sequence of clauses delimited by square brackets, each of which contains a (possibly empty) comma-separated list of attributes. Neither the ordering of attributes nor the grouping of attributes into separate clauses has any semantic effect. Attributes may not be repeated within a list.

Infix Attributes

    attribute-infix ::= 'infix_left'  '=' integer_literal
    attribute-infix ::= 'infix_right' '=' integer_literal
    attribute-infix ::= 'infix        '=' integer_literal

The infix attributes may only be applied to the declaration of a function of binary operator type whose name is an operator. The name indicates the associativity of the operator, either left associative, right associative, or non-associative.

FIXME: Implement these restrictions.

Resilience Attribute

    attribute-resilience ::= 'resilient'
    attribute-resilience ::= 'fragile'
    attribute-resilience ::= 'born_fragile'

See the resilience design.

By-Reference Attribute

    attribute-byref ::= 'byref'

byref is only valid in a type-annotation that appears within either a pattern of a function-signature or the input type of a function type.

byref indicates that the argument will be passed "by reference": the bound variable will be an l-value.

The type being annotated must be materializable. The type after annotation is never materializable.

FIXME: we probably need a const-like variant, which permits r-values (and avoids writeback when the l-value is not physical). We may also need some way of representing this will be consumed by the nth curry.

auto_closure Attribute

    attribute-auto_closure ::= 'auto_closure'

The auto_closure attribute modifies a function type, changing the behavior of any assignment into (or initialization of) a value with the function type. Instead of requiring that the rvalue and lvalue have the same function type, an "auto closing" function type requires its initializer expression to have the same type as the function's result type, and it implicitly binds a closure over this expression. This is typically useful for function arguments that want to capture computation that can be run lazily.

auto_closure is only valid in a type-annotation of a syntactic function type that is defined to take a syntactic empty tuple.

  // An auto closure value.  This captures an implicit closure over the
  // specified expression, instead of the expression itself.
  var a : [auto_closure] () -> Int = 4
  
  // Definition of an 'assert' function.  Assertions and logging routines
  // often want to conditionally evaluate their argument.
  func assert(condition : [auto_closure] () -> Bool)
    
  // Definition of the || operator - it captures its right hand side as
  // an autoclosure so it can short-circuit evaluate it.
  func [infix_left=110] || (lhs: Bool, rhs: [auto_closure] ()->Bool) -> Bool
    
  // Example uses of these functions:
  assert(i < j)
  if (a == 0 || b == 42) { ... }

No Return Attribute

    attribute-noreturn ::= 'noreturn'

Attribute noreturn is only valid in the attribute list of a function declaration or in the attribute list of a type-annotation that describes a syntactic function type.

noreturn indicates to the compiler that the function will not return to the caller. This attribute should be used to suppress the uninitialized variable, missing return warnings and errors. The compiler is also allowed to more aggressively optimize the code in presence of this attribute.

If a function with no a noreturn attribute contains a return statement, an error will be raised.

Types

    type ::= type-function
    type ::= type-array
    
    type-simple ::= type-identifier
    type-simple ::= type-tuple
    type-simple ::= type-composition
    type-simple ::= type-metatype
    type-simple ::= type-optional

    type-annotation ::= attribute-list type

Swift has a small collection of core datatypes that are built into the compiler. Most user-facing datatypes are defined by the standard library or declared as a user defined types.

Metatypes

Each type has a corresponding metatype, with the same name as the type, that is injected into the standard name lookup scope when a type is declared. This allows access to 'static functions' through dot syntax. For example:

    // Declares a type 'foo' as well as its metatype.
    struct foo {
      static func bar() {}
    }
    
    // Declares x to be of type foo.  A reference to a name in type context
    // refers to the type itself.
    var x : foo
    
    // Accesses a static function on the foo metatype.  In a value context, the
    // name of its type refers to its metatype.
    foo.bar()

Fully-Typed Types

A type may be fully-typed. A type is fully-typed unless one of the following conditions hold:

It is a function type whose result or input type is not fully-typed.
It is a tuple type with an element that is not fully-typed. A tuple element is fully-typed unless it has no explicit type (which is permitted for defaultable elements) or its explicit type is not fully-typed. In other words, a type is fully-typed unless it syntactically contains a tuple element with no explicit type annotation.

A type being 'fully-typed' informally means that the type is specified directly from its type annotation without needing contextual or other information to resolve its type.

Materializable Types

A type may be materializable. A type is materializable unless it is 1) annotated with a byref attribute or 2) a tuple with a non-materializable element type. In general, variables must have materializable type.

Named Types

    type-identifier ::= type-identifier-component ('.' type-identifier-component)*
    type-identifier-component ::= identifier generic-args?

Named types may be used simply by using their name. Named types are introduced by typealias declarations or through type declarations that expand to one.

    typealias location = (x : Int, y : Int)
    var x : location      // use of a named type.

Type names may use dot syntax to refer to names types declared in other modules or types nested within other types.

    // Direct reference to a member of another module.
    var x : swift.Int

Each component of a named type may be followed by a list of generic parameters for that component enclosed in angle brackets <>.

    // A generic class definition.
    class Dict<K, V> { }

    // A variable of a generic instance type.
    var map : Dict<String, Int>

Tuple Types

Tuples are everywhere in Swift: even the argument list of a function is a tuple of those arguments.

    type-tuple ::= '(' type-tuple-body? ')'
    type-tuple-body ::= type-tuple-element (',' type-tuple-element)* '...'?
    type-tuple-element ::= identifier ':' type-annotation
    type-tuple-element ::= type-annotation

Syntactically, tuple types are simply a (possibly empty) list of elements enclosed in parentheses. A tuple type with a single, anonymous element is exactly that type: the parentheses are treated as grouping parentheses.

Tuples are the low-level form of data aggregation in Swift, and are used as the building block of function argument lists, multiple return values, enum bodies, etc. Because tuples are widely accessible and available everywhere in the language, aggregate data access and transformation is uniform and powerful.

Each element of a tuple contains an optional name followed by a type.

If the tuple body ends with '...', the tuple is a varargs tuple. The type of the last element is changed from T to T[], and there are special rules for converting an expression to varargs tuple type.

  // Variable definitions.
  var a : ()
  var b : (Int, Int)
  var c : (x : (), y : Int)

  // Tuple type inferred from an initializers:
  var m = ()                     // Type = ()
  var n = (x = 1, y = 2)         // Type = (x : Int, y : Int)
  var o = (1, 2, 3)              // Type = (Int, Int, Int)

  // Function argument and result is a tuple type.
  func foo(x : Int, y : Int) -> (val : Int, err : Int)

  // enum and struct declarations with tuple values.
  struct S {
    var (a : Int, b : Int)
  }
  enum Vertex {
    case Point2(x : Int, y : Int)
    case Point3(x : Int, y : Int, z : Int)
    case Point4(w : Int, x : Int, y : Int, z : Int)
  }

Function Types

    type-function ::= type-tuple '->' type

Function types have a single input and single result type, separated by an arrow. Because each of the types is allowed to be a tuple, we trivially support multiple arguments and multiple results. "Function" types are more properly known as a "closure" type, because they can embody any context captured when the function value was formed.

The result type of a function type must be materializable. The argument type of a function is always required to be parenthesized (a tuple). The behavior of function types may be modified with the auto_closure attribute.

Because of the grammar structure, a nested function type like "(a) -> (b) -> c" is parsed as "(a) -> ((b) -> c)". This means that if you declare this that you can pass it one argument to get a function that "takes b and returns c" or you can pass two arguments to "get a c". This is known as currying. For example:

    // A simple function that takes a tuple and returns Int:
    var a : (a : Int, b : Int) -> Int

    // A simple function that returns multiple values:
    var a : (a : Int, b : Int) -> (val: Int, err: Int)

    // Declare a function that returns a function:
    var x : (Int) -> (Int) -> Int
    
    // y has type (Int) -> Int
    var y = x(1)

    // z1 and z2 both has type Int, and both have the same value (assuming
    // the function had no side effects).
    var z1 = x(1)(2)
    var z2 = y(2)
    
    // An auto closure value.  This captures an implicit closure over the
    // specified expression, instead of the expression itself.
    var a : [auto_closure] () -> Int = 4

Enum Types

'enum' types are known as algebraic data types (ADTs) by the broader programming language community. We name them 'enum' after C enums, because ADTs fulfill many of the same roles as enums in the C tradition.

an enum type is a simple discriminated union: the runtime representation of a value of enum type only has one of the specified elements at a time.

All of the element types of an enum type must be materializable.

an enum type is defined by a enum decl.

Values of enum type may not be default initialized unless the user provides a no-argument constructor.

The enum metatype has a member corresponding to each declared element. For elements with a declared type, this member is a function which can construct an enum containing that element. For elements without a declared type, the member is simply an enum value for that element. A enum value has no accessible members except those explicitly defined by the user.

A reference to a member of the enum metatype can be shortened using delayed identifier resolution with context sensitive type inference.

The enum's value can be tested and accessed by pattern-matching the enum against a enum element pattern.

TODO: Should attributes be allowed on enum elements? TODO: Eventually, with generics we'll have equality and inequality operators. Enum decls should be able to implicitly define these for their types. TODO: Need pattern matching and element extraction.

Array Types

Note that array types are parsed inside-out, with the first bounds clause being the outermost one. This little oddity is required for the bounds of nested arrays to correspond in sequence to subscript indexes. That is, given an array "x : Int[5][7][11][13]" and a chained subscript expression of the form "x[i][j][k][l]", we really want "i" to be bounded by 5, "j" by 7, and so on. This is probably the only case where C's rule of "declaration follows use" really makes sense. There's precedent for this in many languages, including Java and C#.

    type-array ::= type-simple
    type-array ::= type-array '[' ']'
    type-array ::= type-array '[' expr ']'

Array types include a base type and an optional size. Array types indicate a linear sequence of elements stored consequtively memory. Array elements may be efficiently indexed in constant time. All array indexes are bounds checked and out of bound accesses are diagnosed with either a compile time or runtime failure (TODO: runtime failure mode not specified).

While they look syntactically very similar, an array type with a size has very different semantics than an array without. In the former case, the type indicates a declaration of actual storage space. In the later case, the type indicates a reference to storage space allocated elsewhere of runtime-specified size.

FIXME: We should separate out "Arrays" from "Slices". Arrays should always require a size and is by-value, a slice is a by-ref and never have a (statically specified) size.

For an array with a size, the size must be more than zero (no indices would be valid). For now, the array size must be a literal integer. TODO: Define a notion like C's integer-constant-expression for how constant folding works.

The element type of an array type must be materializable.

FIXME: Int[][] not valid because the element type isn't sized. We need some constraint to reject this, or do we?

Some example array types:

    // A simple array declaration:
    var a : Int[4]
    
    // A reference to another array:
    var b : Int[] = a
        
    // Declare a two dimensional array:
    var c : Int[4][4]
    
    // Declare a reference to another array, two dimensional:
    var d : Int[4][]

    // Declare an array of function pointers:
    var array_fn_ptrs : (: (Int) -> Int)[42]
    var g = array_fn_ptrs[12](4)

    // Without parens, this is a function that returns a fixed size array:
    var fn_returning_array : (Int) -> Int[42]
    var h : Int[42] = fn_returning_array(4)
    
    // You can even have arrays of tuples and other things, these work right
    // through composition:
    var array_of_tuples : (a : Int, b : Int)[42]
    var tuple_of_arrays : (a : Int[42], b : Int[42])
    
    array_of_tuples[12].a = array_of_tuples[13].b
    tuple_of_arrays.a[12] = array_of_tuples.b[13]

Metatype Types

    type-metatype ::= type-simple '.' 'metatype'

Every type has an associated metatype. A value of the metatype type is a reference to a global object which describes the type. Most metatype types are singleton and therefore require no storage, but metatypes associated with class types follow the same subtyping rules as their associated class types and therefore are not singleton.

Optional Types

Similar constructs exist in Haskell (Maybe), the Boost library (Optional), and C++14 (optional).

      type-optional ::= type-simple '?'

An optional type is syntactic sugar for the library type Optional<T>. This is a enum with two cases: None and Some, used to represent a value that may or may not be present.

FIXME: The current implementation of Optional is a hack and None and Some are currently both globals rather than members.

Optional types are different from other enums in that any value x is implicitly convertible to Some(x). This is part of the language, not the library; therefore, the libary type Optional is required to have a constructor that accepts a single argument.

Since optional types are part of the type-simple grammar, it is not possible to write T[]? for an optional array. Use (T[]?) instead.

Some example optional types:

      // A simple optional declaration:
      var a : Int? // equivalent to Optional<Int>
    
      // An empty optional:
      var b : Int? = .None
        
      // Declare an array of optionals:
      var c : Int?[] = new Int?[4]

Protocol Composition Types

   type-composition ::= 'protocol' '<' type-composition-list? '>'

   type-composition-list ::= type-identifier (',' type-identifier)*

A protocol composition type composes together a number of protocols to describe a type that meets the requirements of each of those protocols. A protocol composition type protocol<A, B> is similar to an explicitly-defined protocol that inherits both A and B

protocol C : A, B { }

but without the need to introduce a new name.

If we drop implicit conformance to protocols, protocol composition types become much more important, because they allow you to give a name to a composition without requiring types to explicitly conform to that name.

Each of the types named in the type-composition-list shall refer to either a protocol or to a protocol composition. The list may be empty, in which case every type conforms to the empty protocol composition. This is how the Any type is defined in the standard library.

    // A value that represents any type
    var any : protocol<> = 17

    // A value that conforms to both the Document and Enumerator protocols
    var doc : protocol<Document,Enumerator>
    doc.isEmpty()       // uses Enumerator.isEmpty()
    doc.title = "Hello" // uses Document.title

Type Inheritance

    inheritance ::= ':' type-identifier (',' type-identifier)*

A named type (e.g., a class, struct, enum, or protocol) can "inherit" some set of protocols, which implies that any object of that type conforms to each of those protocols. When a protocol inherits other protocols, the set of requirements from all of those protocols is effectivel aggregated into the protocol, and a type that conforms to the current protocol shall conform to each of the protocols that it inherits.

When a non-protocol type inherits a protocol, it is specifying explicitly that it conforms to that protocol. The program is ill-formed if the type does not conform to the protocol.

    protocol VersionedDocument : Document { // every VersionedDocument is a Document
      func bumpVersion()
    }

    func print(doc : Document) { /* ... */ }

    var myDocument : VersionedDocument;
    print(myDocument) // okay: a VersionedDocument is a Document

    class StoredHTML : VersionedDocument { // okay: StoredHTML conforms to VersionedDocument
      var Title : String
      func bumpVersion()
    }

Patterns

The pattern grammar mirrors the expression grammar, or to be more specific, the grammar of literals. This is because the conceptual algorithm for matching a value against a pattern is to try to find an assignment of values to variables which makes the pattern equal the value. So every expression form which can be used to build a value directly should generally have a corresponding pattern form.

    pattern-atom ::= pattern-var
    pattern-atom ::= pattern-any
    pattern-atom ::= pattern-tuple
    pattern-atom ::= pattern-is
    pattern-atom ::= pattern-enum-element
    pattern-atom ::= expr

    pattern      ::= pattern-atom
    pattern      ::= pattern-typed

A pattern represents the structure of a composite value. Parts of a value can be extracted and bound to variables or compared against other values by pattern matching. Among other places, pattern matching occurs on the left-hand side of var bindings, in the arguments of func declarations, and in the case labels of switch statements. Some examples:

    var point = (1, 0, 0)

    // Extract the elements of the "point" tuple and bind them to
    // variables x, y, and z.
    var (x, y, z) = point
    println("x=\(x) y=\(y) z=\(z)")

    // Dispatch on the elements of a tuple in a "switch" statement.
    switch point {
    case (0, 0, 0):
      println("origin")
    // The pattern "_" matches any value.
    case (_, 0, 0):
      println("on the x axis")
    case (0, _, 0):
      println("on the y axis")
    case (0, 0, _):
      println("on the z axis")
    case (var x, var y, var z):
      println("x=\(x) y=\(y) z=\(z)")
    }

A pattern may be "irrefutable", meaning informally that it matches all values of its type. Patterns in declarations, such as var and func, are required to be irrefutable. Patterns in the case labels of switch statements, however, are not.

The basic pattern grammar is a literal "atom" followed by an optional type annotation. Type annotations are useful for documentation, as well as for coercing a matched expression to a particular kind. They are also required when patterns are used in a function signature. Type annotations are currently not allowed in switch statements.

A pattern has a type. A pattern may be "fully-typed", meaning informally that its type is fully determined by the type annotations it contains. Some patterns may also derive a type from their context, be it an enclosing pattern or the way it is used; this set of situations is not yet fully determined.

Typed Patterns

    pattern-typed ::= pattern-atom ':' type-annotation

A type annotation constrains a pattern to have a specific type. An annotated pattern is fully-typed if its annotation type is fully-typed. It is irrefutable if and only if its subpattern is irrefutable.

Type annotations are currently not allowed in the case labels of switch statements; case patterns always get their type from the subject of the switch.

Any Patterns

    pattern-any ::= '_'

The symbol _ in a pattern matches and ignores any value. It is irrefutable.

'var' Patterns

    pattern-var ::= 'var' pattern

The keyword var within a pattern introduces variable bindings. Any identifiers within the subpattern bind new named variables to their matching values.

    var point = (0, 0, 0)
    switch point {
    // Bind x, y, z to the elements of point.
    case (var x, var y, var z):
      println("x=\(x) y=\(y) z=\(z)")
    }

    switch point {
    // Same. 'var' distributes to the identifiers in its subpattern.
    case var (x, y, z):
      println("x=\(x) y=\(y) z=\(z)")
    }

Outside of a var pattern, an identifier behaves as an expression pattern referencing an existing definition.

    var zero = 0
    switch point {
    // x and z are bound as new variables.
    // zero is a reference to the existing 'zero' variable.
    case (var x, zero, var z):
      println("point off the y axis: x=\(x) z=\(z)")
    default:
      println("on the y axis")
    }

The left-hand pattern of a var declaration and the argument pattern of a func declaration are implicitly inside a var pattern; identifiers in their patterns always bind variables. Variable bindings are irrefutable.

The type of a bound variable must be materializable unless it appears in a function-signature and is directly of a byref-annotated type.

Tuple Patterns

    pattern-tuple ::= '(' pattern-tuple-body? ')'
    pattern-tuple-body ::= pattern-tuple-element (',' pattern-tuple-body)* '...'?
    pattern-tuple-element ::= pattern
    pattern-tuple-element ::= pattern '=' expr

A tuple pattern is a list of zero or more patterns. Within a function signature, patterns may also be given a default-value expression.

A tuple pattern is irrefutable if all its sub-patterns are irrefutable.

A tuple pattern is fully-typed if all its sub-patterns are fully-typed, in which case its type is the corresponding tuple type, where each type-tuple-element has the type, label, and default value of the corresponding pattern-tuple-element. A pattern-tuple-element has a label if it is a named pattern or a type annotation of a named pattern.

A tuple pattern whose body ends in '...' is a varargs tuple. The last element of such a tuple must be a typed pattern, and the type of that pattern is changed from T to T[]. The corresponding tuple type for a varargs tuple is a varargs tuple type.

As a special case, a tuple pattern with one element that has no label, has no default value, and is not varargs is treated as a grouping parenthesis: it has the type of its constituent pattern, not a tuple type.

'is' Patterns

    pattern-is ::= 'is' type

is patterns perform a type check equivalent to the x is T cast operator. The pattern matches if the runtime type of a value is of the given type. is patterns are refutable and thus cannot appear in declarations.

    class B {}
    class D1 : B {}
    class D2 : B {}

    var bs : B[] = [B(), D1(), D2()]

    for b in bs {
      switch b {
      case is B:
        println("B")
      case is D1:
        println("D1")
      case is D2:
        println("D2")
      }
    }

Enum Element Patterns

    pattern-enum-element ::= type-identifier? '.' identifier pattern-tuple?

Enum element patterns match a value of enum type if the value matches the referenced case of the enum. If the case has a type, the value of that type can be matched against an optional subpattern.

    enum HTMLTag {
      case A(href:String)
      case IMG(src:String, alt:String)
      case BR
    }

    switch tag {
    case .BR:
      println("<br>")
    case .IMG(var src, var alt):
      println("<img src=\"\(escape(src))\" alt=\"\(escape(alt))\">")
    case .A(var href):
      println("<a href=\"\(escape(href))\">")
    }

Enum element patterns are refutable and thus cannot appear in declarations. (They are currently considered refutable even if the enum contains only a single case.)

Expressions in Patterns

Patterns may include arbitrary expressions as subpatterns. Expression patterns are refutable and thus cannot appear in declarations. An expression pattern is compared to its corresponding value using the ~= operator. The match succeeds if expr ~= value evaluates to true. The standard library provides a default implementation of ~= using == equality; additionally, range objects may be matched against integer and floating-point values. The ~= operator may be overloaded like any function.

    var point = (0, 0, 0)
    switch point {
    // Equality comparison.
    case (0, 0, 0):
      println("origin")
    // Range comparison.
    case (-10..10, -10..10, -10..10):
      println("close to the origin")
    default:
      println("too far away")
    }

    // Define pattern matching of an integer value to a string expression.
    func ~=(pattern:String, value:Int) -> Bool {
      return pattern == "\(value)"
    }

    // Now we can pattern-match strings to integers:
    switch point {
    case ("0", "0", "0"):
      println("origin")
    default:
      println("not the origin")
    }

The order of evaluation of expressions in patterns, including whether an expression is evaluated at all, is unspecified. The compiler is free to reorder or elide expression evaluation in patterns to improve dispatch efficiency. Expressions in patterns therefore cannot be relied on for side effects.

Expressions

Support for user-defined operators causes some amount of parsing to be delayed until after name resolution has occurred. Other restrictions and disambiguations in the grammar permit the parser to decide all other aspects of parsing, such as where statements must be divided.

Semicolons in C are generally just clutter. Swift generally tries to define away the need for them.

    expr          ::= expr-basic
    expr          ::= expr-trailing-closure expr-cast?

    expr-basic    ::= expr-sequence expr-cast?

    expr-sequence ::= expr-unary expr-binary*
    
    expr-primary  ::= expr-literal
    expr-primary  ::= expr-identifier
    expr-primary  ::= expr-super
    expr-primary  ::= expr-closure
    expr-primary  ::= expr-anon-closure-arg
    expr-primary  ::= expr-paren
    expr-primary  ::= expr-delayed-identifier

    expr-postfix  ::= expr-primary
    expr-postfix  ::= expr-postfix operator-postfix
    expr-postfix  ::= expr-new
    expr-postfix  ::= expr-dot
    expr-postfix  ::= expr-metatype
    expr-postfix  ::= expr-subscript
    expr-postfix  ::= expr-call

At the top level of the expression grammar, expressions are a sequence of unary expressions joined by binary operators. When parsing an expr, a binary operator immediately following an expr-unary continues the expression, and the program is ill-formed if it is not then followed by another expr-unary. This resolves an ambiguity which could otherwise arise in statement contexts due to semicolon elision.

    5 !- +~123 -+- ~+6
    (foo)(())
    bar(49+1)
    baz()

A unary or binary expression may optionally be followed by a cast operator.

Binary Operators

Should this use the expr-identifier production to allow qualified identifiers? This would allow "foo swift.+ bar". Is ADL or something like it enough?

    expr-binary ::= op-binary-or-ternary expr-unary expr-cast?
    
    op-binary-or-ternary ::= operator-binary
    op-binary-or-ternary ::= '='
    op-binary-or-ternary ::= '?' expr-sequence ':'

    expr-cast ::= 'is' type
    expr-cast ::= 'as' '!' type

Infix binary expressions are not formed during parsing. Instead, they are formed after name resolution by building a tree from an operator-delimited sequence of unary expressions. Precedence and associativity are determined by the infix attribute on the resolved names, which must fully agree.

If an operator is used as a binary operator, but name resolution does not find at least one function of binary operator type, the expression is ill-formed.

A simple example is:

    4 + 5 * 123

Builtin Binary Operators

In addition to user-defined operators, a handful of builtin operators are defined that parse inside binary expressions with predefined precedence and associativity.

Assignment operator

The assignment operator a = b updates the value of a with the value of b. Its precedence is hardcoded as if declared as follows:

    // Not valid Swift code
    operator infix = {
      precedence 90
      associativity right
    }

The left-hand operand must be an lvalue, or a tuple of lvalues. Assigning to a tuple of lvalues performs destructuring reassignment.

    var (a, b) = (1, 2)

    // Swap two values.
    (a, b) = (b, a)

    // Reassign two values.
    (a, b) = (11, 22)

    // Reassign two values by destructuring a tuple.
    var tuple = (111, 222)
    (a, b) = tuple

An assignment expression evaluates to void. Unlike C, productions such as these are invalid:

    // Error: x = y doesn't return Bool
    if x = y { }

    // Error: (y = z) doesn't return Int
    var x, y, z : Int
    x = y = z

Ternary operator

The ternary operator a ? b : c conditionally evaluates its middle or right operand based on the value of its left operand. Its precedence is hardcoded as if the middle ? b : subexpression were a binary operator declared as follows:

    // Not valid Swift code
    operator infix ?...: {
      precedence 100
      associativity right
    }

The subexpression to the left of the '?' is evaluated, and is converted to 'Bool' using the result's 'getLogicValue' method if it is not already 'Bool'. If the condition is true, the subexpression to the right of '?' is evaluated, and its result becomes the result of the expression. If the condition is false, the subexpression to the right of ':' is evaluated, and its result becomes the result of the expression. Only one of the '?' or ':' subexpressions will be evaluated. The results of the '?' and ':' subexpressions must be implicitly convertible to a common type, which becomes the type of the ternary expression.

    x += b ? y : z
    x += a ? b ? y : z : w

    for i in 1..101 {
      println(i % 15      ? "fizzbuzz"
            : i %  3 == 0 ? "fizz"
            : i %  5 == 0 ? "buzz"
            : "\(i)")
    }

Cast operators

Cast expressions influence the types of their subexpressions. They can appear at the end of a binary operator sequence; their left operand is parsed as if the cast operators were declared as follows:

    // Not valid Swift code
    operator infix as {
      precedence 95
      associativity none
    }

The right operand of all operators is parsed as a type.

x as! T will try to cast the value of the expression x to a subtype of its compile-time type. The type of the value is checked at runtime, and if the cast cannot succeed, the program terminates. T must be a subtype of the compile-time type of x. An example:
```
    var b:B = new D
    var d = b as! D
  
```
x is T will query the type of the value of x at runtime. T must be a subtype of the compile-time type of x. If the runtime value of x is T, the is expression evaluates to true; otherwise, it evaluates to false.
```
    if b is D {
      var d = b as! D
    }
  
```

as! and is all parse a type for their right-hand argument. They must be parenthesized if followed by subsequent operators:

    (b as! D).derivedMethod()
    ((B as! D) as! D2)
    (b is D) ? (b as! D) : D()

Unary Operators

    expr-unary   ::= operator-prefix* expr-postfix

If an operator is used as a unary operator, but name resolution does not find at least one function that takes a single argument, the expression is ill-formed.

Simple examples:

    i = -j

Literals

The type of a literal is inferred from its context, to allow things like "4" to be compatible with any width integer type without 'promotion' rules or casting. In ambiguous cases like "var x = 4", the literals are forced to a default type specified by the standard library.

    expr-literal ::= integer_literal
    expr-literal ::= floating_literal
    expr-literal ::= character_literal
    expr-literal ::= string_literal
    expr-literal ::= '__FILE__'
    expr-literal ::= '__LINE__'
    expr-literal ::= '__COLUMN__'

Numeric literals are either integer, floating point, character, or string depending on its lexical form. The type of the literal is inferred based on its context. If there is no contextual type information for an expression, all unresolved types are inferred to 'IntegerLiteralType' type, to 'FloatLiteralType', to 'CharacterLiteralType', and to 'StringLiteralType', respectively. If a literal is used and these types are not defined, then the code is malformed.

A literal is compatible with its inferred type if that type implements an informal protocol required by literals. This informal protocol requires that the type have an unambiguous "static" function defined whose result type is the same as the inferred type, and that takes a single argument that is either itself literal compatible, or is a builtin integer type.

The '__FILE__', '__LINE__', and '__COLUMN__' magic identifiers expand to a literal representation of their position in the source code. '__FILE__' expands to a string literal; '__LINE__' and '__COLUMN__' each expand to an integer literal.

    // File foo.swift

    var file = __FILE__  // file : String = "foo.swift"
    var line = __LINE__  // line : Int = 4
    var col = __COLUMN__ // column : Int = 11

If '__FILE__', '__LINE__', and/or '__COLUMN__' are used as default argument values in a function declaration, they instead expand to the source location of each function call that instantiates the default argument.

    func log(message:String,
             file:String = __FILE__,
             line:Int = __LINE__) {
      println("\(file):\(line): \(message)")
    }

    log("Orders received")
    doIt()
    log("Job's finished")

Identifiers

    expr-identifier ::= identifier generic-args?

A raw identifier refers to a value found via unqualified value lookup, and has the type of the declaration returned by name lookup and overload resolution. Value declarations are installed with var and the syntactic sugar forms like func declarations.

If an identifier refers to a generic type, an instance of that generic may be referenced by following the identifier with a list of type parameters enclosed in angle brackets <>:

    // A generic struct.
    struct Dict<K,V> {
      init() {}
      static func fromKeysAndValues(keys:K[], values:T[]) -> Dict<K,V> {}
    }

    // Construct an instance of the generic struct.
    var foo = Dict<String, Int>()
    // Invoke a static method of an instance of the generic struct.
    var bar = Dict<String, Int>.fromKeysAndValues(
      ["zim", "zang", "zung"],
      [ 123,    456,    789 ])

Generic disambiguation

Note that < and > are used as both angle brackets in generic identifiers and as characters in binary operator names. Because of this, there are potential parsing ambiguities. Swift uses a context-free heuristic to determine whether to parse an expression involving < and > as a generic parameter list or a binary operator:

When an identifier is followed by <, Swift attempts to parse starting from the < as a generic parameter list.
If it succeeds in parsing a generic parameter list, it looks at the token after the closing >. If it sees one of the following tokens:
( [ { } ] ) . , ;
then the expression is parsed as a generic parameter list.
If Swift cannot parse a generic parameter list after the <, or the matching > is not followed by one of the above tokens, the < is parsed as an operator character.

These rules assume that, in most cases, generic type names will be used in constructor expressions as in Foo<T>(x) or to access static members as in Foo<T>.bar(). Referring to a generic metatype as a value in an expression may require parentheses around the type name.

    // An operator that operates on metatypes.
    func [infix] +-+ <T, U>(t:T.metatype, u:U.metatype) -> Foo { }
    
    var foo = (Dict<String, Int>) +-+ (Slice<Char>)
    println(foo)

On the other hand, some expressions involving < and > operators may misparse as generic arguments as well. These can also be corrected by adding or removing parentheses.

    func foo(x:Bool, y:Bool)
    var a,b,c,d,e : Int

    foo(a < b, c > (d + e)) // ERROR: Misparses as (a<b,c>)(d + e)
    foo((a < b), c > (d + e)) // Force parsing as (a < b), (c > (d + e))
    foo(a < b, c > d + e) // Also parses as (a < b), (c > (d + e))

Super

    expr-super ::= expr-super-method
    expr-super ::= expr-super-subscript
    expr-super ::= expr-super-constructor

    expr-super-method ::= 'super' '.' expr-identifier
    expr-super-subscript ::= 'super' '[' expr ']'
    expr-super-constructor ::= 'super' '.' 'init'

The keyword super is used to refer to superclass members from a subclass method. This can be used to access members of a superclass overridden by the subclass. The following forms are allowed:

A superclass property or method can be accessed with the form super.name.
A superclass subscript accessor can be accessed with the form super[index].
Within a constructor, a superclass constructor can be accessed with the form super.init.

super expressions are invalid outside of a subclass method. super.init is invalid outside of a subclass constructor. super.init furthermore may only be called once per derived constructor, and must be called before the derived constructor accesses self or any instance variables.

Closure Expression

    expr-closure ::= '{' closure-signature? brace-item-list '}'

    closure-signature ::= pattern-tuple func-signature-result? 'in'
    closure-signature ::= identifier (',' identifier*) func-signature-result? 'in'

A closure defines an anonymous function as an expression. Like a func declaration, a closure has parameters, a return type, and some number of statements that are executed when the closure is called. Like local functions, closures can capture values from its enclosing function and closure scopes. Closures are often used in lieu of local functions when the function name would only be used once, to be called by some other function. As a syntax optimization, when the closure contains only a single expression, it's value is used as the result of the closure. Thus, the closure { 5 } is equivalent to { return 5 }.

Unlike func declarations, the return type, parameter types, and even the names of parameters can be omitted from the definition of the closure, making it a concise syntax for small closures. In such cases, the context in which the closure is used must provide information about the parameter and return types. In the special case where the closure consists of only a single expression, that expression participates in the type checking of its context.

    // Takes a closure that it calls to determine an ordering relation.
    func magic(val : Int, predicate : (a : Int, b : Int) -> Bool)
    
    func f() {
      // Compare one way.  Closure is inferred to return Bool and take two ints
      // from the argument context.  This same information infers that $0 and $1
      // both have type 'Int'.
      magic(42, { $0 < $1 })
    
      // Compare the other way way.
      magic(42, { $1 < $0 })

      // Provide parameter names, but infer the types.
      magic(42, { x, y in y < x })

      // Provide parameter names and types.
      magic(42, { (x : Int, y : Int) in y < x })

      // Provide parameter names and types, and return type, with multiple statements.
      magic(42, { (x : Int, y : Int) -> Bool in
        print("Comparing \(x) to \(y).\n")
        return y < x
      })

      // Error, not enough context to infer the type of $0.
      var x = { $0 } 
    }

Anonymous Closure Arguments

    expr-anon-closure-arg ::= dollarident

A use of an identifier whose name fits the "$[0-9]+" regular expression is a reference to an anonymous closure argument that is formed when the containing expression is coerced into a closure context. All other dollar identifiers are invalid.

This can only be used in the body of a closure (expr-closure) that does not have explicitly-specified parameters.

Delayed Identifier Resolution

The ".bar" syntax was picked because it is related to the syntax of a fully qualified "foo.bar" reference.

    expr-delayed-identifier ::= '.' identifier

A delayed identifier expression refers to a constructor of a enum type, without knowing which type it is referring to. The expression is resolved to a constructor of a concrete type through context sensitive type inference.

    enum Direction { case Up, Down }
    func search(val : Int, direction : Direction)
    
    func f() {
      search(42, .Up)
      search(17, .Down)
    }

Parenthesized Expressions

    expr-paren      ::= '(' ')'
    expr-paren      ::= '(' expr-paren-element (',' expr-paren-element)* ')'
    expr-paren-element ::= (identifier ':')? expr

Parentheses expressions contain an (optionally empty) list of optionally named values. Parentheses in an expression context denote one of two things: 1) grouping parentheses, or 2) a tuple literal.

Grouping parentheses occur when there is exactly one value in the list and that value does not have a name. In this case, the type of the parenthesis expression is the type of the single value.

All other cases are tuple literals. The type of the expression is a tuple type whose elements and order match that of the initializer. If there are any named elements, those elements become names for the tuple type. A parenthesis expression with no value has a type of the empty tuple.

Some examples:

    // Simple grouping parenthesis.
    var a = (4)             // Type = Int
    var b = (4+a)           // Type = Int
    
    // Tuple literals.
    var c = ()               // Type = ()
    var d = (4, 5)           // Type = (Int, Int)
    var e = (c, d)           // Type = ((), (Int, Int))
    
    var f = (x : 4, y : 5)   // Type = (x : Int, y : Int)
    var g = (4, y : 5, 6)    // Type = (Int, y : Int, Int)
    
    // Named arguments to functions.
    func foo(a : Int, b : Int)
    foo(b = 4, a = 1)

Dot Expressions

    expr-dot ::= expr-postfix '.' dollarident

If the base expression has tuple type, then the magic identifier "[0-9]+" accesses the specified anonymous member of the tuple. Otherwise, this form is invalid.

    expr-dot ::= expr-postfix '.' expr-identifier

If the base expression has tuple type and if the identifier is the name of a field in the tuple, then this is a reference to the specified field.

Otherwise, dot name lookup is performed, and this expression is treated as function application. This allows looking up members in modules, metatypes, etc.

Subscript Expressions

There is no "built-in" semantics for subscripting. Rather, all subscripting semantics is implemented via subscript declarations in the library.
We require that the '[' not be the first token on a line, so that a statement can begin with an array expression.

    expr-subscript ::= expr-postfix '[' expr ']'

A subscript expression invokes a subscript getter or setter on the type of the expr-postfix. The expr is used as the subscript argument, which will be provided to either the getter or setter depending on whether the subscript expression is used as an rvalue (reading) or lvalue (writing), respectively. A subscript expression that resolves to a subscript declaration with no setter cannot be modified.

New Expressions

It's not really clear what the behavior of multiple bounds should be.

We should probably allow an initializer. The semantics would be to evaluate that constructor for each element constructed.

    expr-new        ::= 'new' type-identifier expr-new-bounds
  
    expr-new-bounds ::= expr-new-bound
    expr-new-bounds ::= expr-new-bounds expr-new-bound
    expr-new-bound  ::= '[' expr? ']'

Allocates and initializes a new array of objects. The first clause must be an expression; subsequent bounds, if present, must be constant under the usual rules for array types. The opening square bracket must be on the same line as the type name.

Function Application

    expr-call ::= expr-postfix expr-paren

The leading '(' of the expr-paren must not be the first token on a line. This greatly reduces the likelihood of confusion from semicolon elision, without requiring feedback from the typechecker or more aggressive whitespace sensitivity.

If the expr-postfix refers to a (possibly parenthesized) name of a type, the expr-paren is first coerced to the type named by expr-postfix. If that coercion fails, then the expr-postfix refers to the set of constructors for that type.

Simple examples:

    // Application of an empty tuple to the function f.
    f()
    // Application of 4 to the function f.
    g(4)
    
    // Application of 4 to the function returned by h().
    var h : (Int) -> (Int) -> Int
    ...
    h()(4)

    // Two separate statements
    i()
    (j <+ 2)()

Trailing Closures

It is possible to model trailing closures as simply another way to perform a function call, forgoing the syntactic transformation for expr-call, if functions meant to be used with trailing closures are written as curried functions, e.g.,

func map(array : T[])(fn : (T) -> U) -> U[] { ... }

There are two problems with this (admittedly simpler) design. First, functions imported from C, C++, and Objective-C won't ever be written in this curried syntax, so we would have to implement redundant entry points to enable this syntax. Second, this design forces the idea of currying front and center for Swift programmers who otherwise wouldn't care, for mostly theoretical reasons.

    expr-trailing-closure ::= expr-postfix expr-closure+

A postfix expression followed by a closure will be invoked with the closure as its argument. This syntax is referred to as a "trailing" closure, because the closure itself is outside the parentheses used to call the expression. Trailing closures are syntactic sugar that eliminates the awkwardness of closing a function call with "})", where the "}" ends the closure and the ")" ends the call.

Trailing closures use a simple syntactic translation, making them purely syntactic sugar. If the postfix expression preceding the trailing closure is an expr-call, the closure is added to the end of the expr-paren of that call. Otherwise, the postfix expression is (implicitly) called with the trailing closure as its only argument.

  dispatch_async(q) {
    print("Whenever you get around to it\n")
  }

Statements

Statements can only exist in contexts that are themselves a stmt. Statements have no type, they just induce control flow changes. We choose to use constructs that will be familiar to a broad range of C/Java programmers.

    stmt ::= stmt-semicolon
    stmt ::= stmt-if
    stmt ::= stmt-while
    stmt ::= stmt-for-c-style
    stmt ::= stmt-for-each
    stmt ::= stmt-switch
    stmt ::= stmt-control-transfer
    
    stmt-control-transfer ::= stmt-return
    stmt-control-transfer ::= stmt-break
    stmt-control-transfer ::= stmt-continue
    stmt-control-transfer ::= stmt-fallthrough

Statements provide the control flow constructs of function bodies and top-level code.

    // A function with some statements. 
    func fib(v : Int) -> Int {
      if v < 2 {
        return v
      }
      return fib(v-1)+fib(v-2)
    }

Semicolon Statement

Allowing semicolons as statements causes us to allow semicolons as statement separators as well. This, in turn, means that we don't reject code that has semicolons after each statement, which will be common when people first start getting used to Swift.

    stmt-semicolon ::= ';'

The semicolon statement has no effect.

'return' Statement

    stmt-return ::= 'return' expr
    stmt-return ::= 'return'

The return statement sets the return value of the current func declaration or closure expression and transfers control out of the function. It sets the return value by converting the specified expression result (or '()' if none is specified) to the return type of the 'func'.

The stmt-return grammar is ambiguous: "{ return 4 }" could be parsed as {"return" "4"} or as a single statement. Ambiguity here is resolved toward the first production, because control flow can't transfer to an subexpression.

'break' Statement

    stmt-return ::= 'break'

The 'break' statement transfers control out of the enclosing 'for' loop or 'while' loop.

'continue' Statement

    stmt-return ::= 'continue'

The 'continue' statement transfers control back to the start of the enclosing 'for' loop or 'while' loop.

'if' Statement

We require braces around the body of an 'if' for two reasons: first, it eliminates the need for parentheses around the condition by making them visually distinctive. Second, it will eliminate all the dithering about whether and when people should, or should not, use braces for if bodies.

    stmt-if      ::= 'if' expr-basic brace-item-list stmt-if-else?
    stmt-if-else ::= 'else' brace-item-list
    stmt-if-else ::= 'else' stmt-if

'if' statements provide a simple control transfer operations that evaluates the condition, invokes the 'getLogicValue' member of the result if the result not a 'Bool', then determines the direction of the branch based on the result. (Internally, the standard library type 'Bool' has a getLogicValue member that returns a 'Builtin.Int1'.) It is an error if the type of the expression is context-dependent or some non-Bool type.

Some examples include:

    if true {
      /*...*/
    }
    
    if X == 4 {
    } else {
    }

    if X == 4 {
    } else if X == 5 {
    } else {
    }

'while' Statement

    stmt-while ::= 'while' expr-basic brace-item-list

'while' statements provide simple loop construct which (on each iteration of the loop) evalutes the condition, invokes the 'getLogicValue' member of the result if the result not a 'Bool', then determines whether to keep looping. (Internally, the standard library type 'Bool' has a getLogicValue member that returns a 'Builtin.Int1'.) It is an error if the type of the expression is context-dependent or some non-Bool type.

Some examples include:

    while true {
      /*...*/
    }
    
    while X == 4 {
      X = 3
    }

'do-while' Statement

    stmt-do-while ::= 'do' brace-item-list 'while' 'expr

'do-while' statements provide simple loop construct which (on each iteration of the loop) evaluates the body, then evaluates the condition, invoking the 'getLogicValue' member of the result if the result not a 'Bool', then determines whether to keep looping. (Internally, the standard library type 'Bool' has a getLogicValue member that returns a 'Builtin.Int1'). It is an error if the type of the expression is context-dependent or some non-Bool type.

Some examples include:

    do {
      /*...*/
    } while true
    
    do {
      X = 3
    } while X == 4

C-Style 'for' Statement

    stmt-for-c-style    ::= 'for'     stmt-for-c-style-init? ';' expr? ';' expr-basic?     brace-item-list
    stmt-for-c-style    ::= 'for' '(' stmt-for-c-style-init? ';' expr? ';' expr-basic? ')' brace-item-list
    stmt-for-c-style-init ::= decl-var
    stmt-for-c-style-init ::= expr

C-Style 'for' statements provide simple loop construct which evaluates the first part (the initializer) before entering the loop, then evalutes the second condition as a logic value to determines whether to keep looping. The third condition is executed at the end of the loop. All three are evaluated in a new scope that surrounds the for statement.

Some examples include:

    for i = 0; i != 10; ++i {
      /*...*/
    }

    for (i = 0; i != 10; ++i) {
      /*...*/
    }

    for var (i,j) = (0,1); i != 10; ++i {
      /*...*/
    }

'for-each' Statement

    stmt-for-each ::= 'for' pattern 'in' expr-basic brace-item-list

Enumerator-based 'for' statements provide enumeration over the values in a container. The expr is either a container or an enumerator; and respectively, it either conforms to the formal Enumeration or formal Enumerator protocol.

Note that each iteration of the loop declares a distinct variable for each variable in the pattern. For example, in a loop like "for i in 0..10", if i is captured inside the loop, each iteration captures a different "i", so there would be a total of ten versions generated each time the loop runs.

Some examples include:

    for i in 0..100 {
      println(String(i));
    }

'switch' Statement

    stmt-switch ::= 'switch' expr-basic '{' stmt-switch-case* '}'
    stmt-switch-case ::= (case-label+ | default-label) brace-item*

    case-label ::= 'case' pattern (',' pattern)* ('where' expr)? ':'
    default-label ::= 'default' ':'

'switch' statements branch on the value of an expression by pattern matching. The subject expression of the switch is evaluated and tested against the patterns in its case labels in source order. When a pattern is found that matches the value, control is transferred into the matching case block. case labels may declare multiple patterns separated by commas, and multiple case labels may cover a case block. Case labels may optionally specify a guard expression, introduced by the where keyword; if present, control is transferred to the case only if the subject value both matches one of its patterns and the guard expression evaluates to true. Patterns are tested "as if" in source order; if multiple cases can match a value, control is transferred only to the first matching case. The actual execution order of pattern matching operations, and in particular the evaluation order of expression patterns, is unspecified.

A switch may also contain a default block. If present, it receives control if no cases match the subject value. The default block must appear at the end of the switch and must be the only label for its block. (default is equivalent to a final case _ pattern.) Switches are required to be exhaustive; either the contained case patterns must cover every possible value of the subject's type, or else an explicit default block must be specified to handle uncovered cases.

Every case and default block has its own scope. Declarations within a case or default block are only visible within that block. Case patterns may bind variables using the var keyword; those variables are also scoped into the corresponding case block, and may be referenced in the where guard for the case label. However, if a case block matches multiple patterns, none of those patterns may contain variable bindings.

Control does not implicitly 'fall through' from one case block to the next. fallthrough statements may explicitly transfer control among case blocks. break and continue within a switch will break or continue out of an enclosing 'while' or 'for' loop, not out of the 'switch' itself.

    func classifyPoint(point:(Int, Int)) {
      switch point {
      case (0, 0):
        println("origin")

      case (_, 0):
        println("on the x axis")

      case (0, _):
        println("on the y axis")

      case (var x, var y) where x == y:
        println("on the y = x diagonal")

      case (var x, var y) where -x == y:
        println("on the y = -x diagonal")

      case (var x, var y):
        println("length \(sqrt(x*x + y*y))")
      }
    }

'fallthrough' Statement

    stmt-fallthrough ::= 'fallthrough'

fallthrough transfers control from a case block of a switch statement to the next case or default block within the switch. It may only appear inside a switch. fallthrough cannot be used in the final block of a switch. It also cannot transfer control into a case block whose pattern contains var bindings.

Protocols

Objects

Generics

Generic Parameters

    generic-params ::= '<' generic-param (',' generic-param)* where-clause? '>'

    generic-param ::= identifier
    generic-param ::= identifier ':' type-identifier
    generic-param ::= identifier ':' type-composition

    where-clause ::= 'where' requirement (',' requirement) *

    requirement ::= conformance-requirement
                ::= same-type-requirement

    conformance-requirement ::= type-identifier ':' type-identifier
    conformance-requirement ::= type-identifier ':' type-composition

    same-type-requirement ::= type-identifier '==' type-identifier

A generic function or type is parameterized by a given set of generic parameters. The generic parameters each have a name as well as some set of requirements that specify the capabilities that any corresponding generic argument might have. For example, the generic parameter T : Printable requires that any generic argument substituted for the generic parameter T conform to the protocol Printable. Similarly, a generic parameter U : SomeClass requires that any generic argument substituted for the generic parameter U inherit from the class SomeClass.

Additional requirements on generic parameters and associated types of generic parameters can be introduced via the "where" clause, which can include additional protocol-conformance requirements (e.g., the generic parameter list <T where T : Printable>, which is equivalent to <T : Printable>), as well as same-type requirements that require two types to be identical (e.g., <T : Collection, U : Collection where T.Element == U.Element>).

Generic Arguments

    generic-args ::= '<' generic-arg (',' generic-arg)* '>'

    generic-arg ::= type

Generic argument lists specify the generic arguments to be provided to a generic type or function, which replace the generic parameters of that type or function to produce a specialized version of that type or function. For example, given a generic class:

    class Dictionary<Key : Hashable, Value> { /* ... */ }

The type Dictionary<String, Int>, replaces the generic parameter Key with String and the generic parameter Value with Int. Each generic argument must satisfy all of the requirements of its corresponding generic parameter (e.g., String must conform to the Hashable protocol), and all generic arguments, when taken together, must satisfy the additional requirements specified in the where clause.

Name Binding

Name binding in swift is performed in different ways depending on what language entity is being considered:

Value names (for var and func declarations) and type names (for typealias, enum, and struct declarations) follow the same scope and name lookup rules as described below.

tuple element names

scope within enum decls

Context sensitive member references are resolved during type checking.

Scopes for Type and Value Names

Name Lookup Unqualified Value Names

"dot" Name Lookup Value Names

Name Lookup for Type and Value Names

Basic algo:

Search the current scope tree for a local name. Local names cannot be forward referenced.
Bind to names defined in the current component, including the current translation unit. TODO: is this a good thing? We could require explicit imports if we wanted to.
Bind to identifiers that are imported with an import directive. Imports are searched in order of introduction (top-down). The location of an import directive in a file (e.g. between func decls) does not affect name lookup, but the order of imports w.r.t. each other does.

Shadowing: Given a ValueDecl D1 in the current module and a ValueDecl D2 in an imported module with the same name and a member of the same type (if relevant): 1. If D1 is a TypeDecl, D2 is shadowed. 2. If neither D1 nor D2 is a TypeDecl, and they have the same type, D2 is shadowed. If a declaration in an imported module is shadowed by any declaration in the current module, it is not found by unqualified global lookup or lookup for members of a type.

Name Lookup for Dot Expressions

Dot Expressions bind to name of tuple elements.

Type Checking

Binary expressions, function application, etc.

Standard Conversions

Anonymous Argument Resolution

Context Sensitive Type Resolution

Standard Library

It would be really great to have literate swift code someday, that way this could be generated directly from the code. This would also be powerful for Swift library developers to be able to depend on being available and standardized.

This describes some of the standard swift code as it is being built up. Since Swift is designed to give power to the library developers, much of what is normally considered the "language" is actually just implemented in the library.

All of this code is published by the 'swift' module, which is implicitly imported into each translation unit, unless some sort of pragma in the code (attribute on an import?) is used to change or disable this behavior.

Builtin Module

In the initial Swift implementation, a module named Builtin is imported into every file. Its declarations can only be found by dot syntax. It provides access to a small number of primitive representation types and operations defined over them that map directly to LLVM IR.

The existance of and details of this module are a private implementation detail used by our implementation of the standard library. Swift code outside the standard library should not be aware of this library, and an independent implementation of the swift standard library should be allowed to be implemented without the builtin library if it desires.

For reference below, the description of the standard library uses the "Builtin." namespace to refer to this module, but independent implementations could use another implementation if they so desire.

Simple Types

Void

    // Void is just a type alias for the empty tuple.
    typealias Void = ()

Having a single standardized integer type that can be used by default everywhere is important. One advantage Swift has is that by the time it is in widespread use, 64-bit architectures will be pervasive, and the LLVM optimizer should grow to be good at shrinking 64-bit integers to 32-bit in many cases for those 32-bit architectures that persist.

Int, Int8, Int16, Int32, Int64

    // Fixed size types are simple structs of the right size.
    struct Int8  { value : Builtin.Int8 }
    struct Int16 { value : Builtin.Int16 }
    struct Int32 { value : Builtin.Int32 }
    struct Int64 { value : Builtin.Int64 }
    struct Int128 { value : Builtin.Int128 }

    // Int is just an alias for the 64-bit integer type.
    typealias Int = Int64

Int, Int8, Int16, Int32, Int64

    struct Float  { value : Builtin.FPIEEE32 }
    struct Double { value : Builtin.FPIEEE64 }

Bool, true, false

    // Bool is a simple enum.
    enum Bool {
      true, false
    }
    
    // Allow true and false to be used unqualified.
    var true = Bool.true
    var false = Bool.false

Arithmetic and Logical Operations

This is all eagerly awaiting the day when we have generics and overloading. For now, Int is the only arithmetic type :)

Arithmetic Operators

    // Simple binary operators, following the same precedence as C.
    func [infix_left=200] * (lhs: Int, rhs: Int) -> Int
    func [infix_left=200] / (lhs: Int, rhs: Int) -> Int
    func [infix_left=200] % (lhs: Int, rhs: Int) -> Int
    func [infix_left=190] + (lhs: Int, rhs: Int) -> Int
    func [infix_left=190] - (lhs: Int, rhs: Int) -> Int
    // In C, <<, >> is 180.

Relational and Equality Operators

    func [infix_left=170] <  : (lhs : Int, rhs : Int) -> Bool
    func [infix_left=170] >  : (lhs : Int, rhs : Int) -> Bool
    func [infix_left=170] <= : (lhs : Int, rhs : Int) -> Bool
    func [infix_left=170] >= : (lhs : Int, rhs : Int) -> Bool
    func [infix_left=160] == : (lhs : Int, rhs : Int) -> Bool
    func [infix_left=160] != : (lhs : Int, rhs : Int) -> Bool
    // In C, bitwise logical operators are 130,140,150.

Short Circuiting Logical Operators

    func [infix_left=120] && (lhs: Bool, rhs: ()->Bool) -> Bool
    func [infix_left=110] || (lhs: Bool, rhs: ()->Bool) -> Bool
    // In C, 100 is ?:
    // In C, 90 is =, *=, += etc.

Chris Lattner