Swift Language Reference

Introduction

In addition to the main spec, there are lots of open ended questions, justification, and ideas of what best practices should be. That random discussion is placed in boxes to the right side of the main text (like this one) to clarify what is normative and what is discussion.

This is the language reference manual for the Swift language, which is highly volatile and constantly under development. It is my (Chris') intention to keep this up to date as the prototype evolves.

The grammar and structure of the language is defined in BNF form in yellow boxes. Examples are shown in gray boxes, and assume that the standard library is in use (unless otherwise specified).

Basic Goals

A non-goal of the Swift project in general is to become some amazing research project. We really want to focus on delivering a real product, and having the design and spec co-evolve.

Support building great frameworks and applications, with a specific focus on permiting rich and powerful APIs.
Get the defaults right: this reduces the barrier to entry and increases the odds that the right thing happens.
Through our support for building great APIs, we aim to provide an expressive and productive language that is fun to program in.
Support low-level system programming. We should want to write compilers, operating system kernels, and media codecs in Swift. This means that being able to obtain high performance is really quite important.
Provide really great tools.
Where possible, steal great ideas instead of innovating new things that will work out in unpredictable ways. It turns out that there are a lot of good ideas already out there.
Lots of other stuff too.

Basic Approach

The basic approach in designing and implementing the Swift prototype was to start at the very bottom of the stack (simple expressions and the trivial bits of the type system) and incrementally build things up one brick at a time. There is a big focus on making things as simple as possible and having a clean internal core. Where it makes sense some sugar (e.g. "func" and "struct") is added on top to make the core more expressive for common situations.

One major aspect that dovetails with expressivity, learnability, and focus on API development is that much of the language is implemented in a standard library (inspired by the Haskell Standard Prelude). By pushing much of the boring parts of the language out of the compiler into the library, we end up with a smaller core language and we force the language that is left to be highly expressive and extensible, which we hope will allow for great libraries to be built on top of it.

More later.

Phases of Translation

The careful factoring of the grammar allows forward referencing of types and values and allows parsing without resolution of import declarations.

Swift has a strict separation between its phases of translation, and the grammar is carefully factored so that there is never ambiguity about whether an identifier refers to a type of a value. The phases of translation are:

Lexing: A translation unit is broken into tokens according to a (nearly, /**/ comments can be nested) regular grammar.
Parsing and AST Building: The tokens are parsed according to the grammar set out below. The grammar is context free and does not require any "type feedback" from the lexer or later stages. During parsing, name binding for references to local variables and other declarations that are not at translation unit (and eventually namespace) scope are bound.
Name Binding: At this phase, references to non-local types and values are bound, and import directives are both validated and searched. Name binding can cause recursive compilation of modules that are referenced but not yet built.
Type Checking: During this phase all types are resolved within value definitions, function application and binary expressions are found and formed, and overloaded functions are resolved.
Code Generation
Linking

TODO: "import swift" implicitly added as the last import in translation unit.

Lexical Structure

Not all characters are "taken" in the language, this is because it is still growing. As there becomes a reason to assign things into the identifier or punctuation bucket, we will do so as swift evolves.

The lexical structure of a Swift file is very simple: the files are tokenized according to the following productions and categories. As is usual with most languages, tokenization uses the maximal munch rule and whitespace separates tokens. This means that "a b" and "ab" lex into different token streams and are therefore different in the grammar.

Whitespace and Comments

TODO: /**/ comments will be supported when I get around to it.

    whitespace ::= ' '
    whitespace ::= '\n'
    whitespace ::= '\r'
    whitespace ::= '\t'
    whitespace ::= '\0'
    comment    ::= //.*[\n\r]

Space, newline, tab, and the nul byte are all considered whitespace and are discarded.

Comments follow the BCPL style, starting with a "//" and running to the end of the file. Comments are ignored as whitespace.

Reserved Punctuation Tokens

The difference between reserved punctuation and identifiers is that you can't "overload an operator" with one of these names.

Note that -> is used for function types "() -> int", not pointer dereferencing.

    punctuation ::= '('
    punctuation ::= ')'
    punctuation ::= '{'
    punctuation ::= '}'
    punctuation ::= '['
    punctuation ::= ']'
    punctuation ::= '.'
    punctuation ::= ','
    punctuation ::= ';'
    punctuation ::= ':'
    punctuation ::= '::'
    punctuation ::= '='
    punctuation ::= '->'

These are all reserved punctuation that are lexed into tokens. Most other punctuation is matched as identifiers.

Reserved Keywords

The number of keywords is reduced by pushing most functionality into the library. This allows us to add new stuff to the library in the future without worrying about conflicting with the user's namespace.

Idea: Should _foo be "private" and "foo" be public, how about "Foo" vs "foo"? In the code you immediately know whether something is exported, and don't need to sprinkle attributes everywhere, OTOH, it makes it harder to remember (like not having a naming convention).

Enumerating the integer types is expedient, but silly and unneeded. There isn't a good reason to expose an i13 LLVM IR type, but i128 would be reasonable.

    keyword ::= '__builtin_int1_type'
    keyword ::= '__builtin_int8_type'
    keyword ::= '__builtin_int16_type'
    keyword ::= '__builtin_int32_type'
    keyword ::= '__builtin_int64_type'
    keyword ::= 'import'
    keyword ::= 'oneof'
    keyword ::= 'struct'
    keyword ::= 'var'
    keyword ::= 'func'
    keyword ::= 'meth'
    keyword ::= 'typealias'

    keyword ::= 'if'
    keyword ::= 'else'

These are the builtin keywords. Swift intentionally tries to reduce the number of keywords where possible.

FIXME: __ should be considered the compiler/languages reserved namespace, any use of an unknown identifier here should be an error.

Numeric Constant

    numeric_constant ::= [0-9]+

Numeric constant tokens represent simple integer values.

TODO: Obviously need a floating point constant when we have a fp type.

Identifier Tokens

    identifier ::= [a-zA-Z_][a-zA-Z_$0-9]*
    identifier ::= [/=-+*%<>!&|^]+

There are two different regular expressions for identifiers, one for normal identifiers and one for "punctuation identifiers". This ensures that something like "foo+bar" gets lexed into three identifiers, not one. Aside from the regex that controls lexing behavior, there is no other difference between these two forms of identifiers.

Implementation Identifier Token

    dollarident ::= $[0-9a-zA-Z_$]*

Tokens that start with a $ are separate class of identifier, which are fixed purpose names that are defined by the implementation.

Declarations

    translation-unit ::= top-level-item*
    top-level-item   ::= ';'
    top-level-item   ::= decl-import
    top-level-item   ::= decl-var
    top-level-item   ::= decl-func
    top-level-item   ::= decl-meth
    top-level-item   ::= decl-typealias
    top-level-item   ::= decl-oneof
    top-level-item   ::= decl-struct
    top-level-item   ::= expr

A source file in Swift is parsed as a list of top level declarations. Extraneous semi colons are allowed at top level, as are a number of other declarations.

TODO: Need to define the module system more.

Expressions at the top level of a file are executed at program startup time in bottom-up order in the module dependence graph. This becomes a static constructor for a library and is the "main" for an executable. This allows 'print "hello world"' to work as a complete app.

import

    decl-import ::= 'import' attribute-list? identifier ('.' identifier)*

'import' declarations allow named values and types to be accessed with local names, even when they are defined in other modules and namespaces. See the section on name binding for more information on how these work.

'import' directives only impact a single translation unit: imports in one swift file do not affect name lookup in another file. import directives can only occur at the top level of a file, not within a function or namespace.

If a single identifier is specified for the import declaration, then the entire module is imported in its entirety into the current scope. If a scope (such as a namespace) is named, then all entities in the namespace are imported. If a specific type or variable is named (e.g. "import swift.int") then only the one type and/or value is imported.

    // Import all of the top level symbols and types in a package.
    import swift

    // Import all of the symbols within a namespace.
    import swift.io

    // Import a single variable, function, type, etc.
    import swift.io.bufferedstream

var

Eventually need to add support for "var x,y : int".

    decl-var ::= 'var' attribute-list? var-name value-specifier
    
    value-specifier ::= ':' type
    value-specifier ::= ':' type '=' expr
    value-specifier ::= '=' expr

'var' declarations form the backbone of value declarations in Swift and are the core semantic model for these values. The func declaration is just syntactic sugar for a var declaration.

Syntactically, var declarations come in three forms from the value-specifier production. In the first form, a type is specified and the value is default initialized. In the second form the type is elided but a value is specified, the declaration gets the specified value and has the same type as its initializer. In the third form, the value's type is as specified and the initializer is converted to that type if required.

Var declarations can optionally have a list of attributes applied to them.

    var-name ::= identifier
    var-name ::= '(' ')'
    var-name ::= '(' var-name (',' var-name)* ')'

The name given to the var can have structure that allows fields of the returned value to be directly named and accessed in later code. The structure of the name is required to match up with the type being matched. An single identifier is always valid to capture the entire value (potentially as an aggregate). Parenthesized values match up against tuple types and against single element oneof types.

Here are some examples of var declarations:

    // Simple examples.
    var a = 4
    var b : int
    var c : int = 42
    
    // Declaring a function like value with 'var', using an attribute.
    // The name here is "==", the type is a function that takes a tuple
    // and returns an int.
    var [infix=120] == : (lhs : int, rhs : int) -> int
    
    // This decodes the tuple return value into independently named parts
    // and both 'val' and 'err' are in scope after this line.
    var (val, err) = foo();
    
    // This binds elements of a struct into local variables.
    struct point { x : int, y : int }
    var (a, b) = foo()  // when foo returns a 'point'.

func

    decl-func ::= 'func' attribute-list? identifier type '=' expr
    decl-func ::= 'func' attribute-list? identifier type brace-expr
    decl-func ::= 'func' attribute-list? identifier type

A 'func' declaration is just shorthand syntax for the (extremely) common case of a declaration of function value. The argument list and optional return value are specified by the type production of the function, and the body is either not specified or is an arbitrary expression. If the argument type is not a function type, then the return value is implicitly inferred to be "()". All of the argument and return value names are injected into the scope of the function body.

TODO: Func should be an immutable name binding, it should implicitly add an attribute immutable when it exists.

TODO: Incoming arguments should be readonly, result should be implicitly writeonly when we have these attributes.

Here are some examples of func definitions:

    // Implicitly returns (), aka void
    func a() {}

    // Same as 'a'
    func b() -> void {}

    // Really simple function
    func c(arg : int) -> void = arg+4

    // Simple operator.
    func [infix=120] + (lhs: int, rhs: int) -> int;

    // Function with multiple return values:
    func d(a : int) -> (b : int) -> (res1 : int, res2 : int);

meth

'meth' syntax is shorthand for defining a function that can be used with dot syntax on the receiver type.

    decl-meth ::= 'meth' attribute-list? type-identifier '::' identifier type brace-expr?

A 'meth' declaration is shorthand syntax for declaring a 'func' with a compound function type. The argument list and optional return value are specified by the type production of the function, and the body is either not specified or is an arbitrary expression. If the argument type is not a function type, then the return value is implicitly inferred to be "()". All of the argument and return value names are injected into the scope of the function body. The initial first argument is of the type specified before the '::'.

'meth' declarations may not be declared infix.

Here are some examples of meth definitions:

    // Implicitly returns (), aka void.
    // This declares a function 'a' of type "t1 -> () -> ()".
    meth t1::a() {}

    // Same as 'a'
    meth t1::a() -> void {}

    // A simple method on a trivial type.
    struct bankaccount { amount : int }
    func bankaccount::deposit(arg : int) {
      amount = amount + arg
    }

typealias

    decl-typealias ::= 'typealias' identifier ':' type

'typealias' makes a named alias of a type, like a typedef in C. From that point on, the alias may be used in all situations the specified name is. It is named "typealias" because it really is an alias, not a "new" type.

Here are some examples of type aliases:

    // location is an alias for a tuple of ints.
    typealias location : (x : int, y : int)
      
    // pair_fn is a function that takes two ints and returns a tuple.
    typealias pair_fn : int -> int -> (first : int, second : int)

'typealias' is the core semantic model for all named types. For example, a oneof decl is just syntactic sugar for a oneof type and a 'typealias'.

oneof

In actual practice, we expect oneof to be commonly used for "enums" and "struct" below to be used for data declarations. The use of "oneof" for discriminated unions will be much less common (but is still very important) than its use for "enums".

    decl-oneof ::= 'oneof' attribute-list? identifier oneof-body

A oneof declaration is a convenient way to declare a type and name it at the same time, as it is simple syntactic sugar for a oneof type and a typealias declaration. Please see oneof types for more information about their capabilities. Here are two exactly equivalent declarations for intuition:

    // Declare discriminated union with typealias + oneof type.
    typealias SomeInts : oneof {
      None,
      One int,
      Two (:int, :int)
    }
    // Declare discriminated union with oneof decl (preferred).
    oneof SomeInts {
      None,
      One int,
      Two (:int, :int)
    }

Here are some more examples of oneof declarations:

    // Declares three "enums".
    oneof DataSearchFlags {
      None, Backward, Anchored
    }
    
    func f1(searchpolicy : DataSearchFlags);  // DataSearchFlags is a valid type name
    func test1() {
      f1(DataSearchFlags::None);  // Use of constructor with qualified identifier
      f1(:None);                  // Use of constructor with context sensitive type inference 
    
      // "None" has no type argument, so the constructor's type is "DataSearchFlags". 
      var a : DataSearchFlags = :None;
    }
    
    oneof SomeInts {
      None,             // Doesn't conflict with previous "None".
      One int,          // Argument type is simple int.
      Two (:int, :int)  // Argument type is simple tuple.
    }
    
    func f2(a : SomeInts);
    
    func test2() {
      Constructors for oneof element can be used in the obvious way.
      f2(:None);
      f2(:One 4);
      f2(:Two(1, 2));
    
      Constructor for None has type "SomeInts".
      var a : SomeInts = SomeInts::None;
    
      Constructor for One has type "int -> SomeInts".
      var b : int -> SomeInts = SomeInts::One;
    
      Constructor for Two has type "(:int,:int) -> SomeInts".
      var c : (:int,:int) -> SomeInts = SomeInts::Two;
    }

struct

    decl-struct ::= 'struct' attribute-list? identifier { type-tuple-body?  }

A struct declaration is syntactic sugar for a oneof declaration of a single element, and a 'typealias'. It requires that a tuple type be specified (whose body is specified in curly braces), and declares a oneof with a single constructor value of the same name as the struct.

Note that unlike oneof, 'struct' does inject its single constructor value into the containing scope. This means that you don't need to use qualified lookup to get access to the constructor for a struct.

These two declarations are equivalent (other than their names):

    struct S1 { a : int, b : int }
    
    oneof S2 {
      S2 (a : int, b : int)
    }
    var S2 = S2::S2;   // Constructor injected into global scope.

Here are some examples of structs:

    struct Point { x : int, y : int }
    struct Size { width : int, height : int }
    struct Rect { origin : Point, size : Size }
    
    func test4() {
      var a : Point;
      var b = Point::Point(1, 2);     // Silly but fine.
      var c = Point(.y = 1, .x = 2);  // Using injected name.
    
      var x1 = Rect(a, Size(42, 123));
      var x2 = Rect(.size = Size(.width = 42, .height=123), .origin = a);
    
      var x1_area = x1.width*x1.height;
    }

Attribute Lists

    attribute-list ::= '[' ']'
    attribute-list ::= '[' attribute (',' attribute)* ']'
    
    attribute      ::= attribute-infix

An attribute is a (possibly empty) comma separated list of attributes.

Infix Attribute

    attribute-infix ::= 'infix' '=' numeric_constant

The only attribute supported so far is the 'infix' attribute. FIXME: Describe requirements on function/var it is applied to, must be binary etc.

TODO: Add support for a bunch more attributes.

Types

    type ::= type-simple
    type ::= type-function
    type ::= type-array
    
    type-simple ::= type-builtin
    type-simple ::= type-identifier
    type-simple ::= type-tuple
    type-simple ::= oneof Types

Swift has a small collection of core datatypes that are built into the compiler. Most datatypes that the user is exposed are defined by the standard library or declared as a user defined types.

FIXME: Why is array a type instead of type-simple?

Builtin Primitive Types

    type-builtin ::= '__builtin_int1_type'
    type-builtin ::= '__builtin_int8_type'
    type-builtin ::= '__builtin_int16_type'
    type-builtin ::= '__builtin_int32_type'
    type-builtin ::= '__builtin_int64_type'

'__builtin_int*_type's are the name of the integer types of the corresponding size. These types specify a storage, but they have no semantics assigned to them (e.g. signed vs unsigned arithmetic) because there are no operations on them built into the language.

TODO: Support builtin float and double.

Named Types

    type-identifier ::= identifier

Named types may be used simply by using their name. Named types are introduced by typealias declarations or through syntactic sugar that expands to one.

    typealias location : (x : int, y : int)
    var x : location      // use of a named type.

Tuple Types

    type-tuple ::= '(' type-tuple-body? ')'
    type-tuple-body ::= identifier? value-specifier (',' identifier? value-specifier)*

Syntactically, tuple types are simply a (possibly empty) list of elements.

Tuples are the primary form of data aggregation in Swift, and are used as the building block of function argument lists, multiple return values, struct and oneof bodies, etc. Because tuples are widely accessible and available everywhere in the language, aggregate data access and transformation is uniform and powerful.

Each element of a tuple contains an optional name followed by a type and/or and default value expression, whose type conversion rules work like those in a var declaration. The name affects swizzling of elements in the tuple when tuple conversions are performed.

  // Variable definitions.
  var a : ()
  var b : (:int, :int)
  var c : (x : (), y : int)
  var d : (a : int, b = 4)       // Value is initialized to (0,4)
  var e : (a : int, b = 4) = (1) // Value is initialized to (1,4)

  // Tuple type inferred from an initializers:
  var m = ()                     // Type = ()
  var n = (.x = 1, .y = 2)       // Type = (x : int, y : int)
  var o = (1, 2, 3)              // Type = (:int, :int, :int)

  // Function argument and result is a tuple type.
  func foo(x : int, y : int) -> (val : int, err : int);

  // oneof and struct declarations with tuple values.
    struct S { a : int, b : int }
  oneof Vertex {
    Point2 (x : int, y : int),
    Point3 (x : int, y : int, z : int),
    Point4 (w : int, x : int, y : int, z : int)
  }

Function Types

    type-function ::= type-simple '->' type

Function types have a single input and single result type, separated by an arrow. Because each of the types is allowed to be a tuple, we trivially support multiple arguments and multiple results. "Function" types are more properly known as a "closure" type, because they can embody any context captured when the function value was formed.

Because of the grammar structure, a nested function type like "a -> b -> c" is parsed as "a -> (b -> c)". This means that if you declare this that you can pass it one argument to get a function that "takes b and returns c" or you can pass two arguments to "get a c". For example:

    // A simple function that takes a tuple and returns int:
    var a : (a : int, b : int) -> int

    // A simple function that returns multiple values:
    var a : (a : int, b : int) -> (val: int, err: int)

    // Declare a function that returns a function:
    var x : int -> int -> int;
    
    // y has type int -> int
    var y = x 1;

    // z1 and z2 both has type int, and both have the same value (assuming
    // the function had no side effects).
    var z1 = x 1 2;
    var z2 = y 2;

Array Types

Array types are currently a hack, and only partially implemented in the compiler. Arrays don't make sense to fully define until we have generics, because array syntax should just be sugar for a standard library type. "int[4]" should just be sugar for array<int, 4> or whatever.

    type-array ::= type '[' ']'
    type-array ::= type '[' expr ']'

Array types include a base type and an optional size. Array types indicate a linear sequence of elements stored consequtively memory. Array elements may be efficiently indexed in constant time. All array indexes are bounds checked and out of bound accesses are diagnosed with either a compile time or runtime failure (TODO: runtime failure mode not specified).

While they look syntactically very similar, an array type with a size has very different semantics than an array without. In the former case, the type indicates a declaration of actual storage space. In the later case, the type indicates a reference to storage space allocated elsewhere of runtime-specified size.

FIXME: We should separate out "Arrays" from "Slices". Arrays should always require a size and is by-value, a slice is a by-ref and never have a (statically specified) size.

For an array with a size, the size must be more than zero (no indices would be valid). For now, the array size must be a literal integer. TODO: Define a notion like C's integer-constant-expression for how constant folding works.

FIXME: int[][] not valid because the element type isn't sized. We need some constraint to reject this, or do we?

Some example array types:

    // A simple array declaration:
    var a : int[4];
    
    // A reference to another array:
    var b : int[] = a;
        
    // Declare a two dimensional array:
    var c : int[4][4];
    
    // Declare a reference to another array, two dimensional:
    var d : int[4][];

    // Declare an array of function pointers:
    var array_fn_ptrs : (: int -> int)[42];
    var g = array_fn_ptrs[12](4);

    // Without parens, this is a function that returns a fixed size array:
    var fn_returning_array : int -> int[42];
    var h : int[42] = fn_returning_array(4);
    
    // You can even have arrays of tuples and other things, these work right
    // through composition:
    var array_of_tuples : (a : int, b : int)[42];
    var tuple_of_arrays : (a : int[42], b : int[42]);
    
    array_of_tuples[12].a = array_of_tuples[13].b;
    tuple_of_arrays.a[12] = array_of_tuples.b[13];

oneof Types

In actual practice, we expect explicit oneof types to be rare, used mainly as a replacement for "bool" arguments. It will be much more common to use oneof decls for "enums" and "struct" decls for data declarations. The use of "oneof" for discriminated unions will be much less common (but is still very important) than its use for "enums". In any case, the decl form of this is important syntactic sugar for the type version of oneof.

'oneof' types are known as algebraic data types by the broader programming language community. The name 'oneof' comes from CLU.

    type-oneof         ::= 'oneof' attribute-list? oneof-body
    oneof-body         ::= '{' oneof-element (',' oneof-element)* '}'
    
    oneof-element      ::= identifier
    oneof-element      ::= identifier ':' type

A oneof type consists of a comma-separated list of elements, which are each either an identifier or an identifier with a type. The runtime representation of a value of oneof type only has one of the specified oneof elements at a time: a oneof value is a simple discriminated union.

A oneof type declares two things: 1) the type itself as an anonymous type, and 2) each of the oneof elements declares a constructor value which creates a value of the oneof type with the specified element kind. The constructor values are defined in a nested scope within the oneof descriptor type, so they must be accessed with either a qualified identifier (if the type itself is named) or through delayed identifier resolution with context sensitive type inference.

If the oneof element has no type specified with it, then the type of the constructor is the oneof type. If a oneof element has a type "T" associated with it, then the type of the constructor is a function that takes "T" and returns the oneof type.

A default initialized value of oneof type is initialized to the first element type in the list, with the default value for its element type.

Here are some examples of oneof types:

    // Declares three "enums" using a oneof type.  Better and easier with a
    // oneof declaration though.
    typealias DataSearchFlags : oneof {
      None, Backward, Anchored
    }
    
    func f1(searchpolicy : DataSearchFlags);
    func test1() {
      f1(DataSearchFlags::None);  // Use of constructor with qualified identifier
      f1(:None);                  // Use of constructor with context sensitive type inference 
    
      // "None" has no type argument, so the constructor's type is "DataSearchFlags". 
      var a : DataSearchFlags = :None;
    }
    
    // A more typical use case, anonymous argument for flags, death to
    // content-free bools in function arguments. 
    func search(val : int, direction : oneof { Up, Down });
    
    func test2() {
      search(42, :Up);
      search(17, :Down);
    }

TODO: Should attributes be allowed on oneof elements? TODO: Eventually, with generics we'll have equality and inequality operators. Oneof decls should implicitly define these for their types. TODO: Need pattern matching and element extraction.

Expressions

Support for user defined operators and separation of parsing from type analysis and name binding leads to a somewhat unusual system where expression parsing just doesn't do very much. This does lead to interesting phases of translation though, because we do "parsing" of expressions late (in type checking) instead of right after lexing.

Semicolons in C are generally just clutter. Swift generally defines away their use through the type system (and careful grammar structure).

Brace expressions don't fit naturally into the grammar anymore, because we don't want them in some places (e.g. the condition of an 'if'). It would be nice if they were not allowed in general as an expression anymore, but that would detract from their use as closures etc.

    expr                   ::= expr-primary+
    expr-non-brace         ::= expr-primary-non-brace+
    
    expr-primary           ::= expr-non-brace | expr-brace
    expr-primary-non-brace ::= expr-literal
    expr-primary-non-brace ::= expr-identifier
    expr-primary-non-brace ::= expr-paren
    expr-primary-non-brace ::= expr-delayed-identifier
    expr-primary-non-brace ::= expr-dot
    expr-primary-non-brace ::= expr-subscript

    expr-primary-non-brace ::= expr-if

At the top level of the expression grammar, expressions are a sequence of "primary" expressions.

    // A silly, but valid, example of a single 'expr':
    4 4 (4+5) 4 4
    
    // A more reasonable example:
    foo()
    x = 12
    bar(49+1)
    baz()
    
    
    // An example that is parsed as one list of three primary expressions,
    // but is resolved in type checking to a var decl followed by a separate
    // apply expression.
    var x = 4 foo()

Simple Literals

Consider treating literal numbers like go's arbitrary precision integers that are resolved to a type later. It would be really great for x+4.0 to work if x is a float, not a double. Likewise when we have multiple different width integers floating around. This is somewhat easy (just give numbers dependent type) but we would needs some magic to make "var x = 4" work, since we want it to get a default type, not be ambiguous.

    expr-literal ::= numeric_constant

The only literal currently supported are integer constants. These are given 'integer_literal_type' type, which must be defined by the library. If an integer literal is used and integer_literal_type is not defined, then the code is malformed.

Identifiers

    expr-identifier ::= identifier

A raw identifier refers to a value found via unqualified value lookup, and has the type of the declaration returned by name lookup and overload resolution. Value declarations are installed with var and the syntactic sugar forms like func declarations.

    expr-identifier ::= dollarident

A use of an identifier whose name fits the "$[0-9]+" regular expression is a reference to an anonymous closure argument that is formed when the containing expression is coerced into a closure context. All other dollar identifiers are invalid.

    expr-identifier ::= type-identifier '::' identifier

Qualified identifiers look up a constructor member of a oneof type (and eventually into namespaces). The first identifier is looked up as a type name.

Delayed Identifier Resolution

The ":bar" syntax was picked because it is "half" of a fully qualified "foo::bar" reference.

    expr-delayed-identifier ::= ':' identifier

A delayed identifier expression refers to a constructor of a oneof type, without knowing which type it is referring to. The expression is resolved to a constructor of a concrete type through context sensitive type inference. Delayed identifier resolution is actually more powerful than qualified identifier resolution, because it can find constructors of anonymous types.

    // A function with an argument that has an anonymous type. 
    func search(val : int, direction : oneof { Up, Down });
    
    func f() {
      search(42, :Up);
      search(17, :Down);
    }

Dot Expressions

Instead of using "foo.$1" to get to fields of a tuple (which is ugly), we could use "foo[1]". This would then make people want to use variable indexes into tuples though, which is a bad idea.

    expr-dot ::= expr-primary '.' dollarident

If the base expression has tuple type, then the magic identifier "$[0-9]+" accesses the specified anonymous member of the tuple. Otherwise, this form is invalid.

    expr-dot ::= expr-primary '.' identifier

If the base expression has tuple type and if the identifier is the name of a field in the tuple, then this is a reference to the specified field.

FIXME: Remove by injecting field members are overloaded functions. If the base expression is a oneof record with one element, which has an associated type of tuple (which is true of all struct declarations that have tuple body type), then the field access directly accesses the underlying tuple.

Otherwise, if dot name lookup is performed, and this expression is treated as function application. "x.foo" is treated as a synonym for "foo x", (other than the differing name lookup rules).

TODO: If dot name lookup fails, do ADL into the namespace of the base expression's type.

No other field accesses are currently allowed.

Subscript Expressions

As with arrays, this is just hacked in for now. Revisit this after we have generics.

    expr-subscript ::= expr-primary '[' expr-single ']'

FIXME: Array subscript should just be an overloaded operator like any other. a[x] should just call a function subscript(a, x). This will allow natural support for arrays and dictionaries, whose implementation of subscript comes from the stdlib.

FIXME2: Two problems with this: 1) we need support for lvalue function calls e.g. f(a, x) = 4, otherwise we can't use an array subscript as an lvalue. 2) we want overloading in the long term to get non-array types.

Parenthesized Expressions

    expr-paren      ::= '(' ')'
    expr-paren      ::= '(' expr-paren-element (',' expr-paren-element)* ')'
    expr-paren-element ::= ('.' identifier '=')? expr

Parentheses expressions contain an (optionally empty) list of optionally named values. Parentheses in an expression context denote one of two things: 1) grouping parentheses, or 2) a tuple literal.

Grouping parentheses occur when there is exactly one value in the list and that value does not have a name. In this case, the type of the parenthesis expression is the type of the single value.

All other cases are tuple literals. The type of the expression is a tuple type whose elements and order match that of the initializer. If there are any named elements, those elements become names for the tuple type. A parenthesis expression with no value has a type of the empty tuple.

Some examples:

    // Simple grouping parenthesis.
    var a = (4);             // Type = int
    var b = (4+a);           // Type = int
    
    // Tuple literals.
    var c = ()               // Type = ()
    var d = (4, 5)           // Type = (:int,:int)
    var e = (c, d)           // Type = ((), (:int, :int))
    
    var f = (.x = 4, .y = 5) // Type = (x : int, y : int)
    var g = (4, .y = 5, 6)   // Type = (:int, y : int, :int)
    
    // Named arguments to functions.
    func foo(a : int, b : int);
    foo(.b = 4, .a = 1)

Brace Literals

    expr-brace      ::= '{' expr-brace-item* '}'
    expr-brace-item ::= ';'
    expr-brace-item ::= expr
    expr-brace-item ::= decl-var
    expr-brace-item ::= decl-func
    expr-brace-item ::= decl-typealias
    expr-brace-item ::= decl-oneof
    expr-brace-item ::= decl-struct

FIXME: Need to kill off "the result of {} is the result of the last expr if it doesn't have a ; at the end. Add return statement instead.

Function Application

Type conversions/casts are just normal function calls: int(4.0) just runs the (overloaded) 'int' function on its argument. Separation of the value and type namespace makes this simple.

    expr-apply ::= expr expr

Juxtaposition of two expressions, when the former is of function type, is application of a function to its argument, and the argument is converted to the expected argument type. For example, "foo()" is a juxtaposition of the foo expression with an empty tuple.

This expression is not formed by the parser, it is formed by the type checker when it is processing consequtive sequences of primary expressions and finds something of function type.

A simple example:

    // Application of an empty tuple to the function f.
    f ()
    // Application of 4 to the function f, both are equivalent.
    g 4
    g (4)
    
    // Application of 4 to the function returned by h().
    var h : int -> int -> int;
    ...
    h () 4
    
    // Application of "{}" to the function returned by the result of
    // applying "(x)" to "if".
    if (x) {}

Infix Binary Expressions

    expr-infix ::= expr identifier expr

Infix binary expressions are not formed during parsing, they are formed during type checking, when a sequence of primary expressions is being type checked and a value with the infix attribute is found. The order of evaluation of the various subexpressions is defined by the precedence of the infix attribute.

TODO: Should this use the expr-identifier production to allow qualified identifiers? This would allow "foo swift::min bar". Oneof members cannot be declared infix, but future module/namespace scopes seem worth accessing, though ADL is probably enough.

A simple example is:

    4 + 5 * 123 min 42

'if' Expression

    expr-if      ::= 'if' expr-non-brace brace-expr expr-if-else?
    expr-if-else ::= 'else' brace-expr
    expr-if-else ::= 'else' expr-if

'if' expressions provide a simple control transfer operations that evalutes the condition, then invokes the 'convertToLogicValue' function on the result, and determines the direction of the branch based on the result of convertToLogicValue. This structure allows any type that can be converted to a logic value to be used in an 'if' expression. The result type of an 'if' expression is always '()', and the brace-expr operands are always forced to '()' type.

If there is no convertToLogicValue function that accepts a 'T', or if the resultant function does not produce a value of '__builtin_int1_type' type, then the code is malformed.

Some examples include:

    if true { /*...*/ }
    
    if X == 4 { } else { }

    if X == 4 {
    } else if X == 5 {
    } else {
    }

Protocols

Objects

Generics

Name Binding

Name binding in swift is performed in different ways depending on what language entity is being considered:

Value names (for var and func declarations) and type names (for typealias, oneof, and struct declarations) follow the same scope and name lookup rules as described below.

tuple element names

scope within oneof decls

Context sensitive member references are resolved during type checking.

Scopes for Type and Value Names

Name Lookup Unqualified Value Names

"dot" Name Lookup Value Names

Name Lookup for Type and Value Names

Basic algo:

Search the current scope tree for a local name. Local names cannot be forward referenced.
Bind to names defined in the current component, including the current translation unit. TODO: is this a good thing? We could require explicit imports if we wanted to.
Bind to identifiers that are imported with an import directive. Imports are searched in order of introduction (top-down). The location of an import directive in a file (e.g. between func decls) does not affect name lookup, but the order of imports w.r.t. each other does.

Name Lookup for Dot Expressions

Dot Expressions bind to name of tuple elements.

Type Checking

Binary expressions, function application, etc.

Standard Conversions

Anonymous Argument Resolution

Context Sensitive Type Resolution

Standard Library

It would be really great to have literate swift code someday, that way this could be generated directly from the code. This would also be powerful for Swift library developers to be able to depend on being available and standardized.

This describes some of the standard swift code as it is being built up. Since Swift is designed to give power to the library developers, much of what is normally considered the "language" is actually just implemented in the library.

All of this code is published by the 'swift' module, which is implicitly imported into each translation unit, unless some sort of pragma in the code (attribute on an import?) is used to change or disable this behavior.

Simple Types

void

    // void is just a type alias for the empty tuple.
    typealias void : ()

int

    // int is just a wrapper around the 64-bit integer type.
    struct int { value : __builtin_int64_type }

bool, true, false

    // bool is a simple discriminated union.
    oneof bool {
      true, false
    }
    
    // Allow true and false to be used unqualified.
    var true = bool::true
    var false = bool::false

Logic Value Evalution For Control Flow

void

    // logic_value is the standard type to be used for values that can be
    // used in control flow conditionals.
    typealias logic_value : __builtin_int1_type

    func convertToLogicValue(v : logic_value) -> logic_value
    func convertToLogicValue(v : bool) -> logic_value

Arithmetic and Logical Operations

This is all eagerly awaiting the day when we have generics and overloading. For now, int is the only arithmetic type :)

Arithmetic Operators

    // Simple binary operators, following the same precedence as C.
    func [infix=200] * (lhs: int, rhs: int) -> int
    func [infix=200] / (lhs: int, rhs: int) -> int
    func [infix=200] % (lhs: int, rhs: int) -> int
    func [infix=190] + (lhs: int, rhs: int) -> int
    func [infix=190] - (lhs: int, rhs: int) -> int
    // In C, <<, >> is 180.

Relational and Equality Operators

    var [infix=170] <  : (lhs : int, rhs : int) -> bool
    var [infix=170] >  : (lhs : int, rhs : int) -> bool
    var [infix=170] <= : (lhs : int, rhs : int) -> bool
    var [infix=170] >= : (lhs : int, rhs : int) -> bool
    var [infix=160] == : (lhs : int, rhs : int) -> bool
    var [infix=160] != : (lhs : int, rhs : int) -> bool
    // In C, bitwise logical operators are 130,140,150.

Short Circuiting Logical Operators

    func [infix=120] && (lhs: bool, rhs: ()->bool) -> bool
    func [infix=110] || (lhs: bool, rhs: ()->bool) -> bool
    // In C, 100 is ?:
    // In C, 90 is =, *=, += etc.

Chris Lattner