This is the language reference manual for the Swift language, which is highly volatile and constantly under development. As the prototype evolves, this document should be kept up to date with what is actually implemented.
The grammar and structure of the language is defined in BNF form in yellow boxes. Examples are shown in gray boxes, and assume that the standard library is in use (unless otherwise specified).
In no particular order, and not explained well:
A smaller wishlist goal is to support embedded sub-languages in swift, so that we don't get the OpenCL-is-like-C-but-very-different-in-many-details problem.
The basic approach in designing and implementing the Swift prototype was to start at the very bottom of the stack (simple expressions and the trivial bits of the type system) and incrementally build things up one brick at a time. There is a big focus on making things as simple as possible and having a clean internal core. Where it makes sense, sugar is added on top to make the core more expressive for common situations.
One major aspect that dovetails with expressivity, learnability, and focus on API development is that much of the language is implemented in a standard library (inspired in part by the Haskell Standard Prelude). This means that things like 'int' and 'void' are not part of the language itself, but are instead part of the standard library.
Swift has a strict separation between its phases of translation, and the compiler follows a conceptually simple design. The phases of translation are:
FIXME: "import swift" implicitly added as the last import in translation unit.
The lexical structure of a Swift file is very simple: the files are tokenized according to the following productions and categories. As is usual with most languages, tokenization uses the maximal munch rule and whitespace separates tokens. This means that "a b" and "ab" lex into different token streams and are therefore different in the grammar.
whitespace ::= ' '
whitespace ::= '\n'
whitespace ::= '\r'
whitespace ::= '\t'
whitespace ::= '\0'
comment ::= //.*[\n\r]
comment ::= /* .... */
Space, newline, tab, and the nul byte are all considered whitespace and are discarded, with one exception: a '(' which does not follow a non-whitespace character is different kind of token (called spaced) from one which does not (called unspaced). A '(' at the beginning of a file is spaced.
Comments may follow the BCPL style, starting with a "//" and running to the end of the line, or may be recursively nested /**/ style comments. Comments are ignored and treated as whitespace.
lparen-spaced ::= '(' // preceded by space
lparen-unspaced ::= '(' // not preceded by space
lparen-any ::= lparen-spaced
lparen-any ::= lparen-unspaced
punctuation ::= lparen-spaced
punctuation ::= lparen-unspaced
punctuation ::= ')'
punctuation ::= '{'
punctuation ::= '}'
punctuation ::= '['
punctuation ::= ']'
punctuation ::= '.'
punctuation ::= ','
punctuation ::= ';'
punctuation ::= ':'
punctuation ::= '='
punctuation ::= '->'
These are all reserved punctuation that are lexed into tokens. Most other punctuation is matched as identifiers.
keyword ::= 'else'
keyword ::= 'extension'
keyword ::= 'if'
keyword ::= 'import'
keyword ::= 'func'
keyword ::= 'oneof'
keyword ::= 'static'
keyword ::= 'protocol'
keyword ::= 'return'
keyword ::= 'struct'
keyword ::= 'typealias'
keyword ::= 'var'
keyword ::= 'while'
These are the builtin keywords.
integer_literal ::= [0-9]+
integer_literal ::= 0x[0-9a-fA-F]+
integer_literal ::= 0o[0-7]+
integer_literal ::= 0b[01]+
integer_literal tokens represent simple integer values of unspecified precision.
floating_literal ::= [0-9]+\.[0-9]+
floating_literal ::= [0-9]+(\.[0-9]*)?[eE][+-][0-9]+
floating_literal ::= \.[0-9]+([eE][+-][0-9]+)?
floating_literal tokens represent floating point values of unspecified precision.
string_literal ::= ["]([^"\\\n\r]|string_escape)*["]
string_escape ::= [\][\] | [\]t | [\]n | [\]r | [\]" | [\]'
string_escape ::= [\]x hex hex
string_escape ::= [\]u hex hex hex hex
string_escape ::= [\]U hex hex hex hex hex hex hex hex
hex ::= [0-9a-fA-F]
string_literal tokens represent a string, and are surrounded by double quotes. String literals cannot span multiple lines.
identifier ::= [a-zA-Z_][a-zA-Z_$0-9]*
operator ::= [/=-+*%<>!&|^~]+
Note: excludes '=', see [1]
excludes '->', see [2]
any-identifier ::= identifier | operator
dollarident ::= $[0-9a-zA-Z_$]*
Tokens that start with a $ are separate class of identifier, which are fixed purpose names that are defined by the implementation.
decl ::= decl-extension
decl ::= decl-func
decl ::= decl-import
decl ::= decl-oneof
decl ::= decl-protocol
decl ::= decl-struct
decl ::= decl-typealias
decl ::= decl-var
translation-unit ::= stmt-brace-item*
The top level of a swift source file is grammatically identical to the contents of a brace statement. Some declarations have semantic restrictions that only allow them within a translation unit though.
decl-import ::= 'import' attribute-list any-identifier ('.' any-identifier)*
'import' declarations allow named values and types to be accessed with local names, even when they are defined in other modules and namespaces. See the section on name binding for more information on how these work. import declarations are only allowed at translation unit scope.
'import' directives only impact a single translation unit: imports in one swift file do not affect name lookup in another file. import directives can only occur at the top level of a file, not within a function or namespace.
If a single identifier is specified for the import declaration, then the entire module is imported in its entirety into the current scope. If a scope (such as a namespace) is named, then all entities in the namespace are imported. If a specific type or variable is named (e.g. "import swift.int") then only the one type and/or value is imported. If the named value is overloaded, then the entire overload set is imported.
// Import all of the top level symbols and types in a package.
import swift
// Import all of the symbols within a namespace.
import swift.io
// Import a single variable, function, type, etc.
import swift.io.bufferedstream
// Import all multiplication overloads.
import swift.*
decl-extension ::= 'extension' type-identifier '{' decl* '}'
'extension' declarations allow adding member declarations to existing types, even in other translation units and modules. There are different semantic rules for each type that is extended.
'var' decls that do not have a getter or setter are not allowed in a oneof or struct, data members should be used instead.
decl-var ::= 'var' attribute-list pattern initializer?
decl-var ::= 'var' attribute-list identifier ':' type-annotation '{' var-get-set '}'
decl-var-simple ::= 'var' identifier ':' type-annotation
initializer ::= '=' expr
var-get-set ::= var-get var-set?
var-get-set ::= var-set var-get
var-get ::= 'get' stmt-brace
var-set ::= 'set' var-set-name? stmt-brace
var-set-name ::= '(' identifier ')'
'var' declarations form the backbone of value declarations in Swift. A var declaration takes a pattern and an optional initializer, and declares all the pattern-identifiers in the pattern as variables. If there is an initializer and the pattern is fully-typed, the initializer is converted to the type of the pattern. If there is an initializer and the pattern is not fully-typed, the type of initializer is computed independently of the pattern, and the type of the pattern is derived from the initializer. If no initializer is specified, the pattern must be fully-typed, and the values are default-initialized.
A var declaration may contain a getter and (optionally) a setter,
which will be used when reading or writing the variable, respectively.
Such a variable does not have any associated storage. A var
declaration with a getter or setter must have a type (call it
T). The getter function, whose body is provided as part
of the var-get clause, has type () -> T.
Similarly, the setter function, whose body is part of the
var-set clause (if provided), has type (T)
-> (). If the var-set clause contains a var-set-name
clause, the identifier of that clause is used as the name of the
parameter to the setter. Otherwise, the parameter name is "value".
FIXME: Should the type of a pattern which isn't fully typed affect the type-checking of the expression (i.e. should we compute a structured dependent type)?
Like all other declarations, var's can optionally have a list of attributes applied to them.
The type of a variable must be
materializable. A variable is
an lvalue unless it has a var-get clause but not
var-set clause.
Here are some examples of var declarations:
// Simple examples.
var a = 4
var b : int
var c : int = 42
// This decodes the tuple return value into independently named parts
// and both 'val' and 'err' are in scope after this line.
var (val, err) = foo()
// Variable getter/setter
var _x : int = 0
var x_modify_count : int = 0
var x : int {
get { return _x }
set {
x_modify_count = x_modify_count + 1
_x = value
}
}
Note that both 'get' and 'set' are context-sensitive keywords,
which means that at both global and local scope, there is a syntactic
ambiguity between a variable with a var-get-set clause
and a variable followed by a stmt-brace. This ambiguity is resolved
in favor of a variable with a var-get-set clause if the
token following the opening '{' is either 'set' or 'get'.
decl-func ::= 'static'? 'func' attribute-list any-identifier func-signature stmt-brace?
'func' is a declaration for a function. The argument list and optional return value are specified by the type production of the function, and the body is either a brace expression or elided. Like all other declarations, functions are can have attributes.
If the type is not syntactically a function type (i.e., has no -> in it at top-level), then the return value is implicitly inferred to be "()". All of the argument and return value names are injected into the scope of the function body.
A function in an extension of some type (or in other places that are semantically equivalent to an extension) implicitly get a 'this' argument with these rules ... [todo]
'static' functions are only allowed in an extension of some type (or in other places that are semantically equivalent to an extension). They indicate that the function is actually defined on the metatype for the type, not on the type itself. Thus it does not implicitly get a first 'this' argument, and can be used with dot syntax on the metatype.
TODO: Func should be an immutable name binding, it should implicitly add an attribute immutable when it exists.
TODO: Incoming arguments should be readonly, result should be implicitly writeonly when we have these attributes.
func-signature ::= pattern-tuple+ func-signature-result?
func-signature-result ::= '->' type
A function signature specifies one or more sets of parameter patterns, plus an optional result type.
When a result type is not written, it is implicitly the empty tuple type, ().
In the body of the function described by a particular signature, all the variables bound by all of the parameter patterns are in scope, and the function must return a value of the result type.
An outermost pattern in a function signature must be fully-typed and irrefutable. If a result type is given, it must also be fully-typed.
The type of a function with signature (P0)(P1)..(Pn) -> R is T0 -> T1 -> .. -> Tn -> R, where Ti is the bottom-up type of the pattern Pi. This is called "currying". The behavior of all the intermediate functions (those which do not return R) is to capture their arguments, plus any arguments from prior patterns, and returns a function which takes the next set of arguments. When the "uncurried" function is called (the one taking Tn and returning R), all of the arguments are then available and the function body is finally evaluated as normal.
Here are some examples of func definitions:
// Implicitly returns (), aka void
func a() {}
// Same as 'a'
func a1() -> void {}
// Function pointers to a function expression.
var a2 = func ()->() {}
var a3 = func () {}
var a4 = func {}
// Really simple function
func c(arg : int) -> int { return arg+4 }
// Simple operators.
func [infix_left=190] + (lhs : int, rhs : int) -> int
func [infix_left=160] == (lhs : int, rhs : int) -> bool
// Curried function with multiple return values:
func d(a : int) (b : int) -> (res1 : int, res2 : int) {
return (a,b)
}
// A more realistic example on a trivial type.
struct bankaccount {
amount : int
static func bankaccount() -> bankaccount {
// Custom 'constructor' logic goes here.
}
func deposit(arg : int) {
amount = amount + arg
}
static func someMetaTypeMethod() {}
}
// Dot syntax on metatype.
bankaccount.someMetaTypeMethod()
decl-typealias ::= 'typealias' identifier ':' type
'typealias' makes a named alias of a type, like a typedef in C. From that point on, the alias may be used in all situations the specified name is.
Here are some examples of type aliases:
// location is an alias for a tuple of ints.
typealias location : (x : int, y : int)
// pair_fn is a function that takes two ints and returns a tuple.
typealias pair_fn : (int) -> (int) -> (first : int, second : int)
'typealias' is the core semantic model for all named types. For example, a oneof decl uses an implicit 'typealias' to represent the name for the type.
decl-oneof ::= 'oneof' attribute-list identifier oneof-body
oneof-body ::= '{' (oneof-element (',' oneof-element)*)? decl* '}'
oneof-element ::= identifier
oneof-element ::= identifier ':' type-annotation
A oneof declaration provides direct access to oneof types with a typealias declaration specifying a name. Please see oneof types for more information about their capabilities.
A 'oneof' may include a list of decls after its member types, which is syntactic sugar for defining an extension of the type. The limitations of an oneof extensions apply here as well.
Here are some examples of oneof declarations:
// Declare discriminated union with oneof decl.
oneof SomeInts {
None,
One : int,
Two : (int, int)
}
// Declares three "enums".
oneof DataSearchFlags {
None, Backward, Anchored
}
func f1(searchpolicy : DataSearchFlags) // DataSearchFlags is a valid type name
func test1() {
f1(DataSearchFlags.None) // Use of constructor with qualified identifier
f1(.None) // Use of constructor with context sensitive type inference
// "None" has no type argument, so the constructor's type is "DataSearchFlags".
var a : DataSearchFlags = .None
}
oneof SomeMoreInts {
None, // Doesn't conflict with previous "None".
One : int,
Two : (int, int)
}
func f2(a : SomeMoreInts)
func test2() {
// Constructors for oneof element can be used in the obvious way.
f2(.None)
f2(.One(4))
f2(.Two(1, 2))
// Constructor for None has type "SomeMoreInts".
var a : SomeMoreInts = SomeMoreInts.None
// Constructor for One has type "(int) -> SomeMoreInts".
var b : (int) -> SomeMoreInts = SomeMoreInts.One
// Constructor for Two has type "(int,int) -> SomeMoreInts".
var c : (int,int) -> SomeMoreInts = SomeMoreInts.Two
}
decl-struct ::= 'struct' attribute-list identifier '{' decl-struct-body '}'
decl-struct-body ::= type-tuple-body? decl*
A struct declares a simple value type that can contain data members and have methods. A struct is syntactic sugar for more primitive forms: a oneof declaration of a single element, a 'typealias', a constructor, and potentially an extension with functions in it. Unlike tuple elements, all members of the struct must be named.
Since the element types of a oneof must be materializable, it follows that all the struct field types must be as well.
A 'struct' may include a list of decls after its member types, which is syntactic sugar for defining an extension of the type. The limitations of an struct extensions apply here as well.
These two declarations are equivalent (other than their names):
struct S1 { a : int, b : int }
oneof S2 {
S2 : (a : int, b : int)
}
Likewise, these are two equivalent ways of writing the same thing:
struct S3a { x = 4, y : int }
struct S3b {
x = 4
y : int
}
Likewise, a func declared in the struct are
equivalent to declaring an extension containing
the func.
struct S4a {
x : int
func doStuff() { ... }
}
struct S4b {
x : int
}
extension S4b {
func doStuff() { ... }
}
Here are some more realistic examples of structs:
struct Point { x : int, y : int }
struct Size { width : int, height : int }
struct Rect {
origin : Point,
size : Size
typealias CoordinateType : int
func area() -> int { return size.width*size.height }
}
func test4() {
var a : Point
var b = Point.Point(1, 2) // Silly but fine.
var c = Point(y = 1, x = 2) // Using metatype.
var x1 = Rect(a, Size(42, 123))
var x2 = Rect(size = Size(width = 42, height=123), origin = a)
var x1_area = x1.width*x1.height
var x1_area2 = x1.area()
}
decl-protocol ::= 'protocol' attribute-list identifier protocol-body
protocol-body ::= '{' protocol-member* '}'
A protocol declaration describes an abstract interface implemented by another type. It consists of a comma-separated list of elements, each of which are members that instances of the protocol are guaranteed to have. Internally, protocol declarations are modelled as a typealias that wraps a protocol type.
Here are some examples of protocols:
protocol Document {
func title() -> string
}
protocol VersionedDocument /*: Document*/ {
func version() -> int
}
protocol-member ::= decl-func
'func' members of a protocol define a value of function type that may be accessed with dot syntax on a value of the protocol's type. The function gets an implicit "this" argument of the protocol type.
TODO: var members and typealias members
protocol-member ::= decl-var-simple
'var' members of a protocol define "property" values that may be accessed with dot syntax on a value of the protocol's type. The declaration itself has function type, where the argument is an implicit "this" argument of the protocol type.
TODO: typealias members
attribute-list ::= /*empty*/
attribute-list ::= '[' ']'
attribute-list ::= '[' attribute (',' attribute)* ']'
attribute ::= attribute-infix
attribute ::= attribute-resilience
attribute ::= attribute-byref
attribute ::= attribute-auto_closure
An attribute is a (possibly empty) comma separated list of attributes.
attribute-infix ::= 'infix_left' '=' integer_literal
attribute-infix ::= 'infix_right' '=' integer_literal
attribute-infix ::= 'infix '=' integer_literal
The infix attributes may only be applied to the declaration of a function of binary operator type whose name is an operator. The name indicates the associativity of the operator, either left associative, right associative, or non-associative.
FIXME: Implement these restrictions.
attribute-resilience ::= 'resilient'
attribute-resilience ::= 'fragile'
attribute-resilience ::= 'born_fragile'
See the resilience design.
attribute-byref ::= 'byref'
byref is only valid in a type-annotation that appears within either a pattern of a function-signature or the input type of a function type.
byref indicates that the argument will be passed "by reference": the bound variable will be an l-value.
The type being annotated must be materializable. The type after annotation is never materializable.
FIXME: we probably need a const-like variant, which permits
r-values (and avoids writeback when the l-value is not physical).
We may also need some way of representing this will be
consumed by the nth curry
.
attribute-auto_closure ::= 'auto_closure'
The auto_closure attribute modifies a function type, changing the behavior of any assignment into (or initialization of) a value with the function type. Instead of requiring that the rvalue and lvalue have the same function type, an "auto closing" function type requires its initializer expression to have the same type as the function's result type, and it implicitly binds a closure over this expression. This is typically useful for function arguments that want to capture computation that can be run lazily.
auto_closure is only valid in a type-annotation of a syntactic function type that is defined to take a syntactic empty tuple.
// An auto closure value. This captures an implicit closure over the
// specified expression, instead of the expression itself.
var a : [auto_closure] () -> int = 4
// Definition of an 'assert' function. Assertions and logging routines
// often want to conditionally evaluate their argument.
func assert(condition : [auto_closure] () -> bool)
// Definition of the || operator - it captures its right hand side as
// an autoclosure so it can short-circuit evaluate it.
func [infix_left=110] || (lhs: bool, rhs: [auto_closure] ()->bool) -> bool
// Example uses of these functions:
assert(i < j)
if (a == 0 || b == 42) { ... }
type ::= type-simple
type ::= type-function
type ::= type-array
type-simple ::= type-identifier
type-simple ::= type-tuple
type-annotation ::= attribute-list type
Swift has a small collection of core datatypes that are built into the compiler. Most datatypes that the user is exposed are defined by the standard library or declared as a user defined types.
FIXME: Why is array a type instead of type-simple?
Each type has a corresponding metatype, with the same name as the type, that is injected into the standard name lookup scope when a type is declared. This allows access to 'static functions' through dot syntax. For example:
// Declares a type 'foo' as well as its metatype.
struct foo {
static func bar() {}
}
// Declares x to be of type foo. A reference to a name in type context
// refers to the type itself.
var x : foo
// Accesses a static function on the foo metatype. In a value context, the
// name of its type refers to its metatype.
foo.bar()
A type may be fully-typed. A type is fully-typed unless it is 1) a function type whose result or input type is not fully-typed or 2) a tuple type with an element that is not fully-typed. A tuple element is fully-typed unless it has no explicit type (which is permitted for defaultable elements) or its explicit type is not fully-typed. In other words, a type is fully-typed unless it syntactically contains a tuple element with no explicit type annotation.
A type being 'fully-typed' informally means that the type is specified directly from its type annotation without needing contextual or other information to resolve its type.
A type may be materializable. A type is materializable unless it is 1) annotated with a byref attribute or 2) a tuple with a non-materializable element type. In general, variables must have materializable type.
type-identifier ::= identifier ('.' identifier)*
Named types may be used simply by using their name. Named types are introduced by typealias declarations or through type declarations that expand to one.
typealias location : (x : int, y : int)
var x : location // use of a named type.
Type names may use dot syntax to refer to names types declared in other modules or types nested within other types.
// Direct reference to a member of another module.
var x : swift.int
type-tuple ::= lparen-any type-tuple-body? ')'
type-tuple-body ::= type-tuple-element (',' type-tuple-element)*
type-tuple-element ::= identifier value-specifier
type-tuple-element ::= type-annotation
Syntactically, tuple types are simply a (possibly empty) list of elements enclosed in parentheses. A tuple type with a single, anonymous, undefaulted element is exactly that type: the parentheses are treated as grouping parentheses.
Tuples are the low-level form of data aggregation in Swift, and are used as the building block of function argument lists, multiple return values, struct and oneof bodies, etc. Because tuples are widely accessible and available everywhere in the language, aggregate data access and transformation is uniform and powerful.
Each element of a tuple contains an optional name followed by a type and/or a default value expression, whose type conversion rules work like those in a var declaration. The name affects swizzling of elements in the tuple when tuple conversions are performed.
// Variable definitions.
var a : ()
var b : (int, int)
var c : (x : (), y : int)
var d : (a : int, b = 4) // Value is initialized to (0,4)
var e : (a : int, b = 4) = (1) // Value is initialized to (1,4)
// Tuple type inferred from an initializers:
var m = () // Type = ()
var n = (x = 1, y = 2) // Type = (x : int, y : int)
var o = (1, 2, 3) // Type = (int, int, int)
// Function argument and result is a tuple type.
func foo(x : int, y : int) -> (val : int, err : int)
// oneof and struct declarations with tuple values.
struct S { a : int, b : int }
oneof Vertex {
Point2 : (x : int, y : int),
Point3 : (x : int, y : int, z : int),
Point4 : (w : int, x : int, y : int, z : int)
}
type-function ::= type-tuple '->' type
Function types have a single input and single result type, separated by an arrow. Because each of the types is allowed to be a tuple, we trivially support multiple arguments and multiple results. "Function" types are more properly known as a "closure" type, because they can embody any context captured when the function value was formed.
The result type of a function type must be materializable. The argument type of a function is always required to be parenthesized (a tuple). The behavior of function types may be modified with the auto_closure attribute.
Because of the grammar structure, a nested function type like "(a) -> (b) -> c" is parsed as "(a) -> ((b) -> c)". This means that if you declare this that you can pass it one argument to get a function that "takes b and returns c" or you can pass two arguments to "get a c". This is known as currying. For example:
// A simple function that takes a tuple and returns int:
var a : (a : int, b : int) -> int
// A simple function that returns multiple values:
var a : (a : int, b : int) -> (val: int, err: int)
// Declare a function that returns a function:
var x : (int) -> (int) -> int
// y has type (int) -> int
var y = x(1)
// z1 and z2 both has type int, and both have the same value (assuming
// the function had no side effects).
var z1 = x(1)(2)
var z2 = y(2)
// An auto closure value. This captures an implicit closure over the
// specified expression, instead of the expression itself.
var a : [auto_closure] () -> int = 4
A oneof type is a simple discriminated union: the runtime representation of a value of oneof type only has one of the specified elements at a time.
A oneof type declares two things: 1) the type itself as an anonymous type, and 2) each of the oneof elements declares a constructor value which creates a value of the oneof type with the specified element kind. The constructor values are defined in a nested scope within the oneof descriptor type, so they must be accessed with either a qualified identifier (if the type itself is named) or through delayed identifier resolution with context sensitive type inference.
If the oneof element has no type specified with it, then the type of the constructor is the oneof type. If a oneof element has a type "T" associated with it, then the type of the constructor is a function that takes "T" and returns the oneof type.
A default initialized value of oneof type is initialized to the first element type in the list, with the default value for its element type.
All of the element types of a oneof type must be materializable.
Oneof types are implicitly created when declaring a oneof, struct, which produces a typealias and an underlying oneof type. There is no way to create an anonymous oneof type in the type production of the grammar.
TODO: Should attributes be allowed on oneof elements? TODO: Eventually, with generics we'll have equality and inequality operators. Oneof decls should implicitly define these for their types. TODO: Need pattern matching and element extraction.
type-array ::= type '[' ']'
type-array ::= type '[' expr ']'
Array types include a base type and an optional size. Array types indicate a linear sequence of elements stored consequtively memory. Array elements may be efficiently indexed in constant time. All array indexes are bounds checked and out of bound accesses are diagnosed with either a compile time or runtime failure (TODO: runtime failure mode not specified).
While they look syntactically very similar, an array type with a size has very different semantics than an array without. In the former case, the type indicates a declaration of actual storage space. In the later case, the type indicates a reference to storage space allocated elsewhere of runtime-specified size.
FIXME: We should separate out "Arrays" from "Slices". Arrays should always require a size and is by-value, a slice is a by-ref and never have a (statically specified) size.
For an array with a size, the size must be more than zero (no indices would be valid). For now, the array size must be a literal integer. TODO: Define a notion like C's integer-constant-expression for how constant folding works.
The element type of an array type must be materializable.
FIXME: int[][] not valid because the element type isn't sized. We need some constraint to reject this, or do we?
Some example array types:
// A simple array declaration:
var a : int[4]
// A reference to another array:
var b : int[] = a
// Declare a two dimensional array:
var c : int[4][4]
// Declare a reference to another array, two dimensional:
var d : int[4][]
// Declare an array of function pointers:
var array_fn_ptrs : (: (int) -> int)[42]
var g = array_fn_ptrs[12](4)
// Without parens, this is a function that returns a fixed size array:
var fn_returning_array : (int) -> int[42]
var h : int[42] = fn_returning_array(4)
// You can even have arrays of tuples and other things, these work right
// through composition:
var array_of_tuples : (a : int, b : int)[42]
var tuple_of_arrays : (a : int[42], b : int[42])
array_of_tuples[12].a = array_of_tuples[13].b
tuple_of_arrays.a[12] = array_of_tuples.b[13]
pattern-atom ::= pattern-identifier
pattern-atom ::= pattern-tuple
pattern ::= pattern-atom
pattern ::= pattern-typed
The basic pattern grammar is a literal "atom" followed by an optional type annotation. Type annotations are useful for documentation, as well as for coercing a matched expression to a particular kind. They are also required when patterns are used in a function signature.
A pattern has a type. A pattern may be "fully-typed", meaning informally that its type is fully determined by the type annotations it contains. Some patterns may also derive a type from their context, be it an enclosing pattern or the way it is used; this set of situations is not yet fully determined.
A pattern may be "irrefutable", meaning informally that it matches all values of its type. Patterns in some contexts are required to be irrefutable.
pattern-typed ::= pattern-atom ':' type-annotation
A type annotation constrains a pattern to have a specific type. An annotated pattern is fully-typed if its annotation type is fully-typed. It is irrefutable if and only if its subpattern is irrefutable.
pattern-identifier ::= identifier
An identifier pattern binds a value to a particular name, which is then a legal variable of the pattern's type within its scope. It is irrefutable. It is not fully-typed; the type must be inferred from context.
As a special case, if the identifier is _ then no variable comes into scope, and the value matched is lost. Such an identifier pattern is called an "ignore pattern". An identifier pattern which is not an ignore pattern is called a "named pattern".
The type of a named pattern must be materializable unless it appears in a function-signature and is directly a byref-annotated type.
pattern-tuple ::= '(' pattern-tuple-body? ')'
pattern-tuple-body ::= pattern-tuple-element
pattern-tuple-body ::= pattern-tuple-element ',' pattern-tuple-body
pattern-tuple-element ::= pattern
pattern-tuple-element ::= pattern '=' expr
A tuple pattern is a list of zero or more patterns. Within a function signature, patterns may also be given a default-value expression.
A tuple pattern is irrefutable if all its sub-patterns are irrefutable.
A tuple pattern is fully-typed if all its sub-patterns are fully-typed, in which case its type is the corresponding tuple type, where each type-tuple-element has the type, label, and default value of the corresponding pattern-tuple-element. A pattern-tuple-element has a label if it is a named pattern or a type annotation of a named pattern.
As a special case, a tuple pattern with one element that lacks both a label and default value is treated as a grouping parenthesis: it has the type of its constituent pattern, not a tuple type.
expr ::= expr-unary expr-binary*
expr-primary ::= expr-literal
expr-primary ::= expr-identifier
expr-primary ::= expr-explicit-closure
expr-primary ::= expr-anon-closure-arg
expr-primary ::= expr-paren
expr-primary ::= expr-delayed-identifier
expr-primary ::= expr-func
expr-postfix ::= expr-primary
expr-postfix ::= expr-dot
expr-postfix ::= expr-subscript
expr-postfix ::= expr-call
At the top level of the expression grammar, expressions are a sequence of unary expressions joined by operators. When parsing an expr, any operator immediately following an expr-unary continues the expression, and the program is ill-formed if it is not then followed by another expr-unary. This resolves an ambiguity which could otherwise arise in statement contexts due to semicolon elision.
5 !- +~123 -+- ~+6
(foo)(())
bar(49+1)
baz()
expr-binary ::= operator expr-unary
Infix binary expressions are not formed during parsing. Instead, they are formed after name resolution by building a tree from an operator-delimited sequence of unary expressions. Precedence and associativity are determined by the infix attribute on the resolved names, which must fully agree.
If an operator is used as a binary operator, but name resolution does not find at least one function of binary operator type, the expression is ill-formed.
A simple example is:
4 + 5 * 123
expr-unary ::= operator* expr-postfix
If an operator is used as a unary operator, but name resolution does not find at least one function that takes a single argument, the expression is ill-formed.
Simple examples:
i = -j
expr-literal ::= integer_literal
expr-literal ::= floating_literal
expr-literal ::= string_literal
Numeric literals are either integer, floating point, or string depending on its lexical form. The type of the literal is inferred based on its context. If there is no contextual type information for an expression, all unresolved types are inferred to 'IntegerLiteralType' type, to 'FloatLiteralType', and to 'StringLiteralType', respectively. If a literal is used and these types are not defined, then the code is malformed.
A literal is compatible with its inferred type if that type implements an informal protocol required by literals. This informal protocol requires that the type have an unambiguous "static" function defined whose result type is the same as the inferred type, and that takes a single argument that is either itself literal compatible, or is a builtin integer type.
expr-identifier ::= identifier
A raw identifier refers to a value found via unqualified value lookup, and has the type of the declaration returned by name lookup and overload resolution. Value declarations are installed with var and the syntactic sugar forms like func declarations.
expr-explicit-closure ::= '{' expr '}'
A closure expression is a super-concise version of expr-func for cases where very simple predicates and other small closures are needed (e.g. sorting and searching predicates). It uses Swift's aggressive type system to infer both the argument and return values types for the closure from the context it is used in, and allows access to the formal arguments of the closure through anonymous closure argument expressions.
It is illegal to use these expressions when there is insufficient context to infer the argument and return types of the closure.
Note that expr-explicit-closure is ambiguous with stmt-brace when used in a another stmt-brace or in translation-unit scope. This ambiguity is resolved towards stmt-brace, because these context never have enough contextual information to infer the type of the closure, thus they would always be a semantic error if parsed that way.
// Takes a closure that it calls to determine an ordering relation.
func magic(val : int, predicate : (a : int, b : int) -> bool)
func f() {
// Compare one way. Closure is inferred to return bool and take two ints
// from the argument context. This same information infers that $0 and $1
// both have type 'int'.
magic(42, { $0 < $1 })
// Compare the other way way.
magic(42, { $1 < $0 })
// Error, not enough context to infer the type of $0.
var x = { $0 }
}
expr-anon-closure-arg ::= dollarident
A use of an identifier whose name fits the "$[0-9]+" regular expression is a reference to an anonymous closure argument that is formed when the containing expression is coerced into a closure context. All other dollar identifiers are invalid.
This can only be used in the body of an expr-explicit-closure.
expr-delayed-identifier ::= '.' identifier
A delayed identifier expression refers to a constructor of a oneof type, without knowing which type it is referring to. The expression is resolved to a constructor of a concrete type through context sensitive type inference.
oneof Direction { Up, Down }
func search(val : int, direction : Direction)
func f() {
search(42, .Up)
search(17, .Down)
}
expr-paren ::= lparen-any ')'
expr-paren ::= lparen-any expr-paren-element (',' expr-paren-element)* ')'
expr-paren-element ::= (identifier '=')? expr
Parentheses expressions contain an (optionally empty) list of optionally named values. Parentheses in an expression context denote one of two things: 1) grouping parentheses, or 2) a tuple literal.
Grouping parentheses occur when there is exactly one value in the list and that value does not have a name. In this case, the type of the parenthesis expression is the type of the single value.
All other cases are tuple literals. The type of the expression is a tuple type whose elements and order match that of the initializer. If there are any named elements, those elements become names for the tuple type. A parenthesis expression with no value has a type of the empty tuple.
Note that some enclosing productions restrict the lparen-any to a lparen-unspaced.
Some examples:
// Simple grouping parenthesis.
var a = (4) // Type = int
var b = (4+a) // Type = int
// Tuple literals.
var c = () // Type = ()
var d = (4, 5) // Type = (:int,:int)
var e = (c, d) // Type = ((), (:int, :int))
var f = (x = 4, y = 5) // Type = (x : int, y : int)
var g = (4, y = 5, 6) // Type = (:int, y : int, :int)
// Named arguments to functions.
func foo(a : int, b : int)
foo(b = 4, a = 1)
expr-func ::= 'func' func-signature? stmt-brace
A func expression is an anonymous (unnamed) function literal definition, which can define named arguments (and whose names are in scope for its body) and that can refer to values defined in parent scopes.
A func expression captures a reference to any values in parent scopes that are used.
If the function signature is omitted, it is implicitly () -> ().
TODO: Allow attributes on funcs when useful.
// A simple func expression.
var a = func(val : int) { print(val+1) }
// A recursive func expression.
var fib = func(n : int) -> int {
if (n < 2) { return n; }
return fib(n-1)+fib(n-2)
}
expr-dot ::= expr-postfix '.' dollarident
If the base expression has tuple type, then the magic identifier "$[0-9]+" accesses the specified anonymous member of the tuple. Otherwise, this form is invalid.
expr-dot ::= expr-postfix '.' identifier
If the base expression has tuple type and if the identifier is the name of a field in the tuple, then this is a reference to the specified field.
Otherwise, dot name lookup is performed, and this expression is treated as function application. This allows looking up members in modules, metatypes, etc.
expr-subscript ::= expr-postfix '[' expr ']'
FIXME: Array subscript should just be an overloaded operator like any other. a[x] should just call a function subscript(a, x). This will allow natural support for arrays and dictionaries, whose implementation of subscript comes from the stdlib.
FIXME2: Two problems with this: 1) we need support for lvalue function calls e.g. f(a, x) = 4, otherwise we can't use an array subscript as an lvalue. 2) we want overloading in the long term to get non-array types.
expr-call ::= expr-postfix expr-paren
The leading '(' of the expr-paren must be a lparen-unspaced. This greatly reduces the likelihood of confusion from semicolon elision, without requiring feedback from the typechecker or more aggressive whitespace sensitivity.
Simple examples:
// Application of an empty tuple to the function f.
f()
// Application of 4 to the function f.
g(4)
// Application of 4 to the function returned by h().
var h : (int) -> (int) -> int
...
h()(4)
// Two separate statements
i()
(j <+ 2)()
stmt ::= stmt-semicolon
stmt ::= stmt-assign
stmt ::= stmt-brace
stmt ::= stmt-return
stmt ::= stmt-if
stmt ::= stmt-while
Statements provide the control flow constructs of function bodies and top-level code.
// A function with some statements.
func fib(v : int) -> int {
if v < 2 {
return v
}
return fib(v-1)+fib(v-2)
}
stmt-semicolon ::= ';'
The semicolon statement has no effect.
if (x = 1)It also implies that nested assignments are also illegal:
x = y = z
stmt-assign ::= expr '=' expr
The assignment statement evaluates its left hand side as some sort of lvalue, then evaluates the right hand side, the assigns one to the other. FIXME: The requirements for lvalues should be described, and tied into a description of lvalue types.
stmt-brace ::= '{' stmt-brace-item* '}'
stmt-brace-item ::= decl
stmt-brace-item ::= expr
stmt-brace-item ::= stmt
The brace statement provides a sequencing operation which evaluates the members of its body in order. Function bodies and the bodies of control flow statements use braces. Also, the translation unit itself is effectively and brace statement without the braces.
stmt-return ::= 'return' expr
stmt-return ::= 'return'
The return statement sets the return value of the current func declaration or func expression and transfers control out of the function. It sets the return value by converting the specified expression result (or '()' if none is specified) to the return type of the 'func'.
The stmt-return grammar is ambiguous: "{ return 4 }" could be parsed as {"return" "4"} or as a single statement. Ambiguity here is resolved toward the first production, because control flow can't transfer to an subexpression.
stmt-if ::= 'if' expr stmt-brace stmt-if-else?
stmt-if-else ::= 'else' stmt-brace
stmt-if-else ::= 'else' stmt-if
'if' statements provide a simple control transfer operations that evaluates the condition, invokes the 'getLogicValue' member of the result if the result not a 'bool', then determines the direction of the branch based on the result. (Internally, the standard library type 'bool' has a getLogicValue member that returns a 'Builtin.Int1'.) It is an error if the type of the expression is context-dependent or some non-bool type.
Some examples include:
if true {
/*...*/
}
if X == 4 {
} else {
}
if X == 4 {
} else if X == 5 {
} else {
}
stmt-while ::= 'while' expr stmt-brace
'while' statements provide simple loop construct which (on each iteration of the loop) evalutes the condition, invokes the 'getLogicValue' member of the result if the result not a 'bool', then determines whether to keep looping. (Internally, the standard library type 'bool' has a getLogicValue member that returns a 'Builtin.Int1'.) It is an error if the type of the expression is context-dependent or some non-bool type.
Some examples include:
while true {
/*...*/
}
while X == 4 {
X = 3
}
Name binding in swift is performed in different ways depending on what language entity is being considered:
Value names (for var and func declarations) and type names (for typealias, oneof, and struct declarations) follow the same scope and name lookup rules as described below.
tuple element names
scope within oneof decls
Context sensitive member references are resolved during type checking.
Basic algo:
Dot Expressions bind to name of tuple elements.
Binary expressions, function application, etc.
This describes some of the standard swift code as it is being built up. Since Swift is designed to give power to the library developers, much of what is normally considered the "language" is actually just implemented in the library.
All of this code is published by the 'swift' module, which is implicitly imported into each translation unit, unless some sort of pragma in the code (attribute on an import?) is used to change or disable this behavior.
In the initial Swift implementation, a module named Builtin is imported into every file. Its declarations can only be found by dot syntax. It provides access to a small number of primitive representation types and operations defined over them that map directly to LLVM IR.
The existance of and details of this module are a private implementation detail used by our implementation of the standard library. Swift code outside the standard library should not be aware of this library, and an independent implementation of the swift standard library should be allowed to be implemented without the builtin library if it desires.
For reference below, the description of the standard library uses the "Builtin." namespace to refer to this module, but independent implementations could use another implementation if they so desire.
// void is just a type alias for the empty tuple.
typealias void : ()
// Fixed size types are simple structs of the right size.
struct int8 { value : Builtin.Int8 }
struct int16 { value : Builtin.Int16 }
struct int32 { value : Builtin.Int32 }
struct int64 { value : Builtin.Int64 }
struct int128 { value : Builtin.Int128 }
// int is just an alias for the 64-bit integer type.
typealias int : int64
struct float { value : Builtin.FPIEEE32 }
struct double { value : Builtin.FPIEEE64 }
// bool is a simple discriminated union.
oneof bool {
true, false
}
// Allow true and false to be used unqualified.
var true = bool.true
var false = bool.false
// Simple binary operators, following the same precedence as C.
func [infix_left=200] * (lhs: int, rhs: int) -> int
func [infix_left=200] / (lhs: int, rhs: int) -> int
func [infix_left=200] % (lhs: int, rhs: int) -> int
func [infix_left=190] + (lhs: int, rhs: int) -> int
func [infix_left=190] - (lhs: int, rhs: int) -> int
// In C, <<, >> is 180.
func [infix_left=170] < : (lhs : int, rhs : int) -> bool
func [infix_left=170] > : (lhs : int, rhs : int) -> bool
func [infix_left=170] <= : (lhs : int, rhs : int) -> bool
func [infix_left=170] >= : (lhs : int, rhs : int) -> bool
func [infix_left=160] == : (lhs : int, rhs : int) -> bool
func [infix_left=160] != : (lhs : int, rhs : int) -> bool
// In C, bitwise logical operators are 130,140,150.
func [infix_left=120] && (lhs: bool, rhs: ()->bool) -> bool
func [infix_left=110] || (lhs: bool, rhs: ()->bool) -> bool
// In C, 100 is ?:
// In C, 90 is =, *=, += etc.