{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350 {\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\fmodern\fcharset0 CourierNewPSMT;} {\colortbl;\red255\green255\blue255;} {\*\listtable{\list\listtemplateid1\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid1\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid1} {\list\listtemplateid2\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid101\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid2} {\list\listtemplateid3\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid201\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{hyphen\}}{\leveltext\leveltemplateid202\'01\uc0\u8259 ;}{\levelnumbers;}\fi-360\li1440\lin1440 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{hyphen\}}{\leveltext\leveltemplateid203\'01\uc0\u8259 ;}{\levelnumbers;}\fi-360\li2160\lin2160 }{\listname ;}\listid3}} {\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}{\listoverride\listid2\listoverridecount0\ls2}{\listoverride\listid3\listoverridecount0\ls3}} \margl1440\margr1440\vieww23120\viewh15360\viewkind0 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \f0\b\fs24 \cf0 Elimination rules. \b0 \ \ When type theorists consider a programming language, we break it down like this:\ \pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \ls1\ilvl0\cf0 {\listtext \'95 }What are the kinds of fundamental and derived types in the language?\ {\listtext \'95 }For each type, what are its \i introduction rules \i0 , i.e. how do you get values of that type?\ {\listtext \'95 }For each type, what are its \i elimination rules \i0 , i.e. how do you use values of that type?\ \pard\tx560\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 \ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 Swift has a pretty small set of types right now:\ \pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \ls2\ilvl0\cf0 {\listtext \'95 }Fundamental types: currently i1, i8, i16, i32, and i64; eventually float and double; maybe others.\ {\listtext \'95 }Functions.\ {\listtext \'95 }Tuples. Heterogenous fixed-length aggregates. Swift's system provides two basic kinds: positional and labelled.\ {\listtext \'95 }Arrays. Homogenous fixed-length aggregates.\ {\listtext \'95 }Algebraic data types (ADTs), introduce by \i oneof \i0 . Closed disjoint unions of heterogenous fixed-length aggregates.\ \pard\tx560\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 Adding generics won't affect this, because "unapplied" generic types aren't first-class, and "applied" generic types are always one of the above (probably always ADTs, but it doesn't matter here). But adding any other kind of type (vectors seem likely) means we need to consider its intro/elim rules.\ \ For most of these, intro rules are just a question of picking syntax, and we don't really need a document for that. So let's talk elimination. Generally, an elimination rule is a way at getting back to the information the intro rule(s) wrote into the value. So what are the specific elimination rules for these types? How do we use them, other than in type-generic ways like passing them as arguments to calls?\ \ \pard\tx560\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \b \cf0 Functions \b0 are used by calling them. This is something of a special case: some values of function type may carry data, there isn't really a useful model for directly accessing it. Values of function type are basically completely opaque.\ \b Scalars \b0 are used by feeding them to primitive binary operators. This is also something of a special case, because there's no useful way in which scalars can be decomposed into separate values.\ \b Tuples \b0 are used by projecting out their elements.\ \b Arrays \b0 are used by projecting out slices and elements.\ \b ADTs \b0 are used by projecting out elements of the current alternative, but how we determine the current alternative?\ \ \b Alternatives for alternatives. \b0 \ \ I know of three basic designs for determining the current alternative of an ADT:\ \pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \ls3\ilvl0\cf0 {\listtext \'95 }Visitor pattern: there's some way of declaring a method on the full ADT and then implementing it for each individual alternative. You do this in OO languages mostly because there's no direct language support for \i closed \i0 disjoint unions (as opposed to \i open \i0 disjoint unions, which is essentially just subclassing).\ \pard\tx940\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1440\fi-1440\ql\qnatural\pardirnatural \ls3\ilvl1\cf0 {\listtext \uc0\u8259 }plus: doesn't require language support\ {\listtext \uc0\u8259 }plus: easy to "overload" and provide different kinds of pattern matching on the same type\ {\listtext \uc0\u8259 }plus: straightforward to add interesting ADT-specific logic, like matching a CallExpr instead of each of its N syntactic forms\ {\listtext \uc0\u8259 }plus: simple form of exhaustiveness checking\ {\listtext \uc0\u8259 }minus: cases are separate functions, so data and control flow is awkward\ {\listtext \uc0\u8259 }minus: lots of boilerplate to enable\ {\listtext \uc0\u8259 }minus: lots of boilerplate to use\ {\listtext \uc0\u8259 }minus: nested pattern matching is awful\ \pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \ls3\ilvl0\cf0 {\listtext \'95 }Query functions: dynamic_cast, dyn_cast, isa, instanceof\ \pard\tx940\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1440\fi-1440\ql\qnatural\pardirnatural \ls3\ilvl1\cf0 {\listtext \uc0\u8259 }plus: easy to order and mix with other custom conditions\ {\listtext \uc0\u8259 }plus: low syntactic overhead for testing the alternative if you don't need to actually decompose\ {\listtext \uc0\u8259 }minus: higher syntactic overhead for decomposition\ \pard\tx1660\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li2160\fi-2160\ql\qnatural\pardirnatural \ls3\ilvl2\cf0 {\listtext \uc0\u8259 }isa/instanceof pattern requires either a separate cast or unsafe operations later\ {\listtext \uc0\u8259 }dyn_cast pattern needs a fresh variable declaration, which is very awkward in complex conditions\ \pard\tx940\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1440\fi-1440\ql\qnatural\pardirnatural \ls3\ilvl1\cf0 {\listtext \uc0\u8259 }minus: exhaustiveness checking is basically out the window\ {\listtext \uc0\u8259 }minus: some amount of boilerplate to enable\ \pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \ls3\ilvl0\cf0 {\listtext \'95 }Pattern matching\ \pard\tx940\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1440\fi-1440\ql\qnatural\pardirnatural \ls3\ilvl1\cf0 {\listtext \uc0\u8259 }plus: no boilerplate to enable\ {\listtext \uc0\u8259 }plus: hugely reduced syntax to use if you want a full decomposition\ {\listtext \uc0\u8259 }plus: compiler-supported exhaustiveness checking\ {\listtext \uc0\u8259 }plus: nested matching is natural\ {\listtext \uc0\u8259 }plus: with pattern guards, natural mixing of custom conditions\ {\listtext \uc0\u8259 }minus: syntactic overkill to just test for a specific alternative (e.g. to filter it out)\ {\listtext \uc0\u8259 }minus: needs boilerplate to project out a common member across multiple/all alternatives\ {\listtext \uc0\u8259 }minus: awkward to group alternatives (fallthrough is a simple option but has issues)\ {\listtext \uc0\u8259 }minus: traditionally completely autogenerated by compiler and thus not very flexible\ {\listtext \uc0\u8259 }minus: usually a new grammar production that's very ambiguous with the expression grammar\ {\listtext \uc0\u8259 }minus: somewhat fragile against adding extra data to an alternative\ \pard\tx560\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 \ I feel that this strongly points towards using pattern matching as the basic way of consuming ADTs, maybe with special dispensations for querying the alternative and projecting out common members. I'll ignore that \ \ Pattern matching was probably a foregone conclusion, but I wanted to spell out that having ADTs in the language is what really forces our hand because the alternatives are so bad. Once we need pattern-matching, it makes sense to provide patterns for the other kinds of types as well.\ \ \b Selection statement. \b0 \ \ This is the main way we expect users to employ non-obvious pattern-matching. We obviously need something with statement children, so this has to be a statement. That's also fine because this kind of full pattern match is very syntactically heavyweight, and nobody would want to embed it in the middle of an expression. We also want a low-weight matching expression, though, for relatively simple ADTs.\ \ statement ::= 'match' expr '\{' case-group-list '\}'\ case-group-list ::= case-group\ case-group-list ::= case-group case-group-list\ case-group ::= case-pattern+ stmt-brace-item+\ case-pattern ::= 'case' expr ':'\ case-pattern ::= 'pattern' pattern pattern-guard? ':'\ pattern-guard ::= 'where' expr\ \ We're intentionally not using "switch" here because the syntax is too similar to C but the semantics are a bit different. This is a bit subtle, but honestly there should be enough other clues. The keywords here are all up for debate.\ \ Despite the lack of grouping braces, the semantics are that the statements in each case-group form their own scope, and falling off the end causes control to resume at the end of the match statement \'97 i.e. "implicit break", not "implicit fallthrough". Chris seems motivated to eventually add an explicit 'fallthrough' statement. If we did this, my preference would be to generalize it by allowing the match to be reperformed with a new value. But I also think having local functions removes a lot of the impetus.\ \ Syntactically, braces and the choice of case/pattern keywords are all bound together. The thinking goes as follows. In Swift, statement scopes are always grouped by braces. It's natural to group the cases with braces as well. Doing both lets us avoid a 'case' keyword, but otherwise it leads to ugly style, because either the last case ends in two braces on the same line or cases have to further indented. Okay, it's easy enough to not require braces on the match, with the grammar saying that cases are just greedily consumed \'97 there's no ambiguity here because the match statement is necessarily within braces. But that leaves the code without a definitive end to the cases, and the closing braces end up causing a lot of unnecessary vertical whitespace, like so:\ match (x)\ case :foo \{\ \'85\ \}\ case :bar \{\ \'85\ \}\ So instead, let's require the match statement to have braces, and we'll allow the cases to be written without them:\ match (x) \{\ case :foo:\ \'85\ case :bar:\ \'85\ \}\ That's really a lot prettier, except it breaks the rule about always grouping scopes with braces (we *definitely* want different cases to establish different scopes). Something has to give, though.\ \ Also, ":foo:" is pretty unfortunate. We want to separate the case from its body \'97 it's a huge cue \'97 and as mentioned we'd prefer to do it without requiring open/close punctuation. Colon seems obvious, since there's precedent in C and it's even roughly the right grammatical function; it just looks a little silly after ":name".\ \ The semantics of a match-statement are to first evaluate the value operand, then proceed down the list of case-patterns and execute the statements for the first case-pattern that is satisfied by the value. It is an error if a case-pattern can never trigger because the earlier case-patterns are exhaustive.\ \ A 'case' is satisfied if the value satisfies the evaluated case operand. The basic behavior will be an equality test, but there will be some point of extension to allow library "patterns" like "4..8". The case operand does not need to be a "constant expression" \'97 the expression can even have side-effects, although that's obviously poor style. A 'case' never binds variables.\ \ A 'pattern' is satisfied if the pattern is satisfied and the pattern-guard expression (if present) evaluates to true. The pattern-guard result must be usable as a logic operand. The guard expression is not evaluated if the pattern is not fully satisfied. Variables in the pattern are bound before the guard is evaluated.\ \ All of the case-patterns in a case-group must bind exactly the same variables with exactly the same types.\ \ Since falling out of the match is not unreasonable, there's a colorable argument that non-exhaustive matches should be okay, but I'm inclined to say that they should be errors and people who want non-exhaustive matches can put in catch-all patterns. The only complication with checking exhaustiveness is pattern guards. The obvious conservatively-safe rule is to say "ignore guarded cases during exhaustiveness checking", but some people really want to write "where x < 10" and "where x >= 10", and I can see their point. At the same time, we really don't want to go down that road.\ \ Patterns come up (or potentially come up) in a few other places in the grammar:\ \ \b Var bindings. \b0 \ \ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \cf0 Variable bindings only have a single pattern, which has to be exhaustive, which also means there's no point in supporting guards here. I think we just get this:\ decl-var ::= 'var' attribute-list? pattern value-specifier\ \ \b Function parameters \b0 .\ \ The functional languages all permit you to directly pattern-match in the function declaration, like this example from SML:\ \f1 fun length nil = 0\ | length (a::b) = 1 + length b\ \f0 \ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural \cf0 This is really convenient, but there's probably no reasonable analogue in Swift. One specific reason: we want functions to be callable with keyword arguments, but if you don't give all the parameters their own names, that won't work.\ \ The current Swift approximation is:\ \f1 func length(list : List) : Int \{\ match list \{\ pattern :nil: return 0\ pattern :cons(_,tail): return 1 + length(tail)\ \}\ \}\ \f0 \ That's quite a bit more syntax, but it's mostly the extra braces from the function body. We could remove those with something like this:\ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \f1 \cf0 func length(list : List) : Int = match list \{\ pattern :nil: return 0\ pattern :cons(_,tail): return 1 + length(tail)\ \}\ \f0 Anyway, that's easy to add later if we see the need.\ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \cf0 \ \b Assignment. \b0 \ \ This is a bit iffy. It's a lot like var bindings, but it doesn't have a keyword, so it's really kindof ambiguous given the pattern grammar.\ \ Also, l-value patterns are weird. I can come up with semantics for this, but I don't know what the neighbors will think:\ var perimeter : double\ :feet(x) += yard.dimensions.height // returns Feet, which has one constructor, :feet.\ :feet(x) += yard.dimensions.width\ \ It's probably better to just have l-value tuple expressions and not work in arbitrary patterns.\ \ \b Pattern-match expression. \b0 \ \ This is an attempt to provide that dispensation for query functions we were talking about.\ \ I think this should bind looser than any binary operators except assignments; effectively we should have\ expr-binary ::= # most of the current expr grammar\ \ expr ::= expr-binary\ expr ::= expr-binary 'matches' pattern pattern-guard?\ \ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 The semantics are that this evaluates to true if the pattern and pattern-guard are satisfied. If the pattern binds variables, it's an error if the expression isn't immediately used as a condition; otherwise, the variables are in scope in any code dominated by the 'true' edge. I've intentionally written this in a way that suggests it holds even within complex expressions, but at the very least this should work:\ if rect.dimensions matches (.height = h, .width = w) where h >= w \{\ \'85\ \}\ \pard\tx220\tx720\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\ql\qnatural\pardirnatural \cf0 \ The keyword 'matches' is not set in stone. It's hardly even set in sand. Clearly we should use =~. :)\ \ \b Pattern grammar. \b0 \ \ \pard\tx560\tx1120\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural \cf0 The usual syntax rule is that the pattern grammar mirrors the introduction-rule expression grammar, but with 'pattern' instead of 'expr'. This means that, for example, if we add array literal expressions, we should also need a corresponding array literal pattern. I think that principle is worth keeping, but leaf expressions can (and probably should) just be handled with 'case' instead of 'pattern'.\ \ The leaf pattern is a simple variable name. It matches everything and binds it to a new variable. It's an error to bind the same variable twice in a single pattern; I don't think we want to bite off contextually constrained patterns. :) It is useful to have a special "ignore this" pattern; I suggest adopting the common convention of just assigning special semantics to the identifier '_', where it doesn't actually bind anything.\ pattern ::= identifier\ \ This pattern is useful for binding an entire aggregate to a name while also matching on specific children. I'm just using Haskell syntax here, which I think is nice enough.\ pattern ::= identifier '@' pattern\ \ There's usually a pattern for annotating an arbitrary sub-pattern with a type. I think that's less important for us because type inference is so much more constrained, but if we need it, it would be:\ pattern ::= pattern ':' type\ This would then affect the parsing of vars and funcs because the (frequently mandatory) type annotations there would be parsed as part of the pattern.\ \ We almost certainly want a pattern to test a dynamic type (including protocol satisfaction), maybe something like this:\ pattern ::= pattern 'instanceof' type\ This also improves the type information in effect for checking the sub-pattern and any variables bound directly to it.\ \ We probably do not need the following patterns for matching leaf literals; users should just use 'case'. This is obviously something we can provide excellent recovery for. But non-leaf literals, like array literals, should still have patterns.\ pattern ::= numeric_constant\ pattern ::= string_constant // when we add it\ \ Tuples are interesting because of the labelled / non-labelled distinction. Especially with labelled elements, it is really nice to be able to ignore all the elements you don't care about. This grammar permits some prefix or set of labels to be matched and the rest to be ignored.\ pattern ::= pattern-tuple\ pattern-tuple ::= '(' pattern-tuple-element-list? '...'? ')'\ pattern-tuple-element-list ::= pattern-tuple-element\ pattern-tuple-element-list ::= pattern-tuple-element ',' pattern-tuple-element-list\ pattern-tuple-element ::= pattern\ pattern-tuple-element ::= '.' identifier = pattern\ \ The final cases are for ADT alternatives:\ pattern ::= pattern-ctor-name\ pattern ::= pattern-ctor-name pattern-tuple\ pattern-ctor-name ::= type-identifier '::' identifier\ pattern-ctor-name ::= ':' identifier\ \ I am inclined to say it should be okay to match an ADT value against the bare name of a constructor for that ADT even if that constructor normally requires an argument; i.e. this should work:\ match (x) \{\ case :none: \'85\ case :some: \'85\ \}\ This would be particularly convenient in pattern-match expressions.\ \ \b Miscellaneous. \b0 \ \ It would be interesting to allow overloading / customization of pattern-matching. We may find ourselves needing to do something like this to support non-fragile pattern matching anyway (if there's some set of restrictions that make it reasonable to permit that). The obvious idea of compiling into the visitor pattern is a bit compelling, although control flow would be tricky \'97 we'd probably need the generated code to throw an exception. Alternatively, we could let the non-fragile type convert itself into a fragile type for purposes of pattern matching.\ \ If we ever allow infix ADT constructors, we'll need to allow them in patterns as well.\ \ Eventually, we will build regular expressions into the language, and we will allow them directly as patterns and even bind grouping expressions into user variables.\ \ John.}