mirror of
https://github.com/apple/swift.git
synced 2025-12-21 12:14:44 +01:00
589 lines
34 KiB
Plaintext
589 lines
34 KiB
Plaintext
{\rtf1\ansi\ansicpg1252\cocoartf1138
|
|
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
|
|
{\colortbl;\red255\green255\blue255;}
|
|
{\*\listtable{\list\listtemplateid1\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid1\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid1}
|
|
{\list\listtemplateid2\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid101\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid2}
|
|
{\list\listtemplateid3\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid201\'01\uc0\u8226 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid3}}
|
|
{\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}{\listoverride\listid2\listoverridecount0\ls2}{\listoverride\listid3\listoverridecount0\ls3}}
|
|
\margl1440\margr1440\vieww10800\viewh25220\viewkind0
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\f0\b\fs52 \cf0 Generics in Swift
|
|
\b0\fs24 \
|
|
\
|
|
|
|
\b\fs40 Motivation
|
|
\b0\fs24 \
|
|
\
|
|
Most types and functions in code are expressed in terms of a single, concrete set of sets. Generics generalize this notion by allowing one to express types and functions in terms of an abstraction over a (typically unbounded) set of types, allowing improved code reuse. A typical example of a generic type is a linked list of values, which can be used with any type of value. In C++, this might be expressed as:\
|
|
\
|
|
template<typename T>\
|
|
class List \{\
|
|
public:\
|
|
struct Node \{\
|
|
T value;\
|
|
Node *next;\
|
|
\};\
|
|
\
|
|
List *first;\
|
|
\};\
|
|
\
|
|
where List<int>, List<string>, and List<DataRecord> are all distinct types that provide a linked list storing integers, strings, and DataRecords, respectively. Given such a data structure, one also needs to be able to implement generic functions that can operate on a list of any kind of elements, such as a simple, linear search algorithm:\
|
|
\
|
|
template<typename T>\
|
|
typename List<T>::Node *find(const List<T>&list, const T& value) \{\
|
|
for (typename List<T>::Node *result = list.first; result; result = result->next)\
|
|
if (result->value == value)\
|
|
return result;\
|
|
\
|
|
return 0;\
|
|
\}\
|
|
\
|
|
Generics are important for the construction of useful libraries, because they allow the library to adapt to application-specific data types without losing type safety. This is especially important for foundational libraries containing common data structures and algorithms, since these libraries are used across nearly every interesting application. \
|
|
\
|
|
The alternatives to generics tend to lead to poor solutions:\
|
|
\
|
|
\pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\pardirnatural
|
|
\ls1\ilvl0\cf0 {\listtext \'95 }Object-oriented languages tend to use "top" types (id in Objective-C, java.lang.Object in pre-generics Java, etc.) for their containers and algorithms, which gives up static type safety. Pre-generics Java forced the user to introduce run-time-checked type casts when interacting with containers (which is overly verbose), while Objective-C relies on id's unsound implicit conversion behavior to eliminate the need for casts.\
|
|
{\listtext \'95 }Many languages bake common data structures (arrays, dictionaries, tables) into the language itself. This is unfortunate both because it significantly increases the size of the core language and because users then tend to use this limited set of data structures for *every* problem, even when another (not-baked-in) data structure would suffice.\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
\cf0 \
|
|
Swift is intended to be a small, expressive language with great support for building libraries. We'll need generics to be able to build those libraries well.\
|
|
\
|
|
|
|
\b\fs40 Goals
|
|
\b0\fs24 \
|
|
\
|
|
\pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\pardirnatural
|
|
\ls2\ilvl0\cf0 {\listtext \'95 }Generics should enable the development of rich generic libraries that feel similar to first-class language features\
|
|
{\listtext \'95 }Generics should work on any type, whether it is a value type or some kind of object type\
|
|
{\listtext \'95 }Generic code should be almost as easy to write as non-generic code\
|
|
{\listtext \'95 }Generic code should be compiled such that it can be executed with any data type without requiring a separate "instantiation" step\
|
|
{\listtext \'95 }Generics should interoperate cleanly with run-time polymorphism\
|
|
{\listtext \'95 }Types should be able to retroactively modified to meet the requirements of a generic algorithm or data structure\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
\cf0 \
|
|
As important as the goals of a feature are the explicit non-goals, which we don't want or don't need to support:\
|
|
\pard\tx220\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li720\fi-720\pardirnatural
|
|
\ls3\ilvl0\cf0 {\listtext \'95 }Compile-time "metaprogramming" in any form\
|
|
{\listtext \'95 }Expression-template tricks a la Boost.Spirit, POOMA\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
\cf0 \
|
|
|
|
\b\fs40 Polymorphism
|
|
\b0\fs24 \
|
|
\
|
|
Polymorphism allows one to use different data types with a uniform interface. Overloading already allows a form of polymorphism (
|
|
\i ad hoc
|
|
\i0 polymorphism) in Swift. For example, given\
|
|
\
|
|
func +(x : int, y : int) -> int \{ add\'85 \}\
|
|
func +(x : string, y : string) -> string \{ concat\'85 \}\
|
|
\
|
|
we can write the expression "x + y", which will work for both integers and strings. \
|
|
\
|
|
However, we want the ability to express an algorithm or data structure independently of mentioning any data type. To do so, we need a way to express the essential interface that algorithm or data structure requires. For example, an accumulation algorithm would need to express that for any type T, one can write the expression "x + y" (where x and y are both of type T) and it will produce another T.\
|
|
\
|
|
|
|
\b\fs40 Protocols
|
|
\b0\fs24 \
|
|
\
|
|
Most languages that provide some form of polymorphism also have a way to describe abstract interfaces that cover a range of types: Java and C# interfaces, C++ abstract base classes, Objective-C protocols, Scala traits, Haskell type classes, C++ concepts (briefly), and many more. All allow one to describe functions or methods that are part of the interface, and provide some way to re-use or extend a previous interface by adding to it. We'll start with that core feature, and build onto it what we need.\
|
|
\
|
|
In Swift, I suggest that we use the term
|
|
\i protocol
|
|
\i0 for this feature, because I expect the end result to be similar enough to Objective-C protocols that our users will benefit, and (more importantly) different enough from Java/C# interfaces and C++ abstract base classes that those terms will be harmful. The term
|
|
\i trait
|
|
\i0 comes with the wrong connotation for C++ programmers, and none of our users know Scala.\
|
|
\
|
|
In its most basic form, a protocol is a collection of function signatures:\
|
|
\
|
|
protocol Document \{\
|
|
func title() -> string\
|
|
\}\
|
|
\
|
|
Document describes types that have a title() operation that accepts no arguments and returns a string. Note that there is implicitly a 'self' or 'this' type, which is the type that conforms to the protocol itself. This follows how most object-oriented languages describe interfaces, but deviates from Haskell type classes and C++ concepts, which require explicit type parameters for all of the types. We'll revisit this decision later.\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Protocol Refinement
|
|
\b0 \
|
|
\
|
|
Composition of protocols is important to help programmers organize and understand a large number of protocols and the data types that conform to those protocols. For example, we could extend our Document protocol to cover documents that support versioning:\
|
|
\
|
|
protocol VersionedDocument : Document \{\
|
|
func version() -> int\
|
|
\}\
|
|
\
|
|
Multiple refinement is permitted, allowing us to form a directed acyclic graph of protocols:\
|
|
\
|
|
protocol PersistentDocument : VersionedDocument, Serializable \{\
|
|
func saveToFile(filename : path)\
|
|
\}\
|
|
\
|
|
Any type that conforms to PersistentDocument also conforms to VersionedDocument, Document, and Serializable, which gives us substitutability.\
|
|
\
|
|
|
|
\b Self Types
|
|
\b0 \
|
|
\
|
|
Protocols thus far do not give us an easy way to express simple binary operations. For example, let's try to write a Comparable protocol that could be used to search for a generic find() operation:\
|
|
\
|
|
protocol Comparable \{\
|
|
func ==(other : ???) -> bool\
|
|
\}\
|
|
\
|
|
Our options for filling in ??? are currently very poor. We could use the syntax for saying "any type" or "any type that is comparable", as one must do most OO languages, including Java, C#, and Objective-C, but that's not expressing what we want: that the type of both of the arguments be the same. This is sometimes referred to as the
|
|
\i binary method problem
|
|
\i0 (http://www.cis.upenn.edu/~bcpierce/papers/binary.ps has a discussion of this problem, including the solution I'm proposing below).\
|
|
\
|
|
Neither C++ concepts nor Haskell type classes have this particular problem, because they don't have the notion of an implicit 'this' or 'self' type. Rather, they explicitly parameterize everything. In C++ concepts:\
|
|
\
|
|
concept Comparable<typename T> \{\
|
|
bool operator==(T, T);\
|
|
\}\
|
|
\
|
|
Java and C# programmers work around this issue by parameterizing the interface, e.g. (in Java):\
|
|
\
|
|
abstract class Comparable<THIS extends Comparable<THIS>> \{\
|
|
public bool equals(THIS other);\
|
|
\}\
|
|
\
|
|
and then a class X that wants to be Comparable will inherit from Comparable<X>. This is ugly and has a number of pitfalls; see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6479372 .\
|
|
\
|
|
Scala and Strongtalk have the notion of the 'Self' type, which effectively allows one to refer to the eventual type of 'self'. Self allows us to express the Comparable protocol in a natural way:\
|
|
\
|
|
protocol Comparable \{\
|
|
func ==(other : Self);\
|
|
\}\
|
|
\
|
|
By expressing Comparable in this way, we know that if we have two objects of type T where T conforms to Comparable, comparison between those two objects with == is well-typed. However, if we have objects of different types T and U, we cannot compare those objects with == even if both T and U are Comparable.\
|
|
\
|
|
Self types are not without their costs, particularly in the case where Self is used as a parameter type of a class method that will be subclassed. Here, the parameter type ends up being (implicitly) covariant, which tightens up type-checking but may also force us into more dynamic type checks. We can explore this separately; within protocols, type-checking for Self is more direct.\
|
|
\
|
|
|
|
\b Associated Types
|
|
\b0 \
|
|
\
|
|
In addition to Self, a protocol's operations often need to refer to types that are related to the type of 'Self', such as a type of data stored in a collection, or the node and edge types of a graph. For example, this would allow us to cleanly describe a protocol for collections:\
|
|
\
|
|
protocol Collection \{\
|
|
typealias Value\
|
|
func forEach(callback : (value : Value) -> void)\
|
|
func add(value : Value)\
|
|
\}\
|
|
\
|
|
It is important here that a generic function that refers to a given type T, which is known to be a collection, can access the associated types corresponding to T. For example, one could implement an "accumulate" operation for an arbitrary Collection, but doing so requires us to specify some constraints on the Value type of the collection. We'll return to this later.\
|
|
\
|
|
|
|
\b Note
|
|
\b0 : we think we want to replace "typealias" here with something else, such as "typename".\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b\fs40 \cf0 Subtype Polymorphism\
|
|
|
|
\b0\fs24 \
|
|
Subtype polymorphism is based on the notion of substitutability. If a type S is a subtype of a type T, then a value of type S can safely be used where a value of type T is expected. Object-oriented languages typically use subtype polymorphism, where the subtype relationship is based on inheritance: if the class Dog inherits from the class Animal, then Dog is a subtype of Animal. Subtype polymorphism is generally
|
|
\i dynamic
|
|
\i0 , in the sense that the substitution occurs at run-time, even if it is statically type-checked. \
|
|
\
|
|
In Swift, we consider protocols to be types. A value of protocol type means, essentially, a value of any (dynamically-determined) type T that conforms to the given protocol. Thus, a variable can be declared with type "Serializable", e.g.,\
|
|
\
|
|
var x : Serializable = // value of any Serializable type\
|
|
x.serialize(); // okay: serialize() is part of the Serializable protocol\
|
|
\
|
|
This gives rise to a natural "top" type, such that every type in the language is a subtype of "top". Java has java.lang.Object, C# has object, Objective-C has "id" (although "id" is weird, because it is also convertible to everything; it's best not to use it as a model). In Swift, the "top" type is simply a protocol with no requirements:\
|
|
\
|
|
protocol Any \{\}\
|
|
\
|
|
var value : Any = 17 // an any can hold an integer\
|
|
value = "hello" // or a string\
|
|
value = (42, "hello", Red) // or anything else\
|
|
\
|
|
Naturally, such polymorphism is dynamic, and will require boxing of value types to implement. We can now see how Self types interact with subtype polymorphism. For example, say we have two values of type Comparable, and we try to compare them:\
|
|
\
|
|
var x : Comparable = \'85\
|
|
var y : Comparable = \'85\
|
|
if x == y \{ // well-typed?\
|
|
\}\
|
|
\
|
|
Whether x == y is well-typed is not statically determinable, since the dynamic type of x may different from the dynamic type of y, even if they are both comparable (e.g., one is an int and the other a string). I suggest that this be considered ill-formed for now; we can revisit later if it becomes a problem.\
|
|
\
|
|
To express types that meet the requirements of several protocols, one can just create a new protocol aggregating those protocols:\
|
|
\
|
|
protocol SerializableDocument : Document, Serializable \{ \}\
|
|
var doc : SerializableDocument\
|
|
print(doc.title()) // okay: title() is part of the Document protocol, so we can call it\
|
|
doc.serialize(stout); // okay: serialize() is part of the Serializable protocol\
|
|
\
|
|
Since this syntax is fairly heavyweight, we introduce a shorthand syntax\
|
|
\
|
|
var doc : protocol<Document, Serializable>\
|
|
\
|
|
that allows us to declare an unnamed protocol that combines several other protocols. It's semantically identical to\
|
|
\
|
|
protocol UniqueName : Document, Serializable \{ \}\
|
|
var doc : UniqueName\
|
|
\
|
|
|
|
\b\fs40 Bounded Parametric Polymorphism
|
|
\b0\fs24 \
|
|
\
|
|
Parametric polymorphism is based on the idea of providing type parameters for a generic function or type. When using that function or type, one substitutes concrete types for the type parameters. Strictly speaking, parametric polymorphism allows *any* type to be substituted for a type parameter, but it's useless in practice because that means that generic functions or types cannot do anything to the type parameters: they must instead rely on first-class functions passed into the generic function or type to perform any meaningful work. \
|
|
\
|
|
Far more useful (and prevalent) is bounded parametric polymorphism, which allows the generic function or type to specify constraints (bounds) on the type parameters. By specifying these bounds, it becomes far easier to write and use these generic functions and types. Haskell type classes, Java and C# generics, C++ concepts, and many other language features support bounded parametric polymorphism. \
|
|
\
|
|
Protocols provide a natural way to express the constraints of a generic function in Swift. For example, one could define a generic linked list as:\
|
|
\
|
|
struct ListNode<T : Copyable> \{\
|
|
var Value : T\
|
|
oneof NextNode \{ Node (: ListNode<T>), End \}\
|
|
var Next : NextNode\
|
|
\}\
|
|
\
|
|
struct List<T : Copyable> \{\
|
|
var First : ListNode<T>::NextNode\
|
|
\}\
|
|
\
|
|
This list works on any type T, so long as that type T is Copyable (so that values can be copied into the list). One could then add a generic function that inserts at the beginning of the list:\
|
|
\
|
|
func insertAtBeginning<T : Copyable>(list : List<T>, value : T) \{\
|
|
list.First = ListNode<T>(value, list.First)\
|
|
\}\
|
|
\
|
|
Note that this insertAtBeginning function can operate on any type that meets the Copyable requirements, including List<Copyable>, allowing us to freely mix dynamic polymorphism with bounded parametric polymorphism.\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Expressing Constraints\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 \
|
|
Within the type parameter list of a generic type or function (e.g., the <T : Copyable> in ListNode<T : Comparable>), the 'T' introduces a new type parameter and the (optional) ": type" provides a constraint that type parameter, which may be any type (but will typically be a protocol) Within the body of the generic type or function, any of the functions or types described by the constraints are available. For example, let's implement a find() operation on lists:\
|
|
\
|
|
func find<T : protocol<Copyable,Comparable>>(list : List<T>, value : T) -> int \{\
|
|
var index = 0\
|
|
var current = list.First\
|
|
while current is Node \{ // now I'm just making stuff up\
|
|
if current.Value == value \{ // okay: T is Comparable\
|
|
return index\
|
|
\}\
|
|
current = current.Next\
|
|
index = index + 1\
|
|
\}\
|
|
return -1\
|
|
\}\
|
|
\
|
|
In addition to providing constraints on the type parameters, we also need to be able to constrain associated types. To do so, we introduce the notion of a requires clause, which follows the signature of the generic type or function. For example, let's generalize our find algorithm to work on any ordered collection:\
|
|
\
|
|
protocol OrderedCollection : Collection \{\
|
|
func size() -> int;\
|
|
func getAt(index : int) -> Value // Value is an associated type\
|
|
\}\
|
|
\
|
|
func find<C : OrderedCollection>(collection : C, value : C::Value)-> int \
|
|
requires C::Value : Comparable\
|
|
\{\
|
|
foreach index : int in range(0, collection.size()) \{ // making up syntax again\
|
|
if (collection.getAt(index) == value) \{ // okay: we know that C::Value is Comparable\
|
|
return index;\
|
|
\}\
|
|
\}\
|
|
return -1;\
|
|
\}\
|
|
\
|
|
The requires clause is actually the more general way of expressing constraints, and the constraints expressed in the angle brackets (e.g., <C : OrderedCollection>)are just sugar for a requires clause. For example, the above find() signature is equivalent to\
|
|
\
|
|
func find<C>(collection : C, value : C::Value)-> int \
|
|
requires C : OrderedCollection, C::Value : Comparable\
|
|
\
|
|
Note that find<C> is shorthand for (and equivalent to) find<C : Any>, since every type conforms to the Any protocol.\
|
|
\
|
|
There are two other important kinds of constraints that need to be expressible. Before we get to those, consider a simple "Range" protocol that lets us describe a range of values of some given value type:\
|
|
\
|
|
protocol Range \{\
|
|
typealias Value\
|
|
func empty() -> bool\
|
|
func getFirst() -> Value\
|
|
func getRest() -> Self\
|
|
\}\
|
|
\
|
|
Now, we want to express the notion of an enumerable collection, which provides a range, which we do using a requires clause inside the protocol\
|
|
\
|
|
protocol EnumerableCollection : Collection \{\
|
|
typealias RangeType\
|
|
requires RangeType : Range, RangeType::Value == Value\
|
|
func getRange() -> RangeType\
|
|
\}\
|
|
\
|
|
Here, we are specifying constraints on an associated type (RangeType must conform to the Range protocol), and also ensuring that the type of values produced by querying the range is the same as the type of values stored in the container. This is important, for example, for use with the Comparable protocol (and any protocol using Self types), because it maintains type identity within the generic function or type.\
|
|
\
|
|
We now note that the protocol refinement is actually just syntactic sugar for a requires clause on the Self type, e.g.,\
|
|
\
|
|
protocol SerializableDocument : Document, Serializable \{ \}\
|
|
\
|
|
is equivalent to\
|
|
\
|
|
protocol SerializableDocument \{\
|
|
requires Self : Document, Self : Serializable\
|
|
\}\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Constraint Inference
|
|
\b0 \
|
|
\
|
|
Note that the signature of both the first 'find' and 'insertAtBeginning' have some redundancy of description: they both explicitly require that T be Copyable, but the use of the type List<T> already implies that T is Copyable. As such, we should infer any constraints that are directly implied by types that make up the signature of the function, so the signatures of these functions could be:\
|
|
\
|
|
func insertAtBeginning<T>(list : List<T>, value : T) \{ // List<T> implies that T : Copyable\
|
|
// \'85\
|
|
\}\
|
|
\
|
|
func find<T :Comparable>(list : List<T>, value : T) -> int \{ // List<T> implies that T : Copyable \
|
|
// ...\
|
|
\}\
|
|
\
|
|
|
|
\b Type Parameter Deduction\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 \
|
|
As noted above, type arguments will be deduced from the call arguments to a generic function:\
|
|
\
|
|
var values : list<int>\
|
|
insertAtBeginning(values, 17) // deduces T = int\
|
|
\
|
|
Since Swift already has top-down type inference (as well as the C++-like bottom-up inference), we can also deduce type arguments from the result type\
|
|
\
|
|
func cast<T, U>(value : T) -> U \{ \'85 \} \
|
|
var x : any\
|
|
var y : int = cast(x) // deduces T=any, U = int\
|
|
\
|
|
We require that all type parameters for a generic function be deducible. We introduce this restriction so that we can avoid introducing a syntax for explicitly specifying type arguments to a generic function, e.g.,\
|
|
\
|
|
var y : int = cast<int>(x) // not permitted: < is the less-than operator\
|
|
\
|
|
This syntax is horribly ambiguous in C++, and with good type argument deduction, should not be necessary in Swift.\
|
|
\
|
|
|
|
\b\fs40 Conforming to a Protocol
|
|
\fs24 \
|
|
\
|
|
|
|
\b0 Thus far, we have not actually shown how a type can meet the requirements of a protocol. The most direct way this can occur is implicitly. This is essentially duck typing, where a type is assumed to conform to a protocol if it meets the syntactic requirements of the protocol. For example, given\
|
|
\
|
|
protocol Shape \{\
|
|
func draw();\
|
|
\}\
|
|
\
|
|
one could implement a function that draws a sequence of shapes, e.g.,\
|
|
\
|
|
func drawAll<C : EnumerableCollection>(collection : C) requires C::Value : Shape \{\
|
|
// details uninteresting\
|
|
\}\
|
|
\
|
|
This would make the following code well-formed (assuming that List<T> is an EnumerableCollection):\
|
|
\
|
|
struct Circle \{\
|
|
var center : Point;\
|
|
var radius : int;\
|
|
\
|
|
func draw() \{\
|
|
// draw it\
|
|
\}\
|
|
\}\
|
|
\
|
|
var circles : List<Circle>\
|
|
drawAll(circles)\
|
|
\
|
|
Note that there are two levels of inference going on here: the first is inference of the argument for the type parameter C based on the function argument (this is called template argument deduction in C++; see the section on argument deduction, below). The second is implicitly checking whether the type Circle conforms to the protocol Shape. Since Circle provides a draw() method with the appropriate type, implicit protocol conformance concludes that Circle conforms to Shape and type-checking succeeds.\
|
|
\
|
|
Implicit protocol conformance is convenient. It can run into some trouble when an entity that syntactically matches a protocol doesn't provide the required semantics. For example, Cowboys also know how to "draw!":\
|
|
\
|
|
struct Cowboy \{\
|
|
var gun : SixShooter;\
|
|
\
|
|
func draw() \{\
|
|
// draw!\
|
|
\}\
|
|
\}\
|
|
\
|
|
var cowboys : List<Cowboys>\
|
|
drawAll(cowboys) // dangerous, and probably not what the user intended\
|
|
\
|
|
Random collisions between types are fairly rare. However, when one is using protocol refinement with fine-grained (semantic or mostly-semantic) differences between protocols in the hierarchy, they become more common. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1798.html for examples of this problem as it surfaced with C++ concepts. Our hope is that, with implicit protocol conformance being the only option in Swift, library developers will avoid building refinement hierarchies that run into this trouble.\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Explicit Protocol Conformance\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 Type authors often implement types that are intended to conform to a particular protocol. For example, if we want our generic List type to be a Collection, we can specify that it is by adding a protocol conformance annotation to the type:\
|
|
\
|
|
struct List<T : Copyable> : Collection \{ // List<T> is a collection\
|
|
typealias Value = T\
|
|
func forEach(callback : (value : Value) -> void) \{ /* Implement this */ \}\
|
|
func add(value : Value) \{ /* Implement this */ \}\
|
|
\}\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 \
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 This explicit protocol conformance declaration forces the compiler to check that List<T> actually does meet the requirements of the Collection type. If we were missing an operation (say, forEach) or had the wrong signature, the definition of 'List' would be ill-formed.
|
|
\b \
|
|
\
|
|
Retroactive Modeling\
|
|
|
|
\b0 \
|
|
When using a set of libraries, it's fairly common that one library defines a protocol (and useful generic entities requiring that protocol) while another library provides a data type that provides similar functionality to that protocol, but under a different name.
|
|
\i Retroactive modeling
|
|
\i0 is the process by which the type is retrofitted (without changing the type) to meet the requirements of the protocol. \
|
|
\
|
|
In Swift, one could provide support for retroactive modeling by allowing class extensions, e.g.,\
|
|
\
|
|
extension string : Collection \{\
|
|
typealias Value = char\
|
|
func forEach(callback : (value : Value) -> void) \{ /* use existing string routines to enumerate characters */ \}\
|
|
func add(value : Value) \{ self += value /* append character */ \}\
|
|
\}\
|
|
\
|
|
Once an extension is defined, the extension now conforms to the Collection protocol, e.g.,\
|
|
\
|
|
var s : string;\
|
|
var collection : Collection = s // okay; string conforms to Collection now\
|
|
\
|
|
One can also use string as if it had Collection's members:\
|
|
\
|
|
var s : string\
|
|
s.add('\\n') // okay: uses the extension\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Default Implementations\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 The functions declared within a protocol are requirements that any type must meet if it wants to conform to the protocol. There is a natural tension here, then, between larger protocols that make it easier to write generic algorithms, and smaller protocols that make it easier to write conforming types. For example, should a numeric protocol implement all operations, e.g.,\
|
|
\
|
|
protocol Numeric \{\
|
|
func +(other : Self) - > Self\
|
|
func -(other : Self) - > Self\
|
|
func +() - > Self\
|
|
func -() - > Self\
|
|
\}
|
|
\b \
|
|
\
|
|
|
|
\b0 which would make it easy to write general numeric algorithms, but requires the author of some BigInt class to implement a lot of functionality, or should the numeric protocol implement just the core operations:
|
|
\b \
|
|
\
|
|
|
|
\b0 protocol Numeric \{\
|
|
func +(other : Self) - > Self\
|
|
func -() - > Self\
|
|
\}
|
|
\b \
|
|
\
|
|
|
|
\b0 to make it easier to adopt the protocol (but harder to write numeric algorithms)? Both of the protocols express the same thing (semantically), because one can use the core operations (binary +, unary -) to implement the other algorithms. However, it's far easier to allow the protocol itself to provide default implementations:\
|
|
\
|
|
protocol Numeric \{\
|
|
func +(other : Self) - > Self\
|
|
func -(other : Self) - > Self \{ return self + -other; \}\
|
|
func +() - > Self \{ return self; \}\
|
|
func -() - > Self\
|
|
\}
|
|
\b \
|
|
|
|
\b0 \
|
|
This makes it easier both to implement generic algorithms (which can use the most natural syntax) and to make a new type conform to the protocol. For example, if we were to define only the core algorithms in our BigNum type:\
|
|
\
|
|
struct BigNum : Numeric \{\
|
|
func +(other : BigNum) -> BigNum \{ \'85 \}\
|
|
func -() -> BigNum \{ \'85 \}\
|
|
\}\
|
|
\
|
|
the compiler will automatically synthesize the other operations needed for the protocol. Moreover, these operations will be available to uses of the BigNum class as if they had been written in the type itself (or in an extension of the type, if that feature is used), which means that protocol conformance actually makes it
|
|
\i easier
|
|
\i0 to define types that conform to protocols, rather than just providing additional checking.\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 \
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\fs40 \cf0 Implementation Model
|
|
\fs24 \
|
|
|
|
\b0 \
|
|
Because generics are constrained, a well-typed generic function or type can be translated into object code that uses dynamic dispatch to perform each of its operations on type parameters. This is in stark contrast to the instantiation model of C++ templates, where each new set of template arguments requires the generic function or type to be compiled again. This model is important for scalability of builds, so that the time to perform type-checking and code generation scales with the amount of code written rather than the amount of code instantiated. Moreover, it can lead to smaller binaries and a more flexible language (generic functions can be "virtual").\
|
|
\
|
|
The translation model is fairly simple. Consider the generic find() we implemented for lists, above:\
|
|
\
|
|
func find<T : protocol<Copyable,Comparable>>(list : List<T>, value : T) -> int \{\
|
|
var index = 0\
|
|
var current = list.First\
|
|
while current is ListNode<T> \{ // now I'm just making stuff up\
|
|
if current.Value == value \{ // okay: T is Comparable\
|
|
return index\
|
|
\}\
|
|
current = current.Next\
|
|
index = index + 1\
|
|
\}\
|
|
return -1\
|
|
\}\
|
|
\
|
|
to translate this into executable code, we form a vtable for each of the constraints on the generic function. In this case, we'll have a vtable for Copyable T and Comparable T. Every operation within the body of this generic function type-checks to either an operation on some concrete type (e.g., the operations on int), to an operation within a protocol (which requires indirection through the corresponding vtable), or to an operation on a generic type definition, all of which can easily be emitted as object code.\
|
|
\
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b \cf0 Specialization
|
|
\b0 \
|
|
|
|
\b \
|
|
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural
|
|
|
|
\b0 \cf0 This implementation model lends itself to optimization when we know the specific argument types that will be used when invoking the generic function. In this case, some or all of the vtables provided for the constraints will effectively be constants. By specializing the generic function (at compile-time, link-time, or (if we have a JIT) run-time) for these types, we can eliminate the cost of the virtual dispatch, inline calls when appropriate, and eliminate the overhead of the generic system. Such optimizations can be performed based on heuristics, user direction, or profile-guided optimization.
|
|
\b \
|
|
\
|
|
|
|
\fs40 Overloading\
|
|
\
|
|
|
|
\b0\fs24 Generic functions can be overloaded based entirely on constraints. For example, consider a binary search algorithm:\
|
|
\
|
|
func binarySearch<C : EnumerableCollection>(collection : C, value : C::Value) \
|
|
-> C::RangeType \
|
|
requires C::Value : Ordered \{\
|
|
// We can perform log(N) comparisons, but EnumerableCollection only supports linear\
|
|
// walks, so this is linear time\
|
|
\}\
|
|
\
|
|
protocol RandomAccessRange : Range \{\
|
|
func split() -> (Range, Range) // splits a range in half, returning both halves\
|
|
\}\
|
|
\
|
|
func binarySearch<C : EnumerableCollection>(collection : C, value : C::Value) \
|
|
-> C::RangeType\
|
|
requires C::Value : Ordered, C::RangeType : RandomAccessRange \{\
|
|
// We can perform log(N) comparisons and log(N) range splits, so this is logarithmic time\
|
|
\}\
|
|
\
|
|
If binarySearch is called with a sequence whose range type conforms to RandomAccessRange, both of the generic functions match. However, the second function is
|
|
\i more specialized
|
|
\i0 , because it's constraints are a superset of the constraints of the first function. In such a case, overloading should pick the more specialized function.\
|
|
\
|
|
There is a question as to when this overloading occurs. For example, binarySearch might be called as a subroutine of another generic function with minimal requirements:\
|
|
\
|
|
func doSomethingWithSearch<C : EnumerableCollection>\
|
|
(collection : C, value : C::Value) -> C::RangeType \
|
|
requires C::Value : Ordered \{\
|
|
binarySearch(collection, value);\
|
|
\}\
|
|
\
|
|
At the time when the generic definition of doSomethingWithSearch is type-checked, only the first binarySearch() function applies, since we don't know that C::RangeType conforms to RandomAccessRange. However, when doSomethingWithSearch is actually invoked, C::RangeType might conform to the RandomAccessRange, in which case we'd be better off picking the second binarySearch. This amounts to run-time overload resolution, which may be desirable , but also has downsides, such as the potential for run-time failures due to ambiguities and the cost of performing such an expensive operation at these call sites. Of course, that cost could be mitigated in hot generic functions via the specialization mentioned above.\
|
|
\
|
|
Our current proposal for this is to decide statically which function is called (based on similar partial-ordering rules as used in C++), and avoid run-time overload resolution. If this proves onerous, we can revisit the decision later.}
|