mirror of
https://github.com/apple/swift.git
synced 2025-12-14 20:36:38 +01:00
583 lines
20 KiB
ReStructuredText
583 lines
20 KiB
ReStructuredText
:orphan:
|
|
|
|
Writing High-Performance Swift Code
|
|
===================================
|
|
|
|
The following document is a gathering of various tips and tricks for writing
|
|
high-performance Swift code. The intended audience of this document is compiler
|
|
and standard library developers.
|
|
|
|
Some of the tips in this document can help improve the quality of your Swift
|
|
program and make your code less error prone and more readable. Explicitly
|
|
marking final-classes and class-protocols are two obvious examples. However some
|
|
of the tips described in this document are unprincipled, twisted and come to
|
|
solve a specific temporary limitation of the compiler or the language. Many of
|
|
the recommendations in this document come with trade offs for things like
|
|
program runtime, binary size, code readability, etc.
|
|
|
|
|
|
Enabling Optimizations
|
|
======================
|
|
|
|
The first thing one should always do is to enable optimization. Swift provides
|
|
three different optimization levels:
|
|
|
|
- ``-Onone``: This is meant for normal development. It performs minimal
|
|
optimizations and preserves all debug info.
|
|
- ``-O``: This is meant for most production code. The compiler performs
|
|
aggressive optimizations that can drastically change the type and amount of
|
|
emitted code. Debug information will be emitted but will be lossy.
|
|
- ``-Ounchecked``: This is a special optimization mode meant for specific
|
|
libraries or applications where one is willing to trade safety for
|
|
performance. The compiler will remove all overflow checks as well as some
|
|
implicit type checks. This is not intended to be used in general since it may
|
|
result in undetected memory safety issues and integer overflows. Only use this
|
|
if you have carefully reviewed that your code is safe with respect to integer
|
|
overflow and type casts.
|
|
|
|
In the Xcode UI, one can modify the current optimization level as follows:
|
|
|
|
...
|
|
|
|
|
|
Whole Module Optimizations
|
|
==========================
|
|
|
|
By default Swift compiles each file individually. This allows Xcode to
|
|
compile multiple files in parallel very quickly. However, compiling each file
|
|
separately prevents certain compiler optimizations. Swift can also compile
|
|
the entire program as if it were one file and optimize the program as if it
|
|
were a single compilation unit. This mode is enabled using the command
|
|
line flag ``-whole-module-optimization``. Programs that are compiled in
|
|
this mode will most likely take longer to compile, but may run faster.
|
|
|
|
This mode can be enabled using the Xcode build setting 'Whole Module Optimization'.
|
|
|
|
|
|
Reducing Dynamic Dispatch
|
|
=========================
|
|
|
|
Swift by default is a very dynamic language like Objective-C. Unlike Objective
|
|
C, Swift gives the programmer the ability to improve runtime performance when
|
|
necessary by removing or reducing this dynamicism. This section goes through
|
|
several examples of language constructs that can be used to perform such an
|
|
operation.
|
|
|
|
Dynamic Dispatch
|
|
----------------
|
|
|
|
Classes use dynamic dispatch for methods and property accesses by default. Thus
|
|
in the following code snippet, ``a.aProperty``, ``a.doSomething()`` and
|
|
``a.doSomethingElse()`` will all be invoked via dynamic dispatch:
|
|
|
|
::
|
|
|
|
class A {
|
|
var aProperty: [Int]
|
|
func doSomething() { ... }
|
|
dynamic doSomethingElse() { ... }
|
|
}
|
|
|
|
class B : A {
|
|
override var aProperty {
|
|
get { ... }
|
|
set { ... }
|
|
}
|
|
|
|
override func doSomething() { ... }
|
|
}
|
|
|
|
func usingAnA(a: A) {
|
|
a.doSomething()
|
|
a.aProperty = ...
|
|
}
|
|
|
|
In Swift, dynamic dispatch defaults to indirect invocation through a vtable
|
|
[#]_. If one attaches the ``dynamic`` keyword to the declaration, Swift will
|
|
emit calls via Objective-C message send instead. In both cases this is slower
|
|
than a direct function call because it prevents many compiler optimizations [#]_
|
|
in addition to the overhead of performing the indirect call itself. In
|
|
performance critical code, one often will want to restrict this dynamic
|
|
behavior.
|
|
|
|
Advice: Use 'final' when you know the declaration does not need to be overridden
|
|
--------------------------------------------------------------------------------
|
|
|
|
The ``final`` keyword is a restriction on a declaration of a class, a method, or
|
|
a property such that the declaration cannot be overridden. This implies that the
|
|
compiler can emit direct function calls instead of indirect calls. For instance
|
|
in the following ``C.array1`` and ``D.array1`` will be accessed directly
|
|
[#]_. In contrast, ``D.array2`` will be called via a vtable:
|
|
|
|
::
|
|
|
|
final class C {
|
|
// No declarations in class 'C' can be overridden.
|
|
var array1: [Int]
|
|
func doSomething() { ... }
|
|
}
|
|
|
|
class D {
|
|
final var array1 [Int] // 'array1' cannot be overridden by a computed property.
|
|
var array2: [Int] // 'array2' *can* be overridden by a computed property.
|
|
}
|
|
|
|
func usingC(c: C) {
|
|
c.array1[i] = ... // Can directly access C.array without going through dynamic dispatch.
|
|
c.doSomething() = ... // Can directly call C.doSomething without going through virtual dispatch.
|
|
}
|
|
|
|
func usingD(d: D) {
|
|
d.array1[i] = ... // Can directly access D.array1 without going through dynamic dispatch.
|
|
d.array2[i] = ... // Will access D.array2 through dynamic dispatch.
|
|
}
|
|
|
|
Advice: Use 'private' when declaration does not need to be accessed outside of file
|
|
-----------------------------------------------------------------------------------
|
|
|
|
Applying the ``private`` keyword to a declaration restricts the visibility of
|
|
the declaration to the file in which it is declared. This allows the compiler to
|
|
be able to ascertain all other potentially overriding declarations. Thus the
|
|
absence of any such declarations enables the compiler to infer the ``final``
|
|
keyword automatically and remove indirect calls for methods and field accesses
|
|
accordingly. For instance in the following, ``e.doSomething()`` and
|
|
``f.myPrivateVar``, will be able to be accessed directly assuming ``E``, ``F``
|
|
do not have any overriding declarations in the same file:
|
|
|
|
::
|
|
|
|
private class E {
|
|
func doSomething() { ... }
|
|
}
|
|
|
|
class F {
|
|
private var myPrivateVar : Int
|
|
}
|
|
|
|
func usingE(e: E) {
|
|
e.doSomething() // There is no sub class in the file that declares this class.
|
|
// The compiler can remove virtual calls to doSomething()
|
|
// and directly call A's doSomething method.
|
|
}
|
|
|
|
func usingF(f: F) -> Int {
|
|
return f.myPrivateVar
|
|
}
|
|
|
|
Using Container Types Efficiently
|
|
=================================
|
|
|
|
An important feature provided by the Swift standard library are the generic
|
|
containers Array and Dictionary. This section will explain how to use these
|
|
types in a performant manner.
|
|
|
|
Advice: Use value types in Array
|
|
--------------------------------
|
|
|
|
In Swift, types can be divided into two different categories: value types
|
|
(structs, enums, tuples) and reference types (classes). A key distinction is
|
|
that value types cannot be included inside an NSArray. Thus when using value
|
|
types, the optimizer can remove most of the overhead in Array that is necessary
|
|
to handle the possibility of the array being backed an NSArray.
|
|
|
|
Additionally, In contrast to reference types, value types only need reference
|
|
counting if they contain, recursively, a reference type. By using value types
|
|
without reference types, one can avoid additional retain, release traffic inside
|
|
Array.
|
|
|
|
::
|
|
|
|
// Don't use a class here.
|
|
struct PhonebookEntry {
|
|
var name : String
|
|
var number : [Int]
|
|
}
|
|
|
|
var a : [PhonebookEntry]
|
|
|
|
Keep in mind that there is a trade-off between using large value types and using
|
|
reference types. In certain cases, the overhead of copying and moving around
|
|
large value types will outweigh the cost of removing the bridging and
|
|
retain/release overhead.
|
|
|
|
Advice: Use ContiguousArray with reference types when NSArray bridging is unnecessary
|
|
-------------------------------------------------------------------------------------
|
|
|
|
If you need an array of reference types and the array does not need to be
|
|
bridged to NSArray, use ContiguousArray instead of Array:
|
|
|
|
::
|
|
|
|
class C { ... }
|
|
var a: ContiguousArray<C> = [C(...), C(...), ..., C(...)]
|
|
|
|
Advice: Use inplace mutation instead of object-reassignment
|
|
-----------------------------------------------------------
|
|
|
|
All standard library containers in Swift are value types that use COW
|
|
(copy-on-write) [#]_ to perform copies instead of explicit copies. In many cases
|
|
this allows the compiler to elide unnecessary copies by retaining the container
|
|
instead of performing a deep copy. This is done by only copying the underlying
|
|
container if the reference count of the container is greater than 1 and the
|
|
container is mutated. For instance in the following, no copying will occur when
|
|
``d`` is assigned to ``c``, but when ``d`` undergoes structural mutation by
|
|
appending ``2``, ``d`` will be copied and then ``2`` will be appended to ``d``:
|
|
|
|
::
|
|
|
|
var c: [Int] = [ ... ]
|
|
var d = c // No copy will occur here.
|
|
d.append(2) // A copy *does* occur here.
|
|
|
|
Sometimes COW can introduce additional unexpected copies if the user is not
|
|
careful. An example of this is attempting to perform mutation via
|
|
object-reassignment in functions. In Swift, all parameters are passed in at +1,
|
|
i.e. the parameters are retained before a callsite, and then are released at the
|
|
end of the callee. This means that if one writes a function like the following:
|
|
|
|
::
|
|
|
|
func append_one(a: [Int]) -> [Int] {
|
|
a.append(1)
|
|
return a
|
|
}
|
|
|
|
var a = [1, 2, 3]
|
|
a = append_one(a)
|
|
|
|
``a`` may be copied [#]_ despite the version of ``a`` without one appended to it
|
|
has no uses after ``append_one`` due to the assignment. This can be avoided
|
|
through the usage of ``inout`` parameters:
|
|
|
|
::
|
|
|
|
func append_one_in_place(inout a: [Int]) {
|
|
a.append(1)
|
|
}
|
|
|
|
var a = [1, 2, 3]
|
|
append_one_in_place(&a)
|
|
|
|
Unchecked operations
|
|
====================
|
|
|
|
Swift eliminates integer overflow bugs by checking for overflow when performing
|
|
normal arithmetic. These checks are not appropriate in high performance code
|
|
where one knows that no memory safety issues can result.
|
|
|
|
Advice: Use unchecked integer arithmetic when you can prove that overflow cannot occur
|
|
---------------------------------------------------------------------------------------
|
|
|
|
In performance-critical code you can elide overflow checks if you know it is
|
|
safe.
|
|
|
|
::
|
|
|
|
a : [Int]
|
|
b : [Int]
|
|
c : [Int]
|
|
|
|
// Precondition: for all a[i], b[i]: a[i] + b[i] does not overflow!
|
|
for i in 0 ... n {
|
|
c[i] = a[i] &+ b[i]
|
|
}
|
|
|
|
Generics
|
|
========
|
|
|
|
Swift provides a very powerful abstraction mechanism through the use of generic
|
|
types. The Swift compiler emits one block of concrete code that can perform
|
|
``MySwiftFunc<T>`` for any ``T``. The generated code takes a table of function
|
|
pointers and a box containing ``T`` as additional parameters. Any differences in
|
|
behavior between ``MySwiftFunc<Int>`` and ``MySwiftFunc<String>`` are accounted
|
|
for by passing a different table of function pointers and the size abstraction
|
|
provided by the box. An example of generics:
|
|
|
|
::
|
|
|
|
class MySwiftFunc<T> { ... }
|
|
|
|
MySwiftFunc<Int> X // Will emit code that works with Int...
|
|
MySwiftFunc<String> Y // ... as well as String.
|
|
|
|
When optimizations are enabled, the Swift compiler looks at each invocation of
|
|
such code and attempts to ascertain the concrete (i.e. non-generic type) used in
|
|
the invocation. If the generic function's definition is visible to the optimizer
|
|
and the concrete type is known, the Swift compiler will emit a version of the
|
|
generic function specialized to the specific type. This process, called
|
|
*specialization*, enables the removal of the overhead associated with
|
|
generics. Some more examples of generics:
|
|
|
|
::
|
|
|
|
class MyStack<T> {
|
|
func push(element: T) { ... }
|
|
func pop() -> T { ... }
|
|
}
|
|
|
|
func myAlgorithm(a: [T], length: Int) { ... }
|
|
|
|
// The compiler can specialize code of MyStack[Int]
|
|
var stackOfInts: MyStack[Int]
|
|
// Use stack of ints.
|
|
for i in ... {
|
|
stack.push(...)
|
|
stack.pop(...)
|
|
}
|
|
|
|
var arrayOfInts: [Int]
|
|
// The compiler can emit a specialized version of 'myAlgorithm' targeted for
|
|
// [Int]' types.
|
|
myAlgorithm(arrayOfInts, arrayOfInts.length)
|
|
|
|
Advice: Put generic declarations in the same file where they are used
|
|
---------------------------------------------------------------------
|
|
|
|
The optimizer can only perform specializations if the definition of the generic
|
|
declaration is visible in the current Module. This can only occur if the
|
|
declaration is in the same file as the invocation of the generic. *NOTE* The
|
|
standard library is a special case. Definitions in the standard library are
|
|
visible in all modules and available for specialization.
|
|
|
|
Advice: Allow the compiler to perform generic specialization
|
|
------------------------------------------------------------
|
|
|
|
The compiler can only specialize generic code if the call site and the callee
|
|
function are located in the same compilation unit. One trick that we can use to
|
|
allow compiler to optimize the callee function is to write code that performs a
|
|
type check in the same compilation unit as the callee function. The code behind
|
|
the type check then re-dispatches the call to the generic function - but this
|
|
time it has the type information. In the code sample below we've inserted a type
|
|
check into the function "play_a_game" and made the code run hundreds of times
|
|
faster.
|
|
|
|
::
|
|
|
|
//Framework.swift:
|
|
|
|
protocol Pingable { func ping() -> Self }
|
|
protocol Playable { func play() }
|
|
|
|
extension Int : Pingable {
|
|
func ping() -> Int { return self + 1 }
|
|
}
|
|
|
|
class Game<T : Pingable> : Playable {
|
|
var t : T
|
|
|
|
init (_ v : T) {t = v}
|
|
|
|
func play() {
|
|
for _ in 0...100_000_000 { t = t.ping() }
|
|
}
|
|
}
|
|
|
|
func play_a_game(game : Playable ) {
|
|
// This check allows the optimizer to specialize the
|
|
// generic call 'play'
|
|
if let z = game as? Game<Int> {
|
|
z.play()
|
|
} else {
|
|
game.play()
|
|
}
|
|
}
|
|
|
|
/// -------------- >8
|
|
|
|
// Application.swift:
|
|
|
|
play_a_game(Game(10))
|
|
|
|
|
|
The cost of large swift values
|
|
==============================
|
|
|
|
In Swift, values keep a unique copy of their data. There are several advantages
|
|
to using value-types, like ensuring that values have independent state. When we
|
|
copy values (the effect of assignment, initialization, and argument passing) the
|
|
program will create a new copy of the value. For some large values these copies
|
|
could be time consuming and hurt the performance of the program.
|
|
|
|
.. More on value types:
|
|
.. https://developer.apple.com/swift/blog/?id=10
|
|
|
|
Consider the example below that defines a tree using "value" nodes. The tree
|
|
nodes contain other nodes using a protocol. In computer graphics scenes are
|
|
often composed from different entities and transformations that can be
|
|
represented as values, so this example is somewhat realistic.
|
|
|
|
.. See Protocol-Oriented-Programming:
|
|
.. https://developer.apple.com/videos/play/wwdc2015-408/
|
|
|
|
::
|
|
|
|
protocol P {}
|
|
struct Node : P {
|
|
var left, right : P?
|
|
}
|
|
|
|
struct Tree {
|
|
var node : P?
|
|
init() { ... }
|
|
}
|
|
|
|
|
|
When a tree is copied (passed as an argument, initialized or assigned) the whole
|
|
tree needs to be copied. In the case of our tree this is an expensive operation
|
|
that requires many calls to malloc/free and a significant reference counting
|
|
overhead.
|
|
|
|
However, we don't really care if the value is copied in memory as long as the
|
|
semantics of the value remains.
|
|
|
|
Advice: Use copy-on-write semantics for large values
|
|
----------------------------------------------------
|
|
|
|
To eliminate the cost of copying large values adopt copy-on-write behavior. The
|
|
easiest way to implement copy-on-write is to compose existing copy-on-write data
|
|
structures, such as Array. Swift arrays are values, but the content of the array
|
|
is not copied around every time the array is passed as an argument because it
|
|
features copy-on-write traits.
|
|
|
|
In our Tree example we eliminate the cost of copying the content of the tree by
|
|
wrapping it in an array. This simple change has a major impact on the
|
|
performance of our tree data structure, and the cost of passing the array as an
|
|
argument drops from being O(n), depending on the size of the tree to O(1).
|
|
|
|
::
|
|
|
|
struct Tree : P {
|
|
var node : [P?]
|
|
init() {
|
|
node = [ thing ]
|
|
}
|
|
}
|
|
|
|
|
|
There are two obvious disadvantages of using Array for COW semantics. The first
|
|
problem is that Array exposes methods like "append" and "count" that don't make
|
|
any sense in the context of a value wrapper. These methods can make the use of
|
|
the reference wrapper awkward. It is possible to work around this problem by
|
|
creating a wrapper struct that will hide the unused APIs and the optimizer will
|
|
remove this overhead, but this wrapper will not solve the second problem. The
|
|
Second problem is that Array has code for ensuring program safety and
|
|
interaction with Objective-C. Swift checks if indexed accesses fall within the
|
|
array bounds and when storing a value if the array storage needs to be extended.
|
|
These runtime checks can slow things down.
|
|
|
|
An alternative to using Array is to implement a dedicated copy-on-write data
|
|
structure to replace Array as the value wrapper. The example below shows how to
|
|
construct such a data structure:
|
|
|
|
.. Note: This solution is suboptimal for nested structs, and an addressor based
|
|
.. COW data structure would be more efficient. However at this point it's not
|
|
.. possible to implement addressors out of the standard library.
|
|
|
|
.. More details in this blog post by Mike Ash:
|
|
.. https://www.mikeash.com/pyblog/friday-qa-2015-04-17-lets-build-swiftarray.html
|
|
|
|
::
|
|
|
|
final class Ref<T> {
|
|
var val : T
|
|
init(_ v : T) {val = v}
|
|
}
|
|
|
|
struct Box<T> {
|
|
var ref : Ref<T>
|
|
init(_ x : T) { ref = Ref(x) }
|
|
|
|
var value: T {
|
|
get { return ref.val }
|
|
set {
|
|
if (!isUniquelyReferencedNonObjC(&ref)) {
|
|
ref = Ref(newValue)
|
|
return
|
|
}
|
|
ref.val = newValue
|
|
}
|
|
}
|
|
}
|
|
|
|
The type ``Box`` can replace the array in the code sample above.
|
|
|
|
Unsafe code
|
|
===========
|
|
|
|
Swift classes are always reference counted. The swift compiler inserts code
|
|
that increments the reference count every time the object is accessed.
|
|
For example, consider the problem of scanning a linked list that's
|
|
implemented using classes. Scanning the list is done by moving a
|
|
reference from one node to the next: ``elem = elem.next``. Every time we move
|
|
the reference swift will increment the reference count of the ``next`` object
|
|
and decrement the reference count of the previous object. These reference
|
|
count operations are expensive and unavoidable when using Swift classes.
|
|
|
|
::
|
|
|
|
final class Node {
|
|
var next: Node?
|
|
var data: Int
|
|
...
|
|
}
|
|
|
|
|
|
Advice: Use unmanaged references to avoid reference counting overhead
|
|
---------------------------------------------------------------------
|
|
|
|
In performance-critical code you can use choose to use unmanaged
|
|
references. The ``Unmanaged<T>`` structure allows developers to disable
|
|
automatic reference counting for a specific reference.
|
|
|
|
::
|
|
|
|
var Ref : Unmanaged<Node> = Unmanaged.passUnretained(Head)
|
|
|
|
while let Next = Ref.takeUnretainedValue().next {
|
|
...
|
|
Ref = Unmanaged.passUnretained(Next)
|
|
}
|
|
|
|
|
|
Protocols
|
|
=========
|
|
|
|
Advice: Mark protocols that are only satisfied by classes as class-protocols
|
|
----------------------------------------------------------------------------
|
|
|
|
Swift can limit protocols adoption to classes only. One advantage of marking
|
|
protocols as class-only is that the compiler can optimize the program based on
|
|
the knowledge that only classes satisfy a protocol. For example, the ARC memory
|
|
management system can easily retain (increase the reference count of an object)
|
|
if it knows that it is dealing with a class. Without this knowledge the compiler
|
|
has to assume that a struct may satisfy the protocol and it needs to be prepared
|
|
to retain or release non-trivial structures, which can be expensive.
|
|
|
|
If it makes sense to limit the adoption of protocols to classes then mark
|
|
protocols as class-only protocols to get better runtime performance.
|
|
|
|
::
|
|
|
|
protocol Pingable : class { func ping() -> Int }
|
|
|
|
.. https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Protocols.html
|
|
|
|
|
|
|
|
Footnotes
|
|
=========
|
|
|
|
.. [#] A virtual method table or 'vtable' is a type specific table referenced by
|
|
instances that contains the addresses of the type's methods. Dynamic
|
|
dispatch proceeds by first looking up the table from the object and then
|
|
looking up the method in the table.
|
|
|
|
.. [#] This is due to the compiler not knowing the exact function being called.
|
|
|
|
.. [#] i.e. a direct load of a class's field or a direct call to a function.
|
|
|
|
.. [#] Explain what COW is here.
|
|
|
|
.. [#] In certain cases the optimizer is able to via inlining and ARC
|
|
optimization remove the retain, release causing no copy to occur.
|