mirror of
https://github.com/apple/swift.git
synced 2025-12-14 20:36:38 +01:00
1253 lines
57 KiB
ReStructuredText
1253 lines
57 KiB
ReStructuredText
:orphan:
|
|
|
|
.. _CallingConvention:
|
|
|
|
The Swift Calling Convention
|
|
****************************
|
|
|
|
.. contents::
|
|
|
|
This whitepaper discusses the Swift calling convention, at least as we
|
|
want it to be.
|
|
|
|
It's a basic assumption in this paper that Swift shouldn't make an
|
|
implicit promise to exactly match the default platform calling
|
|
convention. That is, if a C or Objective-C programmer manages to derive the
|
|
address of a Swift function, we don't have to promise that an obvious
|
|
translation of the type of that function will be correctly callable
|
|
from C. For example, this wouldn't be guaranteed to work::
|
|
|
|
// In Swift:
|
|
func foo(_ x: Int, y: Double) -> MyClass { ... }
|
|
|
|
// In Objective-C:
|
|
extern id _TF4main3fooFTSiSd_CS_7MyClass(intptr_t x, double y);
|
|
|
|
We do sometimes need to be able to match C conventions, both to use
|
|
them and to generate implementations of them, but that level of
|
|
compatibility should be opt-in and site-specific. If Swift would
|
|
benefit from internally using a better convention than C/Objective-C uses,
|
|
and switching to that convention doesn't damage the dynamic abilities
|
|
of our target platforms (debugging, dtrace, stack traces, unwinding,
|
|
etc.), there should be nothing preventing us from doing so. (If we
|
|
did want to guarantee compatibility on this level, this paper would be
|
|
a lot shorter!)
|
|
|
|
Function call rules in high-level languages have three major
|
|
components, each operating on a different abstraction level:
|
|
|
|
* the high-level semantics of the call (pass-by-reference
|
|
vs. pass-by-value),
|
|
|
|
* the ownership and validity conventions about argument and result
|
|
values ("+0" vs. "+1", etc.), and
|
|
|
|
* the "physical" representation conventions of how values are actually
|
|
communicated between functions (in registers, on the stack, etc.).
|
|
|
|
We'll tackle each of these in turn, then conclude with a detailed
|
|
discussion of function signature lowering.
|
|
|
|
High-level semantic conventions
|
|
===============================
|
|
|
|
The major division in argument passing conventions between languages
|
|
is between pass-by-reference and pass-by-value languages. It's a
|
|
distinction that only really makes sense in languages with the concept
|
|
of an l-value, but Swift does, so it's pertinent.
|
|
|
|
In general, the terms "pass-by-X" and "call-by-X" are used
|
|
interchangeably. It's unfortunate, because these conventions are
|
|
argument specific, and functions can be passed multiple arguments
|
|
that are each handled in a different way. As such, we'll prefer
|
|
"pass-by-X" for consistency and to emphasize that these conventions
|
|
are argument-specific.
|
|
|
|
Pass-by-reference
|
|
-----------------
|
|
|
|
In pass-by-reference (also called pass-by-name or pass-by-address), if
|
|
`A` is an l-value expression, `foo(A)` is passed some sort of opaque
|
|
reference through which the original l-value can be modified. If `A`
|
|
is not an l-value, the language may prohibit this, or (if
|
|
pass-by-reference is the default convention) it may pass a temporary
|
|
variable containing the result of `A`.
|
|
|
|
Don't confuse pass-by-reference with the concept of a *reference
|
|
type*. A reference type is a type whose value is a reference to a
|
|
different object; for example, a pointer type in C, or a class type in
|
|
Java or Swift. A variable of reference type can be passed by value
|
|
(copying the reference itself) or by reference (passing the variable
|
|
itself, allowing it to be changed to refer to a different object).
|
|
Note that references in C++ are a generalization of pass-by-reference,
|
|
not really a reference type; in C++, a variable of reference type
|
|
behaves completely unlike any other variable in the language.
|
|
|
|
Also, don't confuse pass-by-reference with the physical convention of
|
|
passing an argument value indirectly. In pass-by-reference, what's
|
|
logically being passed is a reference to a tangible, user-accessible
|
|
object; changes to the original object will be visible in the
|
|
reference, and changes to the reference will be reflected in the
|
|
original object. In an indirect physical convention, the argument is
|
|
still logically an independent value, no longer associated with the
|
|
original object (if there was one).
|
|
|
|
If every object in the language is stored in addressable memory,
|
|
pass-by-reference can be easily implemented by simply passing the
|
|
address of the object. If an l-value can have more structure than
|
|
just a single, independently-addressable object, more information may
|
|
be required from the caller. For example, an array argument in
|
|
FORTRAN can be a row or column vector from a matrix, and so arrays are
|
|
generally passed as both an address and a stride. C and C++ do have
|
|
unaddressable l-values because of bitfields, but they forbid passing
|
|
bitfields by reference (in C++) or taking their address (in either
|
|
language), which greatly simplifies pointer and reference types in
|
|
those languages.
|
|
|
|
FORTRAN is the last remaining example of a language that defaults to
|
|
pass-by-reference. Early FORTRAN implementations famously passed
|
|
constants by passing the address of mutable global memory initialized
|
|
to the constant; if the callee modified its parameter (illegal under
|
|
the standard, but...), it literally changed the constant for future
|
|
uses. FORTRAN now allows procedures to explicitly take arguments by
|
|
value and explicitly declare that arguments must be l-values.
|
|
|
|
However, many languages do allow parameters to be explicitly marked as
|
|
pass-by-reference. As mentioned for C++, sometimes only certain kinds
|
|
of l-values are allowed.
|
|
|
|
Swift allows parameters to be marked as pass-by-reference with
|
|
`inout`. Arbitrary l-values can be passed. The Swift convention is
|
|
to always pass an address; if the parameter is not addressable, it
|
|
must be materialized into a temporary and then written back. See the
|
|
accessors proposal for more details about the high-level semantics of
|
|
`inout` arguments.
|
|
|
|
Pass-by-value
|
|
-------------
|
|
|
|
In pass-by-value, if `A` is an l-value expression, `foo(A)` copies the
|
|
current value there. Any modifications `foo` makes to its parameter
|
|
are made to this copy, not to the original l-value.
|
|
|
|
Most modern languages are pass-by-value, with specific functions able
|
|
to opt in to pass-by-reference semantics. This is exactly what Swift
|
|
does.
|
|
|
|
There's not much room for variation in the high-level semantics of
|
|
passing arguments by value; all the variation is in the ownership and
|
|
physical conventions.
|
|
|
|
Ownership transfer conventions
|
|
==============================
|
|
|
|
Arguments and results that require cleanup, like an Objective-C object
|
|
reference or a non-POD C++ object, raise two questions about
|
|
responsibility: who is responsible for cleaning it up, and when?
|
|
|
|
These questions arise even when the cleanup is explicit in code. C's
|
|
`strdup` function returns newly-allocated memory which the caller is
|
|
responsible for freeing, but `strtok` does not. Objective-C has
|
|
standard naming conventions that describe which functions return
|
|
objects that the caller is responsible for releasing, and outside of
|
|
ARC these must be followed manually. Of course, conventions designed
|
|
to be implemented by programmers are often designed around the
|
|
simplicity of that implementation, rather than necessarily being more
|
|
efficient.
|
|
|
|
Pass-by-reference arguments
|
|
---------------------------
|
|
|
|
Pass-by-reference arguments generally don't involve a *transfer* of
|
|
ownership. It's assumed that the caller will ensure that the referent
|
|
is valid at the time of the call, and that the callee will ensure that
|
|
the referent is still valid at the time of return.
|
|
|
|
FORTRAN does actually allow parameters to be tagged as out-parameters,
|
|
where the caller doesn't guarantee the validity of the argument before
|
|
the call. Objective-C has something similar, where an indirect method
|
|
argument can be marked `out`; ARC takes advantage of this with
|
|
autoreleasing parameters to avoid a copy into the writeback temporary.
|
|
Neither of these are something we semantically care about supporting
|
|
in Swift.
|
|
|
|
There is one other theoretically interesting convention question here:
|
|
the argument has to be valid before the call and after the call, but
|
|
does it have to valid during the call? Swift's answer to this is
|
|
generally "yes". Swift does have `inout` aliasing rules that allow a
|
|
certain amount of optimization, but the compiler is forbidden from
|
|
exploiting these rules in any way that could cause memory corruption
|
|
(at least in the absence of race conditions). So Swift has to ensure
|
|
that an `inout` argument is valid whenever it does something
|
|
(including calling an opaque function) that could potentially access
|
|
the original l-value.
|
|
|
|
If Swift allowed local variables to be captured through `inout`
|
|
parameters, and therefore needed to pass an implicit owner parameter
|
|
along with an address, this owner parameter would behave like a
|
|
pass-by-value argument and could use any of the conventions listed
|
|
below. However, the optimal convention for this is obvious: it should
|
|
be `guaranteed`, since captures are very unlikely and callers are
|
|
almost always expected to use the value of an `inout` variable
|
|
afterwards.
|
|
|
|
Pass-by-value arguments
|
|
-----------------------
|
|
|
|
All conventions for this have performance trade-offs.
|
|
|
|
We're only going to discuss *static* conventions, where the transfer
|
|
is picked at compile time. It's possible to have a *dynamic*
|
|
convention, where the caller passes a flag indicating whether it's
|
|
okay to directly take responsibility for the value, and the callee can
|
|
(conceptually) return a flag indicating whether it actually did take
|
|
responsibility for it. If copying is extremely expensive, that can be
|
|
worthwhile; otherwise, the code cost may overwhelm any other benefits.
|
|
|
|
This discussion will ignore one particular impact of these conventions
|
|
on code size. If a function has many callers, conventions that
|
|
require more code in the caller are worse, all else aside. If a
|
|
single call site has many possible targets, conventions that require
|
|
more code in the callee are worse, all else aside. It's not really
|
|
reasonable to decide this in advance for unknown code; we could maybe
|
|
make rules about code calling system APIs, except that system APIs are
|
|
by definition locked down, and we can't change them. It's a
|
|
reasonable thing to consider changing with PGO, though.
|
|
|
|
Responsibility
|
|
~~~~~~~~~~~~~~
|
|
|
|
A common refrain in this performance analysis will be whether a
|
|
function has responsibility for a value. A function has to get a
|
|
value from *somewhere*:
|
|
|
|
* A caller is usually responsible for the return values it receives:
|
|
the callee generated the value and the caller is responsible for
|
|
destroying it. Any other convention has to rely on heavily
|
|
restricting what kind of value can be returned. (If you're thinking
|
|
about Objective-C autoreleased results, just accept this for now;
|
|
we'll talk about that later.)
|
|
|
|
* A function isn't necessarily responsible for a value it loads from
|
|
memory. Ignoring race conditions, the function may be able to
|
|
immediately use the value without taking any specific action to keep
|
|
it valid.
|
|
|
|
* A callee may or may not be responsible for a value passed as a
|
|
parameter, depending on the convention it was passed with.
|
|
|
|
* A function might come from a source that doesn't necessarily make
|
|
the function responsible, but if the function takes an action which
|
|
invalidates the source before using the value, the function has to
|
|
take action to keep the value valid. At that point, the function
|
|
has responsibility for the value despite its original source.
|
|
|
|
For example, a function `foo()` might load a reference `r` from a
|
|
global variable `x`, call an unknown function `bar()`, and then use
|
|
`r` in some way. If `bar()` can't possibly overwrite `x`, `foo()`
|
|
doesn't have to do anything to keep `r` alive across the call;
|
|
otherwise it does (e.g. by retaining it in a refcounted
|
|
environment). This is a situation where humans are often much
|
|
smarter than compilers. Of course, it's also a situation where
|
|
humans are sometimes insufficiently conservative.
|
|
|
|
A function may also require responsibility for a value as part of its
|
|
operation:
|
|
|
|
* Since a variable is always responsible for the current value it
|
|
stores, a function which stores a value into memory must first gain
|
|
responsibility for that value.
|
|
|
|
* A callee normally transfers responsibility for its return value to
|
|
its caller; therefore it must gain responsibility for its return
|
|
value before returning it.
|
|
|
|
* A caller may need to gain responsibility for a value before passing
|
|
it as an argument, depending on the parameter's ownership-transfer
|
|
convention.
|
|
|
|
Known conventions
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
There are three static parameter conventions for ownership worth
|
|
considering here:
|
|
|
|
* The caller may transfer responsibility for the value to the callee.
|
|
In SIL, we call this an **owned** parameter.
|
|
|
|
This is optimal if the caller has responsibility for the value and
|
|
doesn't need it after the call. This is an extremely common
|
|
situation; for example, it comes up whenever a call result is
|
|
immediately used as an argument. By giving the callee responsibility
|
|
for the value, this convention allows the callee to use the value at
|
|
a later point without taking any extra action to keep it alive.
|
|
|
|
The flip side is that this convention requires a lot of extra work
|
|
when a single value is used multiple times in the caller. For
|
|
example, a value passed in every iteration of a loop will need to be
|
|
copied/retained/whatever each time.
|
|
|
|
* The caller may provide the value without any responsibility on
|
|
either side. In SIL, we call this an **unowned** parameter. The
|
|
value is guaranteed to be valid at the moment of the call, and in
|
|
the absence of race conditions, that guarantee can be assumed to
|
|
continue unless the callee does something that might invalidate it.
|
|
As discussed above, humans are often much smarter than computers
|
|
about knowing when that's possible.
|
|
|
|
This is optimal if the caller can acquire the value without
|
|
responsibility and the callee doesn't require responsibility of it.
|
|
In very simple code --- e.g., loading values from an array and
|
|
passing them to a comparator function which just reads a few fields
|
|
from each and returns --- this can be extremely efficient.
|
|
|
|
Unfortunately, this convention is completely undermined if either
|
|
side has to do anything that forces it to take action to keep the
|
|
value alive. Also, if that happens on the caller side, the
|
|
convention can keep values alive longer than is necessary. It's
|
|
very easy for both sides of the convention to end up doing extra
|
|
work because of this.
|
|
|
|
* The caller may assert responsibility for the value. In SIL, we call
|
|
this a **guaranteed** parameter. The callee can rely on the value
|
|
staying valid for the duration of the call.
|
|
|
|
This is optimal if the caller needs to use the value after the call
|
|
and either has responsibility for it or has a guarantee like this
|
|
for it. Therefore, this convention is particularly nice when a
|
|
value is likely to be forwarded by value a great deal.
|
|
|
|
However, this convention does generally keep values alive longer
|
|
than is necessary, since the outermost function which passed it as
|
|
an argument will generally be forced to hold a reference for the
|
|
duration. By the same mechanism, in refcounted systems, this
|
|
convention tends to cause values to have multiple retains active at
|
|
once; for example, if a copy-on-write array is created in one
|
|
function, passed to another, stored in a mutable variable, and then
|
|
modified, the callee will see a reference count of 2 and be forced
|
|
to do a structural copy. This can occur even if the caller
|
|
literally constructed the array for the sole and immediate purpose
|
|
of passing it to the callee.
|
|
|
|
Analysis
|
|
~~~~~~~~
|
|
|
|
Objective-C generally uses the unowned convention for object-pointer
|
|
parameters. It is possible to mark a parameter as being consumed,
|
|
which is basically the owned convention. As a special case, in ARC we
|
|
assume that callers are responsible for keeping `self` values alive
|
|
(including in blocks), which is effectively the `guaranteed`
|
|
convention.
|
|
|
|
`unowned` causes a lot of problems without really solving any, in my
|
|
experience looking at ARC-generated code and optimizer output. A
|
|
human can take advantage of it, but the compiler is so frequently
|
|
blocked. There are many common idioms (like chains of functions that
|
|
just add default arguments at each step) have really awful performance
|
|
because the compiler is adding retains and releases at every single
|
|
level. It's just not a good convention to adopt by default. However,
|
|
we might want to consider allowing specific function parameters to opt
|
|
into it; sort comparators are a particularly interesting candidate
|
|
for this. `unowned` is very similar to C++'s `const &` for things
|
|
like that.
|
|
|
|
`guaranteed` is good for some things, but it causes a lot of silly
|
|
code bloat when values are really only used in one place, which is
|
|
quite common. The liveness / refcounting issues are also pretty
|
|
problematic. But there is one example that's very nice for
|
|
`guaranteed`: `self`. It's quite common for clients of a type to call
|
|
multiple methods on a single value, or for methods to dispatch to
|
|
multiple other methods, which are exactly the situations where
|
|
`guaranteed` excels. And it's relatively uncommon (but not
|
|
unimaginable) for a non-mutating method on a copy-on-write struct to
|
|
suddenly store `self` aside and start mutating that copy.
|
|
|
|
`owned` is a good default for other parameters. It has some minor
|
|
performance disadvantages (unnecessary retains if you have an
|
|
unoptimizable call in a loop) and some minor code size benefits (in
|
|
common straight-line code), but frankly, both of those points pale in
|
|
importance to the ability to transfer copy-on-write structures around
|
|
without spuriously increasing reference counts. It doesn't take too
|
|
many unnecessary structural copies before any amount of
|
|
reference-counting traffic (especially the Swift-native
|
|
reference-counting used in copy-on-write structures) is basically
|
|
irrelevant in comparison.
|
|
|
|
Result values
|
|
-------------
|
|
|
|
There's no major semantic split in result conventions like that
|
|
between pass-by-reference and pass-by-value. In most languages, a
|
|
function has to return a value (or nothing). There are languages like
|
|
C++ where functions can return references, but that's inherently
|
|
limited, because the reference has to refer to something that exists
|
|
outside the function. If Swift ever adds a similar language
|
|
mechanism, it'll have to be memory-safe and extremely opaque, and
|
|
it'll be easy to just think of that as a kind of weird value result.
|
|
So we'll just consider value results here.
|
|
|
|
Value results raise some of the same ownership-transfer questions as
|
|
value arguments. There's one major limitation: just like a
|
|
by-reference result, an actual `unowned` convention is inherently
|
|
limited, because something else other than the result value must be
|
|
keeping it valid. So that's off the table for Swift.
|
|
|
|
What Objective-C does is something more dynamic. Most APIs in
|
|
Objective-C give you a very ephemeral guarantee about the validity of
|
|
the result: it's valid now, but you shouldn't count on it being valid
|
|
indefinitely later. This might be because the result is actually
|
|
owned by some other object somewhere, or it might be because the
|
|
result has been placed in the autorelease pool, a thread-local data
|
|
structure which will (when explicitly drained by something up the call
|
|
chain) eventually release that's been put into it. This autorelease
|
|
pool can be a major source of spurious memory growth, and in classic
|
|
manual reference-counting it was important to drain it fairly
|
|
frequently. ARC's response to this convention was to add an
|
|
optimization which attempts to prevent things from ending up in the
|
|
autorelease pool; the net effect of this optimization is that ARC ends
|
|
up with an owned reference regardless of whether the value was
|
|
autoreleased. So in effect, from ARC's perspective, these APIs still
|
|
return an owned reference, mediated through some extra runtime calls
|
|
to undo the damage of the convention.
|
|
|
|
So there's really no compelling alternative to an owned return
|
|
convention as the default in Swift.
|
|
|
|
Physical conventions
|
|
====================
|
|
|
|
The lowest abstraction level for a calling convention is the actual
|
|
"physical" rules for the call:
|
|
|
|
* where the caller should place argument values in registers and
|
|
memory before the call,
|
|
|
|
* how the callee should pass back the return values in registers
|
|
and/or memory after the call, and
|
|
|
|
* what invariants hold about registers and memory over the call.
|
|
|
|
In theory, all of these could be changed in the Swift ABI. In
|
|
practice, it's best to avoid changes to the invariant rules, because
|
|
those rules could complicate Swift-to-C interoperation:
|
|
|
|
* Assuming a higher stack alignment would require dynamic realignment
|
|
whenever Swift code is called from C.
|
|
|
|
* Assuming a different set of callee-saved registers would require
|
|
additional saves and restores when either Swift code calls C or is
|
|
called from C, depending on the exact change. That would then
|
|
inhibit some kinds of tail call.
|
|
|
|
So we will limit ourselves to considering the rules for allocating
|
|
parameters and results to registers. Our platform C ABIs are usually
|
|
quite good at this, and it's fair to ask why Swift shouldn't just use
|
|
C's rules. There are three general answers:
|
|
|
|
* Platform C ABIs are specified in terms of the C type system, and the
|
|
Swift type system allows things to be expressed which don't have
|
|
direct analogues in C (for example, enums with payloads).
|
|
|
|
* The layout of structures in Swift does not necessarily match their
|
|
layout in C, which means that the C rules don't necessarily cover
|
|
all the cases in Swift.
|
|
|
|
* Swift places a larger emphasis on first-class structs than C does.
|
|
C ABIs often fail to allocate even small structs to registers, or
|
|
use inefficient registers for them, and we would like to be somewhat
|
|
more aggressive than that.
|
|
|
|
Accordingly, the Swift ABI is defined largely in terms of lowering: a
|
|
Swift function signature is translated to a C function signature with
|
|
all the aggregate arguments and results eliminated (possibly by
|
|
deciding to pass them indirectly). This lowering will be described in
|
|
detail in the final section of this whitepaper.
|
|
|
|
However, there are some specific circumstances where we'd like to
|
|
deviate from the platform ABI:
|
|
|
|
Aggregate results
|
|
-----------------
|
|
|
|
As mentioned above, Swift puts a lot of focus on first-class value
|
|
types. As part of this, it's very valuable to be able to return
|
|
common value types fully in registers instead of indirectly. The
|
|
magic number here is three: it's very common for copy-on-write value
|
|
types to want about three pointers' worth of data, because that's just
|
|
enough for some sort of owner pointer plus a begin/end pair.
|
|
|
|
Unfortunately, many common C ABIs fall slightly short of that. Even
|
|
those ABIs that do allow small structs to be returned in registers
|
|
tend to only allow two pointers' worth. So in general, Swift would
|
|
benefit from a very slightly-tweaked calling convention that allocates
|
|
one or two more registers to the result.
|
|
|
|
Implicit parameters
|
|
-------------------
|
|
|
|
There are several language features in Swift which require implicit
|
|
parameters:
|
|
|
|
Closures
|
|
~~~~~~~~
|
|
|
|
Swift's function types are "thick" by default, meaning that a function
|
|
value carries an optional context object which is implicitly passed to
|
|
the function when it is called. This context object is
|
|
reference-counted, and it should be passed `guaranteed` for
|
|
straightforward reasons:
|
|
|
|
* It's not uncommon for closures to be called many times, in which
|
|
case an `owned` convention would be unnecessarily expensive.
|
|
|
|
* While it's easy to imagine a closure which would want to take
|
|
responsibility for its captured values, giving it responsibility for
|
|
a retain of the context object doesn't generally allow that. The
|
|
closure would only be able to take ownership of the captured values
|
|
if it had responsibility for a *unique* reference to the context.
|
|
So the closure would have to be written to do different things based
|
|
on the uniqueness of the reference, and it would have to be able to
|
|
tear down and deallocate the context object after stealing values
|
|
from it. The optimization just isn't worth it.
|
|
|
|
* It's usually straightforward for the caller to guarantee the
|
|
validity of the context reference; worst case, a single extra
|
|
Swift-native retain/release is pretty cheap. Meanwhile, not having
|
|
that guarantee would force many closure functions to retain their
|
|
contexts, since many closures do multiple things with values from
|
|
the context object. So `unowned` would not be a good convention.
|
|
|
|
Many functions don't actually need a context, however; they are
|
|
naturally "thin". It would be best if it were possible to construct a
|
|
thick function directly from a thin function without having to
|
|
introduce a thunk just to move parameters around the missing context
|
|
parameter. In the worst case, a thunk would actually require the
|
|
allocation of a context object just to store the original function
|
|
pointer; but that's only necessary when converting from a completely
|
|
opaque function value. When the source function is known statically,
|
|
which is far more likely, the thunk can just be a global function
|
|
which immediately calls the target with the correctly shuffled
|
|
arguments. Still, it'd be better to be able to avoid creating such
|
|
thunks entirely.
|
|
|
|
In order to reliably avoid creating thunks, it must be possible for
|
|
code invoking an opaque thick function to pass the context pointer in
|
|
a way that can be safely and implicitly ignored if the function
|
|
happens to actually be thin. There are two ways to achieve this:
|
|
|
|
* The context can be passed as the final parameter. In most C calling
|
|
conventions, extra arguments can be safely ignored; this is because
|
|
most C calling conventions support variadic arguments, and such
|
|
conventions inherently can't rely on the callee knowing the extent
|
|
of the arguments.
|
|
|
|
However, this is sub-optimal because the context is often used
|
|
repeatedly in a closure, especially at the beginning, and putting it
|
|
at the end of the argument list makes it more likely to be passed on
|
|
the stack.
|
|
|
|
* The context can be passed in a register outside of the normal
|
|
argument sequence. Some ABIs actually even reserve a register for
|
|
this purpose; for example, on x86-64 it's `%r10`. Neither of the
|
|
ARM ABIs do, however.
|
|
|
|
Having an out-of-band register would be the best solution.
|
|
|
|
(Surprisingly, the ownership transfer convention for the context
|
|
doesn't actually matter here. You might think that an `owned`
|
|
convention would be prohibited, since the callee would fail to release
|
|
the context and would therefore leak it. However, a thin function
|
|
should always have a `nil` context, so this would be harmless.)
|
|
|
|
Either solution works acceptably with curried partial application,
|
|
since the inner parameters can be left in place while transforming the
|
|
context into the outer parameters. However, an `owned` convention
|
|
would either prevent the uncurrying forwarder from tail-calling the
|
|
main function or force all the arguments to be spilled. Neither is
|
|
really acceptable; one more argument against an `owned` convention.
|
|
(This is another example where `guaranteed` works quite nicely, since
|
|
the guarantees are straightforward to extend to the main function.)
|
|
|
|
`self`
|
|
~~~~~~
|
|
|
|
Methods (both static and instance) require a `self` parameter. In all
|
|
of these cases, it's reasonable to expect that `self` will used
|
|
frequently, so it's best to pass it in a register. Also, many methods
|
|
call other methods on the same object, so it's also best if the
|
|
register storing `self` is stable across different method signatures.
|
|
|
|
In static methods on value types, `self` doesn't require any dynamic
|
|
information: there's only one value of the metatype, and there's
|
|
usually no point in passing it.
|
|
|
|
In static methods on class types, `self` is a reference to the class
|
|
metadata, a single pointer. This is necessary because it could
|
|
actually be the class object of a subclass.
|
|
|
|
In instance methods on class types, `self` is a reference to the
|
|
instance, again a single pointer.
|
|
|
|
In mutating instance methods on value types, `self` is the address of
|
|
an object.
|
|
|
|
In non-mutating instance methods on value types, `self` is a value; it
|
|
may require multiple registers, or none, or it may need to be passed
|
|
indirectly.
|
|
|
|
All of these cases except mutating instance methods on value types can
|
|
be partially applied to create a function closure whose type is the
|
|
formal type of the method. That is, if class `A` has a method
|
|
declared `func foo(_ x: Int) -> Double`, then `A.foo` yields a function
|
|
of type `(Int) -> Double`. Assuming that we continue to feel that
|
|
this is a useful language feature, it's worth considered how we could
|
|
support it efficiently. The expenses associated with a partial
|
|
application are (1) the allocation of a context object and (2) needing
|
|
to introduce a thunk to forward to the original function. All else
|
|
aside, we can avoid the allocation if the representation of `self` is
|
|
compatible with the representation of a context object reference; this
|
|
is essentially true only if `self` is a class instance using Swift
|
|
reference counting. Avoiding the thunk is possible only if we
|
|
successfully avoided the allocation (since otherwise a thunk is
|
|
required in order to extract the correct `self` value from the
|
|
allocated context object) and `self` is passed in exactly the same
|
|
manner as a closure context would be.
|
|
|
|
It's unclear whether making this more efficient would really be
|
|
worthwhile on its own, but if we do support an out-of-band context
|
|
parameter, taking advantage of it for methods is essentially trivial.
|
|
|
|
Error handling
|
|
--------------
|
|
|
|
The calling convention implications of Swift's error handling design
|
|
aren't yet settled. It may involve extra parameters; it may involve
|
|
extra return values. Considerations:
|
|
|
|
* Callers will generally need to immediately check for an error.
|
|
Being able to quickly check a register would be extremely
|
|
convenient.
|
|
|
|
* If the error is returned as a component of the result value, it
|
|
shouldn't be physically combined with the normal result. If the
|
|
normal result is returned in registers, it would be unfortunate to
|
|
have to do complicated logic to test for error. If the normal
|
|
result is returned indirectly, contorting the indirect result with
|
|
the error would likely prevent the caller from evaluating the call
|
|
in-place.
|
|
|
|
* It would be very convenient to be able to trivially turn a function
|
|
which can't produce an error into a function which can. This is an
|
|
operation that we expect higher-order code to have do frequently, if
|
|
it isn't completely inlined away. For example::
|
|
|
|
// foo() expects its argument to follow the conventions of a
|
|
// function that's capable of throwing.
|
|
func foo(_ fn: () throws -> ()) throwsIf(fn)
|
|
|
|
// Here we're passing foo() a function that can't throw; this is
|
|
// allowed by the subtyping rules of the language. We'd like to be
|
|
// able to do this without having to introduce a thunk that maps
|
|
// between the conventions.
|
|
func bar(_ fn: () -> ()) {
|
|
foo(fn)
|
|
}
|
|
|
|
We'll consider two ways to satisfy this.
|
|
|
|
The first is to pass a pointer argument that doesn't interfere with
|
|
the normal argument sequence. The caller would initialize the memory
|
|
to a zero value. If the callee is a throwing function, it would be
|
|
expected to write the error value into this argument; otherwise, it
|
|
would naturally ignore it. Of course, the caller then has to load
|
|
from memory to see whether there's an error. This would also either
|
|
consume yet another register not in the normal argument sequence or
|
|
have to be placed at the end of the argument list, making it more
|
|
likely to be passed on the stack.
|
|
|
|
The second is basically the same idea, but using a register that's
|
|
otherwise callee-save. The caller would initialize the register to a
|
|
zero value. A throwing function would write the error into it; a
|
|
non-throwing function would consider it callee-save and naturally
|
|
preserve it. It would then be extremely easy to check it for an
|
|
error. Of course, this would take away a callee-save register in the
|
|
caller when calling throwing functions. Also, if the caller itself
|
|
isn't throwing, it would have to save and restore that register.
|
|
|
|
Both solutions would allow tail calls, and the zero store could be
|
|
eliminated for direct calls to known functions that can throw. The
|
|
second is the clearly superior solution, but definitely requires more
|
|
work in the backend.
|
|
|
|
Default argument generators
|
|
---------------------------
|
|
|
|
By default, Swift is resilient about default arguments and treats them
|
|
as essentially one part of the implementation of the function. This
|
|
means that, in general, a caller using a default argument must call a
|
|
function to emit the argument, instead of simply inlining that
|
|
emission directly into the call.
|
|
|
|
These default argument generation functions are unlike any other
|
|
because they have very precise information about how their result will
|
|
be used: it will be placed into a specific position in specific
|
|
argument list. The only reason the caller would ever want to do
|
|
anything else with the result is if it needs to spill the value before
|
|
emitting the call.
|
|
|
|
Therefore, in principle, it would be really nice if it were possible
|
|
to tell these functions to return in a very specific way, e.g. to
|
|
return two values in the second and third argument registers, or to
|
|
return a value at a specific location relative to the stack pointer
|
|
(although this might be excessively constraining; it would be
|
|
reasonable to simply opt into an indirect return instead). The
|
|
function should also preserve earlier argument registers (although
|
|
this could be tricky if the default argument generator is in a generic
|
|
context and therefore needs to be passed type-argument information).
|
|
|
|
This enhancement is very easy to postpone because it doesn't affect
|
|
any basic language mechanics. The generators are always called
|
|
directly, and they're inherently attached to a declaration, so it's
|
|
quite easy to take any particular generator and compatibly enhance it
|
|
with a better convention.
|
|
|
|
ARM32
|
|
-----
|
|
|
|
Most of the platforms we support have pretty good C calling
|
|
conventions. The exceptions are i386 (for the iOS simulator) and
|
|
ARM32 (for iOS). We really, really don't care about i386, but iOS on
|
|
ARM32 is still an important platform. Switching to a better physical
|
|
calling convention (only for calls from Swift to Swift, of course)
|
|
would be a major improvement.
|
|
|
|
It would be great if this were as simple as flipping a switch, but
|
|
unfortunately the obvious convention to switch to (AAPCS-VFP) has a
|
|
slightly different set of callee-save registers: iOS treats `r9` as a
|
|
scratch register. So we'd really want a variant of AAPCS-VFP that did
|
|
the same. We'd also need to make sure that SJ/LJ exceptions weren't
|
|
disturbed by this calling convention; we aren't really *supporting*
|
|
exception propagation through Swift frames, but completely breaking
|
|
propagation would be unfortunate, and we may need to be able to
|
|
*catch* exceptions.
|
|
|
|
So this would also require some amount of additional support from the
|
|
backend.
|
|
|
|
Function signature lowering
|
|
===========================
|
|
|
|
Function signatures in Swift are lowered in two phases.
|
|
|
|
Semantic lowering
|
|
-----------------
|
|
|
|
The first phase is a high-level semantic lowering, which does a number
|
|
of things:
|
|
|
|
* It determines a high-level calling convention: specifically, whether
|
|
the function must match the C calling convention or the Swift
|
|
calling convention.
|
|
|
|
* It decides the types of the parameters:
|
|
|
|
* Functions exported for the purposes of C or Objective-C may need
|
|
to use bridged types rather than Swift's native types. For
|
|
example, a function that formally returns Swift's `String` type
|
|
may be bridged to return an `NSString` reference instead.
|
|
|
|
* Functions which are values, not simply immediately called, may
|
|
need their types lowered to follow to match a specific generic
|
|
abstraction pattern. This applies to functions that are
|
|
parameters or results of the outer function signature.
|
|
|
|
* It identifies specific arguments and results which *must* be passed
|
|
indirectly:
|
|
|
|
* Some types are inherently address-only:
|
|
|
|
* The address of a weak reference must be registered with the
|
|
runtime at all times; therefore, any `struct` with a weak field
|
|
must always be passed indirectly.
|
|
|
|
* An existential type (if not class-bounded) may contain an
|
|
inherently address-only value, or its layout may be sensitive to
|
|
its current address.
|
|
|
|
* A value type containing an inherently address-only type as a
|
|
field or case payload becomes itself inherently address-only.
|
|
|
|
* Some types must be treated as address-only because their layout is
|
|
not known statically:
|
|
|
|
* The layout of a resilient value type may change in a later
|
|
release; the type may even become inherently address-only by
|
|
adding a weak reference.
|
|
|
|
* In a generic context, the layout of a type may be dependent on a
|
|
type parameter. The type parameter might even be inherently
|
|
address-only at runtime.
|
|
|
|
* A value type containing a type whose layout isn't known
|
|
statically itself generally will not have a layout that can be
|
|
known statically.
|
|
|
|
* Other types must be passed or returned indirectly because the
|
|
function type uses an abstraction pattern that requires it. For
|
|
example, a generic `map` function expects a function that takes a
|
|
`T` and returns a `U`; the generic implementation of `map` will
|
|
expect these values to be passed indirectly because their layout
|
|
isn't statically known. Therefore, the signature of a function
|
|
intended to be passed as this argument must pass them indirectly,
|
|
even if they are actually known statically to be non-address-only
|
|
types like (e.g.) `Int` and `Float`.
|
|
|
|
* It expands tuples in the parameter and result types. This is done
|
|
at this level both because it is affected by abstraction patterns
|
|
and because different tuple elements may use different ownership
|
|
conventions. (This is most likely for imported APIs, where it's the
|
|
tuple elements that correspond to specific C or Objective-C parameters.)
|
|
|
|
This completely eliminates top-level tuple types from the function
|
|
signature except when they are a target of abstraction and thus are
|
|
passed indirectly. (A function with type `(Float, Int) -> Float`
|
|
can be abstracted as `(T) -> U`, where `T == (Float, Int)`.)
|
|
|
|
* It determines ownership conventions for all parameters and results.
|
|
|
|
After this phase, a function type consists of an abstract calling
|
|
convention, a list of parameters, and a list of results. A parameter
|
|
is a type, a flag for indirectness, and an ownership convention. A
|
|
result is a type, a flag for indirectness, and an ownership
|
|
convention. (Results need ownership conventions only for non-Swift
|
|
calling conventions.) Types will not be tuples unless they are
|
|
indirect.
|
|
|
|
Semantic lowering may also need to mark certain parameters and results
|
|
as special, for the purposes of the special-case physical treatments
|
|
of `self`, closure contexts, and error results.
|
|
|
|
Physical lowering
|
|
-----------------
|
|
|
|
The second phase of lowering translates a function type produced by
|
|
semantic lowering into a C function signature. If the function
|
|
involves a parameter or result with special physical treatment,
|
|
physical lowering initially ignores this value, then adds in the
|
|
special treatment as agreed upon with the backend.
|
|
|
|
General expansion algorithm
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Central to the operation of the physical-lowering algorithm is the
|
|
**generic expansion algorithm**. This algorithm turns any
|
|
non-address-only Swift type in a sequence of zero or more **legal
|
|
type**, where a legal type is either:
|
|
|
|
* an integer type, with a power-of-two size no larger than the maximum
|
|
integer size supported by C on the target,
|
|
|
|
* a floating-point type supported by the target, or
|
|
|
|
* a vector type supported by the target.
|
|
|
|
Obviously, this is target-specific. The target also specifies a
|
|
maximum voluntary integer size. The legal type sequence only contains
|
|
vector types or integer types larger than the maximum voluntary size
|
|
when the type was explicit in the input.
|
|
|
|
Pointers are represented as integers in the legal type sequence. We
|
|
assume there's never a reason to differentiate them in the ABI as long
|
|
as the effect of address spaces on pointer size is taken into account.
|
|
If that's not true, this algorithm should be adjusted.
|
|
|
|
The result of the algorithm also associates each legal type with an
|
|
offset. This information is sufficient to reconstruct an object in
|
|
memory from a series of values and vice-versa.
|
|
|
|
The algorithm proceeds in two steps.
|
|
|
|
Typed layouts
|
|
^^^^^^^^^^^^^
|
|
|
|
First, the type is recursively analyzed to produce a **typed layout**.
|
|
A typed layout associates ranges of bytes with either (1) a legal type
|
|
(whose storage size must match the size of the associated byte
|
|
range), (2) the special type **opaque**, or (3) the special type
|
|
**empty**. Adjacent ranges mapped to **opaque** or **empty** can be
|
|
combined.
|
|
|
|
For most of the types in Swift, this process is obvious: they either
|
|
correspond to an obvious legal type (e.g. thick metatypes are
|
|
pointer-sized integers), or to an obvious sequence of scalars
|
|
(e.g. class existentials are a sequence of pointer-sized integers).
|
|
Only a few cases remain:
|
|
|
|
* Integer types that are not legal types should be mapped as opaque.
|
|
|
|
* Vector types that are not legal types should be broken into smaller
|
|
vectors, if their size is an even multiple of a legal vector type,
|
|
or else broken into their components. (This rule may need some
|
|
tinkering.)
|
|
|
|
* Tuples and structs are mapped by merging the typed layouts of the
|
|
fields, as padded out to the extents of the aggregate with
|
|
empty-mapped ranges. Note that, if fields do not overlap, this is
|
|
equivalent to concatenating the typed layouts of the fields, in
|
|
address order, mapping internal padding to empty. Bit-fields should
|
|
map the bits they occupy to opaque.
|
|
|
|
For example, given the following struct type::
|
|
|
|
struct FlaggedPair {
|
|
var flag: Bool
|
|
var pair: (MyClass, Float)
|
|
}
|
|
|
|
If Swift performs naive, C-like layout of this structure, and this
|
|
is a 64-bit platform, typed layout is mapped as follows::
|
|
|
|
FlaggedPair.flag := [0: i1, ]
|
|
FlaggedPair.pair := [ 8-15: i64, 16-19: float]
|
|
FlaggedPair := [0: i1, 8-15: i64, 16-19: float]
|
|
|
|
If Swift instead allocates `flag` into the spare (little-endian) low
|
|
bits of `pair.0`, the typed layout map would be::
|
|
|
|
FlaggedPair.flag := [0: i1 ]
|
|
FlaggedPair.pair := [0-7: i64, 8-11: float]
|
|
FlaggedPair := [0-7: opaque, 8-11: float]
|
|
|
|
* Unions (imported from C) are mapped by merging the typed layouts of
|
|
the fields, as padded out to the extents of the aggregate with
|
|
empty-mapped ranges. This will often result in a fully-opaque
|
|
mapping.
|
|
|
|
* Enums are mapped by merging the typed layouts of the cases, as
|
|
padded out to the extents of the aggregate with empty-mapped ranges.
|
|
A case's typed layout consists of the typed layout of the case's
|
|
directly-stored payload (if any), merged with the typed layout for
|
|
its discriminator. We assume that checking for a discriminator
|
|
involves a series of comparisons of bits extracted from
|
|
non-overlapping ranges of the value; the typed layout of a
|
|
discriminator maps all these bits to opaque and the rest to empty.
|
|
|
|
For example, given the following enum type::
|
|
|
|
enum Sum {
|
|
case Yes(MyClass)
|
|
case No(Float)
|
|
case Maybe
|
|
}
|
|
|
|
If Swift, in its infinite wisdom, decided to lay this out
|
|
sequentially, and to use invalid pointer values the class to
|
|
indicate that the other cases are present, the layout would look as
|
|
follows::
|
|
|
|
Sum.Yes.payload := [0-7: i64 ]
|
|
Sum.Yes.discriminator := [0-7: opaque ]
|
|
Sum.Yes := [0-7: opaque ]
|
|
Sum.No.payload := [ 8-11: float]
|
|
Sum.No.discriminator := [0-7: opaque ]
|
|
Sum.No := [0-7: opaque, 8-11: float]
|
|
Sum.Maybe := [0-7: opaque ]
|
|
Sum := [0-7: opaque, 8-11: float]
|
|
|
|
If Swift instead chose to just use a discriminator byte, the layout
|
|
would look as follows::
|
|
|
|
Sum.Yes.payload := [0-7: i64 ]
|
|
Sum.Yes.discriminator := [ 8: opaque]
|
|
Sum.Yes := [0-7: i64, 8: opaque]
|
|
Sum.No.payload := [0-3: float ]
|
|
Sum.No.discriminator := [ 8: opaque]
|
|
Sum.No := [0-3: float, 8: opaque]
|
|
Sum.Maybe := [ 8: opaque]
|
|
Sum := [0-8: opaque ]
|
|
|
|
If Swift chose to use spare low (little-endian) bits in the class
|
|
pointer, and to offset the float to make this possible, the layout
|
|
would look as follows::
|
|
|
|
Sum.Yes.payload := [0-7: i64 ]
|
|
Sum.Yes.discriminator := [0: opaque ]
|
|
Sum.Yes := [0-7: opaque ]
|
|
Sum.No.payload := [ 4-7: float]
|
|
Sum.No.discriminator := [0: opaque ]
|
|
Sum.No := [0: opaque, 4-7: float]
|
|
Sum.Maybe := [0: opaque ]
|
|
Sum := [0-7: opaque ]
|
|
|
|
The merge algorithm for typed layouts is as follows. Consider two
|
|
typed layouts `L` and `R`. A range from `L` is said to *conflict*
|
|
with a range from `R` if they intersect and they are mapped as
|
|
different non-empty types. If two ranges conflict, and either range
|
|
is mapped to a vector, replace it with mapped ranges for the vector
|
|
elements. If two ranges conflict, and neither range is mapped to a
|
|
vector, map them both to opaque, combining them with adjacent opaque
|
|
ranges as necessary. If a range is mapped to a non-empty type, and
|
|
the bytes in the range are all mapped as empty in the other map, add
|
|
that range-mapping to the other map. `L` and `R` should now match
|
|
perfectly; this is the result of the merge. Note that this algorithm
|
|
is both associative and commutative.
|
|
|
|
Forming a legal type sequence
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Once the typed layout is constructed, it can be turned into a legal
|
|
type sequence.
|
|
|
|
Note that this transformation is sensitive to the offsets of ranges in
|
|
the complete type. It's possible that the simplifications described
|
|
here could be integrated directly into the construction of the typed
|
|
layout without changing the results, but that's not yet proven.
|
|
|
|
In all of these examples, the maximum voluntary integer size is 4
|
|
(`i32`) unless otherwise specified.
|
|
|
|
If any range is mapped as a non-empty, non-opaque type, but its start
|
|
offset is not a multiple of its natural alignment, remap it as opaque.
|
|
For these purposes, the natural alignment of an integer type is the
|
|
minimum of its size and the maximum voluntary integer size; the
|
|
natural alignment of any other type is its C ABI type. Combine
|
|
adjacent opaque ranges.
|
|
|
|
For example::
|
|
|
|
[1-2: i16, 4: i8, 6-7: i16] ==> [1-2: opaque, 4: i8, 6-7: i16]
|
|
|
|
If any range is mapped as an integer type that is not larger than the
|
|
maximum voluntary size, remap it as opaque. Combine adjacent opaque
|
|
ranges.
|
|
|
|
For example::
|
|
|
|
[1-2: opaque, 4: i8, 6-7: i16] ==> [1-2: opaque, 4: opaque, 6-7: opaque]
|
|
[0-3: i32, 4-11: i64, 12-13: i16] ==> [0-3: opaque, 4-11: i64, 12-13: opaque]
|
|
|
|
An *aligned storage unit* is an N-byte-aligned range of N bytes, where
|
|
N is a power of 2 no greater than the maximum voluntary integer size.
|
|
A *maximal* aligned storage unit has a size equal to the maximum
|
|
voluntary integer size.
|
|
|
|
Note that any remaining ranges mapped as integers must fully occupy
|
|
multiple maximal aligned storage units.
|
|
|
|
Split all opaque ranges at the boundaries of maximal aligned storage
|
|
units. From this point on, never combine adjacent opaque ranges
|
|
across these boundaries.
|
|
|
|
For example::
|
|
|
|
[1-6: opaque] ==> [1-3: opaque, 4-6: opaque]
|
|
|
|
Within each maximal aligned storage unit, find the smallest aligned
|
|
storage unit which contains all the opaque ranges. Replace the first
|
|
opaque range in the maximal aligned storage unit with a mapping from
|
|
that aligned storage unit to an integer of the aligned storage unit's
|
|
size. Remove any other opaque ranges in the maximal aligned storage
|
|
unit. Note that this can create overlapping ranges in some cases.
|
|
For the purposes of this calculation, the last maximal aligned
|
|
storage unit should be considered "full", as if the type had an
|
|
infinite amount of empty tail-padding.
|
|
|
|
For example::
|
|
|
|
[1-2: opaque] ==> [0-3: i32]
|
|
[0-1: opaque] ==> [0-1: i16]
|
|
[0: opaque, 2: opaque] ==> [0-3: i32]
|
|
[0-9: fp80, 10: opaque] ==> [0-9: fp80, 10: i8]
|
|
|
|
// If maximum voluntary size is 8 (i64):
|
|
[0-9: fp80, 11: opaque, 13: opaque] ==> [0-9: fp80, 8-15: i64]
|
|
|
|
(This assumes that `fp80` is a legal type for illustrative purposes.
|
|
It would probably be a better policy for the actual x86-64 target to
|
|
consider it illegal and treat it as opaque from the start, at least
|
|
when lowering for the Swift calling convention; for C, it is important
|
|
to produce an `fp80` mapping for ABI interoperation with C functions
|
|
that take or return `long double` by value.)
|
|
|
|
The final legal type sequence is the sequence of types for the
|
|
non-empty ranges in the map. The associated offset for each type is
|
|
the offset of the start of the corresponding range.
|
|
|
|
Only the final step can introduce overlapping ranges, and this is only
|
|
possible if there's a non-integer legal type which:
|
|
|
|
* has a natural alignment less than half of the size of the maximum
|
|
voluntary integer size or
|
|
|
|
* has a store size is not a multiple of half the size of the maximum
|
|
voluntary integer size.
|
|
|
|
On our supported platforms, these conditions are only true on x86-64,
|
|
and only of `long double`.
|
|
|
|
Deconstruction and Reconstruction
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Given the address of an object and a legal type sequence for its type,
|
|
it's straightforward to load a valid sequence or store the sequence
|
|
back into memory. For the most part, it's sufficient to simply load
|
|
or store each value at its appropriate offset. There are two
|
|
subtleties:
|
|
|
|
* If the legal type sequence had any overlapping ranges, the integer
|
|
values should be stored first to prevent overwriting parts of the
|
|
other values they overlap.
|
|
|
|
* Care must be taken with the final values in the sequence; integer
|
|
values may extend slightly beyond the ordinary storage size of the
|
|
argument type. This is usually easy to compensate for.
|
|
|
|
The value sequence essentially has the same semantics that the value
|
|
in memory would have: any bits that aren't part of the actual
|
|
representation of the original type have a completely unspecified
|
|
value.
|
|
|
|
Forming a C function signature
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
As mentioned before, in principle the process of physical lowering
|
|
turns a semantically-lowered Swift function type (in implementation
|
|
terms, a SILFunctionType) into a C function signature, which can then
|
|
be lowered according to the usual rules for the ABI. This is, in
|
|
fact, what we do when trying to match a C calling convention.
|
|
However, for the native Swift calling convention, because we actively
|
|
want to use more aggressive rules for results, we instead build an
|
|
LLVM function type directly. We first construct a direct result type
|
|
that we're certain the backend knows how to interpret according to our
|
|
more aggressive desired rules, and then we use the expansion algorithm
|
|
to construct a parameter sequence consisting solely of types with
|
|
obvious ABI lowering that the backend can reliably handle. This
|
|
bypasses the need to consult Clang for our own native calling
|
|
convention.
|
|
|
|
We have this generic expansion algorithm, but it's important to
|
|
understand that the physical lowering process does not just naively
|
|
use the results of this algorithm. The expansion algorithm will
|
|
happily expand an arbitrary structure; if that structure is very
|
|
large, the algorithm might turn it into hundreds of values. It would
|
|
be foolish to pass it as an argument that way; it would use up all the
|
|
argument registers and basically turn into a very inefficient memcpy,
|
|
and if the caller wanted it all in one place, they'd have to very
|
|
painstakingly reassemble. It's much better to pass large structures
|
|
indirectly. And with result values, we really just don't have a
|
|
choice; there's only so many registers you can use before you have to
|
|
give up and return indirectly. Therefore, even in the Swift native
|
|
convention, the expansion algorithm is basically used as a first pass.
|
|
A second pass then decides whether the expanded sequence is actually
|
|
reasonable to pass directly.
|
|
|
|
Recall that one aspect of the semantically-lowered Swift function type
|
|
is whether we should be matching the C calling convention or not. The
|
|
following algorithm here assumes that the importer and semantic
|
|
lowering have conspired in a very particular way to make that
|
|
possible. Specifically, we assume is that an imported C function
|
|
type, lowered semantically by Swift, will follow some simple
|
|
structural rules:
|
|
|
|
* If there was a by-value `struct` or `union` parameter or result in
|
|
the imported C type, it will correspond to a by-value direct
|
|
parameter or return type in Swift, and the Swift type will be a
|
|
nominal type whose declaration links back to the original C
|
|
declaration.
|
|
|
|
* Any other parameter or result will be transformed by the importer
|
|
and semantic lowering to a type that the generic expansion algorithm
|
|
will expand to a single legal type whose representation is
|
|
ABI-compatible with the original parameter. For example, an
|
|
imported pointer type will eventually expand to an integer of
|
|
pointer size.
|
|
|
|
* There will be at most one result in the lowered Swift type, and it
|
|
will be direct.
|
|
|
|
Given this, we go about lowering the function type as follows. Recall
|
|
that, when matching the C calling convention, we're building a C
|
|
function type; but that when matching the Swift native calling
|
|
convention, we're building an LLVM function type directly.
|
|
|
|
Results
|
|
^^^^^^^
|
|
|
|
The first step is to consider the results of the function.
|
|
|
|
There's a different set of rules here when we're matching the C
|
|
calling convention. If there's a single direct result type, and it's
|
|
a nominal type imported from Clang, then the result type of the C
|
|
function type is that imported Clang type. Otherwise, concatenate the
|
|
legal type sequences from the direct results. If this yields an empty
|
|
sequence, the result type is `void`. If it yields a single legal
|
|
type, the result type is the corresponding Clang type. No other could
|
|
actually have come from an imported C declaration, so we don't have
|
|
any real compatibility requirements; for the convenience of
|
|
interoperation, this is handled by constructing a new C struct which
|
|
contains the corresponding Clang types for the legal type sequence as
|
|
its fields.
|
|
|
|
Otherwise, we are matching the Swift calling convention. Concatenate
|
|
the legal type sequences from all the direct results. If
|
|
target-specific logic decides that this is an acceptable collection to
|
|
return directly, construct the appropriate IR result type to convince
|
|
the backend to handle it. Otherwise, use the `void` IR result type
|
|
and return the "direct" results indirectly by passing the address of a
|
|
tuple combining the original direct results (*not* the types from the
|
|
legal type sequence).
|
|
|
|
Finally, any indirect results from the semantically-lowered function
|
|
type are simply added as pointer parameters.
|
|
|
|
Parameters
|
|
^^^^^^^^^^
|
|
|
|
After all the results are collected, it's time to collect the
|
|
parameters. This is done one at the time, from left to right, adding
|
|
parameters to our physically-lowered type.
|
|
|
|
If semantic lowering has decided that we have to pass the parameter
|
|
indirectly, we simply add a pointer to the type. This covers both
|
|
mandatory-indirect pass-by-value parameters and pass-by-reference
|
|
parameters. The latter can arise even in C and Objective-C.
|
|
|
|
Otherwise, the rules are somewhat different if we're matching the C
|
|
calling convention. If the parameter is a nominal type imported from
|
|
Clang, then we just add the imported Clang type to the Clang function
|
|
type as a parameter. Otherwise, we derive the legal type sequence for
|
|
the parameter type. Again, we should only have compatibility
|
|
requirements if the legal type sequence has a single element, but for
|
|
the convenience of interoperation, we collect the corresponding Clang
|
|
types for all of the elements of the sequence.
|
|
|
|
Finally, if we're matching the Swift calling convention, derive the
|
|
legal type sequence. If the result appears to be a reasonably small
|
|
and efficient set of parameters, add their corresponding IR types to
|
|
the function type we're building; otherwise, ignore the legal type
|
|
sequence and pass the address of the original type indirectly.
|
|
|
|
Considerations for whether a legal type sequence is reasonable to pass
|
|
directly:
|
|
|
|
* There probably ought to be a maximum size. Unless it's a single
|
|
256-bit vector, it's hard to imagine wanting to pass more than, say,
|
|
32 bytes of data as individual values. The callee may decide that
|
|
it needs to reconstruct the value for some reason, and the larger
|
|
the type gets, the more expensive this is. It may also be
|
|
reasonable for this cap to be lower on 32-bit targets, but that
|
|
might be dealt with better by the next restriction.
|
|
|
|
* There should also be a cap on the number of values. A 32-byte limit
|
|
might be reasonable for passing 4 doubles. It's probably not
|
|
reasonable for passing 8 pointers. That many values will exhaust
|
|
all the parameter registers for just a single value. 4 is probably
|
|
a reasonable cap here.
|
|
|
|
* There's no reason to require the data to be homogeneous. If a
|
|
struct contains three floats and a pointer, why force it to be
|
|
passed in memory?
|
|
|
|
When all of the parameters have been processed in this manner,
|
|
the function type is complete.
|