* [Executors][Distributed] custom executors for distributed actor
* harden ordering guarantees of synthesised fields
* the issue was that a non-default actor must implement the is remote check differently
* NonDefaultDistributedActor to complete support and remote flag handling
* invoke nonDefaultDistributedActorInitialize when necessary in SILGen
* refactor inline assertion into method
* cleanup
* [Executors][Distributed] Update module version for NonDefaultDistributedActor
* Minor docs cleanup
* we solved those fixme's
* add mangling test for non-def-dist-actor
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
The variants are produced by SILGen when opaque values are enabled.
They are necessary because otherwise SILGen would produce
address_to_pointer of values.
They will be lowered by AddressLowering.
This is a dedicated instruction for incrementing a
profiler counter, which lowers to the
`llvm.instrprof.increment` intrinsic. This
replaces the builtin instruction that was
previously used, and ensures that its arguments
are statically known. This ensures that SIL
optimization passes do not invalidate the
instruction, fixing some code coverage cases in
`-O`.
rdar://39146527
By using the keyword instead of the function, we actually get a much simpler
implementation since we avoid all of the machinery of SILGenApply. Given that we
are going down that path, I am removing the old builtin implementation since it
is dead code.
The reason why I am removing this now is that in a subsequent commit, I want to
move all of the ownership checking passes to run /before/ mandatory inlining. I
originally placed the passes after mandatory inlining since the function version
of the move keyword was transparent and needing to be inlined before we could
process it. Since we use the keyword now, that is no longer an issue.
The new intrinsic, exposed via static functions on Task<T, Never> and
Task<T, Error> (rethrowing), begins an asynchronous context within a
synchronous caller's context. This is only available for use under the
task-to-thread concurrency model, and even then only under SPI.
The key thing is that the move checker will not consider the explicit copy value
to be a copy_value that can be rewritten, ensuring that any uses of the result
of the explicit copy_value (consuming or other wise) are not checked.
Similar to the _move operator I recently introduced, this is a transparent
function so we can perform one level of specialization and thus at least be
generic over all concrete types.
This patch introduces a new stdlib function called _move:
```Swift
@_alwaysEmitIntoClient
@_transparent
@_semantics("lifetimemanagement.move")
public func _move<T>(_ value: __owned T) -> T {
#if $ExperimentalMoveOnly
Builtin.move(value)
#else
value
#endif
}
```
It is a first attempt at creating a "move" function for Swift, albeit a skleton
one since we do not yet perform the "no use after move" analysis. But this at
leasts gets the skeleton into place so we can built the analysis on top of it
and churn tree in a manageable way. Thus in its current incarnation, all it does
is take in an __owned +1 parameter and returns it after moving it through
Builtin.move.
Given that we want to use an OSSA based analysis for our "no use after move"
analysis and we do not have opaque values yet, we can not supporting moving
generic values since they are address only. This has stymied us in the past from
creating this function. With the implementation in this PR via a bit of
cleverness, we are now able to support this as a generic function over all
concrete types by being a little clever.
The trick is that when we transparent inline _move (to get the builtin), we
perform one level of specialization causing the inlined Builtin.move to be of a
loadable type. If after transparent inlining, we inline builtin "move" into a
context where it is still address only, we emit a diagnostic telling the user
that they applied move to a generic or existential and that this is not yet
supported.
The reason why we are taking this approach is that we wish to use this to
implement a new (as yet unwritten) diagnostic pass that verifies that _move
(even for non-trivial copyable values) ends the lifetime of the value. This will
ensure that one can write the following code to reliably end the lifetime of a
let binding in Swift:
```Swift
let x = Klass()
let _ = _move(x)
// hypotheticalUse(x)
```
Without the diagnostic pass, if one were to write another hypothetical use of x
after the _move, the compiler would copy x to at least hypotheticalUse(x)
meaning the lifetime of x would not end at the _move, =><=.
So to implement this diagnostic pass, we want to use the OSSA infrastructure and
that only works on objects! So how do we square this circle: by taking advantage
of the mandatory SIL optimzier pipeline! Specifically we take advantage of the
following:
1. Mandatory Inlining and Predictable Dead Allocation Elimination run before any
of the move only diagnostic passes that we run.
2. Mandatory Inlining is able to specialize a callee a single level when it
inlines code. One can take advantage of this to even at -Onone to
monomorphosize code.
and then note that _move is such a simple function that predictable dead
allocation elimination is able to without issue eliminate the extra alloc_stack
that appear in the caller after inlining without issue. So we (as the tests
show) get SIL that for concrete types looks exactly like we just had run a
move_value for that specific type as an object since we promote away the
stores/loads in favor of object operations when we eliminate the allocation.
In order to prevent any issue with this being used in a context where multiple
specializations may occur, I made the inliner emit a diagnostic if it inlines
_move into a function that applies it to an address only value. The diagnostic
is emitted at the source location where the function call occurs so it is easy
to find, e.x.:
```
func addressOnlyMove<T>(t: T) -> T {
_move(t) // expected-error {{move() used on a generic or existential value}}
}
moveonly_builtin_generic_failure.swift:12:5: error: move() used on a generic or existential value
_move(t)
^
```
To eliminate any potential ABI impact, if someone calls _move in a way that
causes it to be used in a context where the transparent inliner will not inline
it, I taught IRGen that Builtin.move is equivalent to a take from src -> dst and
marked _move as always emit into client (AEIC). I also took advantage of the
feature flag I added in the previous commit in order to prevent any cond_fails
from exposing Builtin.move in the stdlib. If one does not pass in the flag
-enable-experimental-move-only then the function just returns the value without
calling Builtin.move, so we are safe.
rdar://83957028
Adds two new IRGen-level builtins (one for allocating, the other for deallocating), a stdlib shim function for enhanced stack-promotion heuristics, and the proposed public stdlib functions.
Change the code generation patterns for `async let` bindings to use an ABI based on the following
functions:
- `swift_asyncLet_begin`, which starts an `async let` child task, but which additionally
now associates the `async let` with a caller-owned buffer to receive the result of the task.
This is intended to allow the task to emplace its result in caller-owned memory, allowing the
child task to be deallocated after completion without invalidating the result buffer.
- `swift_asyncLet_get[_throwing]`, which replaces `swift_asyncLet_wait[_throwing]`. Instead of
returning a copy of the value, this entry point concerns itself with populating the local buffer.
If the buffer hasn't been populated, then it awaits completion of the task and emplaces the
result in the buffer; otherwise, it simply returns. The caller can then read the result out of
its owned memory. These entry points are intended to be used before every read from the
`async let` binding, after which point the local buffer is guaranteed to contain an initialized
value.
- `swift_asyncLet_finish`, which replaces `swift_asyncLet_end`. Unlike `_end`, this variant
is async and will suspend the parent task after cancelling the child to ensure it finishes
before cleaning up. The local buffer will also be deinitialized if necessary. This is intended
to be used on exit from an `async let` scope, to handle cleaning up the local buffer if necessary
as well as cancelling, awaiting, and deallocating the child task.
- `swift_asyncLet_consume[_throwing]`, which combines `get` and `finish`. This will await completion
of the task, leaving the result value in the result buffer (or propagating the error, if it
throws), while destroying and deallocating the child task. This is intended as an optimization
for reading `async let` variables that are read exactly once by their parent task.
To avoid an epoch break with existing swiftinterfaces and ABI clients, the old builtins and entry
points are kept intact for now, but SILGen now only generates code using the new interface.
This new interface fixes several issues with the old async let codegen, including use-after-free
crashes if the `async let` was never awaited, and the inability to read from an `async let` variable
more than once.
rdar://77855176
Rather than using group task options constructed from the Swift parts
of the _Concurrency library and passed through `createAsyncTask`'s
options, introduce a separate builtin that always takes a group. Move
the responsibility for creating the options structure into IRGen, so
we don't need to expose the TaskGroupTaskOptionRecord type in Swift.
Introduce a builtin `createAsyncTask` that maps to `swift_task_create`,
and use that for the non-group task creation operations based on the
task-creation flags. `swift_task_create` and the thin function version
`swift_task_create_f` go through the dynamically-replaceable
`swift_task_create_common`, where all of the task creation logic is
present.
While here, move copying of task locals and the initial scheduling of
the task into `swift_task_create_common`, enabling by separate flags.
* [Distributed] Initial distributed checking
* [Distributed] initial types shapes and conform to DistributedActor
* [Distributed] Require Codable params and return types
* [Distributed] initial synthesis of fields and constructors
* [Distributed] Field and initializer synthesis
* [Distributed] Codable requirement on distributed funcs; also handle <T: Codable>
* [Distributed] handle generic type params which are Codable in dist func
[Distributed] conformsToProtocol after all
* [Distributed] Implement remote flag on actors
* Implement remote flag on actors
* add test
* actor initializer that sets remote flag
[Distributed] conformances getting there
* [Distributed] dont require async throws; cleanup compile tests
* [Distributed] do not synthesize default implicit init, only our special ones
* [Distributed] properly synth inits and properties; mark actorTransport as _distributedActorIndependent
Also:
- do not synthesize default init() initializer for dist actor
* [Distributed] init(transport:) designated and typechecking
* [Distributed] dist actor initializers MUST delegate to local-init
* [Distributed] check if any ctors in delegation call init(transport:)
* [Distributed] check init(transport:) delegation through many inits; ban invoking init(resolve:using:) explicitly
* [Distributed] disable IRGen test for now
* [Distributed] Rebase cleanups
* [Concurrent] transport and address are concurrent value
* [Distributed] introduce -enable-experimental-distributed flag
* rebase adjustments again
* rebase again...
* [Distributed] distributed functions are implicitly async+throws outside the actor
* [Distributed] implicitly throwing and async distributed funcs
* remove printlns
* add more checks to implicit function test
* [Distributed] resolve initializer now marks the isRemote actor flag
* [Distributed] distributedActor_destroy invoked instead, rather than before normal
* [Distributed] Generate distributed thunk for actors
* [distributed] typechecking for _remote_ functions existing, add tests for remote funcs
* adding one XFAIL'ed task & actor lifetime test
The `executor_deinit1` test fails 100% of the time
(from what I've seen) so I thought we could track
and see when/if someone happens to fix this bug.
Also, added extra coverage for #36298 via `executor_deinit2`
* Fix a memory issue with actors in the runtime system, by @phausler
* add new test that now passes because of patch by @phausler
See previous commit in this PR.
Test is based on one from rdar://74281361
* fix all tests that require the _remote_ function stubs
* Do not infer @actorIndependent onto `let` decls
* REVERT_ME: remove some tests that hacky workarounds will fail
* another flaky test, help build toolchain
* [Distributed] experimental distributed implies experimental concurrency
* [Distributed] Allow distributed function that are not marked async or throws
* [Distributed] make attrs SIMPLE to get serialization generated
* [Distributed] ActorAddress must be Hashable
* [Distributed] Implement transport.actorReady call in local init
* cleanup after rebase
* [Distributed] add availability attributes to all distributed actor code
* cleanup - this fixed some things
* fixing up
* fixing up
* [Distributed] introduce new Distributed module
* [Distributed] diagnose when missing 'import _Distributed'
* [Distributed] make all tests import the module
* more docs on address
* [Distributed] fixup merge issues
* cleanup: remove unnecessary code for now SIMPLE attribute
* fix: fix getActorIsolationOfContext
* [Distributed] cmake: depend on _concurrency module
* fixing tests...
* Revert "another flaky test, help build toolchain"
This reverts commit 83ae6654dd.
* remove xfail
* clenup some IR and SIL tests
* cleanup
* [Distributed] fix cmake test and ScanDependencies/can_import_with_map.swift
* [Distributed] fix flags/build tests
* cleanup: use isDistributed wherever possible
* [Distributed] don't import Dispatch in tests
* dont link distributed in stdlib unittest
* trying always append distributed module
* cleanups
* [Distributed] move all tests to Distributed/ directory
* [lit] try to fix lit test discovery
* [Distributed] update tests after diagnostics for implicit async changed
* [Distributed] Disable remote func tests on Windows for now
* Review cleanups
* [Distributed] fix typo, fixes Concurrency/actor_isolation_objc.swift
* [Distributed] attributes are DistributedOnly (only)
* cleanup
* [Distributed] cleanup: rely on DistributedOnly for guarding the keyword
* Update include/swift/AST/ActorIsolation.h
Co-authored-by: Doug Gregor <dgregor@apple.com>
* introduce isAnyThunk, minor cleanup
* wip
* [Distributed] move some type checking to TypeCheckDistributed.cpp
* [TypeCheckAttr] remove extra debug info
* [Distributed/AutoDiff] fix SILDeclRef creation which caused AutoDiff issue
* cleanups
* [lit] remove json import from lit test suite, not needed after all
* [Distributed] distributed functions only in DistributedActor protocols
* [Distributed] fix flag overlap & build setting
* [Distributed] Simplify noteIsolatedActorMember to not take bool distributed param
* [Distributed] make __isRemote not public
* [Distributed] Fix availability and remove actor class tests
* [actorIndependent] do not apply actorIndependent implicitly to values where it would be illegal to apply
* [Distributed] disable tests until issue fixed
Co-authored-by: Dario Rexin <drexin@apple.com>
Co-authored-by: Kavon Farvardin <kfarvardin@apple.com>
Co-authored-by: Doug Gregor <dgregor@apple.com>
* Revert "[Distributed] disable tests until issue fixed"
This reverts commit 0a04278920.
* Revert "[Distributed] Initial `distributed` actors and functions and new module (#37109)"
This reverts commit 814ede0cf3.
* [Distributed] Initial distributed checking
* [Distributed] initial types shapes and conform to DistributedActor
* [Distributed] Require Codable params and return types
* [Distributed] initial synthesis of fields and constructors
* [Distributed] Field and initializer synthesis
* [Distributed] Codable requirement on distributed funcs; also handle <T: Codable>
* [Distributed] handle generic type params which are Codable in dist func
[Distributed] conformsToProtocol after all
* [Distributed] Implement remote flag on actors
* Implement remote flag on actors
* add test
* actor initializer that sets remote flag
[Distributed] conformances getting there
* [Distributed] dont require async throws; cleanup compile tests
* [Distributed] do not synthesize default implicit init, only our special ones
* [Distributed] properly synth inits and properties; mark actorTransport as _distributedActorIndependent
Also:
- do not synthesize default init() initializer for dist actor
* [Distributed] init(transport:) designated and typechecking
* [Distributed] dist actor initializers MUST delegate to local-init
* [Distributed] check if any ctors in delegation call init(transport:)
* [Distributed] check init(transport:) delegation through many inits; ban invoking init(resolve:using:) explicitly
* [Distributed] disable IRGen test for now
* [Distributed] Rebase cleanups
* [Concurrent] transport and address are concurrent value
* [Distributed] introduce -enable-experimental-distributed flag
* rebase adjustments again
* rebase again...
* [Distributed] distributed functions are implicitly async+throws outside the actor
* [Distributed] implicitly throwing and async distributed funcs
* remove printlns
* add more checks to implicit function test
* [Distributed] resolve initializer now marks the isRemote actor flag
* [Distributed] distributedActor_destroy invoked instead, rather than before normal
* [Distributed] Generate distributed thunk for actors
* [distributed] typechecking for _remote_ functions existing, add tests for remote funcs
* adding one XFAIL'ed task & actor lifetime test
The `executor_deinit1` test fails 100% of the time
(from what I've seen) so I thought we could track
and see when/if someone happens to fix this bug.
Also, added extra coverage for #36298 via `executor_deinit2`
* Fix a memory issue with actors in the runtime system, by @phausler
* add new test that now passes because of patch by @phausler
See previous commit in this PR.
Test is based on one from rdar://74281361
* fix all tests that require the _remote_ function stubs
* Do not infer @actorIndependent onto `let` decls
* REVERT_ME: remove some tests that hacky workarounds will fail
* another flaky test, help build toolchain
* [Distributed] experimental distributed implies experimental concurrency
* [Distributed] Allow distributed function that are not marked async or throws
* [Distributed] make attrs SIMPLE to get serialization generated
* [Distributed] ActorAddress must be Hashable
* [Distributed] Implement transport.actorReady call in local init
* cleanup after rebase
* [Distributed] add availability attributes to all distributed actor code
* cleanup - this fixed some things
* fixing up
* fixing up
* [Distributed] introduce new Distributed module
* [Distributed] diagnose when missing 'import _Distributed'
* [Distributed] make all tests import the module
* more docs on address
* [Distributed] fixup merge issues
* cleanup: remove unnecessary code for now SIMPLE attribute
* fix: fix getActorIsolationOfContext
* [Distributed] cmake: depend on _concurrency module
* fixing tests...
* Revert "another flaky test, help build toolchain"
This reverts commit 83ae6654dd.
* remove xfail
* clenup some IR and SIL tests
* cleanup
* [Distributed] fix cmake test and ScanDependencies/can_import_with_map.swift
* [Distributed] fix flags/build tests
* cleanup: use isDistributed wherever possible
* [Distributed] don't import Dispatch in tests
* dont link distributed in stdlib unittest
* trying always append distributed module
* cleanups
* [Distributed] move all tests to Distributed/ directory
* [lit] try to fix lit test discovery
* [Distributed] update tests after diagnostics for implicit async changed
* [Distributed] Disable remote func tests on Windows for now
* Review cleanups
* [Distributed] fix typo, fixes Concurrency/actor_isolation_objc.swift
* [Distributed] attributes are DistributedOnly (only)
* cleanup
* [Distributed] cleanup: rely on DistributedOnly for guarding the keyword
* Update include/swift/AST/ActorIsolation.h
Co-authored-by: Doug Gregor <dgregor@apple.com>
* introduce isAnyThunk, minor cleanup
* wip
* [Distributed] move some type checking to TypeCheckDistributed.cpp
* [TypeCheckAttr] remove extra debug info
* [Distributed/AutoDiff] fix SILDeclRef creation which caused AutoDiff issue
* cleanups
* [lit] remove json import from lit test suite, not needed after all
* [Distributed] distributed functions only in DistributedActor protocols
* [Distributed] fix flag overlap & build setting
* [Distributed] Simplify noteIsolatedActorMember to not take bool distributed param
* [Distributed] make __isRemote not public
Co-authored-by: Dario Rexin <drexin@apple.com>
Co-authored-by: Kavon Farvardin <kfarvardin@apple.com>
Co-authored-by: Doug Gregor <dgregor@apple.com>
- Introduce an UnownedSerialExecutor type into the concurrency library.
- Create a SerialExecutor protocol which allows an executor type to
change how it executes jobs.
- Add an unownedExecutor requirement to the Actor protocol.
- Change the ABI for ExecutorRef so that it stores a SerialExecutor
witness table pointer in the implementation field. This effectively
makes ExecutorRef an `unowned(unsafe) SerialExecutor`, except that
default actors are represented without a witness table pointer (just
a bit-pattern).
- Synthesize the unownedExecutor method for default actors (i.e. actors
that don't provide an unownedExecutor property).
- Make synthesized unownedExecutor properties `final`, and give them
a semantics attribute specifying that they're for default actors.
- Split `Builtin.buildSerialExecutorRef` into a few more precise
builtins. We're not using the main-actor one yet, though.
Pitch thread:
https://forums.swift.org/t/support-custom-executors-in-swift-concurrency/44425
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
This isn't _terribly_ useful as-is, because the only constant mask you can get at from Swift at present is the zeroinitializer, but even that is quite useful for optimizing the repeating: intializer on SIMD. At some future point we should wire up generating constant masks for the .even, .odd, .high and .low properties (and also eventually make shufflevector take non-constant masks in LLVM). But this is enough to be useful, so let's get it in.
The comment in LowerHopToActor explains the design here.
We want SILGen to emit hops to actors, ignoring executors,
because it's easier to fully optimize in a world where deriving
an executor is a non-trivial operation. But we also want something
prior to IRGen to lower the executor derivation because there are
useful static optimizations we can do, such as doing the derivation
exactly once on a dominance path and strength-reducing the derivation
(e.g. exploiting static knowledge that an actor is a default actor).
There are probably phase-ordering problems with doing this so late,
but hopefully they're restricted to situations like actors that
share an executor. We'll want to optimize that eventually, but
in the meantime, this unblocks the executor work.
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
Tasks shouldn't normally hog the actor context indefinitely after making a call that's bound to
that actor, since that prevents the actor from potentially taking on other jobs it needs to
be able to address. Set up SILGen so that it saves the current executor (using a new runtime
entry point) and hops back to it after every actor call, not only ones where the caller context
is also actor-bound.
The added executor hopping here also exposed a bug in the runtime implementation while processing
DefaultActor jobs, where if an actor job returned to the processing loop having already yielded
the thread back to a generic executor, we would still attempt to make the actor give up the thread
again, corrupting its state.
rdar://71905765
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)
It would be more abstractly correct if this got DI support so
that we destroy the member if the constructor terminates
abnormally, but we can get to that later.
In derivatives of loops, no longer allocate boxes for indirect case payloads. Instead, use a custom pullback context in the runtime which contains a bump-pointer allocator.
When a function contains a differentiated loop, the closure context is a `Builtin.NativeObject`, which contains a `swift::AutoDiffLinearMapContext` and a tail-allocated top-level linear map struct (which represents the linear map struct that was previously directly partial-applied into the pullback). In branching trace enums, the payloads of previously indirect cases will be allocated by `swift::AutoDiffLinearMapContext::allocate` and stored as a `Builtin.RawPointer`.
`Builtin.createAsyncTask` takes flags, an optional parent task, and an
async/throwing function to execute, and passes it along to the
`swift_task_create_f` entry point to create a new (potentially child)
task, returning the new task and its initial context.
Implement a new builtin, `cancelAsyncTask()`, to cancel the given
asynchronous task. This lowers down to a call into the runtime
operation `swift_task_cancel()`.
Use this builtin to implement Task.Handle.cancel().
Rather than produce an "unowned" result from `getCurrentAsyncTask()`,
take advantage of the fact that the task is effectively guaranteed in
the scope. Do so be returning it as "unowned", and push an
end_lifetime cleanup to end the lifetime. This eliminates unnecessary
ref-count traffic as well as introducing another use of unowned.
Approach is thanks to Michael Gottesman, bugs are mine.
TLDR: This patch introduces a new kind of builtin, "a polymorphic builtin". One
calls it like any other builtin, e.x.:
```
Builtin.generic_add(x, y)
```
but it has a contract: it must be specialized to a concrete builtin by the time
we hit Lowered SIL. In this commit, I add support for the following generic
operations:
Type | Op
------------------------
FloatOrVector |FAdd
FloatOrVector |FDiv
FloatOrVector |FMul
FloatOrVector |FRem
FloatOrVector |FSub
IntegerOrVector|AShr
IntegerOrVector|Add
IntegerOrVector|And
IntegerOrVector|ExactSDiv
IntegerOrVector|ExactUDiv
IntegerOrVector|LShr
IntegerOrVector|Mul
IntegerOrVector|Or
IntegerOrVector|SDiv
IntegerOrVector|SRem
IntegerOrVector|Shl
IntegerOrVector|Sub
IntegerOrVector|UDiv
IntegerOrVector|Xor
Integer |URem
NOTE: I only implemented support for the builtins in SIL and in SILGen. I am
going to implement the optimizer parts of this in a separate series of commits.
DISCUSSION
----------
Today there are polymorphic like instructions in LLVM-IR. Yet, at the
swift and SIL level we represent these operations instead as Builtins whose
names are resolved by splatting the builtin into the name. For example, adding
two things in LLVM:
```
%2 = add i64 %0, %1
%2 = add <2 x i64> %0, %1
%2 = add <4 x i64> %0, %1
%2 = add <8 x i64> %0, %1
```
Each of the add operations are done by the same polymorphic instruction. In
constrast, we splat out these Builtins in swift today, i.e.:
```
let x, y: Builtin.Int32
Builtin.add_Int32(x, y)
let x, y: Builtin.Vec4xInt32
Builtin.add_Vec4xInt32(x, y)
...
```
In SIL, we translate these verbatim and then IRGen just lowers them to the
appropriate polymorphic instruction. Beyond being verbose, these prevent these
Builtins (which need static types) from being used in polymorphic contexts where
we can guarantee that eventually a static type will be provided.
In contrast, the polymorphic builtins introduced in this commit can be passed
any type, with the proviso that the expert user using this feature can guarantee
that before we reach Lowered SIL, the generic_add has been eliminated. This is
enforced by IRGen asserting if passed such a builtin and by the SILVerifier
checking that the underlying builtin is never called once the module is in
Lowered SIL.
In forthcoming commits, I am going to add two optimizations that give the stdlib
tool writer the tools needed to use this builtin:
1. I am going to add an optimization to constant propagation that changes a
"generic_*" op to the type of its argument if the argument is a type that is
valid for the builtin (i.e. integer or vector).
2. I am going to teach the SILCloner how to specialize these as it inlines. This
ensures that when we transparent inline, we specialize the builtin automatically
and can then form SSA at -Onone using predictable memory access operations.
The main implication around these polymorphic builtins are that if an author is
not able to specialize the builtin, they need to ensure that after constant
propagation, the generic builtin has been DCEed. The general rules are that the
-Onone optimizer will constant fold branches with constant integer operands. So
if one can use a bool of some sort to trigger the operation, one can be
guaranteed that the code will not codegen. I am considering putting in some sort
of diagnostic to ensure that the stdlib writer has a good experience (e.x. get
an error instead of crashing the compiler).
Returns `true` if `T.Type` is known to refer to a concrete type. The
implementation allows for the optimizer to specialize this at -O and
eliminate conditional code.
Includes `Swift._isConcrete<T>(T.Type) -> Bool` wrapper function.