This change moves us toward the official feature, and eliminates the
extra level of "thunk" that was implied by `@_cdecl`. Amusingly, this
trips up the LLVM-level ARC optimizations, because we are trying to
perform ARC optimizations within the retain/release runtime functions.
Teach those optimization passes to leave swift_retainN et al alone.
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871
We used to represent these just as normal LLVM functions, e.x.:
declare objc_object* @objc_retain(objc_object*)
declare void @objc_release(objc_object*)
Recently, special objc intrinsics were added to LLVM. This pass updates these
small (old) passes to use the new intrinsics.
This turned out to not be too difficult since we never create these
instructions. We only analyze them, move them, and delete them.
rdar://47852297
* Remove RegisterPreservingCC. It was unused.
* Remove DefaultCC from the runtime. The distinction between C_CC and DefaultCC
was unused and inconsistently applied. Separate C_CC and DefaultCC are
still present in the compiler.
* Remove function pointer indirection from runtime functions except those
that are used by Instruments. The remaining Instruments interface is
expected to change later due to function pointer liability.
* Remove swift_rt_ wrappers. Function pointers are an ABI liability that we
don't want, and there are better ways to get nonlazy binding if we need it.
The fully custom wrappers were only needed for RegisterPreservingCC and
for optimizing the Instruments function pointers.
For semantic ARC I need to add an endBorrow entrypoint that will be removed by
ARCContract.cpp. In the process I am doing a little bit of cleanup.
In this commit, I only use this to generate the enum RT_Kind in LLVMARCOpts.h. I
verified it was the same using a diff tool. I am going to do further updates in
subsequent commits to make the diff easy to see.
Properly lower reference counting SIL instructions with nonatomic attribute as invocations of corresponding non-atomic reference counting runtime functions.
Correct format:
```
//===--- Name of file - Description ----------------------------*- Lang -*-===//
```
Notes:
* Comment line should be exactly 80 chars.
* Padding: Pad with dashes after "Description" to reach 80 chars.
* "Name of file", "Description" and "Lang" are all optional.
* In case of missing "Lang": drop the "-*-" markers.
* In case of missing space: drop one, two or three dashes before "Name of file".
This lets us remove `swift_fixLifetime` as a real runtime entry point. Also, avoid generating the marker at all if the LLVM ARC optimizer won't be run, as in -Onone or -disable-llvm-arc-optimizer mode.
some of the ARC entry points. rdar://22724641. After this commit,
swift_retain_noresult will be completely replaced by swift_retain.
LLVMARCOpts pass is modified NOT to rewrite swift_retain to
swift_retain_noresult which forward no reference.
Swift SVN r32082
I asked that the patches were split up so I could do post commit review.
This reverts commit r32059.
This reverts commit r32058.
This reverts commit r32056.
This reverts commit r32055.
Swift SVN r32060
to remove reference forwarding for some of the ARC entry points. rdar://22724641. After this
commit, swift_retain_noresult will be completely replaced by swift_retain and LLVMARCOpts.cpp
will no longer canonicalize swift_retain to swift_retain_noresult as now swift_retain returns no
reference.
Swift SVN r32058
We were never generating it and are going to transition all of the stdlib data
structures to single pointer representations.
rdar://21665665
Swift SVN r30174
This is a type that has ownership of a reference while allowing access to the
spare bits inside the pointer, but which can also safely hold an ObjC tagged pointer
reference (with no spare bits of course). It additionally blesses one
Foundation-coordinated bit with the meaning of "has swift refcounting" in order
to get a faster short-circuit to native refcounting. It supports the following
builtin operations:
- Builtin.castToBridgeObject<T>(ref: T, bits: Builtin.Word) ->
Builtin.BridgeObject
Creates a BridgeObject that contains the bitwise-OR of the bit patterns of
"ref" and "bits". It is the user's responsibility to ensure "bits" doesn't
interfere with the reference identity of the resulting value. In other words,
it is undefined behavior unless:
castReferenceFromBridgeObject(castToBridgeObject(ref, bits)) === ref
This means "bits" must be zero if "ref" is a tagged pointer. If "ref" is a real
object pointer, "bits" must not have any non-spare bits set (unless they're
already set in the pointer value). The native discriminator bit may only be set
if the object is Swift-refcounted.
- Builtin.castReferenceFromBridgeObject<T>(bo: Builtin.BridgeObject) -> T
Extracts the reference from a BridgeObject.
- Builtin.castBitPatternFromBridgeObject(bo: Builtin.BridgeObject) -> Builtin.Word
Presents the bit pattern of a BridgeObject as a Word.
BridgeObject's bits are set up as follows on the various platforms:
i386, armv7:
No ObjC tagged pointers
Swift native refcounting flag bit: 0x0000_0001
Other available spare bits: 0x0000_0002
x86_64:
Reserved for ObjC tagged pointers: 0x8000_0000_0000_0001
Swift native refcounting flag bit: 0x0000_0000_0000_0002
Other available spare bits: 0x7F00_0000_0000_0004
arm64:
Reserved for ObjC tagged pointers: 0x8000_0000_0000_0000
Swift native refcounting flag bit: 0x4000_0000_0000_0000
Other available spare bits: 0x3F00_0000_0000_0007
TODO: BridgeObject doesn't present any extra inhabitants. It ought to at least provide null as an extra inhabitant for Optional.
Swift SVN r22880
This includes:
1. Teaching SwiftAA that swift_fixLifetime does not affect loads, stores of pointers.
2. Teaching SwiftARCExpand to remove swift_fixLifetime.
3. Teaching SwiftARCOpts that it can move retains but not releases over
swift_fixLifetime.
At -O I get the following % speedups > 10%.
InsertionSort:|%29.73
ImageProc:|||||%23.97
RC4:|||||||||||%19.46
PrimeNum|||||||%18.10
Ary||||||||||||%14.09
Ary2|||||||||||%13.16
<rdar://problem/18685662>
Swift SVN r22814
We can remove objc_retain/releases just like swift ones.
Results -O where SU = minbefore/minafter (10 samples):
TEST`````````````````MIN``MAX``MEAN`SD```MED``MIN``MAX``MEAN`SD```MED````SU
ClassArrayGetter`````1931`1943`1938`4````1940`1143`1158`1150`5````1153```1.68
DeltaBlue````````````3434`3450`3439`4````3441`2906`2932`2913`7````2911```1.18
Dictionary```````````5616`5933`5755`112``5748`5291`5713`5515`140``5563```1.06
PopFrontArray````````103``109``104``1````104``96```99```96```1````96`````1.07
PopFrontArrayGeneric`102``106``103``1````103``95```98```96```1````96`````1.07
PrimeNum`````````````4098`7552`5953`1164`6319`4496`8653`6336`1210`6639```0.91
QuickSort````````````6380`6415`6396`11```6398`6071`6115`6086`13```6082```1.05
Rectangles```````````2277`2377`2322`31```2320`1636`1682`1657`12```1655```1.39
StrCat```````````````2811`2849`2825`12```2822`2487`2516`2501`12```2505```1.13
StringWalk```````````5784`5797`5790`4````5790`6124`6137`6129`4````6127```0.94
Swift SVN r22695
Teach LLVMARCOpts about swift_unknownRetain and that we can move retains accross
objc_retains.
This recovers the performance we lost (see rdar below) due to making retain
sinking more aggressive.
ClassArrayGetter```````,``581.00````,``488.00````,``92.00```,````````18.9%
Rectangles`````````````,``618.00````,``552.00````,``71.00```,````````13.0%
rdar://18337069
Swift SVN r21999