[IRGen][Runtime] Add emit-into-client retain/release calls for Darwin ARM64.

This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.

To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.

IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.

The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)

Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).

As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.

The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.

To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.

Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.

rdar://122595871
This commit is contained in:
Mike Ash
2025-09-04 16:05:46 -04:00
parent cb7ddbe9ea
commit 93fae78e04
26 changed files with 706 additions and 44 deletions

View File

@@ -271,6 +271,19 @@ option(SWIFT_BUILD_STDLIB_CXX_MODULE
"If not building stdlib, controls whether to build the Cxx module"
TRUE)
# The swiftClientRetainRelease library is currently only available for Darwin
# platforms.
if(SWIFT_HOST_VARIANT_SDK IN_LIST SWIFT_DARWIN_PLATFORMS)
# Off by default everywhere for now.
option(SWIFT_BUILD_CLIENT_RETAIN_RELEASE
"Build the swiftClientRetainRelease library"
FALSE)
else()
option(SWIFT_BUILD_CLIENT_RETAIN_RELEASE
"Build the swiftClientRetainRelease library"
FALSE)
endif()
# In many cases, the CMake build system needs to determine whether to include
# a directory, or perform other actions, based on whether the stdlib or SDK is
# being built at all -- statically or dynamically. Please note that these

View File

@@ -548,6 +548,9 @@ public:
// Whether to emit mergeable or non-mergeable traps.
unsigned MergeableTraps : 1;
/// Enable the use of swift_retain/releaseClient functions.
unsigned EnableClientRetainRelease : 1;
/// The number of threads for multi-threaded code generation.
unsigned NumThreads = 0;
@@ -675,6 +678,7 @@ public:
EmitAsyncFramePushPopMetadata(true), EmitTypeMallocForCoroFrame(true),
AsyncFramePointerAll(false), UseProfilingMarkerThunks(false),
UseCoroCCX8664(false), UseCoroCCArm64(false), MergeableTraps(false),
EnableClientRetainRelease(false),
DebugInfoForProfiling(false), CmdArgs(),
SanitizeCoverage(llvm::SanitizerCoverageOptions()),
TypeInfoFilter(TypeInfoDumpFilter::All),

View File

@@ -1477,6 +1477,13 @@ def mergeable_traps :
Flag<["-"], "mergeable-traps">,
HelpText<"Emit mergeable traps even in optimized builds">;
def enable_client_retain_release :
Flag<["-"], "enable-client-retain-release">,
HelpText<"Enable use of swift_retain/releaseClient functions">;
def disable_client_retain_release :
Flag<["-"], "disable-client-retain-release">,
HelpText<"Disable use of swift_retain/releaseClient functions">;
def enable_new_llvm_pass_manager :
Flag<["-"], "enable-new-llvm-pass-manager">,
HelpText<"Enable the new llvm pass manager">;

View File

@@ -248,6 +248,17 @@ extern uintptr_t __COMPATIBILITY_LIBRARIES_CANNOT_CHECK_THE_IS_SWIFT_BIT_DIRECTL
// so changing this value is not sufficient.
#define SWIFT_DEFAULT_LLVM_CC llvm::CallingConv::C
// Define the calling convention for refcounting functions for targets where it
// differs from the standard calling convention. Currently this is only used for
// swift_retain, swift_release, and some internal helper functions that they
// call.
#if defined(__aarch64__)
#define SWIFT_REFCOUNT_CC SWIFT_CC_PreserveMost
#define SWIFT_REFCOUNT_CC_PRESERVEMOST 1
#else
#define SWIFT_REFCOUNT_CC
#endif
/// Should we use absolute function pointers instead of relative ones?
/// WebAssembly target uses it by default.
#ifndef SWIFT_COMPACT_ABSOLUTE_FUNCTION_POINTER

View File

@@ -59,6 +59,9 @@ namespace swift {
template <typename Ret, typename Param>
Param returnTypeHelper(Ret (*)(Param)) {}
template <typename Ret, typename Param>
Param returnTypeHelper(SWIFT_REFCOUNT_CC Ret (*)(Param)) {}
#if defined(__LP64__) || defined(_LP64)
#define REGISTER_SUBSTITUTION_PREFIX ""
#define REGISTER_PREFIX "x"

View File

@@ -137,6 +137,7 @@ HeapObject* swift_allocEmptyBox();
/// It may also prove worthwhile to have this use a custom CC
/// which preserves a larger set of registers.
SWIFT_RUNTIME_EXPORT
SWIFT_REFCOUNT_CC
HeapObject *swift_retain(HeapObject *object);
SWIFT_RUNTIME_EXPORT
@@ -173,6 +174,7 @@ bool swift_isDeallocating(HeapObject *object);
/// - maybe a variant that can assume a non-null object
/// It's unlikely that a custom CC would be beneficial here.
SWIFT_RUNTIME_EXPORT
SWIFT_REFCOUNT_CC
void swift_release(HeapObject *object);
SWIFT_RUNTIME_EXPORT

View File

@@ -220,6 +220,22 @@ FUNCTION(NativeStrongRelease, Swift, swift_release, C_CC, AlwaysAvailable,
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
UNKNOWN_MEMEFFECTS)
// void *swift_retain(void *ptr);
FUNCTION(NativeStrongRetainClient, Swift, swift_retainClient, SwiftClientRR_CC, AlwaysAvailable,
RETURNS(RefCountedPtrTy),
ARGS(RefCountedPtrTy),
ATTRS(NoUnwind, FirstParamReturned, WillReturn),
EFFECT(RuntimeEffect::RefCounting),
UNKNOWN_MEMEFFECTS)
// void swift_release(void *ptr);
FUNCTION(NativeStrongReleaseClient, Swift, swift_releaseClient, SwiftClientRR_CC, AlwaysAvailable,
RETURNS(VoidTy),
ARGS(RefCountedPtrTy),
ATTRS(NoUnwind),
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
UNKNOWN_MEMEFFECTS)
// void *swift_retain_n(void *ptr, int32_t n);
FUNCTION(NativeStrongRetainN, Swift, swift_retain_n, C_CC, AlwaysAvailable,
RETURNS(RefCountedPtrTy),
@@ -420,6 +436,24 @@ FUNCTION(BridgeObjectStrongRelease, Swift, swift_bridgeObjectRelease,
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
UNKNOWN_MEMEFFECTS)
// void *swift_bridgeObjectRetainClient(void *ptr);
FUNCTION(BridgeObjectStrongRetainClient, Swift, swift_bridgeObjectRetainClient,
SwiftClientRR_CC, AlwaysAvailable,
RETURNS(BridgeObjectPtrTy),
ARGS(BridgeObjectPtrTy),
ATTRS(NoUnwind, FirstParamReturned),
EFFECT(RuntimeEffect::RefCounting),
UNKNOWN_MEMEFFECTS)
// void *swift_bridgeObjectReleaseClient(void *ptr);
FUNCTION(BridgeObjectStrongReleaseClient, Swift, swift_bridgeObjectReleaseClient,
SwiftClientRR_CC, AlwaysAvailable,
RETURNS(VoidTy),
ARGS(BridgeObjectPtrTy),
ATTRS(NoUnwind),
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
UNKNOWN_MEMEFFECTS)
// void *swift_nonatomic_bridgeObjectRetain(void *ptr);
FUNCTION(NonAtomicBridgeObjectStrongRetain, Swift, swift_nonatomic_bridgeObjectRetain,
C_CC, AlwaysAvailable,

View File

@@ -3983,6 +3983,11 @@ static bool ParseIRGenArgs(IRGenOptions &Opts, ArgList &Args,
Opts.MergeableTraps = Args.hasArg(OPT_mergeable_traps);
Opts.EnableClientRetainRelease =
Args.hasFlag(OPT_enable_client_retain_release,
OPT_disable_client_retain_release,
Opts.EnableClientRetainRelease);
Opts.EnableObjectiveCProtocolSymbolicReferences =
Args.hasFlag(OPT_enable_objective_c_protocol_symbolic_references,
OPT_disable_objective_c_protocol_symbolic_references,

View File

@@ -997,11 +997,16 @@ void IRGenFunction::emitNativeStrongRetain(llvm::Value *value,
value = Builder.CreateBitCast(value, IGM.RefCountedPtrTy);
// Emit the call.
llvm::CallInst *call = Builder.CreateCall(
(atomicity == Atomicity::Atomic)
? IGM.getNativeStrongRetainFunctionPointer()
: IGM.getNativeNonAtomicStrongRetainFunctionPointer(),
value);
FunctionPointer function;
if (atomicity == Atomicity::Atomic &&
IGM.TargetInfo.HasSwiftClientRRLibrary &&
getOptions().EnableClientRetainRelease)
function = IGM.getNativeStrongRetainClientFunctionPointer();
else if (atomicity == Atomicity::Atomic)
function = IGM.getNativeStrongRetainFunctionPointer();
else
function = IGM.getNativeNonAtomicStrongRetainFunctionPointer();
llvm::CallInst *call = Builder.CreateCall(function, value);
call->setDoesNotThrow();
call->addParamAttr(0, llvm::Attribute::Returned);
}
@@ -1257,10 +1262,16 @@ void IRGenFunction::emitNativeStrongRelease(llvm::Value *value,
Atomicity atomicity) {
if (doesNotRequireRefCounting(value))
return;
emitUnaryRefCountCall(*this, (atomicity == Atomicity::Atomic)
? IGM.getNativeStrongReleaseFn()
: IGM.getNativeNonAtomicStrongReleaseFn(),
value);
llvm::Constant *function;
if (atomicity == Atomicity::Atomic &&
IGM.TargetInfo.HasSwiftClientRRLibrary &&
getOptions().EnableClientRetainRelease)
function = IGM.getNativeStrongReleaseClientFn();
else if (atomicity == Atomicity::Atomic)
function = IGM.getNativeStrongReleaseFn();
else
function = IGM.getNativeNonAtomicStrongReleaseFn();
emitUnaryRefCountCall(*this, function, value);
}
void IRGenFunction::emitNativeSetDeallocating(llvm::Value *value) {
@@ -1353,20 +1364,30 @@ void IRGenFunction::emitUnknownStrongRelease(llvm::Value *value,
void IRGenFunction::emitBridgeStrongRetain(llvm::Value *value,
Atomicity atomicity) {
emitUnaryRefCountCall(*this,
(atomicity == Atomicity::Atomic)
? IGM.getBridgeObjectStrongRetainFn()
: IGM.getNonAtomicBridgeObjectStrongRetainFn(),
value);
llvm::Constant *function;
if (atomicity == Atomicity::Atomic &&
IGM.TargetInfo.HasSwiftClientRRLibrary &&
getOptions().EnableClientRetainRelease)
function = IGM.getBridgeObjectStrongRetainClientFn();
else if (atomicity == Atomicity::Atomic)
function = IGM.getBridgeObjectStrongRetainFn();
else
function = IGM.getNonAtomicBridgeObjectStrongRetainFn();
emitUnaryRefCountCall(*this, function, value);
}
void IRGenFunction::emitBridgeStrongRelease(llvm::Value *value,
Atomicity atomicity) {
emitUnaryRefCountCall(*this,
(atomicity == Atomicity::Atomic)
? IGM.getBridgeObjectStrongReleaseFn()
: IGM.getNonAtomicBridgeObjectStrongReleaseFn(),
value);
llvm::Constant *function;
if (atomicity == Atomicity::Atomic &&
IGM.TargetInfo.HasSwiftClientRRLibrary &&
getOptions().EnableClientRetainRelease)
function = IGM.getBridgeObjectStrongReleaseClientFn();
else if (atomicity == Atomicity::Atomic)
function = IGM.getBridgeObjectStrongReleaseFn();
else
function = IGM.getNonAtomicBridgeObjectStrongReleaseFn();
emitUnaryRefCountCall(*this, function, value);
}
void IRGenFunction::emitErrorStrongRetain(llvm::Value *value) {

View File

@@ -564,8 +564,9 @@ IRGenModule::IRGenModule(IRGenerator &irgen,
InvariantMetadataID = getLLVMContext().getMDKindID("invariant.load");
InvariantNode = llvm::MDNode::get(getLLVMContext(), {});
DereferenceableID = getLLVMContext().getMDKindID("dereferenceable");
C_CC = getOptions().PlatformCCallingConvention;
SwiftClientRR_CC = llvm::CallingConv::PreserveMost;
// TODO: use "tinycc" on platforms that support it
DefaultCC = SWIFT_DEFAULT_LLVM_CC;
@@ -1730,6 +1731,11 @@ void IRGenModule::addLinkLibraries() {
registerLinkLibrary(
LinkLibrary{"objc", LibraryKind::Library, /*static=*/false});
if (TargetInfo.HasSwiftClientRRLibrary &&
getOptions().EnableClientRetainRelease)
registerLinkLibrary(LinkLibrary{"swiftClientRetainRelease",
LibraryKind::Library, /*static=*/true});
// If C++ interop is enabled, add -lc++ on Darwin and -lstdc++ on linux.
// Also link with C++ bridging utility module (Cxx) and C++ stdlib overlay
// (std) if available.

View File

@@ -921,6 +921,7 @@ public:
llvm::CallingConv::ID SwiftCC; /// swift calling convention
llvm::CallingConv::ID SwiftAsyncCC; /// swift calling convention for async
llvm::CallingConv::ID SwiftCoroCC; /// swift calling convention for callee-allocated coroutines
llvm::CallingConv::ID SwiftClientRR_CC; /// swift client retain/release calling convention
/// What kind of tail call should be used for async->async calls.
llvm::CallInst::TailCallKind AsyncTailCallKind;

View File

@@ -71,6 +71,14 @@ static void configureARM64(IRGenModule &IGM, const llvm::Triple &triple,
// half for the kernel.
target.SwiftRetainIgnoresNegativeValues = true;
// ARM64 Darwin has swiftClientRetainRelease, but not in Embedded mode. JIT
// mode can't load the static library, so disable it there as well.
if (triple.isOSDarwin() &&
!IGM.getSwiftModule()->getASTContext().LangOpts.hasFeature(
Feature::Embedded) &&
!IGM.getOptions().UseJIT)
target.HasSwiftClientRRLibrary = true;
target.UsableSwiftAsyncContextAddrIntrinsic = true;
}

View File

@@ -114,6 +114,9 @@ public:
/// "negative" pointer values.
bool SwiftRetainIgnoresNegativeValues = false;
/// True if the swiftClientRetainRelease static library is available.
bool HasSwiftClientRRLibrary = false;
bool UsableSwiftAsyncContextAddrIntrinsic = false;
};

View File

@@ -64,6 +64,13 @@ inline RT_Kind classifyInstruction(const llvm::Instruction &I) {
.Case("__swift_" #TextualName, RT_ ## Name)
#include "LLVMSwift.def"
// Identify "Client" versions of reference counting entry points.
#define SWIFT_FUNC(Name, MemBehavior, TextualName) \
.Case("swift_" #TextualName "Client", RT_ ## Name)
#define SWIFT_INTERNAL_FUNC_NEVER_NONATOMIC(Name, MemBehavior, TextualName) \
.Case("__swift_" #TextualName "Client", RT_ ## Name)
#include "LLVMSwift.def"
// Support non-atomic versions of reference counting entry points.
#define SWIFT_FUNC(Name, MemBehavior, TextualName) \
.Case("swift_nonatomic_" #TextualName, RT_ ## Name)

View File

@@ -320,6 +320,10 @@ endif()
if(SWIFT_BUILD_STDLIB)
# These must be kept in dependency order so that any referenced targets
# exist at the time we look for them in add_swift_*.
if(SWIFT_BUILD_CLIENT_RETAIN_RELEASE)
add_subdirectory(ClientRetainRelease)
endif()
add_subdirectory(runtime)
add_subdirectory(stubs)
add_subdirectory(core)
@@ -389,4 +393,3 @@ endif()
if(SWIFT_BUILD_LIBEXEC)
add_subdirectory(libexec)
endif()

View File

@@ -0,0 +1,18 @@
add_swift_target_library(swiftClientRetainRelease
STATIC DONT_EMBED_BITCODE NOSWIFTRT
RetainRelease.s
C_COMPILE_FLAGS ${SWIFT_RUNTIME_CXX_FLAGS}
$<$<BOOL:${SWIFT_STDLIB_ENABLE_OBJC_INTEROP}>:-DSWIFT_OBJC_INTEROP=1>
LINK_FLAGS ${SWIFT_RUNTIME_LINK_FLAGS}
SWIFT_COMPILE_FLAGS ${SWIFT_STANDARD_LIBRARY_SWIFT_FLAGS}
DEPLOYMENT_VERSION_OSX ${COMPATIBILITY_MINIMUM_DEPLOYMENT_VERSION_OSX}
DEPLOYMENT_VERSION_IOS ${COMPATIBILITY_MINIMUM_DEPLOYMENT_VERSION_IOS}
DEPLOYMENT_VERSION_TVOS ${COMPATIBILITY_MINIMUM_DEPLOYMENT_VERSION_TVOS}
DEPLOYMENT_VERSION_WATCHOS ${COMPATIBILITY_MINIMUM_DEPLOYMENT_VERSION_WATCHOS}
DEPLOYMENT_VERSION_XROS ${COMPATIBILITY_MINIMUM_DEPLOYMENT_VERSION_XROS}
MACCATALYST_BUILD_FLAVOR "zippered"
INSTALL_IN_COMPONENT compiler
INSTALL_WITH_SHARED)

View File

@@ -0,0 +1,365 @@
// We currently have an implementation for ARM64 Mach-O.
#if __arm64__ && __LP64__ && defined(__APPLE__) && defined(__MACH__)
#include "swift/ABI/System.h"
// Use the CAS instructions where available.
#if __ARM_FEATURE_ATOMICS
#define USE_CAS 1
#else
#define USE_CAS 0
#endif
// If CAS is not available, we use load/store exclusive.
#define USE_LDX_STX !USE_CAS
// Use ptrauth instructions where needed.
#if __has_feature(ptrauth_calls)
#define PTRAUTH 1
#else
#define PTRAUTH 0
#endif
// The value of 1 strong refcount in the overall refcount field.
#define STRONG_RC_ONE (1 << 33)
// The mask to apply to BridgeObject values to extract the pointer they contain.
#define BRIDGEOBJECT_POINTER_BITS (~SWIFT_ABI_ARM64_SWIFT_SPARE_BITS_MASK)
.subsections_via_symbols
.data
// The slowpath mask is in the runtime. Its "address" is the mask, with an
// offset so that it's still correct when the weak reference resolves to zero.
.weak_reference __swift_retainRelease_slowpath_mask_v1
// Grab our own copy of the slowpath mask. This mask is a value which indicates
// when we must call into the runtime slowpath. If the object's refcount field
// has any bits set that are in the mask, then we must take the slow path. The
// offset is the value it will have when the variable isn't present at runtime,
// and needs to be the correct mask for older runtimes.
//
// This variable goes into a special section so it can be located and runtime
// to override the value. Instrumentation can set it to all 1s to ensure the
// slow path is always used.
.section __DATA,__swift5_rr_mask
.align 3
LretainRelease_slowpath_mask:
.quad __swift_retainRelease_slowpath_mask_v1 + 0x8000000000000000
.text
.align 2
// Macro for conditionally emitting instructions. When `condition` is true, the
// rest of the line is emitted. When false, nothing is emitted. More readable
// shorthand for #if blocks when there's only one instruction to conditionalize.
.macro CONDITIONAL condition line:vararg
.ifb \line
.err CONDITIONAL used with no instruction
.endif
.if \condition
\line
.endif
.endmacro
// Helper macros for conditionally supporting ptrauth.
.macro maybe_pacibsp
.if PTRAUTH
pacibsp
.endif
.endmacro
.macro ret_maybe_ab
.if PTRAUTH
retab
.else
ret
.endif
.endmacro
// NOTE: we're using the preserve_most calling convention, so x9-15 are off
// limits, in addition to the usual x19 and up. Any calls to functions that use
// the standard calling convention need to save/restore x9-x15.
.globl _swift_retain_preservemost
.weak_definition _swift_retain_preservemost
_swift_retain_preservemost:
maybe_pacibsp
stp x0, x9, [sp, #-0x50]!
stp x10, x11, [sp, #0x10]
stp x12, x13, [sp, #0x20]
stp x14, x15, [sp, #0x30]
stp fp, lr, [sp, #0x40];
add fp, sp, #0x40
// Clear the unused bits from the pointer
and x0, x0, #BRIDGEOBJECT_POINTER_BITS
bl _swift_retain
ldp fp, lr, [sp, #0x40]
ldp x14, x15, [sp, #0x30]
ldp x12, x13, [sp, #0x20]
ldp x10, x11, [sp, #0x10]
ldp x0, x9, [sp], #0x50
ret_maybe_ab
.globl _swift_release_preservemost
.weak_definition _swift_release_preservemost
_swift_release_preservemost:
maybe_pacibsp
str x9, [sp, #-0x50]!
stp x10, x11, [sp, #0x10]
stp x12, x13, [sp, #0x20]
stp x14, x15, [sp, #0x30]
stp fp, lr, [sp, #0x40];
add fp, sp, #0x40
// Clear the unused bits from the pointer
and x0, x0, #BRIDGEOBJECT_POINTER_BITS
bl _swift_release
ldp fp, lr, [sp, #0x40]
ldp x14, x15, [sp, #0x30]
ldp x12, x13, [sp, #0x20]
ldp x10, x11, [sp, #0x10]
ldr x9, [sp], #0x50
ret_maybe_ab
.private_extern _swift_bridgeObjectReleaseClient
#if SWIFT_OBJC_INTEROP
_swift_bridgeObjectReleaseClient:
tbz x0, #63, LbridgeObjectReleaseNotTagged
ret
LbridgeObjectReleaseNotTagged:
tbnz x0, #62, _bridgeObjectReleaseClientObjC
and x0, x0, 0x0ffffffffffffff8
#else
_swift_bridgeObjectReleaseClient:
and x0, x0, 0x0ffffffffffffff8
#endif
.alt_entry _swift_releaseClient
.private_extern _swift_releaseClient
_swift_releaseClient:
// RR of NULL or values with high bit set is a no-op.
cmp x0, #0
b.le Lrelease_ret
// We'll operate on the address of the refcount field, which is 8 bytes into
// the object.
add x1, x0, #8
// Load the current value in the refcount field when using CAS.
CONDITIONAL USE_CAS, \
ldr x16, [x1]
// The compare-and-swap goes back to here when it needs to retry.
Lrelease_retry:
// Get the slow path mask and see if the refcount field has any of those bits
// set.
adrp x17, LretainRelease_slowpath_mask@PAGE
ldr x17, [x17, LretainRelease_slowpath_mask@PAGEOFF]
// Load-exclusive of the current value in the refcount field when using LLSC.
// stxr does not update x16 like cas does, so this load must be inside the loop.
// ldxr/stxr are not guaranteed to make forward progress if there are memory
// accesses between them, so we need to do this after getting the mask above.
CONDITIONAL USE_LDX_STX, \
ldxr x16, [x1]
tst x16, x17
// Also check if we're releasing with a refcount of 0. That will initiate
// dealloc and requires calling the slow path. We don't try to decrement and
// then call dealloc in that case. We'll just immediately go to the slow path
// and let it take care of the entire operation.
mov x17, #STRONG_RC_ONE
ccmp x16, x17, #0x8, eq
// If the refcount value matches the slow path mask, or the strong refcount is
// zero, then go to the slow path.
b.lt Lslowpath_release
// We're good to proceed with the fast path. Compute the new value of the
// refcount field.
sub x17, x16, x17
#if USE_CAS
// Save a copy of the old value so we can determine if the CAS succeeded.
mov x2, x16
// Compare and swap the new value into the refcount field. Perform the operation
// with release memory ordering so that dealloc on another thread will see all
// stores performed on this thread prior to calling release.
casl x16, x17, [x1]
// The previous value of the refcount field is now in x16. We succeeded if that
// value is the same as the old value we had before. If we failed, retry.
cmp x2, x16
b.ne Lrelease_retry
#elif USE_LDX_STX
// Try to store the updated value.
stlxr w16, x17, [x1]
// On failure, retry.
cbnz w16, Lrelease_retry
#else
#error Either USE_CAS or USE_LDX_STX must be set.
#endif
// On success, return.
Lrelease_ret:
ret
Lslowpath_release:
CONDITIONAL USE_LDX_STX, \
clrex
b _swift_release_preservemost
.alt_entry _bridgeObjectReleaseClientObjC
_bridgeObjectReleaseClientObjC:
maybe_pacibsp
stp x0, x9, [sp, #-0x50]!
stp x10, x11, [sp, #0x10]
stp x12, x13, [sp, #0x20]
stp x14, x15, [sp, #0x30]
stp fp, lr, [sp, #0x40];
add fp, sp, #0x40
// Clear the unused bits from the pointer
and x0, x0, #BRIDGEOBJECT_POINTER_BITS
bl _objc_release
ldp fp, lr, [sp, #0x40]
ldp x14, x15, [sp, #0x30]
ldp x12, x13, [sp, #0x20]
ldp x10, x11, [sp, #0x10]
ldp x0, x9, [sp], #0x50
LbridgeObjectReleaseObjCRet:
ret_maybe_ab
.private_extern _swift_bridgeObjectRetainClient
#if SWIFT_OBJC_INTEROP
_swift_bridgeObjectRetainClient:
tbz x0, #63, LbridgeObjectRetainNotTagged
ret
LbridgeObjectRetainNotTagged:
tbnz x0, #62, _swift_bridgeObjectRetainClientObjC
.alt_entry _swift_retainClient
#else
.set _swift_bridgeObjectRetainClient, _swift_retainClient
#endif
.private_extern _swift_retainClient
_swift_retainClient:
// RR of NULL or values with high bit set is a no-op.
cmp x0, #0
b.le Lretain_ret
// Mask off spare bits that may have come in from bridgeObjectRetain. Keep the
// original value in x0 since we have to return it.
and x1, x0, 0xffffffffffffff8
// We'll operate on the address of the refcount field, which is 8 bytes into
// the object.
add x1, x1, #8
// Load the current value of the refcount field when using CAS.
CONDITIONAL USE_CAS, \
ldr x16, [x1]
Lretain_retry:
// Get the slow path mask and see if the refcount field has any of those bits
// set.
adrp x17, LretainRelease_slowpath_mask@PAGE
ldr x17, [x17, LretainRelease_slowpath_mask@PAGEOFF]
// Load-exclusive of the current value in the refcount field when using LLSC.
// stxr does not update x16 like cas does, so this load must be inside the loop.
// ldxr/stxr are not guaranteed to make forward progress if there are memory
// accesses between them, so we need to do this after getting the mask above.
CONDITIONAL USE_LDX_STX, \
ldxr x16, [x1]
// Compute a refcount field with the strong refcount incremented.
mov x3, #STRONG_RC_ONE
add x3, x16, x3
// Test the incremented value against the slowpath mask. This checks for both
// the side table case and the overflow case, as the overflow sets the high
// bit. This can't have a false negative, as clearing the bit with an overflow
// would require the refcount field to contain a side table pointer with a top
// set to 0x7fff, which wouldn't be a valid pointer.
tst x3, x17
b.ne Lslowpath_retain
#if USE_CAS
// Save the old value so we can check if the CAS succeeded.
mov x2, x16
// Compare and swap the new value into the refcount field. Retain can use
// relaxed memory ordering.
cas x16, x3, [x1]
// The previous value of the refcount field is now in x16. We succeeded if that
// value is the same as the old value we had before. If we failed, retry.
cmp x2, x16
b.ne Lretain_retry
#elif USE_LDX_STX
// Try to store the updated value.
stxr w16, x3, [x1]
// On failure, retry.
cbnz w16, Lretain_retry
#else
#error Either USE_CAS or USE_LDX_STX must be set.
#endif
// If we succeeded, return. Retain returns the object pointer being retained,
// which is still in x0 at this point.
Lretain_ret:
ret
Lslowpath_retain:
CONDITIONAL USE_LDX_STX, \
clrex
b _swift_retain_preservemost
.alt_entry _swift_bridgeObjectRetainClientObjC
_swift_bridgeObjectRetainClientObjC:
maybe_pacibsp
stp x0, x9, [sp, #-0x50]!
stp x10, x11, [sp, #0x10]
stp x12, x13, [sp, #0x20]
stp x14, x15, [sp, #0x30]
stp fp, lr, [sp, #0x40];
add fp, sp, #0x40
// Clear the unused bits from the pointer
and x0, x0, #BRIDGEOBJECT_POINTER_BITS
bl _objc_retain
ldp fp, lr, [sp, #0x40]
ldp x14, x15, [sp, #0x30]
ldp x12, x13, [sp, #0x20]
ldp x10, x11, [sp, #0x10]
ldp x0, x9, [sp], #0x50
ret_maybe_ab
#else
.private_extern _placeholderSymbol
.set _placeholderSymbol, 0
#endif

View File

@@ -182,7 +182,8 @@ namespace swift {
}
// FIXME: HACK: copied from HeapObject.cpp
extern "C" SWIFT_LIBRARY_VISIBILITY SWIFT_NOINLINE SWIFT_USED void
extern "C" SWIFT_LIBRARY_VISIBILITY SWIFT_NOINLINE SWIFT_USED SWIFT_REFCOUNT_CC
void
_swift_release_dealloc(swift::HeapObject *object);
namespace swift {
@@ -714,6 +715,7 @@ class RefCounts {
// Out-of-line slow paths.
SWIFT_NOINLINE
SWIFT_REFCOUNT_CC
HeapObject *incrementSlow(RefCountBits oldbits, uint32_t inc);
SWIFT_NOINLINE
@@ -815,7 +817,7 @@ class RefCounts {
// can be directly returned from swift_retain. This makes the call to
// incrementSlow() a tail call.
SWIFT_ALWAYS_INLINE
HeapObject *increment(uint32_t inc = 1) {
HeapObject *increment(HeapObject *returning, uint32_t inc = 1) {
auto oldbits = refCounts.load(SWIFT_MEMORY_ORDER_CONSUME);
// Constant propagation will remove this in swift_retain, it should only
@@ -835,7 +837,7 @@ class RefCounts {
}
} while (!refCounts.compare_exchange_weak(oldbits, newbits,
std::memory_order_relaxed));
return getHeapObject();
return returning;
}
SWIFT_ALWAYS_INLINE
@@ -1010,6 +1012,7 @@ class RefCounts {
// First slow path of doDecrement, where the object may need to be deinited.
// Side table is handled in the second slow path, doDecrementSideTable().
template <PerformDeinit performDeinit>
SWIFT_REFCOUNT_CC
bool doDecrementSlow(RefCountBits oldbits, uint32_t dec) {
RefCountBits newbits;
@@ -1379,7 +1382,7 @@ class HeapObjectSideTableEntry {
// STRONG
void incrementStrong(uint32_t inc) {
refCounts.increment(inc);
refCounts.increment(nullptr, inc);
}
template <PerformDeinit performDeinit>

View File

@@ -109,7 +109,9 @@
#define SWIFT_WEAK_IMPORT
#endif
#if __has_attribute(musttail)
// WASM says yes to __has_attribute(musttail) but doesn't support using it, so
// exclude WASM from SWIFT_MUSTTAIL.
#if __has_attribute(musttail) && !defined(__wasm__)
#define SWIFT_MUSTTAIL [[clang::musttail]]
#else
#define SWIFT_MUSTTAIL

View File

@@ -447,6 +447,11 @@ if(TARGET libSwiftScan)
list(APPEND tooling_stdlib_deps libSwiftScan)
endif()
set(stdlib_link_libraries)
if(SWIFT_BUILD_CLIENT_RETAIN_RELEASE)
list(APPEND stdlib_link_libraries swiftClientRetainRelease)
endif()
add_swift_target_library(swiftCore
${SWIFT_STDLIB_LIBRARY_BUILD_TYPES}
${swiftCore_common_options}

View File

@@ -61,6 +61,43 @@ using namespace swift;
#error "The runtime must be built with a compiler that supports swiftcall."
#endif
#if SWIFT_REFCOUNT_CC_PRESERVEMOST
// These assembly definitions support the swiftClientRetainRelease library which
// is currently implemented for ARM64 Mach-O.
#if __arm64__ && __LP64__ && defined(__APPLE__) && defined(__MACH__)
asm(R"(
// Define a mask used by ClientRetainRelease to determine when it must call into
// the runtime. The symbol's address is used as the mask, rather than its
// contents, to eliminate one load instruction when using it. This is imported
// weakly, which makes its address zero when running against older runtimes.
// ClientRetainRelease references it using an addend of 0x8000000000000000,
// which produces the appropriate mask in that case. Since the mask is still
// unchanged in this version of the runtime, we export this symbol as zero. If a
// different mask is ever needed, the address of this symbol needs to be set to
// 0x8000000000000000 less than that value so that it comes out right in
// ClientRetainRelease.
.globl __swift_retainRelease_slowpath_mask_v1
.set __swift_retainRelease_slowpath_mask_v1, 0
// Define aliases for swift_retain/release that indicate they use preservemost.
// ClientRetainRelease will reference these so that it can fall back to a
// register-preserving register on older runtimes.
.globl _swift_retain_preservemost
.set _swift_retain_preservemost, _swift_retain
.globl _swift_release_preservemost
.set _swift_release_preservemost, _swift_release
// A weak definition can only be overridden by a strong definition if the
// library with the strong definition contains at least one weak definition.
// Create a placeholder weak definition here to allow that to work.
.weak_definition _swift_release_preservemost_weak_placeholder
.globl _swift_release_preservemost_weak_placeholder
_swift_release_preservemost_weak_placeholder:
.byte 0
)");
#endif
#endif
/// Returns true if the pointer passed to a native retain or release is valid.
/// If false, the operation should immediately return.
SWIFT_ALWAYS_INLINE
@@ -116,6 +153,20 @@ static HeapObject *_swift_tryRetain_(HeapObject *object)
return _ ## name ## _ args; \
} while(0)
// SWIFT_REFCOUNT_CC functions make the call to the "might be swizzled" path
// through an adapter marked noinline and with the refcount CC. This allows the
// fast path to avoid pushing a stack frame. Without this adapter, clang emits
// code that pushes a stack frame right away, then does the fast path or slow
// path.
#define CALL_IMPL_SWIFT_REFCOUNT_CC(name, args) \
do { \
if (SWIFT_UNLIKELY( \
_swift_enableSwizzlingOfAllocationAndRefCountingFunctions_forInstrumentsOnly \
.load(std::memory_order_relaxed))) \
SWIFT_MUSTTAIL return _##name##_adapter args; \
return _##name##_ args; \
} while (0)
#define CALL_IMPL_CHECK(name, args) do { \
void *fptr; \
memcpy(&fptr, (void *)&_ ## name, sizeof(fptr)); \
@@ -421,26 +472,48 @@ HeapObject *swift::swift_allocEmptyBox() {
}
// Forward-declare this, but define it after swift_release.
extern "C" SWIFT_LIBRARY_VISIBILITY SWIFT_NOINLINE SWIFT_USED void
extern "C" SWIFT_LIBRARY_VISIBILITY SWIFT_NOINLINE SWIFT_USED SWIFT_REFCOUNT_CC
void
_swift_release_dealloc(HeapObject *object);
SWIFT_ALWAYS_INLINE
static HeapObject *_swift_retain_(HeapObject *object) {
SWIFT_ALWAYS_INLINE static HeapObject *_swift_retain_(HeapObject *object) {
SWIFT_RT_TRACK_INVOCATION(object, swift_retain);
if (isValidPointerForNativeRetain(object)) {
// swift_bridgeObjectRetain might call us with a pointer that has spare bits
// set, and expects us to return that unmasked value. Mask off those bits
// for the actual increment operation.
HeapObject *masked = (HeapObject *)((uintptr_t)object &
~heap_object_abi::SwiftSpareBitsMask);
// Return the result of increment() to make the eventual call to
// incrementSlow a tail call, which avoids pushing a stack frame on the fast
// path on ARM64.
return object->refCounts.increment(1);
return masked->refCounts.increment(object, 1);
}
return object;
}
#ifdef SWIFT_STDLIB_OVERRIDABLE_RETAIN_RELEASE
SWIFT_REFCOUNT_CC
static HeapObject *_swift_retain_adapterImpl(HeapObject *object) {
HeapObject *masked =
(HeapObject *)((uintptr_t)object & ~heap_object_abi::SwiftSpareBitsMask);
_swift_retain(masked);
return object;
}
// This strange construct prevents the compiler from creating an unnecessary
// stack frame in swift_retain. A direct tail call to _swift_retain_adapterImpl
// somehow causes clang to emit a stack frame.
static HeapObject *(*SWIFT_REFCOUNT_CC volatile _swift_retain_adapter)(
HeapObject *object) = _swift_retain_adapterImpl;
#endif
HeapObject *swift::swift_retain(HeapObject *object) {
#ifdef SWIFT_THREADING_NONE
return swift_nonatomic_retain(object);
#else
CALL_IMPL(swift_retain, (object));
CALL_IMPL_SWIFT_REFCOUNT_CC(swift_retain, (object));
#endif
}
@@ -457,7 +530,7 @@ SWIFT_ALWAYS_INLINE
static HeapObject *_swift_retain_n_(HeapObject *object, uint32_t n) {
SWIFT_RT_TRACK_INVOCATION(object, swift_retain_n);
if (isValidPointerForNativeRetain(object))
object->refCounts.increment(n);
return object->refCounts.increment(object, n);
return object;
}
@@ -483,11 +556,18 @@ static void _swift_release_(HeapObject *object) {
object->refCounts.decrementAndMaybeDeinit(1);
}
#ifdef SWIFT_STDLIB_OVERRIDABLE_RETAIN_RELEASE
SWIFT_REFCOUNT_CC SWIFT_NOINLINE
static void _swift_release_adapter(HeapObject *object) {
_swift_release(object);
}
#endif
void swift::swift_release(HeapObject *object) {
#ifdef SWIFT_THREADING_NONE
swift_nonatomic_release(object);
#else
CALL_IMPL(swift_release, (object));
CALL_IMPL_SWIFT_REFCOUNT_CC(swift_release, (object));
#endif
}

View File

@@ -695,13 +695,9 @@ void *swift::swift_bridgeObjectRetain(void *object) {
#if SWIFT_OBJC_INTEROP
if (isObjCTaggedPointer(object) || isBridgeObjectTaggedPointer(object))
return object;
#endif
auto const objectRef = toPlainObject_unTagged_bridgeObject(object);
#if SWIFT_OBJC_INTEROP
if (!isNonNative_unTagged_bridgeObject(object)) {
return swift_retain(static_cast<HeapObject *>(objectRef));
return swift_retain(static_cast<HeapObject *>(object));
}
// Put the call to objc_retain in a separate function, tail-called here. This
@@ -712,10 +708,9 @@ void *swift::swift_bridgeObjectRetain(void *object) {
// bit set.
SWIFT_MUSTTAIL return objcRetainAndReturn(object);
#else
// No tail call here. When !SWIFT_OBJC_INTEROP, the value of objectRef may be
// different from that of object, e.g. on Linux ARM64.
swift_retain(static_cast<HeapObject *>(objectRef));
return object;
// swift_retain will mask off any extra bits in object, and return the
// original value, so we can tail call it here.
return swift_retain(static_cast<HeapObject *>(object));
#endif
}

View File

@@ -26,8 +26,8 @@ entry(%x : $@convention(thin) () -> ()):
// CHECK-LABEL: define{{( dllexport)?}}{{( protected)?}} swiftcc { ptr, ptr } @thick_func_value(ptr %0, ptr %1) {{.*}} {
// CHECK-NEXT: entry:
// CHECK-NEXT: call ptr @swift_retain(ptr returned %1) {{#[0-9]+}}
// CHECK-NEXT: call void @swift_release(ptr %1) {{#[0-9]+}}
// CHECK-NEXT: call{{( preserve_mostcc)?}} ptr @swift_retain{{(Client)?}}(ptr returned %1) {{#[0-9]+}}
// CHECK-NEXT: call{{( preserve_mostcc)?}} void @swift_release{{(Client)?}}(ptr %1) {{#[0-9]+}}
// CHECK-NEXT: %3 = insertvalue { ptr, ptr } undef, ptr %0, 0
// CHECK-NEXT: %4 = insertvalue { ptr, ptr } %3, ptr %1, 1
// CHECK-NEXT: ret { ptr, ptr } %4

View File

@@ -10,6 +10,9 @@ void swift_unownedRelease_n(void *, uint32_t);
void *swift_nonatomic_unownedRetain_n(void *, uint32_t);
void swift_nonatomic_unownedRelease_n(void *, uint32_t);
void *swift_bridgeObjectRetain(void *);
void swift_bridgeObjectRelease(void *);
// Wrappers so we can call these from Swift without upsetting the ARC optimizer.
void *wrapper_swift_retain_n(void *obj, uint32_t n) {
return swift_retain_n(obj, n);
@@ -43,4 +46,10 @@ void wrapper_swift_nonatomic_unownedRelease_n(void *obj, uint32_t n) {
swift_nonatomic_unownedRelease_n(obj, n);
}
void *wrapper_swift_bridgeObjectRetain(void *obj) {
return swift_bridgeObjectRetain(obj);
}
void wrapper_swift_bridgeObjectRelease(void *obj) {
swift_bridgeObjectRelease(obj);
}

View File

@@ -0,0 +1,48 @@
// RUN: %empty-directory(%t)
//
// RUN: %target-clang -x c %S/Inputs/retain_release_wrappers.c -c -o %t/retain_release_wrappers.o
// RUN: %target-build-swift -enable-experimental-feature Extern %t/retain_release_wrappers.o %s -o %t/refcount_bridgeobject
// RUN: %target-codesign %t/refcount_bridgeobject
// RUN: %target-run %t/refcount_bridgeobject
// REQUIRES: executable_test
// REQUIRES: swift_feature_Extern
// UNSUPPORTED: use_os_stdlib
// UNSUPPORTED: back_deployment_runtime
import StdlibUnittest
// Declarations of runtime ABI refcounting functions.
@_extern(c)
func wrapper_swift_bridgeObjectRetain(_ obj: UnsafeMutableRawPointer?) -> UnsafeMutableRawPointer?
@_extern(c)
func wrapper_swift_bridgeObjectRelease(_ obj: UnsafeMutableRawPointer?)
let RefcountBridgeObjectTests = TestSuite("RefcountBridgeObject")
var didDeinit = false
class C {
deinit {
didDeinit = true
}
}
RefcountBridgeObjectTests.test("retain/release") {
do {
let obj = C()
let asInt = unsafeBitCast(obj, to: UInt.self)
// 2 is a spare bit available to BridgeObject on all current targets.
let asBridgeObject = UnsafeMutableRawPointer(bitPattern: asInt | 2)
let result = wrapper_swift_bridgeObjectRetain(asBridgeObject)
expectEqual(asBridgeObject, result)
wrapper_swift_bridgeObjectRelease(asBridgeObject)
}
expectTrue(didDeinit)
didDeinit = false
}
runAllTests()

View File

@@ -224,4 +224,13 @@ RefcountOverflowTests.test("nonatomic_unownedRetain moderate increment") {
didDeinit = false
}
RefcountOverflowTests.test("swift_retain overflow") {
let obj = C()
let u = Unmanaged.passRetained(obj)
expectCrashLater(withMessage: "Fatal error: Object was retained too many times")
while true {
_ = u.retain()
}
}
runAllTests()