This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871
rdar://157795547
When types contain stored properties of resilient types, we instantiate their metadata at runtime. If those types are non-copyable, they won't have layout strings, so we must not set the flag.
rdar://151176697
While generic types generally have layout strings (when enabled), non-copyable types don't, so
we have to make sure the flag does not get set.
rdar://138487964
On platforms that don't have reserved bits in objc (including unknown) pointers, we use the spare bits for Swift enums, so they have to be masked out. Blocks don't have reserved bits on any platform.
rdar://138085348
Even though errors are ObjC boxes, they can't be tagged pointers and in fact may use that bit to store enum tags, so treating them like regular ObjC references here can cause ref count issues.
rdar://137066879
An unmanaged property does not map to an operation in CVW, instead it will be copied like primitive values. When instantiating the layout string, we correctly do not emit an operation, but we compute the offset to the next field as if we did. This is causing the offset to be incorrect and subsequent operations to be executed on the wrong address, causing crashes or other misbehavior.
10.50 was once greater than any real macOS version, but now it compares
less than real released versions, which makes these tests depend on the
deployment target unnecessarily. Update these tests to use even larger
numbers to hopefully keep them independent a little longer.
rdar://132501359
PowerOf2Ceil is not the correct function to use, because we end up with an empty mask if there is only one value stored in the extra tag bits
rdar://129627898
When casting the projectedBits to Int8, we accidetnally passed the base addr instead of the projectedBits. This was causing the wrong bits to be read.
rdar://129627898
LLVM expects integer types of fractional byte sizes to have been written as those types as well, so it expects all unused bytes to be 0.
Since we are using the unused extra tag bits to store tags of outer enums, that assumption does not hold here. In regular witnesses,
the outer enum would mask out those bytes before checking the tag of the inner enum. In CVW we can't do that, so we have to apply the
mask ourselves. To guarantee the mask does not get optimized out, we have to use full bytes instead of fractionals.
rdar://127279770
When an imported C type is over or under aligned, we did not use the alignment of the type, but computed the maximum alignment of its components, causing alignment issues in compact value witnesses.
When an @objc @implementation class requires the use of `ClassMetadataStrategy::Update` because some of its stored properties do not have fixed sizes, we adjust the direct field offsets during class realization by emitting a custom metadata update function which calls a new entry point in the Swift runtime. That entry point adjusts field offsets like `swift_updateClassMetadata2()`, but it only assumes that the class has Objective-C metadata, not Swift metadata.
This commit introduces an alternative mechanism which does the same thing without using any Swift-only metadata. It’s a rough implementation with important limitations:
• We’re currently using the field offset vector, which means that field offsets are being emitted into @objc @implementation classes; these will be removed.
• The new Swift runtime entry point duplicates a lot of `swift_updateClassMetadata2()`’s implementation; it will be refactored into something much smaller and more compact.
• Availability bounds for this feature have not yet been implemented.
Future commits in this PR will correct these issues.
rdar://126954341
C types don't have separate size and stride, but in type layouts we always computed the size as if they did. This could cause wrong offsets in compact value witnesses
* [Runtime] Fix CVW for genreic single payload enums with no extra inhabitants
rdar://126728925
When the payload of a generic SPE did not have any extra inhabitants, we erroneously always treated it as the no payload case.
Additionally the offset and skip values were improperly computed.
* Fixed FileCheck string
This change introduces a new compilation target platform to the Swift compiler - visionOS.
- Changes to the compiler build infrastrucuture to support building compiler-adjacent artifacts and test suites for the new target.
- Addition of the new platform kind definition.
- Support for the new platform in language constructs such as compile-time availability annotations or runtime OS version queries.
- Utilities to read out Darwin platform SDK info containing platform mapping data.
- Utilities to support re-mapping availability annotations from iOS to visionOS (e.g. 'updateIntroducedPlatformForFallback', 'updateDeprecatedPlatformForFallback', 'updateObsoletedPlatformForFallback').
- Additional tests exercising platform-specific availability handling and availability re-mapping fallback code-path.
- Changes to existing test suite to accomodate the new platform.
Fix overflow detection on unowned refcounts so that we create a side table when incrementing from 126. Implement strong refcount overflow to the side table.
The unowned refcount is never supposed to be 127, because that (sometimes) represents the immortal refcount. We attempt to detect that by checking newValue == Offsets::UnownedRefCountMask, but the mask is shifted so that condition is never true. We managed to hit the side table case when incrementing from 127, because it looks like the immortal case. But that broke when we fixed immortal side table initialization in b41079a8f54ae2d61c68cdda46c74232084af020. With that change, we now create an immortal side table when overflowing the unowned refcount, then try to increment the unowned refcount in that immortal side table, which traps.
rdar://123788910
rdar://121868127
In compact value witnesses we need to mask the extra tag bits in case they are used to store tag bits of outer enums, so we only read the ones we are interested in.
rdar://119792426
There are a few issues with wrong assumptions around extra inhabitants that cause tags to not be identified properly in some cases. Until a proper fix is identified, we emit tag functions instead.
rdar://118606044
The initWithTakeTable accidentally referenced bridgeRetain instead of copyingInitWithTake, which caused a leak when an object containing a bridge reference was also not bitwise takable.
An immutable noncopyable capture borrows the captured value in-place and can't do anything
to modify it, and the may_assign_but_not_consume checking behaves badly with some code patterns
generated for resilient types when `self` is captured during a deinit. This change allows for
more accurate checking and fixes rdar://118427997.
When a address-only noncopyable value is dead-def'ed by an indirect return from a `try_apply`,
the cleanup should be inserted on the normal return successor block. Fixes rdar://118255228.
* [Runtime] Use threaded code in compact value witness runtime
These changed reduce branching and yield performance improvements of up to 10% for some cases.
* Fix offset in handleRefCountsInitWithTake
By using a specialize function, we only call through the witness table and fetch the layout string once for the whoe buffer, instead of once per element.
rdar://115013153
For special enum cases, e.g. effectively optional references, the layout string will be the same as the payload, because we don't have to check for the particular case. For those cases we have to use the regular witnesses, which should be shared among all those cases.
`module.map` as a module map name has been discouraged since 2014, and
Clang will soon warn on its usage. This patch renames all instances of
`module.map` in the Swift tests to `module.modulemap` in preparation
for this change to Clang.
rdar://106123303