Instead of copying the data and the type and witnesses separately, use the size in the value witness table and copy everything at once.
copyTypeInto assumed the type was an ordinary existential. When it was actually an extended existential, it would do an incorrect cast and read part of a pointer as the number of witness tables to copy. This would typically result in a large buffer overflow and crash. At this point we already know the type's size, so we can use that info directly rather than essentially recomputing it.
rdar://163980446
Add SWIFT_DEBUG_ENABLE_TASK_SLAB_ALLOCATOR, which is on by default. When turned off, async stack allocations call through to malloc/free. This allows memory debugging tools to be used on async stack allocations.
This is breaking embedded due to its use of weak definitions for embedded versions of the runtime functions. The presence of a weak export in libswiftCore allows any strong symbol in libswiftCore to override a weak symbol elsewhere. Remove while we work out how to reconcile the two.
rdar://163578646
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871
This PR replaces our use of `__dso_handle` and (indirectly) `dladdr()`
when figuring out the base address of loaded Swift images on ELF-based
platforms. Instead, we use `__ehdr_start` which refers to the start of
the image's ELF header (i.e. the start of the file) and is provided by
the compiler/linker for this purpose.
This pointer is consumed by `swift_addNewDSOImage()` and, later, by
Swift Testing when running tests, but is otherwise unused, so the less
work we do on it at runtime, the better.
A fallback path is present that, for images that do not contain
`__ehdr_start`, derives the address by calling `dladdr()` on the section
structure's address.
Resolves#84997.
We scan the target's initial allocation pool, and all 16kB heap allocations. We check each pointer-aligned offset within those areas, and try to read it as Swift metadata and get a name from it. If that fails, quietly move on. It's very unlikely for some random memory to look enough like Swift metadata for this to produce a name, so this works very well to print the generic metadata instantiated in the remote process without requiring `SWIFT_DEBUG_ENABLE_METADATA_ALLOCATION_ITERATION`.
rdar://161120936
Pretty sure this can't ever trigger because of operating system
limits on the size of environment strings, but since overflow
tests are cheap, there seems no reason not to add one.
rdar://160307933
We used to try to use `process_vm_readv()` if `CAP_SYS_PTRACE` is
enabled. This avoided using signal handlers to catch crashes when
we try to read through an invalid pointer, but it complicates the
code and it turns out not to work on some Linux kernels where
the `process_vm_readv()` syscall is unavailable.
rdar://159930644
This change adds a new type of cache (cache by type descriptor) to the protocol conformance lookup system. This optimization is beneficial for generic types, where the
same conformance can be reused across different instantiations of the generic type.
Key changes:
- Add a `GetOrInsertManyScope` class to `ConcurrentReadableHashMap` for performing
multiple insertions under a single lock
- Add type descriptor-based caching for protocol conformances
- Add environment variables for controlling and debugging the conformance cache
- Add tests to verify the behavior of the conformance cache
- Fix for https://github.com/swiftlang/swift/issues/82889
The implementation is controlled by the `SWIFT_DEBUG_ENABLE_CACHE_PROTOCOL_CONFORMANCES_BY_TYPE_DESCRIPTOR`
environment variable, which is enabled by default.
This reapplies https://github.com/swiftlang/swift/pull/82818 after it's been reverted in https://github.com/swiftlang/swift/pull/83770.
This change adds a new type of cache (cache by type descriptor) to the protocol conformance lookup system. This optimization is beneficial for generic types, where the
same conformance can be reused across different instantiations of the generic type.
Key changes:
- Add a `GetOrInsertManyScope` class to `ConcurrentReadableHashMap` for performing
multiple insertions under a single lock
- Add type descriptor-based caching for protocol conformances
- Add environment variables for controlling and debugging the conformance cache
- Add tests to verify the behavior of the conformance cache
- Fix for https://github.com/swiftlang/swift/issues/82889
The implementation is controlled by the `SWIFT_DEBUG_ENABLE_CACHE_PROTOCOL_CONFORMANCES_BY_TYPE_DESCRIPTOR`
environment variable, which is enabled by default.
When the concurrency library's "is current global actor" hook is not
available, assume that we are already executing on the right global
actor. This mimics the behavior of earlier standard libraries when
faced with an isolated conformance, as well as dealing with odd
configurations where the code might not load the concurrency library
yet
At the moment, WebAssembly ends up in this configuration because we
don't run the initialization for the concurrency library. That makes
this also a workaround for issue #82682 / rdar://154762027.
Musl's `clone()` wrapper returns `EINVAL` if you try to use `CLONE_THREAD`,
which seems a bit wrong (certainly it is in this particular application,
since we *really* don't care whether the thread is a valid C library
thread at this point).
Also properly support ELF images that are built with a base address other
than zero (this typically isn't an issue for dynamically linked programs,
as they will be relocated at runtime anyway, but for statically linked
binaries it's very common to set the base address to a non-zero value).
rdar://154282813
When ObjC interop is not available, Error values are represented in ErrorObject boxes. These are full HeapObjects, but unowned refcounting ops asserted that the metadata was class metadata. This assert would be hit when destroying an ErrorObject that was weakly referenced. Expand the asserts to accept ErrorObject metadata as well.
rdar://150214921
We had fixed this bug in https://github.com/swiftlang/swift/pull/79381
but missed to realize the same problem existed for parameters as well.
This corrects the swift_func_getParameterTypeInfo impl, and also removes
the entire "unsafe" method, we no longer use it anywhere.
Resolves rdar://146679254
The "was already deallocated" message is incorrect, since the target of an unowned reference stays allocated even after being deinitialized. We could say "was already deinitialized" but that's a bit of a niche term. "Was already destroyed" conveys what happened without the reader needing to worry about deinitialization versus deallocation.
rdar://149237704
The new build system set `SWIFT_RUNTIME_CLOBBER_FREED_OBJECTS` to 0.
Unfortunately, the check in the Swift runtime used `#ifdef`, so even
though it was turned off, it was actually enabled in some cases.
Fixing the issue in the build system as well as switching the check to
verify that value of `SWIFT_RUNTIME_CLOBBER_FREED_OBJECTS` is taken into
account in the sources. C/C++ implicitly defines macro values to 1 when
set without a value and 0 when it is not set.
Also making the hex a bit more recognizable and grep'able by including
it as a comment.
Fixes: rdar://149210738
This memory is part of the conformance cache concurrent hash map, so
when we clear the conformance cache, record each of the allocated
pointers within the concurrent map's free list. This way, it'll be
freed with the rest of the concurrent map when it's safe to do so.
With relative witness tables, the low bit of a witness table pointer is
an indicator that we need to load from the given pointer. We were also
using the low bit of the witness table pointer in the conformance
cache entry as part of a pointer union. Hilarity ensures [*].
Switch to another low bit by exploding the conformance cache key
into separate fields and taking the low bit of one of those pointers
that isn't reserved.
Fixes the remainder of rdar://149326058, I hope.
[*] No, I am not laughing.