Rather than attempt to use standard C++ type names, create a namespace specifically for environment variable types and require vars to use a type from that namespace. This avoids the annoying collision between this `string` and `std::string`, and makes everything a bit more consistent. Rename the types a bit to be more appropriate for this context: boolean, uint8, and uint32, instead of bool, uint8_t, and uint32_t.
When an exception is thrown through swift_job_run, it leaves the Concurrency runtime in an inconsistent state, which can lead to misbehavior or crashes later on. It's very difficult to work out what the cause is when this happens. Since the program state is doomed once this happens, prevent exceptions from propagating through swift_job_run at all, and terminate immediately when throwing.
rdar://171909991
We were crashing when using this hook function because the hook points
at a Swift function, which expects `self` to be set, but we weren't
telling the C compiler and so when we called through the function pointer
from C++ code, the relevant register wasn't set and we'd crash.
Fixes#85134
rdar://163401438
* Have job_run include the actor, executor, and task name.
* Make task_status_changed only trace if one of the tracked values actually changed.
* Add a job_enqueue_executor that covers enqueueing jobs on serial executors (default actors, custom executors, main actor, etc.).
* Actually end the actor lifetime signpost interval.
* Include the actor metadata and context descriptor pointers in actor enqueue/dequeue so the type can be identified.
* Add the task pointer to all task-related signpost messages.
* Emit job_enqueue_main_executor from the swift_task_enqueueImpl path so main actor enqueues appear in traces.
* Add TracingExecutorKind enum and log executorKind in job_enqueue_executor and job_run to distinguish global, default actor, main actor, custom serial executor, and task executor contexts.
While we're here, finally add a test for Concurrency signposts. This is difficult because the signposts are disabled by default and there isn't a good way to turn them on in an automated fashion. Resolve this by adding an environment variable, SWIFT_CONCURRENCY_TRACING_SUBSYSTEM, which overrides the subsystem used by the Concurrency signpost code. The test can use this to set a subsystem that isn't disabled by default, which then allows `log stream` to capture the signposts.
rdar://158239258
This change adds logic in the compiler to compute malloc type ids and emit them together with typed allocation calls. It also adds the new runtime function swift_allocObjectTyped, which calls typed malloc.
Because we guaranteed that one cannot access a distribted "remote"
instance's state via type system, we can allocate the instance much
smaller because we never access those user initialized fields.
This way a remote distributed actor reference is always 128 bytes, this
is because actors have special storage 12*8 bytes of PrivateData in
addition to the 16 bytes object header.
With this optimization, regardless how many fields an actor has, the
remote ref always is the same size.
resolves rdar://81825648
Used claude to generate a bunch of tests to check the sizes of actors
etc, I think this is correct (and getting it wrong totally just crashes
immediately anyway). Could use a review though, thank you!
cc @mikeash @xedin
This is a follow up from the "async" `deinit` work, which will allow us
to guarantee cleanup code to run in deinitializers, even if they need to
call asynchronous code, and even if they may be run in a task that was
cancelled: by "shielding" it from cancellation.
This is incomplete, the child handling needs some more love.
SE proposal: https://github.com/swiftlang/swift-evolution/pull/3037/
Tweaked the comment in `Runtime/Config.h`.
Fixed a couple of incorrect ARM64 instruction mnemonics. This still needs
testing on ARM64 Windows.
Fixed an out-of-date comment in `swift-backtrace`.
Use a macro in `Backtrace.cpp` to guarantee we don't overrun the buffer,
and in the process simplify the code slightly.
rdar://101623384
Right now, the backtracer won't work on 32-bit Windows because we
aren't properly dealing with the `stdcall` calling convention on
the Win32 API calls.
rdar://101623384
This doesn't have a working symbolicator yet, but it does build and
it can obtain a basic backtrace.
It also doesn't include a working `swift-backtrace` program yet.
rdar://101623384
We should expose the demangle functionality; It's been widely used by
calling into internal _swift_demangle and instead we should offer a real
API. It's also already used in the Runtime module already when forming
backtraces.
[Previous
discussions](https://forums.swift.org/t/demangle-function/25416/15)
happened between 2019 and 2024, and just never materialized in a
complete implementation and proposal.
Right now, even more tools are in need of this API, as we are building
continious profiling solutions etc, so it is a good time to revisit this
topic.
This PR is roughly based off @Azoy's
https://github.com/swiftlang/swift/pull/25314/files#diff-fd379a721cc9a1c9ef6486eae713e945da842b42170d4d069029a57334371eba
from 2019, however it brings it over to the new Runtime module which is
a great place to put this functionality - even Backtrace had to recently
reinvent calling the demangling infra in this module.
Pending SE review, [proposal
here](https://github.com/swiftlang/swift-evolution/pull/2989).
cc @azoy @al45tair
---------
Co-authored-by: Alastair Houghton <alastair@alastairs-place.net>
In a discussion with @rjmccall, we agreed that these should really be
UNKNOWN_MEMEFFECTS so we are conservative.
Just slicing this off from a larger patch stream.
Add SWIFT_DEBUG_ENABLE_TASK_SLAB_ALLOCATOR, which is on by default. When turned off, async stack allocations call through to malloc/free. This allows memory debugging tools to be used on async stack allocations.
In embedded mode, some things call retain/release using the C++ declarations, but the implementations are in Swift. The Swift implementations don't use preservemost, so the C++ declarations must not declare preservemost in that context.
rdar://163940783
This is necessary because we need to model its stack-allocation
behavior, although I'm not yet doing that in this patch because
StackNesting first needs to be taught to not try to move the
deallocation.
I'm not convinced that `async let` *should* be doing a stack allocation,
but it undoubtedly *is* doing a stack allocation, and until we have an
alternative to that, we will need to model it properly.
functions.
These were introduced in an early draft implementation of async let, but
never used by a released compiler. They are not used as symbols by any
app binaries. There's no reason to keep carrying them.
While I'm at it, dramatically improve the documentation of the remaining
async let API functions.
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871
We scan the target's initial allocation pool, and all 16kB heap allocations. We check each pointer-aligned offset within those areas, and try to read it as Swift metadata and get a name from it. If that fails, quietly move on. It's very unlikely for some random memory to look enough like Swift metadata for this to produce a name, so this works very well to print the generic metadata instantiated in the remote process without requiring `SWIFT_DEBUG_ENABLE_METADATA_ALLOCATION_ITERATION`.
rdar://161120936