The previous algorithm was doing an iterative forward data flow analysis
followed by a reverse data flow analysis. I suspect the history here is that
it was a reverse analysis, and that didn't really work for infinite loops,
and so complexity accumulated.
The new algorithm is quite straightforward and relies on the allocations
being properly jointly post-dominated, just not nested. We simply walk
forward through the blocks in consistent-with-dominance order, maintaining
the stack of active allocations and deferring deallocations that are
improperly nested until we deallocate the allocations above it. The only
real subtlety is that we have to delay walking into dead-end regions until
we've seen all of the edges into them, so that we can know whether we have
a coherent stack state in them. If the state is incoherent, we need to
remove any deallocations of previous allocations because we cannot talk
correctly about what's on top of the stack.
The reason I'm doing this, besides it just being a simpler and hopefully
faster algorithm, is that modeling some of the uses of the async stack
allocator properly requires builtins that cannot just be semantically
reordered. That should be somewhat easier to handle with the new approach,
although really (1) we should not have runtime functions that need this and
(2) we're going to need a conservatively-correct solution that's different
from this anyway because hoisting allocations is *also* limited in its own
way.
I've attached a rather pedantic proof of the correctness of the algorithm.
The thing that concerns me most about the rewritten pass is that it isn't
actually validating joint post-dominance on input, so if you give it bad
input, it might be a little mystifying to debug the verifier failures.
`anyAppleOS` represents a meta-platform for availability checks that can be
used to check availability across all of Apple's operating systems. It
supports versions 26.0 and up since version 26.0 is the first OS version number
that is aligned accross macOS, iOS, watchOS, tvOS, and visionOS.
Apple platform-specific availability specification take precedence over
`anyAppleOS` availability specifications when specified simultaneously.
Resolves rdar://153834380.
There is another near-identical function in DebugOptUtils.h that can be used
everywhere this function is used, and offers more flexibility in its callback
interface.
Whenever we have a reference to a foreign function/variable in SIL, use
a mangled name at the SIL level with the C name in the asmname
attribute. The expands the use of asmname to three kinds of cases that
it hadn't been used in yet:
* Declarations imported from C headers/modules
* @_cdecl @implementation of C headers/modules
* @_cdecl functions in general
Some code within the SIL pipeline makes assumptions that the C names of
various runtime functions are reflected at the SIL level. For example,
the linking of Embedded Swift runtime functions is done by-name, and
some of those names refer to C functions (like `swift_retain`) and
others refer to Swift functions that use `@_silgen_name` (like
`swift_getDefaultExecutor`). Extend the serialized module format to
include a table that maps from the asmname of functions/variables over
to their mangled names, so we can look up functions by asmname if we
want. These tables could also be used for checking for declarations
that conflict on their asmname in the future. Right now, we leave it
up to LLVM or the linker to do the checking.
`@_silgen_name` is not affected by these changes, nor should it be:
that hidden feature is specifically meant to affect the name at the
SIL level.
The vast majority of test changes are SIL tests where we had expected
to see the C/C++/Objective-C names in the tests for references to
foreign entities, and now we see Swift mangled names (ending in To).
The SIL declarations themselves will have a corresponding asmname.
Notably, the IRGen tests have *not* changed, because we generally the
same IR as before. It's only the modeling at the SIL lever that has
changed.
Another part of rdar://137014448.
This brings this control in line with other diagnostic controls we have which operate on a per-group level.
'DefaultIgnoreWarnings' diagnostic group option applies to all warnings belonging to a certain diagnostic group.
The inheritance rules are:
- Marking a diagnostic group as 'DefaultIgnoreWarnings' means warnings belonging to this group will not be emitted by-default
- Warnings belonging to sub-groups of this group will also not be emitted by-default
- Enabling a 'DefaultIgnoreWarnings' group (with '-Werror','-Wwarning', etc.) means warnings belonging to this group will be emitted.
- Warnings belonging to sub-groups of this group will also be emitted.
- Warnings belonging to super-groups of this group will not be affected.
I missed upgrading this to an error for Swift 6 mode, let's upgrade it
to an error for a future language mode. It's important we reject these
cases since we're otherwise allowing subtyping to be a weaker constraint
than conversion.
When generating dead stub methods for vtable entries, we should not
attach COMDATs to their definitions. This is because such stubs may be
referenced only through aliases, and the presence of a COMDAT on the
stub definition would cause the aliased symbol, we want to keep, to be
discarded from the symbol table when other object files also have a
dead stub method.
Given the following two object files generated:
```mermaid
graph TD
subgraph O0[A.swift.o]
subgraph C0[COMDAT Group swift_dead_method_stub]
D0_0["[Def] swift_dead_method_stub"]
end
S0_0["[Symbol] swift_dead_method_stub"] --> D0_0
S1["[Symbol] C1.dead"] --alias--> D0_0
end
subgraph O1[B.swift.o]
subgraph C1[COMDAT Group swift_dead_method_stub]
D0_1["[Def] swift_dead_method_stub"]
end
S0_1["[Symbol] swift_dead_method_stub"] --> D0_1
S2["[Symbol] C2.beef"] --alias--> D0_1
end
```
When linking these two object files, the linker will pick one of the
COMDAT groups of `swift_dead_method_stub`. The other COMDAT group will be
discarded, along with all symbols aliased to the discarded definition,
even though those aliased symbols (`C1.dead` and `C2.beef`) are the
ones we want to keep to construct vtables.
This change stops attaching COMDATs to dead stub method definitions.
This effectively only affects WebAssembly targets because MachO's
`supportsCOMDAT` returns false, and COFF targets don't use
`linkonce_odr` for dead stub methods.
The COMDAT was added in 7e8f782457
Builtin.FixedArray was introduced as the first generic builtin type, with
special case handling in all the various recursive visitors. Introduce
a base class, and move the handling to that base class, so it is easier
to introduce other generic builtins in the future.
This change just stages in a few new platform kinds, without fully adding
support for them yet.
- The `Swift` platform represents availability of the Swift runtime across all
platforms that support an ABI stable Swift runtime (see the pitch at
https://forums.swift.org/t/pitch-swift-runtime-availability/82742).
- The `anyAppleOS` platform is an experimental platform that represents all of
Apple's operating systems. This is intended to simplify writing availability
for Apple's platforms by taking advantage of the new unified OS versioning
system announced at WWDC 2025.
- The `DriverKit` platform corresponds to Apple DriverKit which is already
supported by LLVM.
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871