When performing multi-threaded IRGen, all of the link libraries used
for autolinking went (only) into the first object file. If the object
files were then placed in a static archive, the autolinking would only
kick in if clients that link the static library reference something in
the first object file. This causes link failures.
Make sure that we put the autolink information into each of the object
files we create when doing multi-threaded IRGen.
Fixes the rest of rdar://162400654.
With multi-threaded IRGen, the global variables associated with "once"
initialization tokens were not getting colocated with their actual
global variables, which caused the initialization code to get split
across different files. This issue manifest as autolinking errors in
some projects.
Fixes rdar://162400654.
Implement the @export(implementation) and @export(interface) attributes
to replace @_alwaysEmitIntoClient and @_neverEmitIntoClient. Provide a
warning + Fix-It to start staging out the very-new
@_neverEmitIntoClient. We'll hold off on pushing folks toward
@_alwaysEmitIntoClient for a little longer.
This fixes a regression introduced by e3c8f423bc,
but the root cause was actually a subtle invariant violation
in IRGen.
FulfillmentMap's use of canonical types as keys assumes that
canonical type equality is sufficient for checking if two types
"are the same". However, this is not true when those types
contain type parameters, because two distinct type parameters
might belong to the same equivalence class.
Thus, when we take the type's context substitution map, and
apply it to each type parameter in our given list of
requirements, the substitution operation could output a
different but equivalent type parameter.
As a workaround for this problem, try to preserve the old
behavior of subst() in this specific case.
Fixes rdar://160649141.
It hoists `destroy_value` instructions for non-lexical values.
```
%1 = some_ownedValue
...
last_use(%1)
... // other instructions
destroy_value %1
```
->
```
%1 = some_ownedValue
...
last_use(%1)
destroy_value %1 // <- moved after the last use
... // other instructions
```
In contrast to non-mandatory optimization passes, this is the only pass which hoists destroys over deinit-barriers.
This ensures consistent behavior in -Onone and optimized builds.
Allow external declaration of global variables via `@_extern(c)`. Such
variables need to have types represented in C (of course), have only
storage (no accessors), and cannot have initializers. At the SIL
level, we use the SIL asmname attribute to get the appropriate C name.
While here, slightly shore up the `@_extern(c)` checking, which should
fix issue #70776 / rdar://153515764.
This is necessary because we need to model its stack-allocation
behavior, although I'm not yet doing that in this patch because
StackNesting first needs to be taught to not try to move the
deallocation.
I'm not convinced that `async let` *should* be doing a stack allocation,
but it undoubtedly *is* doing a stack allocation, and until we have an
alternative to that, we will need to model it properly.
When generating dead stub methods for vtable entries, we should not
attach COMDATs to their definitions. This is because such stubs may be
referenced only through aliases, and the presence of a COMDAT on the
stub definition would cause the aliased symbol, we want to keep, to be
discarded from the symbol table when other object files also have a
dead stub method.
Given the following two object files generated:
```mermaid
graph TD
subgraph O0[A.swift.o]
subgraph C0[COMDAT Group swift_dead_method_stub]
D0_0["[Def] swift_dead_method_stub"]
end
S0_0["[Symbol] swift_dead_method_stub"] --> D0_0
S1["[Symbol] C1.dead"] --alias--> D0_0
end
subgraph O1[B.swift.o]
subgraph C1[COMDAT Group swift_dead_method_stub]
D0_1["[Def] swift_dead_method_stub"]
end
S0_1["[Symbol] swift_dead_method_stub"] --> D0_1
S2["[Symbol] C2.beef"] --alias--> D0_1
end
```
When linking these two object files, the linker will pick one of the
COMDAT groups of `swift_dead_method_stub`. The other COMDAT group will be
discarded, along with all symbols aliased to the discarded definition,
even though those aliased symbols (`C1.dead` and `C2.beef`) are the
ones we want to keep to construct vtables.
This change stops attaching COMDATs to dead stub method definitions.
This effectively only affects WebAssembly targets because MachO's
`supportsCOMDAT` returns false, and COFF targets don't use
`linkonce_odr` for dead stub methods.
The COMDAT was added in 7e8f782457
The schemas for the coro frame de/alloc functions were incorrectly not
being assigned and instead the non-frame de/alloc functions were being
assigned twice. Fix that and add more tests.
rdar://163330882
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release.
To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call.
IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations.
The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.)
Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value).
As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed.
The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes.
To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols.
Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case.
rdar://122595871
Pass the frame from the outer yield_once_2 coroutine into the
swift_coro_de/alloc functions and from there into all four de/allocation
witnesses. This enables ABIs which key off of information stored in the
frame.
To enable ABIs which store extra info in the frame, add two new slots to
the coroutine allocator function table. For example, a frame could have
a header containing a context pointer at a negative offset from the
address returned from `swift_coro_alloc_frame`. The frame deallocation
function would then know to deallocate more space correspondingly.
Removes the underscored prefixes from the @_section and @_used attributes, making them public as @section and @used respectively. The SymbolLinkageMarkers experimental feature has been removed as these attributes are now part of the standard language. Implemented expression syntactic checking rules per SE-0492.
Major parts:
- Renamed @_section to @section and @_used to @used
- Removed the SymbolLinkageMarkers experimental feature
- Added parsing support for the old underscored names with deprecation warnings
- Updated all tests and examples to use the new attribute names
- Added syntactic validation for @section to align with SE-0492 (reusing the legality checker by @artemcm)
- Changed @DebugDescription macro to explicitly use a tuple type instead of type inferring it, to comply with the expression syntax rules
- Added a testcase for the various allowed and disallowed syntactic forms, `test/ConstValues/SectionSyntactic.swift`.
Doing so enables allocators to contain additional context for use by
allocation functions. Because the allocator is already passed to
_swift_coro_alloc, on the fast path (no allocator, popless) no
allocation function is used, and the allocator is passed in the
swiftcoro register, this is cheap.
This experimental feature will be used to force the compiler to treat `Swift`
runtime availability as separate from platform availability when compiling for
targets that have the Swift runtime built-in.