When performing multi-threaded IRGen, all of the link libraries used
for autolinking went (only) into the first object file. If the object
files were then placed in a static archive, the autolinking would only
kick in if clients that link the static library reference something in
the first object file. This causes link failures.
Make sure that we put the autolink information into each of the object
files we create when doing multi-threaded IRGen.
Fixes the rest of rdar://162400654.
The schemas for the coro frame de/alloc functions were incorrectly not
being assigned and instead the non-frame de/alloc functions were being
assigned twice. Fix that and add more tests.
rdar://163330882
To enable ABIs which store extra info in the frame, add two new slots to
the coroutine allocator function table. For example, a frame could have
a header containing a context pointer at a negative offset from the
address returned from `swift_coro_alloc_frame`. The frame deallocation
function would then know to deallocate more space correspondingly.
Allocator structs are passed in to new ABI yield-once coroutines and
contain pointers to functions to de/allocate memory. Here, those
pointers are signed.
This PR introduces three new instrumentation flags and plumbs them
through to IRGen:
1. `-ir-profile-generate` - enable IR-level instrumentation.
2. `-cs-profile-generate` - enable context-sensitive IR-level
instrumentation.
3. `-ir-profile-use` - IR-level PGO input profdata file to enable
profile-guided optimization (both IRPGO and CSIRPGO)
**Context:**
https://forums.swift.org/t/ir-level-pgo-instrumentation-in-swift/82123
**Swift-driver PR:** https://github.com/swiftlang/swift-driver/pull/1992
**Tests and validation:**
This PR includes ir level verification tests, also checks few edge-cases
when `-ir-profile-use` supplied profile is either missing or is an
invalid IR profile.
However, for argument validation, linking, and generating IR profiles
that can later be consumed by -cs-profile-generate, I’ll need
corresponding swift-driver changes. Those changes are being tracked in
https://github.com/swiftlang/swift-driver/pull/1992
This commit adds -sil-output-path and -ir-output-path frontend options that
allow generating SIL and LLVM IR files as supplementary outputs during normal
compilation.
These options can be useful for debugging and analysis tools
workflows that need access to intermediate compilation artifacts
without requiring separate compiler invocations.
Expected behaviour:
Primary File mode:
- SIL: Generates one .sil file per source file
- IR: Generates one .ll file per source file
Single-threaded WMO mode:
- SIL: Generates one .sil file for the entire module
- IR: Generates one .ll file for the entire module
Multi-threaded WMO mode:
- SIL: Generates one .sil file for the entire module
- IR: Generates separate .ll files per source file
File Maps with WMO:
- Both SIL and IR outputs using first entry's naming, which is
consistent with the behaviour of other supplementary outputs.
rdar://160297898
`.swift_ast` section is now recognized as a "metadata" section by LLVM's
object emission. On WebAssembly, metadata symbols must not appear in the
object file symbol table. Therefore, change the linkage of the
`@__Swift_AST` global from InternalLinkage to PrivateLinkage to avoid
exposing the symbol.
555eaef3ab
Calling setupLLVMOptimizationRemarks overwrites the MainRemarkStreamer
in the LLVM context. This prevents LLVM from serializing the remark meta
information for the already emitted SIL remarks into the object file.
Without the meta information bitstream remarks don't work correctly.
Instead, emit SIL remarks and LLVM remarks to the same RemarkSerializer,
and keep the file stream alive until after CodeGen.
Android before API 29 and a few other platforms don't support native TLS, so
fall back to LLVM's emulated TLS there, just like clang does. Also, make sure
`-Xcc -f{no-,}emulated-tls` flags passed in are applied to control what the
Swift compiler does.
This achieves the same as clang's `-fdebug-info-for-profiling`, which
emits DWARF discriminators to aid in narrowing-down which basic block
corresponds to a particular instruction address. This is particularly
useful for sampling-based profiling.
rdar://135443278
Adds sections `__TEXT,__swift_as_entry`, and `__TEXT,__swift_as_ret` that
contain relative pointers to async functlets modelling async function entries,
and function returns, respectively.
Emission of the sections can be trigger with the frontend option
`-Xfrontend -enable-async-frame-push-pop-metadata`.
This is done by:
* IRGen adding a `async_entry` function attribute to async functions.
* LLVM's coroutine splitting identifying continuation funclets that
model the return from an async function call by adding the function
attribute `async_ret`. (see #llvm-project/pull/9204)
* An LLVM pass that keys off these two function attribute and emits the
metadata into the above mention sections.
rdar://134460666
Based on prior evaluation, this optimization always increases code
size. It splits out blocks and introduces calls, which always adds
extra instructions to shuffle values for calling conventions and
to make the actual call/return. Apple clang avoids the optimization
in `-Oz`, which I think is pretty close to what Swift's `-Osize`
really means.