Added SWIFT_RUNTIME_WEAK_IMPORT/CHECK/USE macros.
Everything supports fast dealloc except x86 iOS simulators, so we no longer need
to look up objc_has_weak_formation_callout.
Added direct references for
objc_setHook_lazyClassNamer
_objc_realizeClassFromSwift
objc_setHook_getClass
os_system_version_get_current_version
_dyld_is_objc_constant
This type can be put to good use outside of the concurrency runtime. Reexport it from libswift_Concurrency so it can still be found there as well.
rdar://79352061
* SwiftDtoa v2: Better, Smaller, Faster floating-point formatting
SwiftDtoa is the C/C++ code used in the Swift runtime to produce the textual representations used by the `description` and `debugDescription` properties of the standard Swift floating-point types.
This update includes a number of algorithmic improvements to SwiftDtoa to improve portability, reduce code size, and improve performance but does not change the actual output.
About SwiftDtoa
===============
In early versions of Swift, the `description` properties used the C library `sprintf` functionality with a fixed number of digits.
In 2018, that logic was replaced with the first version of SwiftDtoa which used used a fast, adaptive algorithm to automatically choose the correct number of digits for a particular value.
The resulting decimal output is always:
* Accurate. Parsing the decimal form will yield exactly the same binary floating-point value again. This guarantee holds for any parser that accurately implements IEEE 754. In particular, the Swift standard library can guarantee that for any Double `d` that is not a NaN, `Double(d.description) == d`.
* Short. Among all accurate forms, this form has the fewest significant digits. (Caution: Surprisingly, this is not the same as minimizing the number of characters. In some cases, minimizing the number of characters requires producing additional significant digits.)
* Close. If there are multiple accurate, short forms, this code chooses the decimal form that is closest to the exact binary value. If there are two exactly the same distance, the one with an even final digit will be used.
Algorithms that can produce this "optimal" output have been known since at least 1990, when Steele and White published their Dragon4 algorithm.
However, Dragon4 and other algorithms from that period relied on high-precision integer arithmetic, which made them slow.
More recently, a surge of interest in this problem has produced dramatically better algorithms that can produce the same results using only fast fixed-precision arithmetic.
This format is ideal for JSON and other textual interchange: accuracy ensures that the value will be correctly decoded, shortness minimizes network traffic, and the existence of high-performance algorithms allows this form to be generated more quickly than many `printf`-based implementations.
This format is also ideal for logging, debugging, and other general display. In particular, the shortness guarantee avoids the confusion of unnecessary additional digits, so that the result of `1.0 / 10.0` consistently displays as `0.1` instead of `0.100000000000000000001`.
About SwiftDtoa v2
==================
Compared to the original SwiftDtoa code, this update is:
**Better**:
The core logic is implemented using only C99 features with 64-bit and smaller integer arithmetic.
If available, 128-bit integers are used for better performance.
The core routines do not require any floating-point support from the C/C++ standard library and with only minor modifications should be usable on systems with no hardware or software floating-point support at all.
This version also has experimental support for IEEE 754 binary128 format, though this support is obviously not included when compiling for the Swift standard library.
**Smaller**:
Code size reduction compared to the earlier versions was a primary goal for this effort.
In particular, the new binary128 support shares essentially all of its code with the float80 implementation.
**Faster**:
Even with the code size reductions, all formats are noticeably faster.
The primary performance gains come from three major changes:
Text digits are now emitted directly in the core routines in a form that requires only minimal adjustment to produce the final text.
Digit generation produces 2, 4, or even 8 digits at a time, depending on the format.
The double logic optimistically produces 7 digits in the initial scaling with a Ryu-inspired backtracking when fewer digits suffice.
SwiftDtoa's algorithms
======================
SwiftDtoa started out as a variation of Florian Loitsch' Grisu2 that addressed the shortness failures of that algorithm.
Subsequent work has incorporated ideas from Errol3, Ryu, and other sources to yield a production-quality implementation that is performance- and size-competitive with current research code.
Those who wish to understand the details can read the extensive comments included in the code.
Note that float16 actually uses a different algorithm than the other formats, as the extremely limited range can be handled with much simpler techniques.
The float80/binary128 logic sacrifices some performance optimizations in order to minimize the code size for these less-used formats; the goal for SwiftDtoa v2 has been to match the float80 performance of earlier implementations while reducing code size and widening the arithmetic routines sufficiently to support binary128.
SwiftDtoa Testing
=================
A newly-developed test harness generates several large files of test data that include known-correct results computed with high-precision arithmetic routines.
The test files include:
* Critical values generated by the algorithm presented in the Errol paper (about 48 million cases for binary128)
* Values for which the optimal decimal form is exactly midway between two binary floating-point values.
* All exact powers of two representable in this format.
* Floating-point values that are close to exact powers of ten.
In addition, several billion random values for each format were compared to the results from other implementations.
For binary16 and binary32 this provided exhaustive validation of every possible input value.
Code Size and Performance
=========================
The tables below summarize the code size and performance for the SwiftDtoa C library module by itself on several different processor architectures.
When used from Swift, the `.description` and `.debugDescription` implementations incur additional overhead for creating and returning Swift strings that are not captured here.
The code size tables show the total size in bytes of the compiled `.o` object files for a particular version of that code.
The headings indicate the floating-point formats supported by that particular build (e.g., "16,32" for a version that supports binary16 and binary32 but no other formats).
The performance numbers below were obtained from a custom test harness that generates random bit patterns, interprets them as the corresponding floating-point value, and averages the overall time.
For float80, the random bit patterns were generated in a way that avoids generating invalid values.
All code was compiled with the system C/C++ compiler using `-O2` optimization.
A few notes about particular implementations:
* **SwiftDtoa v1** is the original SwiftDtoa implementation as committed to the Swift runtime in April 2018.
* **SwiftDtoa v1a** is the same as SwiftDtoa v1 with added binary16 support.
* **SwiftDtoa v2** can be configured with preprocessor macros to support any subset of the supported formats. I've provided sizes here for several different build configurations.
* **Ryu** (Ulf Anders) implements binary32 and binary64 as completely independent source files. The size here is the total size of the two .o object files.
* **Ryu(size)** is Ryu compiled with the `RYU_OPTIMIZE_SIZE` option.
* **Dragonbox** (Junekey Jeon). The size here is the compiled size of a simple `.cpp` file that instantiates the template for the specified formats, plus the size of the associated text output logic.
* **Dragonbox(size)** is Dragonbox compiled to minimize size by using a compressed power-of-10 table.
* **gdtoa** has a very large feature set. For this reason, I excluded it from the code size comparison since I didn't consider the numbers to be comparable to the others.
x86_64
----------------
These were built using Apple clang 12.0.5 on a 2019 16" MacBook Pro (2.4GHz 8-core Intel Core i9) running macOS 11.1.
**Code Size**
Bold numbers here indicate the configurations that have shipped as part of the Swift runtime.
| | 16,32,64,80 | 32,64,80 | 32,64 |
|---------------|------------:|------------:|------------:|
|SwiftDtoa v1 | | **15128** | |
|SwiftDtoa v1a | **16888** | | |
|SwiftDtoa v2 | **20220** | 18628 | 8248 |
|Ryu | | | 40408 |
|Ryu(size) | | | 23836 |
|Dragonbox | | | 23176 |
|Dragonbox(size)| | | 15132 |
**Performance**
| | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1 | | 25ns | 46ns | 82ns | |
|SwiftDtoa v1a | 37ns | 26ns | 47ns | 83ns | |
|SwiftDtoa v2 | 22ns | 19ns | 31ns | 72ns | 90ns |
|Ryu | | 19ns | 26ns | | |
|Ryu(size) | | 17ns | 24ns | | |
|Dragonbox | | 19ns | 24ns | | |
|Dragonbox(size) | | 19ns | 29ns | | |
|gdtoa | 220ns | 381ns | 1184ns | 16044ns | 22800ns |
ARM64
----------------
These were built using Apple clang 12.0.0 on a 2020 M1 Mac Mini running macOS 11.1.
**Code Size**
| | 16,32,64 | 32,64 |
|---------------|---------:|------:|
|SwiftDtoa v1 | | 7436 |
|SwiftDtoa v1a | 9124 | |
|SwiftDtoa v2 | 9964 | 8228 |
|Ryu | | 35764 |
|Ryu(size) | | 16708 |
|Dragonbox | | 27108 |
|Dragonbox(size)| | 19172 |
**Performance**
| | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1 | | 21ns | 39ns | | |
|SwiftDtoa v1a | 17ns | 21ns | 39ns | | |
|SwiftDtoa v2 | 15ns | 17ns | 29ns | 54ns | 71ns |
|Ryu | | 15ns | 19ns | | |
|Ryu(size) | | 29ns | 24ns | | |
|Dragonbox | | 16ns | 24ns | | |
|Dragonbox(size) | | 15ns | 34ns | | |
|gdtoa | 143ns | 242ns | 858ns | 25129ns | 36195ns |
ARM32
----------------
These were built using clang 8.0.1 on a BeagleBone Black (500MHz ARMv7) running FreeBSD 12.1-RELEASE.
**Code Size**
| | 16,32,64 | 32,64 |
|---------------|---------:|------:|
|SwiftDtoa v1 | | 8668 |
|SwiftDtoa v1a | 10356 | |
|SwiftDtoa v2 | 9796 | 8340 |
|Ryu | | 32292 |
|Ryu(size) | | 14592 |
|Dragonbox | | 29000 |
|Dragonbox(size)| | 21980 |
**Performance**
| | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1 | | 459ns | 1152ns | | |
|SwiftDtoa v1a | 383ns | 451ns | 1148ns | | |
|SwiftDtoa v2 | 202ns | 357ns | 715ns | 2720ns | 3379ns |
|Ryu | | 345ns | 5450ns | | |
|Ryu(size) | | 786ns | 5577ns | | |
|Dragonbox | | 300ns | 904ns | | |
|Dragonbox(size) | | 294ns | 1021ns | | |
|gdtoa | 2180ns | 4749ns | 18742ns |293000ns | 440000ns |
* This is fast enough now even for non-optimized test runs
* Fix float80 Nan/Inf parsing, comment more thoroughly
In debug configurations, fatal error messages include file & line number information. This update presents this information in a format matching the diagnostic conventions used by the compiler, which can be a slight productivity boost.
Code compiled with optimizations enabled (which is most production code) does not output this information, so it isn’t affected by this change.
Original format:
Fatal error: A suffusion of yellow: file calc.swift, line 5
New format:
calc.swift:5: Fatal error: A suffusion of yellow
Resolves rdar://68484891
Previously, overflow and underflow both caused this to return `nil`, which causes several problems:
* It does not distinguish between a large but valid input and a malformed input. `Float("3.402824e+38")` is perfectly well-formed but returns nil
* It differs from how the compiler handles literals. As a result, `Float(3.402824e+38)` is very different from `Float("3.402824e+38")`
* It's inconsistent with Foundation Scanner()
* It's inconsistent with other programming languages
This is exactly the same as #25313
Fixes rdar://problem/36990878
Most SwiftShims were put in the swift namespace in C++ mode which broke certain things when importing them in a swift file in C++ mode. This was OK when they were only imported as part of the swift runtime but, now they are used in C++ mode both in the swift runtime and when C++ interop is enabled.
This broke when C++ interop was enabled because the `Swift` module contains references to symbols in the SwiftShims headers which are built without C++ interop enabled (no "swift" namespace). But, when C++ interop is enabled, the SwiftShims headers would put everything in the swift namespace meaning the symbols couldn't be found in the global namespace. Then, the compiler would error when trying to deserialize the Swift module.
Currently, _swift_stdlib_strtoX_clocale_impl is present here twice: one
general definition for most platforms that wraps the standard strto*
functions, and another general definition with a slightly different
implementation for Cygwin, Haiku, and Windows, using stringstreams to
achieve a similar result. Furthermore, for Windows, the stringstream
implementation isn't even used but specialized away.
Firstly: the stringstream implementation is slightly broken, since it
causes some of the unit tests to fail; secondly, there is an awful lot
of repetition here that is ripe for simplification.
Instead of duplicating twice and using template specialization to induce
platform-specific behavior, we massage the stringstream implementation
for Cygwin and Haiku into something that looks like a standard strto*
function.
Now we have one definition (for each type) of swift_strto*_l. By
default, the _swift_stdlib_strto*_clocale stubs will refer to strto*_l,
and platforms withing to make use of the swift_strto*_l stubs can make
the necessary preprocessor definitions to utilize them.
Instead of putting the stubs alongside the redefinitions in each platform
preprocessor section, split these out, in anticipation for consolidation
in the next commit.
Here the template specializations can be adapted to a strto* wrapper, for
use with the general function-pointer implementation of
_swift_stdlib_strtoX_clocale_impl.
The template defined for Cygwin and friends does not handle failure cases
properly so the NumericParsing unit test fails. To get the correct test
behavior, we need to use an implementation such as like in the Windows
specializations or the non-Cygwin implementation that takes a function
pointer.
However, adding additional specializations would be too wordy, and
OpenBSD doesn't have locale-dependent definitions of the relevant strto*
functions. We could add a specialization for a two-argument function
pointer, but that becomes too repetitive.
Instead, implement a few stubs and use the preprocessor, a la the
implementation for Android.
There are a few environment variables used to enable debugging options in the
runtime, and we'll likely add more over time. These are implemented with
scattered getenv() calls at the point of use. This is inefficient, as most/all
OSes have to do a linear scan of the environment for each call. It's also not
discoverable, since the only way to find these variables is to inspect the
source.
This commit places all of these variables in a central location.
stdlib/public/runtime/EnvironmentVariables.def defines all of the debug
variables including their name, type, default value, and a help string. On OSes
which make an `environ` array available, the entire array is scanned in a single
pass the first time any debug variable is requested. By quickly rejecting
variables that do not start with `SWIFT_`, we optimize for the common case where
no debug variables are set. We also have a fallback to repeated `getenv()` calls
when a full scan is not possible.
Setting `SWIFT_HELP=YES` will print out all available debug variables along with
a brief description of what they do.
This adds a new copy of LLVMSupport into the runtime. This is the final
step before changing the inline namespace for the runtime support. This
will allow us to avoid the ODR violations from the header definitions of
LLVMSupport.
LLVMSupport forked at: 22492eead218ec91d349c8c50439880fbeacf2b7
Changes made to LLVMSupport from that revision:
process.inc forward declares `_beginthreadex` due to compilation issues due to custom flag handling
API changes required that we alter the `Deallocate` routine to account
for the alignment.
This is a temporary state, meant to simplify the process. We do not use
the entire LLVMSupport library and there is no value in keeping the
entire library. Subsequent commits will prune the library to the needs
for the runtime.
Rather than using the forward declaration for the LLVMSupport types,
expect to be able to use the full declaration. Because these are
references in the implementation, there is no reason to use a forward
declaration as the full types need to be declared for use. The LLVM
headers will provide the declaration and definition for the types. This
is motivated by the desire to ensure that the LLVMSupport symbols are
properly namespaced to avoid ODR violations in the runtime.
This reduces the dependency on `LLVMSupport`. This is the first step
towards helping move towards a local fork of the LLVM ADT to ensure that
static linking of the Swift runtime and core library does not result in
ODR violations.
This reverts commit 5fd6e98b2f, reversing
changes made to 3aee49d9d0.
Revert "XFAIL test/Interpreter/metadata_access.swift on arm64e"
This reverts commit 8fe216b004.
Revert "XFAIl test on os stdlib bots"
This reverts commit aea5fa4842.
Extend SwiftDtoa to provide optimal formatting for Float16 and use that for `Float16.description` and `Float16.debugDescription`.
Notes on signaling NaNs: LLVM's Float16 support passes Float16s on x86
by legalizing to Float32. This works well for most purposes but incidentally
loses the signaling marker from any NaN (because it's a conversion as far
as the hardware is concerned), with a side effect that the print code never
actually sees a true sNaN. This is similar to what happens with Float and
Double on i386 backends. The earlier code here tried to detect sNaN in a
different way, but that approach isn't guaranteed to work so we decided to
make this code use the correct detection logic -- sNaN printing will just be
broken until we can get a better argument passing convention.
Resolves rdar://61414101