Commit Graph

15 Commits

Author SHA1 Message Date
Yuta Saito
c32c7f55fe [wasm] Fix build failure due to lack of _Float16 support
WebAssembly does not support _Float16 type, so we need to guard the use
of the type. Unfortunately, Clang does not provide a good way to detect
the support of _Float16 type at compile time, so just disable for wasm
targets.
2024-04-07 16:34:11 +00:00
Stephen Canon
6ac852f3f4 Move Float16 print tests to their own file and test exhaustively (#72859)
We can easily test all 2**16 values, so let's do it. Also now _Float16 is properly supported in clang, so we can pass arguments to CPP that way, which lets us get snan right on more platforms.
2024-04-05 19:55:19 -04:00
Ikko Ashimine
6eee687cd6 Fix typo in SwiftDtoa.cpp
noticable -> noticeable
2022-06-10 23:11:44 +09:00
Josh Soref
5fab3d1f58 Spelling stdlib/public/runtime (#42439)
* spelling: access

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: bridgeable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: canonical

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: clazz

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: compatibility

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: language

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: necessary

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: platform

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: recursive

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: related

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: repeated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: satisfy

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: that

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: the

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: verification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

Co-authored-by: Josh Soref <jsoref@users.noreply.github.com>
2022-04-19 14:03:03 -07:00
Thomas
9ea48da9cc Shouldn't this be labeled as a C++ file? 2021-09-13 20:45:24 +01:00
OneUpWallStreet
25df5d2c48 Fixed some common grammatical errors in the comments. (#38248) 2021-07-05 01:00:48 -07:00
tbkka
a32dacb131 SwiftDtoa v2: Better, Smaller, Faster floating-point formatting (#35299)
* SwiftDtoa v2: Better, Smaller, Faster floating-point formatting

SwiftDtoa is the C/C++ code used in the Swift runtime to produce the textual representations used by the `description` and `debugDescription` properties of the standard Swift floating-point types.
This update includes a number of algorithmic improvements to SwiftDtoa to improve portability, reduce code size, and improve performance but does not change the actual output.

About SwiftDtoa
===============

In early versions of Swift, the `description` properties used the C library `sprintf` functionality with a fixed number of digits.
In 2018, that logic was replaced with the first version of SwiftDtoa which used used a fast, adaptive algorithm to automatically choose the correct number of digits for a particular value.
The resulting decimal output is always:

* Accurate.  Parsing the decimal form will yield exactly the same binary floating-point value again. This guarantee holds for any parser that accurately implements IEEE 754. In particular, the Swift standard library can guarantee that for any Double `d` that is not a NaN, `Double(d.description) == d`.

* Short. Among all accurate forms, this form has the fewest significant digits. (Caution: Surprisingly, this is not the same as minimizing the number of characters. In some cases, minimizing the number of characters requires producing additional significant digits.)

* Close. If there are multiple accurate, short forms, this code chooses the decimal form that is closest to the exact binary value.  If there are two exactly the same distance, the one with an even final digit will be used.

Algorithms that can produce this "optimal" output have been known since at least 1990, when Steele and White published their Dragon4 algorithm.
However, Dragon4 and other algorithms from that period relied on high-precision integer arithmetic, which made them slow.
More recently, a surge of interest in this problem has produced dramatically better algorithms that can produce the same results using only fast fixed-precision arithmetic.

This format is ideal for JSON and other textual interchange: accuracy ensures that the value will be correctly decoded, shortness minimizes network traffic, and the existence of high-performance algorithms allows this form to be generated more quickly than many `printf`-based implementations.

This format is also ideal for logging, debugging, and other general display. In particular, the shortness guarantee avoids the confusion of unnecessary additional digits, so that the result of `1.0 / 10.0` consistently displays as `0.1` instead of `0.100000000000000000001`.

About SwiftDtoa v2
==================

Compared to the original SwiftDtoa code, this update is:

**Better**:
The core logic is implemented using only C99 features with 64-bit and smaller integer arithmetic.
If available, 128-bit integers are used for better performance.
The core routines do not require any floating-point support from the C/C++ standard library and with only minor modifications should be usable on systems with no hardware or software floating-point support at all.
This version also has experimental support for IEEE 754 binary128 format, though this support is obviously not included when compiling for the Swift standard library.

**Smaller**:
Code size reduction compared to the earlier versions was a primary goal for this effort.
In particular, the new binary128 support shares essentially all of its code with the float80 implementation.

**Faster**:
Even with the code size reductions, all formats are noticeably faster.
The primary performance gains come from three major changes:
Text digits are now emitted directly in the core routines in a form that requires only minimal adjustment to produce the final text.
Digit generation produces 2, 4, or even 8 digits at a time, depending on the format.
The double logic optimistically produces 7 digits in the initial scaling with a Ryu-inspired backtracking when fewer digits suffice.

SwiftDtoa's algorithms
======================

SwiftDtoa started out as a variation of Florian Loitsch' Grisu2 that addressed the shortness failures of that algorithm.
Subsequent work has incorporated ideas from Errol3, Ryu, and other sources to yield a production-quality implementation that is performance- and size-competitive with current research code.

Those who wish to understand the details can read the extensive comments included in the code.
Note that float16 actually uses a different algorithm than the other formats, as the extremely limited range can be handled with much simpler techniques.
The float80/binary128 logic sacrifices some performance optimizations in order to minimize the code size for these less-used formats; the goal for SwiftDtoa v2 has been to match the float80 performance of earlier implementations while reducing code size and widening the arithmetic routines sufficiently to support binary128.

SwiftDtoa Testing
=================

A newly-developed test harness generates several large files of test data that include known-correct results computed with high-precision arithmetic routines.
The test files include:
* Critical values generated by the algorithm presented in the Errol paper (about 48 million cases for binary128)
* Values for which the optimal decimal form is exactly midway between two binary floating-point values.
* All exact powers of two representable in this format.
* Floating-point values that are close to exact powers of ten.

In addition, several billion random values for each format were compared to the results from other implementations.
For binary16 and binary32 this provided exhaustive validation of every possible input value.

Code Size and Performance
=========================

The tables below summarize the code size and performance for the SwiftDtoa C library module by itself on several different processor architectures.
When used from Swift, the `.description` and `.debugDescription` implementations incur additional overhead for creating and returning Swift strings that are not captured here.

The code size tables show the total size in bytes of the compiled `.o` object files for a particular version of that code.
The headings indicate the floating-point formats supported by that particular build (e.g., "16,32" for a version that supports binary16 and binary32 but no other formats).

The performance numbers below were obtained from a custom test harness that generates random bit patterns, interprets them as the corresponding floating-point value, and averages the overall time.
For float80, the random bit patterns were generated in a way that avoids generating invalid values.

All code was compiled with the system C/C++ compiler using `-O2` optimization.

A few notes about particular implementations:
* **SwiftDtoa v1** is the original SwiftDtoa implementation as committed to the Swift runtime in April 2018.
* **SwiftDtoa v1a** is the same as SwiftDtoa v1 with added binary16 support.
* **SwiftDtoa v2** can be configured with preprocessor macros to support any subset of the supported formats.  I've provided sizes here for several different build configurations.
* **Ryu** (Ulf Anders) implements binary32 and binary64 as completely independent source files.  The size here is the total size of the two .o object files.
* **Ryu(size)** is Ryu compiled with the `RYU_OPTIMIZE_SIZE` option.
* **Dragonbox** (Junekey Jeon).  The size here is the compiled size of a simple `.cpp` file that instantiates the template for the specified formats, plus the size of the associated text output logic.
* **Dragonbox(size)** is Dragonbox compiled to minimize size by using a compressed power-of-10 table.
* **gdtoa** has a very large feature set.  For this reason, I excluded it from the code size comparison since I didn't consider the numbers to be comparable to the others.

x86_64
----------------

These were built using Apple clang 12.0.5 on a 2019 16" MacBook Pro (2.4GHz 8-core Intel Core i9) running macOS 11.1.

**Code Size**

Bold numbers here indicate the configurations that have shipped as part of the Swift runtime.

|               | 16,32,64,80 | 32,64,80    | 32,64       |
|---------------|------------:|------------:|------------:|
|SwiftDtoa v1   |             |   **15128** |             |
|SwiftDtoa v1a  |   **16888** |             |             |
|SwiftDtoa v2   |   **20220** |     18628   |        8248 |
|Ryu            |             |             |       40408 |
|Ryu(size)      |             |             |       23836 |
|Dragonbox      |             |             |       23176 |
|Dragonbox(size)|             |             |       15132 |

**Performance**

|              | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1  |          |     25ns |     46ns |    82ns |           |
|SwiftDtoa v1a |     37ns |     26ns |     47ns |    83ns |           |
|SwiftDtoa v2  |     22ns |     19ns |     31ns |    72ns |      90ns |
|Ryu           |          |     19ns |     26ns |         |           |
|Ryu(size)     |          |     17ns |     24ns |         |           |
|Dragonbox     |          |     19ns |     24ns |         |           |
|Dragonbox(size) |        |     19ns |     29ns |         |           |
|gdtoa         |    220ns |    381ns |   1184ns | 16044ns |   22800ns |

ARM64
----------------

These were built using Apple clang 12.0.0 on a 2020 M1 Mac Mini running macOS 11.1.

**Code Size**

|               | 16,32,64 | 32,64 |
|---------------|---------:|------:|
|SwiftDtoa v1   |          |  7436 |
|SwiftDtoa v1a  |     9124 |       |
|SwiftDtoa v2   |     9964 |  8228 |
|Ryu            |          | 35764 |
|Ryu(size)      |          | 16708 |
|Dragonbox      |          | 27108 |
|Dragonbox(size)|          | 19172 |

**Performance**

|              | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1  |          |     21ns |     39ns |         |           |
|SwiftDtoa v1a |     17ns |     21ns |     39ns |         |           |
|SwiftDtoa v2  |     15ns |     17ns |     29ns |    54ns |      71ns |
|Ryu           |          |     15ns |     19ns |         |           |
|Ryu(size)     |          |     29ns |     24ns |         |           |
|Dragonbox     |          |     16ns |     24ns |         |           |
|Dragonbox(size) |        |     15ns |     34ns |         |           |
|gdtoa         |    143ns |    242ns |    858ns | 25129ns |   36195ns |

ARM32
----------------

These were built using clang 8.0.1 on a BeagleBone Black (500MHz ARMv7) running FreeBSD 12.1-RELEASE.

**Code Size**

|               | 16,32,64 | 32,64 |
|---------------|---------:|------:|
|SwiftDtoa v1   |          |  8668 |
|SwiftDtoa v1a  |    10356 |       |
|SwiftDtoa v2   |     9796 |  8340 |
|Ryu            |          | 32292 |
|Ryu(size)      |          | 14592 |
|Dragonbox      |          | 29000 |
|Dragonbox(size)|          | 21980 |

**Performance**

|              | binary16 | binary32 | binary64 | float80 | binary128 |
|--------------|---------:|---------:|---------:|--------:|----------:|
|SwiftDtoa v1  |          |    459ns |   1152ns |         |           |
|SwiftDtoa v1a |    383ns |    451ns |   1148ns |         |           |
|SwiftDtoa v2  |    202ns |    357ns |    715ns |  2720ns |    3379ns |
|Ryu           |          |    345ns |   5450ns |         |           |
|Ryu(size)     |          |    786ns |   5577ns |         |           |
|Dragonbox     |          |    300ns |    904ns |         |           |
|Dragonbox(size) |        |    294ns |   1021ns |         |           |
|gdtoa         |   2180ns |   4749ns |  18742ns |293000ns |  440000ns |

* This is fast enough now even for non-optimized test runs

* Fix float80 Nan/Inf parsing, comment more thoroughly
2021-01-27 14:35:55 -08:00
Anthony Latsis
9fd1aa5d59 [NFC] Pre- increment and decrement where possible 2020-06-01 15:39:29 +03:00
tbkka
110e5136c1 Float16 optimal formatting (#30862)
Extend SwiftDtoa to provide optimal formatting for Float16 and use that for `Float16.description` and `Float16.debugDescription`.

Notes on signaling NaNs: LLVM's Float16 support passes Float16s on x86
by legalizing to Float32.  This works well for most purposes but incidentally
loses the signaling marker from any NaN (because it's a conversion as far
as the hardware is concerned), with a side effect that the print code never
actually sees a true sNaN.  This is similar to what happens with Float and
Double on i386 backends.  The earlier code here tried to detect sNaN in a
different way, but that approach isn't guaranteed to work so we decided to
make this code use the correct detection logic -- sNaN printing will just be
broken until we can get a better argument passing convention.

Resolves rdar://61414101
2020-04-09 09:37:38 -04:00
David Zarzycki
95473a10d7 [Misc] NFC: Fix random build warnings
Unused variables/methods, language extensions, extra semicolons, intentional
self assignment, platform specific quirks, etc.
2018-04-30 12:52:43 -04:00
tbkka
924b8e55d5 Updates to Floating-point printing code (SwiftDtoa.cpp) (#16178)
This collects a number of changes I've been testing over the
last month.

* Bug fix: The single-precision float formatter did not always
  round the last digit even in cases where there were two
  possible outputs that were otherwise equally good.

* Algorithm simplification: The condition for determining
  whether to widen or narrow the interval was more complex than
  necessary. I now simply widen the interval for all even
  significands.

* Code simplification: The single-precision float formatter now uses fewer
  64-bit features.  This eliminated some 32-bit vs. 64-bit conditionals in
  exchange for a minor loss of performance (~2%).

* Minor performance tweaks: Steve Canon pointed out a few places
  where I could avoid some extraneous arithmetic.

I've also rewritten a lot of comments to try to make the exposition
clearer.

The earlier testing regime focused on testing from first
principles.  For example, I verified accuracy by feeding the
result back into the C library `strtof`, `strtod`, etc. and
checking round-trip exactness.  Unfortunately, this approach
requires many checks for each value, limiting test performance.
It's also difficult to validate last-digit rounding.

For this round of updates, I've instead compared the digit
decompositions to other popular algorithms:
* David M. Gay's gdtoa library is a robust and well-tested
  implementation based on Dragon4.  It supports all formats, but
  is slow. (netlib.org/fp)
* Grisu3 supports Float and Double.  It is fast but incomplete,
  failing on about 1% of all inputs.
  (github.com/google/double-conversion)
* Errol4 is fast and complete but only supports Double.  The
  repository includes an implementation of the enumeration
  algorithm described in the Errol paper.
  (github.com/marcandrysco/errol)

The exact tests varied by format:

* Float: SwiftDtoa now generates the exact same digits as gdtoa
  for every single-precision Float.

* Double: Testing against Grisu3 (with fallback to Errol4 when
  Grisu3 failed) greatly improved test performance.  This
  allowed me to test 100 trillion (10^14) randomly-selected
  doubles in a reasonable amount of time.  I also checked all
  values generated by the Errol enumeration algorithm.

* Float80: I compared the Float80 output to the gdtoa library
  because neither Grisu3 nor Errol4 yet supports 80-bit extended
  precision.  All values generated by the Errol enumeration
  algorithm have been checked, as well as several billion
  randomly-selected values.
2018-04-26 16:09:49 -07:00
Saleem Abdulrasool
d3ab36fd46 Merge pull request #15745 from compnerd/dota
runtime: avoid UB on Windows x86_64 builds
2018-04-15 18:18:47 -07:00
tbkka
1cc1832b96 SR-3131: Adjust choice of decimal vs. exponential format (#15805)
Merge SR-3131 fix:

For each floating-point type, there is a range of integers which
can be exactly represented in that type.  Adjust the formatting
logic so that we use decimal format for integers within this
range, exponential format for numbers outside of this range.

For example, Double has a 53-bit significand so can exactly
represent every integer from `-(2^53)...(2^53)`.  With this
change, we now use decimal format for these integers and
exponential format for values outside of this range.  This is
a relatively small change from the previous logic -- we've
basically just moved the cutoff from 10^15 to 2^53 (about 10^17).

The decision for using exponential format for small numbers is
not changed.
2018-04-12 09:35:49 -07:00
Saleem Abdulrasool
ebb21f50ee runtime: avoid UB on Windows x86_64 builds
Use the `ull` extension rather than `l` for LLP64 environment support.
Fixes UB on the Windows x86_64 port.
2018-04-04 14:34:49 -07:00
tbkka
97a934c412 SR-106: New floating-point description implementation (#15474)
* SR-106: New floating-point `description` implementation

This replaces the current implementation of `description` and
`debugDescription` for the standard floating-point types with a new
formatting routine based on a variation of Florian Loitsch' Grisu2
algorithm with changes suggested by Andrysco, Jhala, and Lerner's 2016
paper describing Errol3.

Unlike the earlier code based on `sprintf` with a fixed number of
digits, this version always chooses the optimal number of digits.  As
such, we can now use the exact same output for both `description` and
`debugDescription` (except of course that `debugDescription` provides
full detail for NaNs).

The implementation has been extensively commented; people familiar with
Grisu-style algorithms should find the code easy to understand.

This implementation is:

* Fast.  It uses only fixed-width integer arithmetic and has constant
  memory and time requirements.

* Simple. It is only a little more complex than Loitsch' original
  implementation of Grisu2.  The digit decomposition logic for double is
  less than 300 lines of standard C (half of which is common arithmetic
  support routines).

* Always Accurate. Converting the decimal form back to binary (using an
  accurate algorithm such as Clinger's) will always yield exactly the
  original binary value.  For the IEEE 754 formats, the round-trip will
  produce exactly the same bit pattern in memory.  This is an essential
  requirement for JSON serialization, debugging, and logging.

* Always Short.  This always selects an accurate result with the minimum
  number of decimal digits.  (So that `1.0 / 10.0` will always print
  `0.1`.)

* Always Close.  Among all accurate, short results, this always chooses
  the result that is closest to the exact floating-point value. (In case
  of an exact tie, it rounds the last digit even.)

This resolves SR-106 and related issues that have complained
about the floating-point `description` properties being inexact.

* Remove duplicate infinity handling

* Use defined(__SIZEOF_INT128__) to detect uint128_t support

* Separate `extracting` the integer part from `clearing` the integer part

The previous code was unnecessarily obfuscated by the attempt to combine
these two operations.

* Use `UINT32_MAX` to mask off 32 bits of a larger integer

* Correct the expected NaN results for 32-bit i386

* Make the C++ exceptions here consistent

Adding a C source file somehow exposed an issue in an unrelated C++ file.
Thanks to Joe Groff for the fix.

* Rename SwiftDtoa to ".cpp"

Having a C file in stdlib/public/runtime causes strange
build failures on Linux in unrelated C++ files.

As a workaround, rename SwiftDtoa.c to .cpp to see
if that avoids the problems.

* Revert "Make the C++ exceptions here consistent"

This reverts commit 6cd5c20566.
2018-04-01 16:52:48 -07:00