Files
swift-mirror/stdlib/public/runtime/SwiftDtoa.cpp
tbkka 97a934c412 SR-106: New floating-point description implementation (#15474)
* SR-106: New floating-point `description` implementation

This replaces the current implementation of `description` and
`debugDescription` for the standard floating-point types with a new
formatting routine based on a variation of Florian Loitsch' Grisu2
algorithm with changes suggested by Andrysco, Jhala, and Lerner's 2016
paper describing Errol3.

Unlike the earlier code based on `sprintf` with a fixed number of
digits, this version always chooses the optimal number of digits.  As
such, we can now use the exact same output for both `description` and
`debugDescription` (except of course that `debugDescription` provides
full detail for NaNs).

The implementation has been extensively commented; people familiar with
Grisu-style algorithms should find the code easy to understand.

This implementation is:

* Fast.  It uses only fixed-width integer arithmetic and has constant
  memory and time requirements.

* Simple. It is only a little more complex than Loitsch' original
  implementation of Grisu2.  The digit decomposition logic for double is
  less than 300 lines of standard C (half of which is common arithmetic
  support routines).

* Always Accurate. Converting the decimal form back to binary (using an
  accurate algorithm such as Clinger's) will always yield exactly the
  original binary value.  For the IEEE 754 formats, the round-trip will
  produce exactly the same bit pattern in memory.  This is an essential
  requirement for JSON serialization, debugging, and logging.

* Always Short.  This always selects an accurate result with the minimum
  number of decimal digits.  (So that `1.0 / 10.0` will always print
  `0.1`.)

* Always Close.  Among all accurate, short results, this always chooses
  the result that is closest to the exact floating-point value. (In case
  of an exact tie, it rounds the last digit even.)

This resolves SR-106 and related issues that have complained
about the floating-point `description` properties being inexact.

* Remove duplicate infinity handling

* Use defined(__SIZEOF_INT128__) to detect uint128_t support

* Separate `extracting` the integer part from `clearing` the integer part

The previous code was unnecessarily obfuscated by the attempt to combine
these two operations.

* Use `UINT32_MAX` to mask off 32 bits of a larger integer

* Correct the expected NaN results for 32-bit i386

* Make the C++ exceptions here consistent

Adding a C source file somehow exposed an issue in an unrelated C++ file.
Thanks to Joe Groff for the fix.

* Rename SwiftDtoa to ".cpp"

Having a C file in stdlib/public/runtime causes strange
build failures on Linux in unrelated C++ files.

As a workaround, rename SwiftDtoa.c to .cpp to see
if that avoids the problems.

* Revert "Make the C++ exceptions here consistent"

This reverts commit 6cd5c20566.
2018-04-01 16:52:48 -07:00

2440 lines
95 KiB
C++

//===--- SwiftDtoa.c ---------------------------------------------*- c -*-===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2018 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===---------------------------------------------------------------------===//
//
// Note: This is really a C file, but Swift's build system for Linux is
// partially allergic to C, so it's being compiled as ".cpp" for now. Please
// don't infect it with C++-isms.
//
///
/// The core algorithm here (see `swift_decompose_double` below) is a
/// modified form of the Grisu2 algorithm from Florian Loitsch;
/// "Printing Floating-Point Numbers Quickly and Accurately with
/// Integers", 2010. https://dl.acm.org/citation.cfm?id=1806623
///
/// This includes some improvements suggested by the "Errol paper":
/// Marc Andrysco, Ranjit Jhala, Sorin Lerner; "Printing
/// Floating-Point Numbers: A Faster, Always Correct Method", 2016.
/// https://dl.acm.org/citation.cfm?id=2837654
///
/// The following summary assumes you're familiar with Grisu-style
/// algorithms in general:
///
/// Loitsch' original Grisu2 implementation guarantees round-trip
/// accuracy but only generates the shortest decimal expansion about 99%
/// of the time. Grisu3 addresses this by essentially running two
/// copies of Grisu2 in parallel: one copy biased to avoid inaccuracy,
/// the other biased to avoid generating excess digits. If they agree,
/// then the result must be both accurate and short. But if they
/// disagree, then Grisu3 gives up and must defer to another algorithm.
///
/// The Errol paper provides a deeper analysis of the cases where
/// Grisu2 fails to find the shortest decimal expansion. There
/// are two root causes of such failures:
///
/// * Insufficient precision leads to scattered failures across the
/// entire range. Theorem 7 in the Errol paper shows a way to
/// construct a superset of the numbers subject to such failures.
/// With this list, we can simply test whether we have sufficient
/// precision.
///
/// For Double, the Errol3 algorithm uses double-double arithmetic
/// with about 106 bits precision. This turns out to be not quite
/// sufficient, requiring Errol3 to pre-screen the input against a
/// list of exceptions culled from the larger list of possible
/// failures. Using high-precision integers, we've discovered that
/// 110 bit precision is sufficient to satisfy the Errol test cases
/// without requiring any pre-screening.
///
/// For Float and Float80, the same approach shows that we need 53
/// and 135 bits, respectively. It is an interesting coincidence
/// that for all three cases, an n-bit significand can be formatted
/// optimally with no more than 2n+7 bits of intermediate precision.
///
/// * For numbers with even significands and a specific range of
/// exponents, the shortest value might occur exactly at the midpoint
/// between two adjacent binary floating-point values. Theorem 6 of
/// the Errol paper characterizes this sufficiently to allow us to
/// vary our strategy. Errol3 uses a separate formatter for this
/// case. This implementation instead widens the interval (as in
/// Grisu3), which allows us to use the same fast digit generation in
/// all cases, providing uniform performance across the entire range.
///
/// In addition to addressing the shortness failures characterized in
/// the Errol paper, the implementation here also incorporates
/// final-digit corrections from Grisu3, allowing it to produce the
/// optimal decimal decomposition in all cases.
///
/// In summary, this implementation is:
///
/// * Fast. It uses only fixed-width integer arithmetic and has
/// constant memory requirements.
///
/// * Simple. It is only a little more complex than Loitsch' original
/// implementation of Grisu2. The full digit decomposition for double
/// is less than 300 lines of standard C.
///
/// * Always Accurate. Converting the decimal form back to binary
/// will always yield exactly the same value. For the IEEE 754
/// formats, the round-trip will produce exactly the same bit
/// pattern in memory.
///
/// * Always Short. This always selects an accurate result with the
/// minimum number of decimal digits.
///
/// * Always Close. Among all accurate, short results, this always
/// chooses the result that is closest to the exact floating-point
/// value. (In case of an exact tie, it rounds the last digit even.)
///
/// For single-precision Float, these claims have been exhaustively
/// tested -- all 2^32 values can be checked in under an hour on a
/// mid-range modern laptop. The Double and Float80 formatters rely on
/// results from the Errol paper to ensure correctness. In addition,
/// we have verified more than 10^13 randomly-chosen values by comparing
/// the results to the output of Errol4.
///
// ----------------------------------------------------------------------------
#include <float.h>
#include <math.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "swift/Runtime/SwiftDtoa.h"
#if defined(__SIZEOF_INT128__)
// We get a significant speed boost if we can use the __uint128_t
// type that's present in GCC and Clang on 64-bit architectures. In
// particular, we can do 128-bit arithmetic directly and can
// represent 192-bit integers as a collection of 64-bit elements.
#define HAVE_UINT128_T 1
#else
// On 32-bit, we have to use slower code that manipulates 128-bit
// and 192-bit integers as collections of 32-bit elements.
#define HAVE_UINT128_T 0
#endif
// Try to verify that the system floating-point types really are what we
// expect. Note that the code below is specific to these exact
// floating-point representations.
#if (FLT_RADIX != 2)
// Either you're using hexadecimal float format on S390, or you have a
// really broken C environment.
#error "This platform claims to not use binary floating point."
#endif
#if SWIFT_DTOA_FLOAT_SUPPORT
#if (FLT_MANT_DIG != 24) || (FLT_MIN_EXP != -125) || (FLT_MAX_EXP != 128)
#error "Are you certain `float` on this platform is really IEEE 754 single-precision binary32 format?"
#endif
#endif
#if SWIFT_DTOA_DOUBLE_SUPPORT
#if (DBL_MANT_DIG != 53) || (DBL_MIN_EXP != -1021) || (DBL_MAX_EXP != 1024)
#error "Are you certain `double` on this platform is really IEEE 754 double-precision binary64 format?"
#endif
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
#if (LDBL_MANT_DIG != 64) || (LDBL_MIN_EXP != -16381) || (LDBL_MAX_EXP != 16384)
#error "Are you certain `long double` on this platform is really Intel 80-bit extended precision?"
#endif
#endif
// See the implementations at the bottom of this file for detailed explanations
// of the purpose of each function.
//
// Helper functions used by float, double, and float80 machinery.
//
#if SWIFT_DTOA_FLOAT_SUPPORT || SWIFT_DTOA_DOUBLE_SUPPORT || SWIFT_DTOA_FLOAT80_SUPPORT
#if HAVE_UINT128_T
typedef __uint128_t swift_uint128_t;
#define initialize128WithHighLow64(dest, high64, low64) ((dest) = ((__uint128_t)(high64) << 64) | (low64))
#else
typedef struct {
uint32_t low, b, c, high;
} swift_uint128_t;
#define initialize128WithHighLow64(dest, high64, low64) \
((dest).low = (uint32_t)(low64), \
(dest).b = (uint32_t)((low64) >> 32), \
(dest).c = (uint32_t)(high64), \
(dest).high = (uint32_t)((high64) >> 32))
#endif
static int binaryExponentFor10ToThe(int p);
static int decimalExponentFor2ToThe(int e);
#endif
//
// Helper functions used only by the single-precision float formatter
//
#if SWIFT_DTOA_FLOAT_SUPPORT
static uint64_t multiply64x32RoundingDown(uint64_t lhs, uint32_t rhs);
static uint64_t multiply64x32RoundingUp(uint64_t lhs, uint32_t rhs);
static uint64_t multiply64x64RoundingDown(uint64_t lhs, uint64_t rhs);
static uint64_t multiply64x64RoundingUp(uint64_t lhs, uint64_t rhs);
static uint64_t shiftRightRoundingUp64(uint64_t lhs, int shift);
static void intervalContainingPowerOf10_Float(int p, uint64_t *lower, uint64_t *upper, int *exponent);
#endif
//
// Helper functions used only by the double-precision formatter
//
#if SWIFT_DTOA_DOUBLE_SUPPORT
#if HAVE_UINT128_T
#define increment128(dest) ((dest) += 1)
#define isLessThan128x128(lhs, rhs) ((lhs) < (rhs))
#define subtract128x128(lhs, rhs) (*(lhs) -= (rhs))
#define multiply128xi32(lhs, rhs) (*(lhs) *= (rhs))
#define initialize128WithHigh64(dest, value) ((dest) = (__uint128_t)(value) << 64)
#define extractHigh64From128(arg) ((uint64_t)((arg) >> 64))
static int extractIntegerPart128(__uint128_t *fixed128, int fractionBits) {
return (int)(*fixed128 >> fractionBits);
}
static void clearIntegerPart128(__uint128_t *fixed128, int fractionBits) {
const swift_uint128_t fixedPointMask = (((__uint128_t)1 << fractionBits) - 1);
*fixed128 &= fixedPointMask;
}
#else
#define increment128(dest) \
do { \
uint64_t t = (dest).low + 1; \
(dest).low = (uint32_t)t; \
t >>= 32; \
t += (dest).b; \
(dest).b = (uint32_t)t; \
t >>= 32; \
t += (dest).c; \
(dest).c = (uint32_t)t; \
t >>= 32; \
(dest).high += (uint32_t)t; \
} while (0)
static int isLessThan128x128(swift_uint128_t lhs, swift_uint128_t rhs);
static void subtract128x128(swift_uint128_t *lhs, swift_uint128_t rhs);
static void multiply128xi32(swift_uint128_t *lhs, uint32_t rhs);
#define initialize128WithHigh64(dest, value) \
((dest).low = (dest).b = 0, \
(dest).c = (uint32_t)(value), \
(dest).high = (uint32_t)((value) >> 32))
#define extractHigh64From128(arg) (((uint64_t)(arg).high << 32)|((arg).c))
static int extractIntegerPart128(swift_uint128_t *fixed128, int fractionBits) {
const int highFractionBits = fractionBits % 32;
return (int)(fixed128->high >> highFractionBits);
}
static void clearIntegerPart128(swift_uint128_t *fixed128, int fractionBits) {
const int highFractionBits = fractionBits % 32;
fixed128->high &= ((uint32_t)1 << highFractionBits) - 1;
}
#endif
static swift_uint128_t multiply128x64RoundingDown(swift_uint128_t lhs, uint64_t rhs);
static swift_uint128_t multiply128x64RoundingUp(swift_uint128_t lhs, uint64_t rhs);
static swift_uint128_t shiftRightRoundingDown128(swift_uint128_t lhs, int shift);
static swift_uint128_t shiftRightRoundingUp128(swift_uint128_t lhs, int shift);
static void intervalContainingPowerOf10_Double(int p, swift_uint128_t *lower, swift_uint128_t *upper, int *exponent);
#endif
//
// Helper functions used only by the extended-precision long double formatter
//
#if SWIFT_DTOA_FLOAT80_SUPPORT
#if HAVE_UINT128_T
// A 192-bit unsigned integer type stored as 3 64-bit words
typedef struct {uint64_t low, mid, high;} swift_uint192_t;
#define initialize192WithHighMidLow64(dest, high64, mid64, low64) \
((dest).low = (low64), \
(dest).mid = (mid64), \
(dest).high = (high64))
#else
// A 192-bit unsigned integer type stored as 6 32-bit words
typedef struct {uint32_t low, b, c, d, e, high;} swift_uint192_t;
#define initialize192WithHighMidLow64(dest, high64, mid64, low64) \
((dest).low = (uint64_t)(low64), \
(dest).b = (uint64_t)(low64) >> 32, \
(dest).c = (uint64_t)(mid64), \
(dest).d = (uint64_t)(mid64) >> 32, \
(dest).e = (uint64_t)(high64), \
(dest).high = (uint64_t)(high64) >> 32)
#endif
static void multiply192x64RoundingDown(swift_uint192_t *lhs, uint64_t rhs);
static void multiply192x64RoundingUp(swift_uint192_t *lhs, uint64_t rhs);
static void multiply192xi32(swift_uint192_t *lhs, uint32_t rhs);
static void multiply192x128RoundingDown(swift_uint192_t *lhs, swift_uint128_t rhs);
static void multiply192x128RoundingUp(swift_uint192_t *lhs, swift_uint128_t rhs);
static void subtract192x192(swift_uint192_t *lhs, swift_uint192_t rhs);
static int isLessThan192x192(swift_uint192_t lhs, swift_uint192_t rhs);
static void shiftRightRoundingDown192(swift_uint192_t *lhs, int shift);
static void shiftRightRoundingUp192(swift_uint192_t *lhs, int shift);
static void intervalContainingPowerOf10_Float80(int p, swift_uint192_t *lower, swift_uint192_t *upper, int *exponent);
#endif
//
// --------------- Digit generation ---------------------
//
// This is the interesting part.
// These routines take a floating-point value and efficiently compute
// everything necessary to write an optimal base-10 representation of
// that value. In particular, they compute the base-10 exponent and
// corresponding digits.
// swift_decompose_double is thoroughly commented; swift_decompose_float
// and swift_decompose_float80 are fundamentally the same algorithm, but
// adjusted to perform optimally for those types.
#if SWIFT_DTOA_DOUBLE_SUPPORT
// Return raw bits encoding the double
static uint64_t bitPatternForDouble(double d) {
union { double d; uint64_t u; } converter;
converter.d = d;
return converter.u;
}
int swift_decompose_double(double d,
int8_t *digits, size_t digits_length, int *decimalExponent)
{
// Bits in raw significand (not including hidden bit, if present)
static const int significandBitCount = DBL_MANT_DIG - 1;
static const uint64_t significandMask
= ((uint64_t)1 << significandBitCount) - 1;
// Bits in raw exponent
static const int exponentBitCount = 11;
static const int exponentMask = (1 << exponentBitCount) - 1;
// Note: IEEE 754 conventionally uses 1023 as the exponent
// bias. That's because they treat the significand as a
// fixed-point number with one bit (the hidden bit) integer
// portion. The logic here reconstructs the significand as a
// pure fraction, so we need to accomodate that when
// reconstructing the binary exponent.
static const int64_t exponentBias = (1 << (exponentBitCount - 1)) - 2; // 1022
// Step 0: Deconstruct the target number
// Note: this strongly assumes IEEE 754 binary64 format
uint64_t raw = bitPatternForDouble(d);
int exponentBitPattern = (raw >> significandBitCount) & exponentMask;
uint64_t significandBitPattern = raw & significandMask;
// Step 1: Handle the various input cases:
int binaryExponent;
uint64_t significand;
if (digits_length < 17) {
return 0;
} else if (exponentBitPattern == exponentMask) { // NaN or Infinity
// Return no digits
return 0;
} else if (exponentBitPattern == 0) {
if (significandBitPattern == 0) { // Zero
digits[0] = 0;
*decimalExponent = 0;
return 1;
} else { // subnormal
binaryExponent = 1 - exponentBias;
significand = significandBitPattern
<< (64 - significandBitCount - 1);
}
} else { // normal
binaryExponent = exponentBitPattern - exponentBias;
uint64_t hiddenBit = (uint64_t)1 << significandBitCount;
uint64_t fullSignificand = significandBitPattern + hiddenBit;
significand = fullSignificand << (64 - significandBitCount - 1);
}
// Step 2: Determine the exact unscaled target interval
// Grisu-style algorithms construct the shortest decimal digit
// sequence within a specific interval. To build the appropriate
// interval, we start by computing the exact midpoints between
// this floating-point value and the adjacent ones.
uint64_t halfUlp = (uint64_t)1 << (64 - significandBitCount - 2);
uint64_t quarterUlp = halfUlp >> 1;
uint64_t upperMidpointExact = significand + halfUlp;
int isBoundary = significandBitPattern == 0;
uint64_t lowerMidpointExact
= significand - (isBoundary ? quarterUlp : halfUlp);
// Step 3: Estimate the base 10 exponent
// Grisu algorithms are based in part on a simple technique for
// generating a base-10 form for a binary floating-point number.
// Start with a binary floating-point number `f * 2^e` and then
// estimate the decimal exponent `p`. You can then rewrite your
// original number as:
//
// ```
// f * 2^e * 10^-p * 10^p
// ```
//
// The last term is part of our output, and a good estimate for
// `p` will ensure that `2^e * 10^-p` is going to be close to 1.
// So multiplying the first three terms yields a fraction suitable
// for producing the decimal digits. So we need to estimate `p`:
int base10Exponent = decimalExponentFor2ToThe(binaryExponent);
// Step 4: Compute a power-of-10 scale factor
// Compute `10^-p` to 128-bit precision. We generate
// both over- and under-estimates to ensure we can exactly
// bound the later use of these values.
swift_uint128_t powerOfTenRoundedDown;
swift_uint128_t powerOfTenRoundedUp;
int powerOfTenExponent = 0;
intervalContainingPowerOf10_Double(-base10Exponent,
&powerOfTenRoundedDown,
&powerOfTenRoundedUp,
&powerOfTenExponent);
const int extraBits = binaryExponent + powerOfTenExponent;
// Step 5: Scale the interval (with rounding)
// As mentioned above, the final digit generation works
// with an interval, so we actually apply the scaling
// to the upper and lower midpoint values separately.
// As part of the scaling here, we'll switch from a pure
// fraction to a fixed-point form. Using 14 bit integer portion
// will allow us to compute four decimal digits at a time.
static const int integerBits = 14;
static const int fractionBits = 128 - integerBits;
// We scale the interval in one of two different ways...
// This value comes from the Errol paper, Theorem 6
static const int maxIntegerMidpointExponent = 131;
swift_uint128_t u, l;
if (((significandBitPattern & 1) != 0)
|| (binaryExponent < significandBitCount + 3)
|| (binaryExponent > maxIntegerMidpointExponent)) {
// Case A: Narrow the interval
// Loitsch' original Grisu2 always narrows the interval.
// Since our digit generation will select a value within
// the scaled interval, narrowing the interval guarantees
// that we will find a digit sequence that converts back
// to the original value.
// This ensures accuracy but, as explained in Loitsch' paper,
// this carries a risk that there will be a shorter digit
// sequence outside of our narrowed interval that we will
// miss. This risk obviously gets lower with increased
// precision, but it wasn't until the Errol paper that anyone
// had a good way to test whether a particular implementation
// had sufficient precision. Using Errol Theorem 7 and tinkering
// with the `fractionBits` parameter above, you can show that
// 110 fraction bits is sufficient for Double.
// Multiply out the upper midpoint, rounding down...
swift_uint128_t u1 = multiply128x64RoundingDown(powerOfTenRoundedDown,
upperMidpointExact);
// Account for residual binary exponent and adjust
// to the fixed-point format
u = shiftRightRoundingDown128(u1, integerBits - extraBits);
// Conversely for the lower midpoint...
swift_uint128_t l1 = multiply128x64RoundingUp(powerOfTenRoundedUp,
lowerMidpointExact);
l = shiftRightRoundingUp128(l1, integerBits - extraBits);
} else {
// Case B: Widen the interval
// As explained in Errol Theorem 6, in certain cases the true
// shortest decimal result is at one of the exact scaled
// midpoints. Any narrowing will exclude the exact midpoints,
// so we must widen here to ensure we consider those cases.
// Errol Theorem 6 explains that this can only happen if the
// exact midpoints are integers with even significands and
// exponents less than a particular bound. (See the `if`
// condition above.)
// Widening the interval in this case ensures that we find a
// short result but carries a risk of selecting a result
// outside of the exact scaled interval (which would be
// inaccurate). For Float and Float80, it's easy to see this
// never happens: they use more fraction bits here than the
// maximum exponent for this case so any number outside the
// exact interval will not be an integer. Since any
// non-integer will always have more digits than the adjacent
// integers, the digit generation will never select it. For
// double, we've cut things a bit finer (114 bit fraction here
// is not higher than the exponent limit of 131 above), so
// we've had to rely on Errol Theorem 7 to enumerate possible
// failures to test those cases.
// Note: Grisu3 converts every number twice: once with the
// narrowed interval to ensure accuracy and once with the
// wider interval to ensure shortness. If both agree, the
// result must meet both conditions. In essence, the Errol
// paper provides a way to select narrowing or widening
// appropriately so we can avoid Grisu3's double conversion.
swift_uint128_t u1 = multiply128x64RoundingUp(powerOfTenRoundedUp,
upperMidpointExact);
u = shiftRightRoundingUp128(u1, integerBits - extraBits);
swift_uint128_t l1 = multiply128x64RoundingDown(powerOfTenRoundedDown,
lowerMidpointExact);
l = shiftRightRoundingDown128(l1, integerBits - extraBits);
}
// Step 6: Align first digit, adjust exponent
// This preps for digit generation. It just multiplies repeatedly
// by 10 until we have exactly one decimal digit in the integer
// part, adjusting the exponent as we go.
// In particular, this prunes leading zeros from subnormals.
// Generate digits for `t` with interval width `delta`
swift_uint128_t t = u;
swift_uint128_t delta = u;
subtract128x128(&delta, l); // Explained below.
int exponent = base10Exponent + 1;
// Except for subnormals, this loop should never run more than once.
#if HAVE_UINT128_T
static const swift_uint128_t fixedPointOne = (__uint128_t)1 << fractionBits;
while (t < fixedPointOne)
#else
// Because 1.0 in fixed point has a lot of zeros, it suffices
// to only compare the high-order word here. This is a minor
// performance win.
while (t.high < ((uint32_t)1 << (fractionBits % 32)))
#endif
{
exponent -= 1;
multiply128xi32(&delta, 10);
multiply128xi32(&t, 10);
}
// Step 7: Generate digits
// This is a common part of Grisu-style algorithms. The
// underlying idea is to generate digits for the scaled upper and
// lower boundaries, and stop when we hit the first different
// digit (at which point, the digit for the upper midpoint is the
// candidate final digit). To understand this, just note that
// 0.1234 is the shortest decimal between u = 0.123456 and l =
// 0.123345.
// Grisu uses a slightly optimized technique: it generates digits
// for the upper bound (multiplying by 10 to isolate each digit)
// and multiplies the interval width `delta` at the same time.
// The `different digit` criteria above translates to a test for
// `delta` being larger than the remainder.
int8_t *digit_p = digits;
// Adjustment above already set up the first digit:
int nextDigit = extractIntegerPart128(&t, fractionBits);
clearIntegerPart128(&t, fractionBits);
// Further optimization: Generating four digits at a time reduces
// the total arithmetic required per digit. Note: The following
// block can be entirely removed with no effect on the result.
// If you're trying to understand this algorithm, just skip this
// block on first reading.
swift_uint128_t d0 = delta;
multiply128xi32(&d0, 10000);
swift_uint128_t t0 = t;
multiply128xi32(&t0, 10000);
int fourDigits = extractIntegerPart128(&t0, fractionBits); // 4 digits
clearIntegerPart128(&t0, fractionBits);
while (isLessThan128x128(d0, t0)) {
*digit_p++ = nextDigit;
int d = fourDigits / 100; // top 2 digits
*digit_p++ = d / 10;
*digit_p++ = d % 10;
d = fourDigits % 100; // bottom 2 digits
*digit_p++ = d / 10;
nextDigit = d % 10;
t = t0;
delta = d0;
multiply128xi32(&d0, 10000);
multiply128xi32(&t0, 10000);
fourDigits = extractIntegerPart128(&t0, fractionBits);
clearIntegerPart128(&t0, fractionBits);
}
// Finish by generating one digit at a time.
while (isLessThan128x128(delta, t)) {
*digit_p++ = nextDigit;
multiply128xi32(&delta, 10);
multiply128xi32(&t, 10);
nextDigit = extractIntegerPart128(&t, fractionBits);
clearIntegerPart128(&t, fractionBits);
}
// Adjust the final digit to be closer to the original value.
// This is basically the same as Grisu3. It accounts for the
// fact that sometimes there is more than one shortest digit
// sequence.
// For example, consider how the above would work if you had the
// value 0.1234 and computed u = 0.1257, l = 0.1211. The above
// digit generation works with `u`, so produces 0.125. But the
// values 0.122, 0.123, and 0.124 are just as short and 0.123 is
// the best choice, since it's closest to the original value.
// If `delta <= t + 1.0`, then the interval is narrower than
// one decimal digit, so there is no other option.
// Note: We've already consumed most of our available precision,
// so it's okay to just work in 64 bits here...
uint64_t deltaHigh64 = extractHigh64From128(delta);
uint64_t tHigh64 = extractHigh64From128(t);
if (deltaHigh64 > tHigh64 + ((uint64_t)1 << (fractionBits % 64))) {
// Note: 64-bit arithmetic is okay here
uint64_t skew;
if (isBoundary) {
// If we're at the boundary where the exponent shifts,
// then the original value is 1/3 of the way from
// the bottom of the interval ...
skew = deltaHigh64 - deltaHigh64 / 3 - tHigh64;
} else {
// ... otherwise it's exactly in the middle.
skew = deltaHigh64 / 2 - tHigh64;
}
// The `skew` above is the difference between our
// computed digits and the original exact value.
// Use that to offset the final digit:
uint64_t one = (uint64_t)(1) << (64 - integerBits);
uint64_t fractionMask = one - 1;
uint64_t oneHalf = one >> 1;
if ((skew & fractionMask) == oneHalf) {
int adjust = (int)(skew >> (64 - integerBits));
// If the skew is exactly integer + 1/2, round the
// last digit even after adjustment
nextDigit = (nextDigit - adjust) & ~1;
} else {
// Else round to nearest...
int adjust = (int)((skew + oneHalf) >> (64 - integerBits));
nextDigit = (nextDigit - adjust);
}
}
*digit_p++ = nextDigit; // Store the final digit.
*decimalExponent = exponent;
return digit_p - digits;
}
#endif
#if SWIFT_DTOA_FLOAT_SUPPORT
// Return raw bits encoding the float
static uint64_t bitPatternForFloat(float f) {
union { float f; uint32_t u; } converter;
converter.f = f;
return converter.u;
}
// Decompose an IEEE 754 binary32 single-precision float
// into decimal digits and a corresponding decimal exponent.
// See swift_decompose_double for detailed comments on the algorithm here
int swift_decompose_float(float f,
int8_t *digits, size_t digits_length, int *decimalExponent)
{
static const int significandBitCount = FLT_MANT_DIG - 1;
static const uint32_t significandMask
= ((uint32_t)1 << significandBitCount) - 1;
static const int exponentBitCount = 8;
static const int exponentMask = (1 << exponentBitCount) - 1;
// See comments in swift_decompose_double
static const int64_t exponentBias = (1 << (exponentBitCount - 1)) - 2; // 125
// Step 0: Deconstruct the target number
// Note: this strongly assumes IEEE 754 binary32 format
uint32_t raw = bitPatternForFloat(f);
int exponentBitPattern = (raw >> significandBitCount) & exponentMask;
uint32_t significandBitPattern = raw & significandMask;
// Step 1: Handle the various input cases:
int binaryExponent;
uint32_t significand;
if (digits_length < 9) {
// Ensure we have space for 9 digits
return 0;
} else if (exponentBitPattern == exponentMask) { // NaN or Infinity
// Return no digits
return 0;
} else if (exponentBitPattern == 0) {
if (significandBitPattern == 0) { // Zero
// Return one zero digit and decimalExponent = 0.
digits[0] = 0;
*decimalExponent = 0;
return 1;
} else { // Subnormal
binaryExponent = 1 - exponentBias;
significand = significandBitPattern << (32 - significandBitCount - 1);
}
} else { // normal
binaryExponent = exponentBitPattern - exponentBias;
uint32_t hiddenBit = (uint32_t)1 << (uint32_t)significandBitCount;
uint32_t fullSignificand = significandBitPattern + hiddenBit;
significand = fullSignificand << (32 - significandBitCount - 1);
}
// Step 2: Determine the exact unscaled target interval
uint32_t halfUlp = (uint32_t)1 << (32 - significandBitCount - 2);
uint32_t quarterUlp = halfUlp >> 1;
uint32_t upperMidpointExact = significand + halfUlp;
int isBoundary = significandBitPattern == 0;
uint32_t lowerMidpointExact
= significand - (isBoundary ? quarterUlp : halfUlp);
// Step 3: Estimate the base 10 exponent
int base10Exponent = decimalExponentFor2ToThe(binaryExponent);
// Step 4: Compute a power-of-10 scale factor
uint64_t powerOfTenRoundedDown = 0;
uint64_t powerOfTenRoundedUp = 0;
int powerOfTenExponent = 0;
intervalContainingPowerOf10_Float(-base10Exponent,
&powerOfTenRoundedDown,
&powerOfTenRoundedUp,
&powerOfTenExponent);
const int extraBits = binaryExponent + powerOfTenExponent;
// Step 5: Scale the interval (with rounding)
static const int integerBits = 5;
static const int fractionBits = 64 - integerBits;
// This value comes from the Errol paper, Theorem 6
static const int maxIntegerMidpointExponent = 57;
uint64_t u, l;
if (((significandBitPattern & 1) != 0)
|| (binaryExponent < significandBitCount + 3)
|| (binaryExponent > maxIntegerMidpointExponent)) {
// Narrow the interval
uint64_t u1 = multiply64x32RoundingDown(powerOfTenRoundedDown,
upperMidpointExact);
u = u1 >> (integerBits - extraBits); // Rounding down
uint64_t l1 = multiply64x32RoundingUp(powerOfTenRoundedUp,
lowerMidpointExact);
l = shiftRightRoundingUp64(l1, integerBits - extraBits);
} else {
// Widen the interval
uint64_t u1 = multiply64x32RoundingUp(powerOfTenRoundedUp,
upperMidpointExact);
u = shiftRightRoundingUp64(u1, integerBits - extraBits);
uint64_t l1 = multiply64x32RoundingDown(powerOfTenRoundedDown,
lowerMidpointExact);
l = l1 >> (integerBits - extraBits); // Rounding down
}
// Step 6: Align first digit, adjust exponent
// In particular, this prunes leading zeros from subnormals
static const uint64_t fixedPointOne = (uint64_t)1 << fractionBits;
static const uint64_t fixedPointMask = fixedPointOne - 1;
uint64_t t = u;
uint64_t delta = u - l;
int exponent = base10Exponent + 1;
while (t < fixedPointOne) {
exponent -= 1;
delta *= 10;
t *= 10;
}
// Step 7: Generate digits
int8_t *digit_p = digits;
// Adjustment above already set up the first digit:
int nextDigit = (int)(t >> fractionBits);
t &= fixedPointMask;
// Generate one digit at a time...
while (t > delta) {
*digit_p++ = nextDigit;
delta *= 10;
t *= 10;
nextDigit = (int)(t >> fractionBits);
t &= fixedPointMask;
}
// Adjust the final digit to be closer to the original value
if (delta > t + fixedPointOne) {
uint64_t skew;
if (isBoundary) {
skew = delta - delta / 3 - t;
} else {
skew = delta / 2 - t;
}
uint64_t one = (uint64_t)(1) << (64 - integerBits);
uint64_t fractionMask = one - 1;
uint64_t oneHalf = one >> 1;
if ((skew & fractionMask) == oneHalf) {
int adjust = (int)(skew >> (64 - integerBits));
// If the skew is integer + 1/2, round the last digit even
// after adjustment
nextDigit = (nextDigit - adjust) & ~1;
} else {
// Else round to nearest...
int adjust = (int)((skew + oneHalf) >> (64 - integerBits));
nextDigit = (nextDigit - adjust);
}
}
*digit_p++ = nextDigit;
*decimalExponent = exponent;
return digit_p - digits;
}
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
// See `swift_decompose_double` for detailed comments on this implementatoin.
int swift_decompose_float80(long double d,
int8_t *digits, size_t digits_length, int *decimalExponent)
{
// Omit leading bit, as if we were an IEEE 754 format
static const int significandBitCount = LDBL_MANT_DIG - 1;
static const int exponentBitCount = 15;
static const int exponentMask = (1 << exponentBitCount) - 1;
// See comments in swift_decompose_double to understand
// why we use 16,382 instead of 16,383 here.
static const int64_t exponentBias = (1 << (exponentBitCount - 1)) - 2; // 16,382
// Step 0: Deconstruct the target number
// Note: this strongly assumes Intel 80-bit extended format in LSB
// byte order
const uint64_t *raw_p = (const uint64_t *)&d;
int exponentBitPattern = raw_p[1] & exponentMask;
uint64_t significandBitPattern = raw_p[0];
// Step 1: Handle the various input cases:
int64_t binaryExponent;
uint64_t significand;
if (digits_length < 21) {
return 0;
} else if (exponentBitPattern == exponentMask) { // NaN or Infinity
// Return no digits
return 0;
} else if (exponentBitPattern == 0) {
if (significandBitPattern == 0) { // Zero
digits[0] = 0;
*decimalExponent = 0;
return 1;
} else { // subnormal
binaryExponent = 1 - exponentBias;
significand = significandBitPattern;
}
} else if (significandBitPattern >> 63) { // Normal
binaryExponent = exponentBitPattern - exponentBias;
significand = significandBitPattern;
} else {
// "Unnormal" values are rejected as invalid by 80387 and later.
// Treat them the same as NaNs here.
return 0;
}
// Step 2: Determine the exact unscaled target interval
uint64_t halfUlp = (uint64_t)1 << 63;
uint64_t quarterUlp = halfUlp >> 1;
uint64_t threeQuarterUlp = halfUlp + quarterUlp;
swift_uint128_t upperMidpointExact, lowerMidpointExact;
initialize128WithHighLow64(upperMidpointExact, significand, halfUlp);
int isBoundary = (significandBitPattern & 0x7fffffffffffffff) == 0;
// Subtract 1/4 or 1/2 ULP by first subtracting 1 full ULP, then adding some back
initialize128WithHighLow64(lowerMidpointExact, significand - 1, isBoundary ? threeQuarterUlp : halfUlp);
// Step 3: Estimate the base 10 exponent
int base10Exponent = decimalExponentFor2ToThe(binaryExponent);
// Step 4: Compute a power-of-10 scale factor
swift_uint192_t powerOfTenRoundedDown;
swift_uint192_t powerOfTenRoundedUp;
int powerOfTenExponent = 0;
intervalContainingPowerOf10_Float80(-base10Exponent,
&powerOfTenRoundedDown,
&powerOfTenRoundedUp,
&powerOfTenExponent);
const int extraBits = binaryExponent + powerOfTenExponent;
// Step 5: Scale the interval (with rounding)
static const int integerBits = 14;
static const int fractionBits = 192 - integerBits;
#if HAVE_UINT128_T
static const int highFractionBits = fractionBits % 64;
#else
static const int highFractionBits = fractionBits % 32;
#endif
// This value comes from the Errol paper, Theorem 6
static const int maxIntegerMidpointExponent = 153;
swift_uint192_t u, l;
if (((significandBitPattern & 1) != 0)
|| (binaryExponent < significandBitCount + 3)
|| (binaryExponent > maxIntegerMidpointExponent)) {
// Narrow the interval
u = powerOfTenRoundedDown;
multiply192x128RoundingDown(&u, upperMidpointExact);
shiftRightRoundingDown192(&u, integerBits - extraBits);
l = powerOfTenRoundedUp;
multiply192x128RoundingUp(&l, lowerMidpointExact);
shiftRightRoundingUp192(&l, integerBits - extraBits);
} else {
// Widen the interval
u = powerOfTenRoundedUp;
multiply192x128RoundingUp(&u, upperMidpointExact);
shiftRightRoundingUp192(&u, integerBits - extraBits);
l = powerOfTenRoundedDown;
multiply192x128RoundingDown(&l, lowerMidpointExact);
shiftRightRoundingDown192(&l, integerBits - extraBits);
}
// Step 6: Align first digit, adjust exponent
// In particular, this prunes leading zeros from subnormals
static const uint64_t fixedPointOneHigh = (uint64_t)1 << highFractionBits;
static const uint64_t fixedPointMaskHigh = fixedPointOneHigh - 1;
swift_uint192_t t = u;
swift_uint192_t delta = u;
subtract192x192(&delta, l);
int exponent = base10Exponent + 1;
while (t.high < fixedPointOneHigh) {
exponent -= 1;
multiply192xi32(&delta, 10);
multiply192xi32(&t, 10);
}
// Step 7: Generate digits
int8_t *digit_p = digits;
// Adjustment above already set up the first digit:
int nextDigit = (int)(t.high >> highFractionBits);
t.high &= fixedPointMaskHigh;
// Generate four digits at a time ...
swift_uint192_t d0 = delta;
swift_uint192_t t0 = t;
multiply192xi32(&d0, 10000);
multiply192xi32(&t0, 10000);
int fourDigits = (int)(t0.high >> highFractionBits);
t0.high &= fixedPointMaskHigh;
while (isLessThan192x192(d0, t0)) {
*digit_p++ = nextDigit;
int d = fourDigits / 100;
*digit_p++ = d / 10;
*digit_p++ = d % 10;
d = fourDigits % 100;
*digit_p++ = d / 10;
nextDigit = d % 10;
t = t0;
delta = d0;
multiply192xi32(&d0, 10000);
multiply192xi32(&t0, 10000);
fourDigits = (int)(t0.high >> highFractionBits);
t0.high &= fixedPointMaskHigh;
}
// Generate one digit at a time...
while (isLessThan192x192(delta, t)) {
*digit_p++ = nextDigit;
multiply192xi32(&delta, 10);
multiply192xi32(&t, 10);
nextDigit = (int)(t.high >> highFractionBits);
t.high &= fixedPointMaskHigh;
}
// Adjust the final digit to be closer to the original value
// We've already consumed most of our available precision, so it's
// okay to just work in 64 bits here...
#if HAVE_UINT128_T
uint64_t deltaHigh64 = delta.high;
uint64_t tHigh64 = t.high;
#else
uint64_t deltaHigh64 = ((uint64_t)delta.high << 32) + delta.e;
uint64_t tHigh64 = ((uint64_t)t.high << 32) + t.e;
#endif
if (deltaHigh64 > tHigh64 + ((uint64_t)1 << (fractionBits % 64))) {
uint64_t skew;
if (isBoundary) {
skew = deltaHigh64 - deltaHigh64 / 3 - tHigh64;
} else {
skew = deltaHigh64 / 2 - tHigh64;
}
uint64_t one = (uint64_t)(1) << (64 - integerBits);
uint64_t fractionMask = one - 1;
uint64_t oneHalf = one >> 1;
if ((skew & fractionMask) == oneHalf) {
int adjust = (int)(skew >> (64 - integerBits));
// If the skew is integer + 1/2, round the last digit even
// after adjustment
nextDigit = (nextDigit - adjust) & ~1;
} else {
// Else round to nearest...
int adjust = (int)((skew + oneHalf) >> (64 - integerBits));
nextDigit = (nextDigit - adjust);
}
}
*digit_p++ = nextDigit;
*decimalExponent = exponent;
return digit_p - digits;
}
#endif
//
// ---------------- High-level API -----------------
//
// Functions that format a Float/Double/Float80
// directly into a buffer.
//
// These handle various exception cases (infinity, Nan, zero)
// before invoking the general base-10 conversion.
#if SWIFT_DTOA_FLOAT_SUPPORT || SWIFT_DTOA_DOUBLE_SUPPORT || SWIFT_DTOA_FLOAT80_SUPPORT
static size_t swift_format_constant(char *dest, size_t length, const char *s) {
const size_t l = strlen(s);
if (length <= l) {
return 0;
}
strcpy(dest, s);
return l;
}
#endif
#if SWIFT_DTOA_FLOAT_SUPPORT
size_t swift_format_float(float d, char *dest, size_t length)
{
if (!isfinite(d)) {
if (isinf(d)) {
// Infinity
if (signbit(d)) {
return swift_format_constant(dest, length, "-inf");
} else {
return swift_format_constant(dest, length, "inf");
}
} else {
// NaN
static const int significandBitCount = 23;
uint32_t raw = bitPatternForFloat(d);
const char *sign = signbit(d) ? "-" : "";
const char *signaling = ((raw >> (significandBitCount - 1)) & 1) ? "" : "s";
uint32_t payload = raw & ((1L << (significandBitCount - 2)) - 1);
char buff[32];
if (payload != 0) {
snprintf(buff, sizeof(buff), "%s%snan(0x%x)",
sign, signaling, payload);
} else {
snprintf(buff, sizeof(buff), "%s%snan",
sign, signaling);
}
return swift_format_constant(dest, length, buff);
}
}
// zero
if (d == 0.0) {
if (signbit(d)) {
return swift_format_constant(dest, length, "-0.0");
} else {
return swift_format_constant(dest, length, "0.0");
}
}
// Decimal numeric formatting
int decimalExponent;
int8_t digits[9];
int digitCount =
swift_decompose_float(d, digits, sizeof(digits), &decimalExponent);
if (decimalExponent < -3 || decimalExponent > 6) {
return swift_format_exponential(dest, length, signbit(d),
digits, digitCount, decimalExponent);
} else {
return swift_format_decimal(dest, length, signbit(d),
digits, digitCount, decimalExponent);
}
}
#endif
#if SWIFT_DTOA_DOUBLE_SUPPORT
size_t swift_format_double(double d, char *dest, size_t length)
{
if (!isfinite(d)) {
if (isinf(d)) {
// Infinity
if (signbit(d)) {
return swift_format_constant(dest, length, "-inf");
} else {
return swift_format_constant(dest, length, "inf");
}
} else {
// NaN
static const int significandBitCount = 52;
uint64_t raw = bitPatternForDouble(d);
const char *sign = signbit(d) ? "-" : "";
const char *signaling = ((raw >> (significandBitCount - 1)) & 1) ? "" : "s";
uint64_t payload = raw & ((1L << (significandBitCount - 2)) - 1);
char buff[32];
if (payload != 0) {
snprintf(buff, sizeof(buff), "%s%snan(0x%llx)",
sign, signaling, payload);
} else {
snprintf(buff, sizeof(buff), "%s%snan",
sign, signaling);
}
return swift_format_constant(dest, length, buff);
}
}
// zero
if (d == 0.0) {
if (signbit(d)) {
return swift_format_constant(dest, length, "-0.0");
} else {
return swift_format_constant(dest, length, "0.0");
}
}
// Decimal numeric formatting
int decimalExponent;
int8_t digits[17];
int digitCount =
swift_decompose_double(d, digits, sizeof(digits), &decimalExponent);
if (decimalExponent < -3 || decimalExponent > 15) {
return swift_format_exponential(dest, length, signbit(d),
digits, digitCount, decimalExponent);
} else {
return swift_format_decimal(dest, length, signbit(d),
digits, digitCount, decimalExponent);
}
}
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
size_t swift_format_float80(long double d, char *dest, size_t length)
{
if (!isfinite(d)) {
if (isinf(d)) {
// Infinity
if (signbit(d)) {
return swift_format_constant(dest, length, "-inf");
} else {
return swift_format_constant(dest, length, "inf");
}
} else {
// NaN
// Assumes Intel 80-bit extended format in LSB byte order:
uint64_t significandBitPattern = *(const uint64_t *)&d;
const char *sign = signbit(d) ? "-" : "";
const char *signaling = ((significandBitPattern >> 62) & 1) ? "" : "s";
uint64_t payload = significandBitPattern & (((uint64_t)1 << 61) - 1);
char buff[32];
if (payload != 0) {
snprintf(buff, sizeof(buff), "%s%snan(0x%llx)",
sign, signaling, payload);
} else {
snprintf(buff, sizeof(buff), "%s%snan",
sign, signaling);
}
return swift_format_constant(dest, length, buff);
}
}
// zero
if (d == 0.0) {
if (signbit(d)) {
return swift_format_constant(dest, length, "-0.0");
} else {
return swift_format_constant(dest, length, "0.0");
}
}
// Decimal numeric formatting
// Decimal numeric formatting
int decimalExponent;
int8_t digits[21];
int digitCount =
swift_decompose_float80(d, digits, sizeof(digits), &decimalExponent);
if (decimalExponent < -3 || decimalExponent > 18) {
return swift_format_exponential(dest, length, signbit(d),
digits, digitCount, decimalExponent);
} else {
return swift_format_decimal(dest, length, signbit(d),
digits, digitCount, decimalExponent);
}
}
#endif
/**
* Routines to format a decomposed value into a standard string form.
*/
// Format into exponential format: "1.234e+56"
// Returns number of characters actually written to `dest`.
// Returns zero if buffer is too small.
size_t swift_format_exponential(char *dest, size_t length,
bool negative, const int8_t *digits, int digit_count, int exponent)
{
// Largest buffer we could possibly need:
size_t maximum_size = digit_count + 9;
if (length < maximum_size) {
// We only do the detailed check if the size is borderline.
size_t actual_size =
+ (negative ? 1 : 0) // Leading minus
+ digit_count // digits
+ (digit_count > 1 ? 1 : 0) // decimal
+ 1 // 'e'
+ 1 // sign
+ (exponent > 99 ? (exponent > 999 ? 4 : 3) : 2) // exponent
+ 1; // trailing zero byte
if (length < actual_size) {
if (length > 0) {
dest[0] = 0;
}
return 0;
}
}
char *p = dest;
if (negative) {
*p++ = '-';
}
*p++ = digits[0] + '0';
exponent -= 1;
if (digit_count > 1) {
*p++ = '.';
for (int i = 1; i < digit_count; i++) {
*p++ = digits[i] + '0';
}
}
*p++ = 'e';
if (exponent < 0) {
*p++ = '-';
exponent = -exponent;
} else {
*p++ = '+';
}
if (exponent > 99) {
if (exponent > 999) {
*p++ = (exponent / 1000 % 10) + '0';
}
*p++ = (exponent / 100 % 10) + '0';
exponent %= 100;
}
*p++ = (exponent / 10) + '0';
*p++ = (exponent % 10) + '0';
*p = '\0';
return p - dest;
}
// Format into decimal form: "123456789000.0", "1234.5678", "0.0000001234"
// Returns number of bytes of `dest` actually used, or zero if
// provided buffer is too small.
size_t swift_format_decimal(char *dest, size_t length,
bool negative, const int8_t *digits, int digit_count, int exponent)
{
// Largest buffer we could possibly need:
size_t maximum_size =
digit_count // All the digits
+ (exponent > 0 ? exponent : -exponent) // Max # of extra zeros
+ 4; // Max # of other items
if (length < maximum_size) {
// We only do the detailed check if the size is borderline.
if (exponent <= 0) { // "0.0000001234"
size_t actual_size =
(negative ? 1 : 0) // Leading minus
+ 2 // Leading "0."
+ (-exponent) // Leading zeros after decimal point
+ digit_count
+ 1; // Trailing zero byte
if (length < actual_size) {
if (length > 0) {
dest[0] = 0;
}
return 0;
}
} else if (exponent < digit_count) { // "123.45"
size_t actual_size =
(negative ? 1 : 0) // Leading minus
+ digit_count
+ 1 // embedded decimal point
+ 1; // Trailing zero byte
if (length < actual_size) {
if (length > 0) {
dest[0] = 0;
}
return 0;
}
} else { // "12345000.0"
size_t actual_size =
(negative ? 1 : 0) // Leading minus
+ digit_count
+ (exponent - digit_count) // trailing zeros
+ 2 // ".0" to mark this as floating point
+ 1; // Trailing zero byte
if (length < actual_size) {
if (length > 0) {
dest[0] = 0;
}
return 0;
}
}
}
char *p = dest;
if (negative) {
*p++ = '-';
}
if (exponent <= 0) {
*p++ = '0';
*p++ = '.';
while (exponent < 0) {
*p++ = '0';
exponent += 1;
}
for (int i = 0; i < digit_count; ++i) {
*p++ = digits[i] + '0';
}
} else if (exponent < digit_count) {
for (int i = 0; i < digit_count; i++) {
if (exponent == 0) {
*p++ = '.';
}
*p++ = digits[i] + '0';
exponent -= 1;
}
} else {
for (int i = 0; i < digit_count; i++) {
*p++ = digits[i] + '0';
exponent -= 1;
}
while (exponent > 0) {
*p++ = '0';
exponent -= 1;
}
*p++ = '.';
*p++ = '0';
}
*p = '\0';
return p - dest;
}
//
// ------------ Arithmetic helpers ----------------
//
// The core algorithm relies heavily on fraction and fixed-point
// arithmetic with 64-bit, 128-bit, and 192-bit integer values. (For
// float, double, and float80, respectively.) They also need precise
// control over all rounding.
//
// Note that most arithmetic operations are the same for integers and
// fractions, so we can just use the normal integer operations in most
// places. Multiplication however, is different for fixed-size
// fractions. Integer multiplication preserves the low-order part and
// discards the high-order part (ignoring overflow). Fraction
// multiplication preserves the high-order part and discards the
// low-order part (rounding). So most of the arithmetic helpers here
// are for multiplication.
// Note: With 64-bit GCC and Clang, we get a lot of performance gain
// and code simplification by using `__uint128_t`. Otherwise, we have
// to break things down into 32-bit chunks so we don't overflow 64-bit
// temporaries.
#if SWIFT_DTOA_FLOAT_SUPPORT
// Multiply a 64-bit fraction by a 32-bit fraction, rounding down.
static uint64_t multiply64x32RoundingDown(uint64_t lhs, uint32_t rhs) {
#if HAVE_UINT128_T
__uint128_t full = (__uint128_t)lhs * rhs;
return (uint64_t)(full >> 32);
#else
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = ((lhs & mask32) * rhs) >> 32;
return t + (lhs >> 32) * rhs;
#endif
}
// Multiply a 64-bit fraction by a 32-bit fraction, rounding up.
static uint64_t multiply64x32RoundingUp(uint64_t lhs, uint32_t rhs) {
#if HAVE_UINT128_T
static const __uint128_t roundingFactor = UINT32_MAX;
__uint128_t full = (__uint128_t)lhs * rhs;
return (uint64_t)((full + roundingFactor) >> 32);
#else
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = (((lhs & mask32) * rhs) + mask32) >> 32;
return t + (lhs >> 32) * rhs;
#endif
}
// Multiply a 64-bit fraction by a 64-bit fraction, rounding down.
static uint64_t multiply64x64RoundingDown(uint64_t lhs, uint64_t rhs) {
#if HAVE_UINT128_T
__uint128_t full = (__uint128_t)lhs * rhs;
return (uint64_t)(full >> 64);
#else
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = (lhs & mask32) * (rhs & mask32);
t >>= 32;
uint64_t a = (lhs >> 32) * (rhs & mask32);
uint64_t b = (lhs & mask32) * (rhs >> 32);
t += (a & mask32) + (b & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32);
return t + (lhs >> 32) * (rhs >> 32);
#endif
}
// Multiply a 64-bit fraction by a 64-bit fraction, rounding up.
static uint64_t multiply64x64RoundingUp(uint64_t lhs, uint64_t rhs) {
#if HAVE_UINT128_T
static const __uint128_t roundingFactor = ((__uint128_t)1 << 64) - 1;
__uint128_t full = (__uint128_t)lhs * rhs;
return (uint64_t)((full + roundingFactor) >> 64);
#else
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = (lhs & mask32) * (rhs & mask32);
t = (t + mask32) >> 32;
uint64_t a = (lhs >> 32) * (rhs & mask32);
uint64_t b = (lhs & mask32) * (rhs >> 32);
t += (a & mask32) + (b & mask32);
t = (t + mask32) >> 32;
t += (a >> 32) + (b >> 32);
return t + (lhs >> 32) * (rhs >> 32);
#endif
}
// Shift a 64-bit integer right, rounding up.
static uint64_t shiftRightRoundingUp64(uint64_t lhs, int shift) {
uint64_t mask = ((uint64_t)1 << shift) - 1;
uint64_t round = ((lhs & mask) == 0) ? 0 : 1;
return (lhs >> shift) + round;
}
#endif
#if SWIFT_DTOA_DOUBLE_SUPPORT
// Multiply a 128-bit fraction by a 64-bit fraction, rounding down.
static swift_uint128_t multiply128x64RoundingDown(swift_uint128_t lhs, uint64_t rhs) {
#if HAVE_UINT128_T
uint64_t lhsl = (uint64_t)lhs;
uint64_t lhsh = (uint64_t)(lhs >> 64);
swift_uint128_t h = (swift_uint128_t)lhsh * rhs;
swift_uint128_t l = (swift_uint128_t)lhsl * rhs;
return h + (l >> 64);
#else
swift_uint128_t result;
static const uint64_t mask32 = UINT32_MAX;
uint64_t rhs0 = rhs & mask32;
uint64_t rhs1 = rhs >> 32;
uint64_t t = (lhs.low) * rhs0;
t >>= 32;
uint64_t a = (lhs.b) * rhs0;
uint64_t b = (lhs.low) * rhs1;
t += (a & mask32) + (b & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs.c * rhs0;
b = lhs.b * rhs1;
t += (a & mask32) + (b & mask32);
result.low = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs.high * rhs0;
b = lhs.c * rhs1;
t += (a & mask32) + (b & mask32);
result.b = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += lhs.high * rhs1;
result.c = t;
result.high = t >> 32;
return result;
#endif
}
// Multiply a 128-bit fraction by a 64-bit fraction, rounding up.
static swift_uint128_t multiply128x64RoundingUp(swift_uint128_t lhs, uint64_t rhs) {
#if HAVE_UINT128_T
uint64_t lhsl = (uint64_t)lhs;
uint64_t lhsh = (uint64_t)(lhs >> 64);
swift_uint128_t h = (swift_uint128_t)lhsh * rhs;
swift_uint128_t l = (swift_uint128_t)lhsl * rhs;
uint64_t remainder = (uint64_t)l;
uint64_t round = (remainder != 0) ? 1 : 0;
return h + (l >> 64) + round;
#else
swift_uint128_t result;
static const uint64_t mask32 = UINT32_MAX;
uint64_t rhs0 = rhs & mask32;
uint64_t rhs1 = rhs >> 32;
uint64_t t = (lhs.low) * rhs0 + mask32;
t >>= 32;
uint64_t a = (lhs.b) * rhs0;
uint64_t b = (lhs.low) * rhs1;
t += (a & mask32) + (b & mask32) + mask32;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs.c * rhs0;
b = lhs.b * rhs1;
t += (a & mask32) + (b & mask32);
result.low = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs.high * rhs0;
b = lhs.c * rhs1;
t += (a & mask32) + (b & mask32);
result.b = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += lhs.high * rhs1;
result.c = t;
result.high = t >> 32;
return result;
#endif
}
#if !HAVE_UINT128_T
// Multiply a 128-bit fraction by a 32-bit integer in a 32-bit environment.
// (On 64-bit, we use a fast inline macro.)
static void multiply128xi32(swift_uint128_t *lhs, uint32_t rhs) {
uint64_t t = (uint64_t)(lhs->low) * rhs;
lhs->low = (uint32_t)t;
t = (t >> 32) + (uint64_t)(lhs->b) * rhs;
lhs->b = (uint32_t)t;
t = (t >> 32) + (uint64_t)(lhs->c) * rhs;
lhs->c = (uint32_t)t;
t = (t >> 32) + (uint64_t)(lhs->high) * rhs;
lhs->high = (uint32_t)t;
}
// Compare two 128-bit integers in a 32-bit environment
// (On 64-bit, we use a fast inline macro.)
static int isLessThan128x128(swift_uint128_t lhs, swift_uint128_t rhs) {
return ((lhs.high < rhs.high)
|| ((lhs.high == rhs.high)
&& ((lhs.c < rhs.c)
|| ((lhs.c == rhs.c)
&& ((lhs.b < rhs.b)
|| ((lhs.b == rhs.b)
&& (lhs.low < rhs.low)))))));
}
// Subtract 128-bit values in a 32-bit environment
static void subtract128x128(swift_uint128_t *lhs, swift_uint128_t rhs) {
uint64_t t = (uint64_t)lhs->low + (~rhs.low) + 1;
lhs->low = (uint32_t)t;
t = (t >> 32) + lhs->b + (~rhs.b);
lhs->b = (uint32_t)t;
t = (t >> 32) + lhs->c + (~rhs.c);
lhs->c = (uint32_t)t;
t = (t >> 32) + lhs->high + (~rhs.high);
lhs->high = (uint32_t)t;
}
#endif
// Shift a 128-bit integer right, rounding down.
static swift_uint128_t shiftRightRoundingDown128(swift_uint128_t lhs, int shift) {
#if HAVE_UINT128_T
return lhs >> shift;
#else
// Note: Shift is always less than 32
swift_uint128_t result;
uint64_t t = (uint64_t)lhs.low >> shift;
t += ((uint64_t)lhs.b << (32 - shift));
result.low = t;
t >>= 32;
t += ((uint64_t)lhs.c << (32 - shift));
result.b = t;
t >>= 32;
t += ((uint64_t)lhs.high << (32 - shift));
result.c = t;
t >>= 32;
result.high = t;
return result;
#endif
}
// Shift a 128-bit integer right, rounding up.
static swift_uint128_t shiftRightRoundingUp128(swift_uint128_t lhs, int shift) {
#if HAVE_UINT128_T
uint64_t bias = ((uint64_t)1 << shift) - 1;
return ((lhs + bias) >> shift);
#else
swift_uint128_t result;
const uint64_t bias = (1 << shift) - 1;
uint64_t t = ((uint64_t)lhs.low + bias) >> shift;
t += ((uint64_t)lhs.b << (32 - shift));
result.low = t;
t >>= 32;
t += ((uint64_t)lhs.c << (32 - shift));
result.b = t;
t >>= 32;
t += ((uint64_t)lhs.high << (32 - shift));
result.c = t;
t >>= 32;
result.high = t;
return result;
#endif
}
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
// Multiply a 192-bit fraction by a 64-bit fraction, rounding down.
static void multiply192x64RoundingDown(swift_uint192_t *lhs, uint64_t rhs) {
#if HAVE_UINT128_T
// Compute the three 128-bit cross-products
__uint128_t cd = (__uint128_t)lhs->low * rhs;
__uint128_t bc = (__uint128_t)lhs->mid * rhs;
__uint128_t ab = (__uint128_t)lhs->high * rhs;
// Add up the three 64-bit outputs (including carries)
__uint128_t c = (cd >> 64) + (uint64_t)bc;
__uint128_t b = (bc >> 64) + (uint64_t)ab + (c >> 64);
__uint128_t a = (ab >> 64) + (b >> 64);
lhs->high = a;
lhs->mid = b;
lhs->low = c;
#else
static const uint64_t mask32 = UINT32_MAX;
uint64_t rhs0 = rhs & mask32;
uint64_t rhs1 = rhs >> 32;
uint64_t t = lhs->low * rhs0;
t >>= 32;
uint64_t a = lhs->low * rhs1;
uint64_t b = lhs->b * rhs0;
t += (a & mask32) + (b & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->b * rhs1;
b = lhs->c * rhs0;
t += (a & mask32) + (b & mask32);
lhs->low = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->c * rhs1;
b = lhs->d * rhs0;
t += (a & mask32) + (b & mask32);
lhs->b = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->d * rhs1;
b = lhs->e * rhs0;
t += (a & mask32) + (b & mask32);
lhs->c = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->e * rhs1;
b = lhs->high * rhs0;
t += (a & mask32) + (b & mask32);
lhs->d = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += lhs->high * rhs1;
lhs->e = t;
lhs->high = t >> 32;
#endif
}
// Multiply a 192-bit fraction by a 64-bit fraction, rounding up.
static void multiply192x64RoundingUp(swift_uint192_t *lhs, uint64_t rhs) {
#if HAVE_UINT128_T
// Compute the three 128-bit cross-products
__uint128_t cd = (__uint128_t)lhs->low * rhs + UINT64_MAX;
__uint128_t bc = (__uint128_t)lhs->mid * rhs;
__uint128_t ab = (__uint128_t)lhs->high * rhs;
// Add up the three 64-bit outputs (including carries)
__uint128_t c = (cd >> 64) + (uint64_t)bc;
__uint128_t b = (bc >> 64) + (uint64_t)ab + (c >> 64);
__uint128_t a = (ab >> 64) + (b >> 64);
lhs->high = a;
lhs->mid = b;
lhs->low = c;
#else
static const uint64_t mask32 = UINT32_MAX;
static const uint64_t bias = mask32;
uint64_t rhs0 = rhs & mask32;
uint64_t rhs1 = rhs >> 32;
uint64_t t = lhs->low * rhs0 + bias;
t >>= 32;
uint64_t a = lhs->low * rhs1;
uint64_t b = lhs->b * rhs0;
t += (a & mask32) + (b & mask32) + bias;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->b * rhs1;
b = lhs->c * rhs0;
t += (a & mask32) + (b & mask32);
lhs->low = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->c * rhs1;
b = lhs->d * rhs0;
t += (a & mask32) + (b & mask32);
lhs->b = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->d * rhs1;
b = lhs->e * rhs0;
t += (a & mask32) + (b & mask32);
lhs->c = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = lhs->e * rhs1;
b = lhs->high * rhs0;
t += (a & mask32) + (b & mask32);
lhs->d = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += lhs->high * rhs1;
lhs->e = t;
lhs->high = t >> 32;
#endif
}
// Multiply a 192-bit fraction by a 64-bit integer.
// This is used in the digit generation to multiply by ten or
// 10,000. Note that rounding never appliles here.
// As used below, this will never overflow.
static void multiply192xi32(swift_uint192_t *lhs, uint32_t rhs) {
#if HAVE_UINT128_T
__uint128_t t = (__uint128_t)lhs->low * rhs;
lhs->low = (uint64_t)t;
t = (t >> 64) + (__uint128_t)lhs->mid * rhs;
lhs->mid = (uint64_t)t;
t = (t >> 64) + (__uint128_t)lhs->high * rhs;
lhs->high = (uint64_t)t;
#else
uint64_t t = (uint64_t)lhs->low * rhs;
lhs->low = t;
t = (t >> 32) + (uint64_t)lhs->b * rhs;
lhs->b = t;
t = (t >> 32) + (uint64_t)lhs->c * rhs;
lhs->c = t;
t = (t >> 32) + (uint64_t)lhs->d * rhs;
lhs->d = t;
t = (t >> 32) + (uint64_t)lhs->e * rhs;
lhs->e = t;
t = (t >> 32) + (uint64_t)lhs->high * rhs;
lhs->high = t;
#endif
}
// Multiply a 192-bit fraction by a 128-bit fraction, rounding down.
static void multiply192x128RoundingDown(swift_uint192_t *lhs, swift_uint128_t rhs) {
#if HAVE_UINT128_T
// A full multiply of three 64-bit values by two 64-bit values
// yields five such components. We discard the bottom two (except
// for carries) to get a rounded-down three-element result.
__uint128_t current = (__uint128_t)lhs->low * (uint64_t)rhs;
current = (current >> 64);
__uint128_t t = (__uint128_t)lhs->low * (rhs >> 64);
current += (uint64_t)t;
__uint128_t next = t >> 64;
t = (__uint128_t)lhs->mid * (uint64_t)rhs;
current += (uint64_t)t;
next += t >> 64;
current = next + (current >> 64);
t = (__uint128_t)lhs->mid * (rhs >> 64);
current += (uint64_t)t;
next = t >> 64;
t = (__uint128_t)lhs->high * (uint64_t)rhs;
current += (uint64_t)t;
next += t >> 64;
lhs->low = (uint64_t)current;
current = next + (current >> 64);
t = (__uint128_t)lhs->high * (rhs >> 64);
current += t;
lhs->mid = (uint64_t)current;
lhs->high = (uint64_t)(current >> 64);
#else
uint64_t a, b, c, d; // temporaries
// Six 32-bit values multiplied by 4 32-bit values. Oh my.
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = lhs->low * rhs.low;
t >>= 32;
a = (uint64_t)lhs->low * rhs.b;
b = (uint64_t)lhs->b * rhs.low;
t += (a & mask32) + (b & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32);
a = (uint64_t)lhs->low * rhs.c;
b = (uint64_t)lhs->b * rhs.b;
c = (uint64_t)lhs->c * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32);
a = (uint64_t)lhs->low * rhs.high;
b = (uint64_t)lhs->b * rhs.c;
c = (uint64_t)lhs->c * rhs.b;
d = (uint64_t)lhs->d * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32);
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->b * rhs.high;
b = (uint64_t)lhs->c * rhs.c;
c = (uint64_t)lhs->d * rhs.b;
d = (uint64_t)lhs->e * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32);
lhs->low = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->c * rhs.high;
b = (uint64_t)lhs->d * rhs.c;
c = (uint64_t)lhs->e * rhs.b;
d = (uint64_t)lhs->high * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32);
lhs->b = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->d * rhs.high;
b = (uint64_t)lhs->e * rhs.c;
c = (uint64_t)lhs->high * rhs.b;
t += (a & mask32) + (b & mask32) + (c & mask32);
lhs->c = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32);
a = (uint64_t)lhs->e * rhs.high;
b = (uint64_t)lhs->high * rhs.c;
t += (a & mask32) + (b & mask32);
lhs->d = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += (uint64_t)lhs->high * rhs.high;
lhs->e = t;
lhs->high = t >> 32;
#endif
}
// Multiply a 192-bit fraction by a 128-bit fraction, rounding up.
static void multiply192x128RoundingUp(swift_uint192_t *lhs, swift_uint128_t rhs) {
#if HAVE_UINT128_T
// Same as the rounding-down version, but we add
// UINT128_MAX to the bottom two to force an extra
// carry if they are non-zero.
swift_uint128_t current = (swift_uint128_t)lhs->low * (uint64_t)rhs;
current += UINT64_MAX;
current = (current >> 64);
swift_uint128_t t = (swift_uint128_t)lhs->low * (rhs >> 64);
current += (uint64_t)t;
swift_uint128_t next = t >> 64;
t = (swift_uint128_t)lhs->mid * (uint64_t)rhs;
current += (uint64_t)t;
next += t >> 64;
// Round up by adding UINT128_MAX (upper half)
current += UINT64_MAX;
current = next + (current >> 64);
t = (swift_uint128_t)lhs->mid * (rhs >> 64);
current += (uint64_t)t;
next = t >> 64;
t = (swift_uint128_t)lhs->high * (uint64_t)rhs;
current += (uint64_t)t;
next += t >> 64;
lhs->low = (uint64_t)current;
current = next + (current >> 64);
t = (swift_uint128_t)lhs->high * (rhs >> 64);
current += t;
lhs->mid = (uint64_t)current;
lhs->high = (uint64_t)(current >> 64);
#else
uint64_t a, b, c, d; // temporaries
// Six 32-bit values multiplied by 4 32-bit values. Oh my.
static const uint64_t mask32 = UINT32_MAX;
uint64_t t = (uint64_t)lhs->low * rhs.low + mask32;
t >>= 32;
a = (uint64_t)lhs->low * rhs.b;
b = (uint64_t)lhs->b * rhs.low;
t += (a & mask32) + (b & mask32) + mask32;
t >>= 32;
t += (a >> 32) + (b >> 32);
a = (uint64_t)lhs->low * rhs.c;
b = (uint64_t)lhs->b * rhs.b;
c = (uint64_t)lhs->c * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + mask32;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32);
a = (uint64_t)lhs->low * rhs.high;
b = (uint64_t)lhs->b * rhs.c;
c = (uint64_t)lhs->c * rhs.b;
d = (uint64_t)lhs->d * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32) + mask32;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->b * rhs.high;
b = (uint64_t)lhs->c * rhs.c;
c = (uint64_t)lhs->d * rhs.b;
d = (uint64_t)lhs->e * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32);
lhs->low = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->c * rhs.high;
b = (uint64_t)lhs->d * rhs.c;
c = (uint64_t)lhs->e * rhs.b;
d = (uint64_t)lhs->high * rhs.low;
t += (a & mask32) + (b & mask32) + (c & mask32) + (d & mask32);
lhs->b = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32) + (d >> 32);
a = (uint64_t)lhs->d * rhs.high;
b = (uint64_t)lhs->e * rhs.c;
c = (uint64_t)lhs->high * rhs.b;
t += (a & mask32) + (b & mask32) + (c & mask32);
lhs->c = t;
t >>= 32;
t += (a >> 32) + (b >> 32) + (c >> 32);
a = (uint64_t)lhs->e * rhs.high;
b = (uint64_t)lhs->high * rhs.c;
t += (a & mask32) + (b & mask32);
lhs->d = t;
t >>= 32;
t += (a >> 32) + (b >> 32);
t += (uint64_t)lhs->high * rhs.high;
lhs->e = t;
lhs->high = t >> 32;
#endif
}
// Subtract two 192-bit integers or fractions.
static void subtract192x192(swift_uint192_t *lhs, swift_uint192_t rhs) {
#if HAVE_UINT128_T
swift_uint128_t t = (swift_uint128_t)lhs->low + (~rhs.low) + 1;
lhs->low = t;
t = (t >> 64) + lhs->mid + (~rhs.mid);
lhs->mid = t;
lhs->high += (t >> 64) + (~rhs.high);
#else
uint64_t t = (uint64_t)lhs->low + (~rhs.low) + 1;
lhs->low = t;
t = (t >> 32) + lhs->b + (~rhs.b);
lhs->b = t;
t = (t >> 32) + lhs->c + (~rhs.c);
lhs->c = t;
t = (t >> 32) + lhs->d + (~rhs.d);
lhs->d = t;
t = (t >> 32) + lhs->e + (~rhs.e);
lhs->e = t;
lhs->high += (t >> 32) + (~rhs.high);
#endif
}
// Compare two 192-bit integers or fractions.
static int isLessThan192x192(swift_uint192_t lhs, swift_uint192_t rhs) {
#if HAVE_UINT128_T
return (lhs.high < rhs.high)
|| (lhs.high == rhs.high
&& (lhs.mid < rhs.mid
|| (lhs.mid == rhs.mid
&& lhs.low < rhs.low)));
#else
return (lhs.high < rhs.high
|| (lhs.high == rhs.high
&& (lhs.e < rhs.e
|| (lhs.e == rhs.e
&& (lhs.d < rhs.d
|| (lhs.d == rhs.d
&& (lhs.c < rhs.c
|| (lhs.c == rhs.c
&& (lhs.b < rhs.b
|| (lhs.b == rhs.b
&& (lhs.low < rhs.low)))))))))));
#endif
}
// Shift a 192-bit integer right, rounding down.
static void shiftRightRoundingDown192(swift_uint192_t *lhs, int shift) {
#if HAVE_UINT128_T
__uint128_t t = (__uint128_t)lhs->low >> shift;
t += ((__uint128_t)lhs->mid << (64 - shift));
lhs->low = t;
t >>= 64;
t += ((__uint128_t)lhs->high << (64 - shift));
lhs->mid = t;
t >>= 64;
lhs->high = t;
#else
uint64_t t = (uint64_t)lhs->low >> shift;
t += ((uint64_t)lhs->b << (32 - shift));
lhs->low = t;
t >>= 32;
t += ((uint64_t)lhs->c << (32 - shift));
lhs->b = t;
t >>= 32;
t += ((uint64_t)lhs->d << (32 - shift));
lhs->c = t;
t >>= 32;
t += ((uint64_t)lhs->e << (32 - shift));
lhs->d = t;
t >>= 32;
t += ((uint64_t)lhs->high << (32 - shift));
lhs->e = t;
t >>= 32;
lhs->high = t;
#endif
}
// Shift a 192-bit integer right, rounding up.
// Note: The shift will always be less than 20. Someday, that
// might suggest a way to further optimize this.
static void shiftRightRoundingUp192(swift_uint192_t *lhs, int shift) {
#if HAVE_UINT128_T
const uint64_t bias = (1 << shift) - 1;
__uint128_t t = ((__uint128_t)lhs->low + bias) >> shift;
t += ((__uint128_t)lhs->mid << (64 - shift));
lhs->low = t;
t >>= 64;
t += ((__uint128_t)lhs->high << (64 - shift));
lhs->mid = t;
t >>= 64;
lhs->high = t;
#else
const uint64_t bias = (1 << shift) - 1;
uint64_t t = ((uint64_t)lhs->low + bias) >> shift;
t += ((uint64_t)lhs->b << (32 - shift));
lhs->low = t;
t >>= 32;
t += ((uint64_t)lhs->c << (32 - shift));
lhs->b = t;
t >>= 32;
t += ((uint64_t)lhs->d << (32 - shift));
lhs->c = t;
t >>= 32;
t += ((uint64_t)lhs->e << (32 - shift));
lhs->d = t;
t >>= 32;
t += ((uint64_t)lhs->high << (32 - shift));
lhs->e = t;
t >>= 32;
lhs->high = t;
#endif
}
#endif
//
// ------------ Power of 10 calculation ----------------
//
//
// ------------ Power-of-10 tables. --------------------------
//
// Grisu-style algorithms rely on being able to rapidly
// find a high-precision approximation of any power of 10.
// These values were computed by a simple script that
// relied on Python's excellent variable-length
// integer support.
#if SWIFT_DTOA_FLOAT_SUPPORT || SWIFT_DTOA_DOUBLE_SUPPORT || SWIFT_DTOA_FLOAT80_SUPPORT
// Float table
//
// The constant powers of 10 here represent pure fractions
// with a binary point at the far left. (Each number in
// this first table is implicitly divided by 2^64.)
//
// Table size: 320 bytes
//
// A 64-bit significand allows us to exactly represent
// powers of 10 up to 10^27. For larger powers, the
// value here is rounded DOWN from the exact value.
// For those powers, the value here is less than the
// exact power of 10; adding one gives a value greater
// than the exact power of 10.
//
// For single-precision Float, we use these directly
// for positive powers of 10. For negative powers of
// ten, we multiply a value here by 10^-40.
//
// For Double and Float80, we use the 28 exact values
// here to help reduce the size of those tables.
static const uint64_t powersOf10_Float[40] = {
0x8000000000000000, // x 2^1 == 10^0 exactly
0xa000000000000000, // x 2^4 == 10^1 exactly
0xc800000000000000, // x 2^7 == 10^2 exactly
0xfa00000000000000, // x 2^10 == 10^3 exactly
0x9c40000000000000, // x 2^14 == 10^4 exactly
0xc350000000000000, // x 2^17 == 10^5 exactly
0xf424000000000000, // x 2^20 == 10^6 exactly
0x9896800000000000, // x 2^24 == 10^7 exactly
0xbebc200000000000, // x 2^27 == 10^8 exactly
0xee6b280000000000, // x 2^30 == 10^9 exactly
0x9502f90000000000, // x 2^34 == 10^10 exactly
0xba43b74000000000, // x 2^37 == 10^11 exactly
0xe8d4a51000000000, // x 2^40 == 10^12 exactly
0x9184e72a00000000, // x 2^44 == 10^13 exactly
0xb5e620f480000000, // x 2^47 == 10^14 exactly
0xe35fa931a0000000, // x 2^50 == 10^15 exactly
0x8e1bc9bf04000000, // x 2^54 == 10^16 exactly
0xb1a2bc2ec5000000, // x 2^57 == 10^17 exactly
0xde0b6b3a76400000, // x 2^60 == 10^18 exactly
0x8ac7230489e80000, // x 2^64 == 10^19 exactly
0xad78ebc5ac620000, // x 2^67 == 10^20 exactly
0xd8d726b7177a8000, // x 2^70 == 10^21 exactly
0x878678326eac9000, // x 2^74 == 10^22 exactly
0xa968163f0a57b400, // x 2^77 == 10^23 exactly
0xd3c21bcecceda100, // x 2^80 == 10^24 exactly
0x84595161401484a0, // x 2^84 == 10^25 exactly
0xa56fa5b99019a5c8, // x 2^87 == 10^26 exactly
0xcecb8f27f4200f3a, // x 2^90 == 10^27 exactly
0x813f3978f8940984, // x 2^94 ~= 10^28
0xa18f07d736b90be5, // x 2^97 ~= 10^29
0xc9f2c9cd04674ede, // x 2^100 ~= 10^30
0xfc6f7c4045812296, // x 2^103 ~= 10^31
0x9dc5ada82b70b59d, // x 2^107 ~= 10^32
0xc5371912364ce305, // x 2^110 ~= 10^33
0xf684df56c3e01bc6, // x 2^113 ~= 10^34
0x9a130b963a6c115c, // x 2^117 ~= 10^35
0xc097ce7bc90715b3, // x 2^120 ~= 10^36
0xf0bdc21abb48db20, // x 2^123 ~= 10^37
0x96769950b50d88f4, // x 2^127 ~= 10^38
0xbc143fa4e250eb31, // x 2^130 ~= 10^39
};
#endif
#if SWIFT_DTOA_DOUBLE_SUPPORT
// As above, but with 128-bit fractions.
//
// Table size: 464 bytes
//
// We only store every 28th power of ten here.
// We can multiply by an exact 64-bit power of
// ten from the table above to reconstruct the
// significand for any power of 10.
static const uint64_t powersOf10_Double[] = {
// low-order half, high-order half
0x3931b850df08e738, 0x95fe7e07c91efafa, // x 2^-1328 ~= 10^-400
0xba954f8e758fecb3, 0x9774919ef68662a3, // x 2^-1235 ~= 10^-372
0x9028bed2939a635c, 0x98ee4a22ecf3188b, // x 2^-1142 ~= 10^-344
0x47b233c92125366e, 0x9a6bb0aa55653b2d, // x 2^-1049 ~= 10^-316
0x4ee367f9430aec32, 0x9becce62836ac577, // x 2^-956 ~= 10^-288
0x6f773fc3603db4a9, 0x9d71ac8fada6c9b5, // x 2^-863 ~= 10^-260
0xc47bc5014a1a6daf, 0x9efa548d26e5a6e1, // x 2^-770 ~= 10^-232
0x80e8a40eccd228a4, 0xa086cfcd97bf97f3, // x 2^-677 ~= 10^-204
0xb8ada00e5a506a7c, 0xa21727db38cb002f, // x 2^-584 ~= 10^-176
0xc13e60d0d2e0ebba, 0xa3ab66580d5fdaf5, // x 2^-491 ~= 10^-148
0xc2974eb4ee658828, 0xa54394fe1eedb8fe, // x 2^-398 ~= 10^-120
0xcb4ccd500f6bb952, 0xa6dfbd9fb8e5b88e, // x 2^-305 ~= 10^-92
0x3f2398d747b36224, 0xa87fea27a539e9a5, // x 2^-212 ~= 10^-64
0xdde50bd1d5d0b9e9, 0xaa242499697392d2, // x 2^-119 ~= 10^-36
0xfdc20d2b36ba7c3d, 0xabcc77118461cefc, // x 2^-26 ~= 10^-8
0x0000000000000000, 0xad78ebc5ac620000, // x 2^67 == 10^20 exactly
0x9670b12b7f410000, 0xaf298d050e4395d6, // x 2^160 == 10^48 exactly
0x3b25a55f43294bcb, 0xb0de65388cc8ada8, // x 2^253 ~= 10^76
0x58edec91ec2cb657, 0xb2977ee300c50fe7, // x 2^346 ~= 10^104
0x29babe4598c311fb, 0xb454e4a179dd1877, // x 2^439 ~= 10^132
0x577b986b314d6009, 0xb616a12b7fe617aa, // x 2^532 ~= 10^160
0x0c11ed6d538aeb2f, 0xb7dcbf5354e9bece, // x 2^625 ~= 10^188
0x6d953e2bd7173692, 0xb9a74a0637ce2ee1, // x 2^718 ~= 10^216
0x9d6d1ad41abe37f1, 0xbb764c4ca7a4440f, // x 2^811 ~= 10^244
0x4b2d8644d8a74e18, 0xbd49d14aa79dbc82, // x 2^904 ~= 10^272
0xe0470a63e6bd56c3, 0xbf21e44003acdd2c, // x 2^997 ~= 10^300
0x505f522e53053ff2, 0xc0fe908895cf3b44, // x 2^1090 ~= 10^328
0xcca845ab2beafa9a, 0xc2dfe19c8c055535, // x 2^1183 ~= 10^356
0x1027fff56784f444, 0xc4c5e310aef8aa17, // x 2^1276 ~= 10^384
};
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
// Every 83rd power of 10 across the range of Float80
//
// Table size: 2,928 bytes
//
// Note: We could cut this in half at the cost of one additional
// 192-bit multiply by only storing the positive values and
// multiplying by 10^-5063 to obtain the negative ones, similar
// to how we handle Float above.
static const uint64_t powersOf10_Float80[] = {
// low 64 bits, middle 64 bits, high 64 bits
0x56825ada98468526, 0x0abc23fcfda07e29, 0x871db786569ca2dd, // x 2^-16818 ~= 10^-5063
0xe3885beff5ee930d, 0xd1e638f68c97f0e5, 0xde90d1b113564507, // x 2^-16543 ~= 10^-4980
0x5a7d22b5236bcad4, 0xbab3a28a6f0a489c, 0xb74ea21ab2946479, // x 2^-16267 ~= 10^-4897
0x7d1f78bf0f4e2878, 0xcf4aea39615ffe6e, 0x96f9351fcfdd2686, // x 2^-15991 ~= 10^-4814
0xad725c0d6c214314, 0x5bd5c19f18bd2857, 0xf8afafd6ef185238, // x 2^-15716 ~= 10^-4731
0xe418e9217ce83755, 0x801e38463183fc88, 0xccd1ffc6bba63e21, // x 2^-15440 ~= 10^-4648
0x4dcd52747e029d0c, 0x867b3096b1619df8, 0xa8b11ff4721d92fb, // x 2^-15164 ~= 10^-4565
0xed2903f7e1df2b78, 0xc846664fe1364ee8, 0x8aefaaae9060380f, // x 2^-14888 ~= 10^-4482
0xed7a7f4e1e171498, 0x7da1a627b88527f1, 0xe4dbb751faa311b0, // x 2^-14613 ~= 10^-4399
0x320796dc9b1a158c, 0x2a11a871597b8012, 0xbc7d620092481a7e, // x 2^-14337 ~= 10^-4316
0x796014ec6f4c0dcb, 0xcfa99f62903708d7, 0x9b3dee433c1311e9, // x 2^-14061 ~= 10^-4233
0x08920ae76bdb8282, 0x952b06c385a08ff6, 0xffb7a402531fd4c9, // x 2^-13786 ~= 10^-4150
0x18faa162f2c4d6b9, 0x050be8c5d21c6db6, 0xd29c7528965ae5bd, // x 2^-13510 ~= 10^-4067
0x576a6c1abab7f7e7, 0xc05fb0b5c550f28d, 0xad7617610634129e, // x 2^-13234 ~= 10^-3984
0x6cc3b2fb9cae4875, 0xe1bd5b09b7202157, 0x8edd4417ae3dc210, // x 2^-12958 ~= 10^-3901
0xfafc1dc8fd12a6f8, 0xc3f29d230036529b, 0xeb542860da9bc7d8, // x 2^-12683 ~= 10^-3818
0x9d198c56bc799f35, 0x42960adf9591f02a, 0xc1d1a4b6bc2eafc8, // x 2^-12407 ~= 10^-3735
0x1803d096e43ff4bc, 0xdef9759d6432dab7, 0x9fa18c5de65a16fb, // x 2^-12131 ~= 10^-3652
0x74872779576a577c, 0x9150140eb5101a96, 0x83793dfc83fd2f9b, // x 2^-11855 ~= 10^-3569
0x415d0667f0f88262, 0x4898d98d1314d99f, 0xd890d45a257f4644, // x 2^-11580 ~= 10^-3486
0x30a5256a610c1c72, 0x6ebca4d0365d504d, 0xb25d93f98145bdab, // x 2^-11304 ~= 10^-3403
0xfb149b1f86a46376, 0xb5143323a7f8e16c, 0x92e74be10524c389, // x 2^-11028 ~= 10^-3320
0x7b7532e25fead4c8, 0x0df6ab8ac6a0ec2f, 0xf1fb6e82e4df4a77, // x 2^-10753 ~= 10^-3237
0x3b738d6a3caae67a, 0x346b2dd31826cfed, 0xc74c79bce7fe949f, // x 2^-10477 ~= 10^-3154
0x22d3a12777cee527, 0xe185ac46f6ef1993, 0xa424ef0bb5ad3129, // x 2^-10201 ~= 10^-3071
0xb7b6bccb9f60adec, 0x30ad7df78fe30cc8, 0x8730d40821cd89f3, // x 2^-9925 ~= 10^-2988
0x21049149d72d44d5, 0x1e86debc54dd290d, 0xdeb04cb82aec22cb, // x 2^-9650 ~= 10^-2905
0xeb69d287f5e7f920, 0x8d7e84a13801d034, 0xb7688f9800d26b3a, // x 2^-9374 ~= 10^-2822
0x47b3da9aeff7df71, 0x82678774d6c0ac59, 0x970e8fd25ead01e3, // x 2^-9098 ~= 10^-2739
0x50d3e92e2e4f8210, 0x9492db3d978aaca8, 0xf8d2dcaf37504b51, // x 2^-8823 ~= 10^-2656
0xf4ce72058f777a4c, 0xb9eb2c585e924efe, 0xcceef83fdedcc506, // x 2^-8547 ~= 10^-2573
0xb0e624a35b791884, 0x7104a98dbbb38b94, 0xa8c8fc3b03c3d1ed, // x 2^-8271 ~= 10^-2490
0x6c94ecd8291e2ac9, 0x978b03821014b68c, 0x8b03518396007c08, // x 2^-7995 ~= 10^-2407
0x96475208daa03ee3, 0x10eaa1481b149e5a, 0xe4fc163319551441, // x 2^-7720 ~= 10^-2324
0xd709a820ff4ac847, 0x75aab7cb5cb15414, 0xbc980b270680156a, // x 2^-7444 ~= 10^-2241
0xf273bca6b4de9e24, 0xb5025d88d4b252e1, 0x9b53e384eccabcf7, // x 2^-7168 ~= 10^-2158
0x6c55a0603a928f40, 0x9e20db16dfb6b461, 0xffdbcf7251089796, // x 2^-6893 ~= 10^-2075
0x9006db4d43cffe42, 0x9b4bca4cd6cec2db, 0xd2ba3f510a3aa638, // x 2^-6617 ~= 10^-1992
0xa6b3c457fd0cd4d6, 0x28a4de91ba868fbf, 0xad8ea05a5f27642a, // x 2^-6341 ~= 10^-1909
0xe7b14ed140f8d98e, 0x7b2f7d61ce5d426c, 0x8ef179291b6f5424, // x 2^-6065 ~= 10^-1826
0x4a964d052fd03e10, 0x06897060bf491e6e, 0xeb75718d285cd8bf, // x 2^-5790 ~= 10^-1743
0x22b2270f0e8dd87c, 0xa8510fa2f5a9e4de, 0xc1ed0ed498f7c54c, // x 2^-5514 ~= 10^-1660
0x09102915726a9905, 0x5a0eb896edc89b54, 0x9fb8208d65ea5eda, // x 2^-5238 ~= 10^-1577
0xb80d5f481d01deb9, 0x673f2aa50486f5ba, 0x838bd699b7c539e6, // x 2^-4962 ~= 10^-1494
0x668b62b20ec2633b, 0x8682604c7123f859, 0xd8af761f94d2db2c, // x 2^-4687 ~= 10^-1411
0xb2adaed8559cc199, 0x712339ba54f12372, 0xb276ce87987995d5, // x 2^-4411 ~= 10^-1328
0x6beb873308685711, 0xac1ce34246ed56ad, 0x92fc133455668c02, // x 2^-4135 ~= 10^-1245
0x593293d68a2261bc, 0x3c368f9497ca075d, 0xf21da89a29fa1c61, // x 2^-3860 ~= 10^-1162
0x854051f9f0e4ca66, 0x8c5d5a234eda57f7, 0xc768aa46d6d1b675, // x 2^-3584 ~= 10^-1079
0x333d09b2299c5e6b, 0xcf1f49c33399c5ac, 0xa43c26a751d4f7e7, // x 2^-3308 ~= 10^-996
0x25a440d8b1620532, 0x274ebc67c3e21943, 0x8743f33df0feed29, // x 2^-3032 ~= 10^-913
0x3ca95e3deb5be648, 0x52d18ccca1c558c2, 0xdecfcc3329238dd8, // x 2^-2757 ~= 10^-830
0x59d1a7704af3acd7, 0xfae7722c6af19467, 0xb78280c024488353, // x 2^-2481 ~= 10^-747
0x78813f3e80148049, 0x73b2baf13aa1c233, 0x9723ed8a28baf5ac, // x 2^-2205 ~= 10^-664
0xf296a8198aa40fb8, 0x235532b08487fe6a, 0xf8f60e812de0cd7d, // x 2^-1930 ~= 10^-581
0xa7fbdcb40b4f648f, 0x4f20ba9a64a7f6e7, 0xcd0bf4d206072167, // x 2^-1654 ~= 10^-498
0x8dbf63ea468c724f, 0xa0e25c08b5c189d6, 0xa8e0dbe18ffb82cf, // x 2^-1378 ~= 10^-415
0x765995c6cfd406ce, 0x4c3bcb5021afcc31, 0x8b16fb203055ac76, // x 2^-1102 ~= 10^-332
0xe1d09ab6fb409872, 0x82b7e12780e7401a, 0xe51c79a85916f484, // x 2^-827 ~= 10^-249
0x89cbe2422f9e1df9, 0x7415d448f6b6f0e7, 0xbcb2b812db11a5de, // x 2^-551 ~= 10^-166
0x605fe83842e4d290, 0xc986afbe3ee11aba, 0x9b69dbe1b548ce7c, // x 2^-275 ~= 10^-83
0x0000000000000000, 0x0000000000000000, 0x8000000000000000, // x 2^1 == 10^0 exactly
0x0d6953169e1c7a1e, 0xf50a3fa490c30190, 0xd2d80db02aabd62b, // x 2^276 ~= 10^83
0x3720b80c7d8ee39d, 0xaf561aa79a10ae6a, 0xada72ccc20054ae9, // x 2^552 ~= 10^166
0x7cb3f026a212df74, 0x29cb4d87f2a7400e, 0x8f05b1163ba6832d, // x 2^828 ~= 10^249
0x7dda22f9451d28a4, 0xe41c5bd18c57e88f, 0xeb96bf6ebadf77d8, // x 2^1103 ~= 10^332
0xd5da00e6e2d05e5d, 0x5e510c5a752f0f8e, 0xc2087cd3215a16ad, // x 2^1379 ~= 10^415
0x5603ba353e0b2fac, 0x48bbddc4d7359e49, 0x9fceb7ee780436f0, // x 2^1655 ~= 10^498
0x15ceea8df15e47c7, 0x6a83c85cf158c652, 0x839e71d847c1779e, // x 2^1931 ~= 10^581
0x514478d1fcd48eea, 0x3a4181cdda0d6e24, 0xd8ce1c3a2fffaea7, // x 2^2206 ~= 10^664
0xe8b634620f1062be, 0x7304c7fb8a2f8a8a, 0xb2900ca735bdf121, // x 2^2482 ~= 10^747
0xc3ec2fd9302c9bda, 0x729a6a7e830e1cf2, 0x9310dd78089bd66f, // x 2^2758 ~= 10^830
0x1750ef5f751be079, 0x52ccabc96fc88a23, 0xf23fe788c763dffa, // x 2^3033 ~= 10^913
0xaa80925ec1c80b65, 0x97681c548ff6c12f, 0xc784decd820a6180, // x 2^3309 ~= 10^996
0xb4212d4b435a2317, 0x8df0a55abbb2c99a, 0xa453618b9dfd92db, // x 2^3585 ~= 10^1079
0x939c2fedd434642a, 0x7d7de34bf5aa96b4, 0x875715282612729b, // x 2^3861 ~= 10^1162
0xb7d9a0e46bbebb36, 0x3b2057dea52d686b, 0xdeef5022af37f1f6, // x 2^4136 ~= 10^1245
0x931a74148ea64e59, 0xfe7fe67bd1074d0c, 0xb79c7593a1c17df0, // x 2^4412 ~= 10^1328
0xabd20df0f1f1ad54, 0x0fff83cc7fa6b77b, 0x97394e479b6573b1, // x 2^4688 ~= 10^1411
0x25f2467421674b7a, 0x828ff55a248bc026, 0xf919454d86f16685, // x 2^4963 ~= 10^1494
0xe32dbd7131e6ab7d, 0xf674dd4821982084, 0xcd28f57dc585d094, // x 2^5239 ~= 10^1577
0x866ab816a532b07d, 0xdc567471f9639b4e, 0xa8f8bee890f905c7, // x 2^5515 ~= 10^1660
0xaca3a975993a2626, 0xc41cf207a71d87e4, 0x8b2aa784c405e2f1, // x 2^5791 ~= 10^1743
0x731f0b7d820918bd, 0x355fde18e8448607, 0xe53ce1b25fb31788, // x 2^6066 ~= 10^1826
0x0be7c29568db3f20, 0xc1328a3f1bf4d2b8, 0xbccd68c49888be61, // x 2^6342 ~= 10^1909
0x9c6e0b1b927b7d3f, 0xffc9b96619da642a, 0x9b7fd75a060350cd, // x 2^6618 ~= 10^1992
0x2a84c8fb4bd2edc9, 0xa679df45d339389b, 0x80121ad60ca2c518, // x 2^6894 ~= 10^2075
0x0d4648d0876cf1c3, 0x48c67661c087fb5a, 0xd2f5e0469040e0eb, // x 2^7169 ~= 10^2158
0xfe2d99a281a011ac, 0x65c13361e6b2c078, 0xadbfbcb6c676a69b, // x 2^7445 ~= 10^2241
0xf4ec157aa4147562, 0xac89bfa5e79484a6, 0x8f19ebdf7661e3e9, // x 2^7721 ~= 10^2324
0xe945f8c80090be1f, 0x9b8672e64aadbed2, 0xebb812063c9e01db, // x 2^7996 ~= 10^2407
0xca12d8ad3b36d2e4, 0xd252322ea50ad274, 0xc223eeb2e1bde452, // x 2^8272 ~= 10^2490
0xf554fb41e5b3e384, 0x977acb4d4af624fc, 0x9fe55281904ba38b, // x 2^8548 ~= 10^2573
0xf3f69093398e2573, 0x111ae5735ec0e878, 0x83b10fb893300cde, // x 2^8824 ~= 10^2656
0x30f65a8da0d10429, 0x1eecf4cf8a0b25f5, 0xd8ecc6aa93e876fc, // x 2^9099 ~= 10^2739
0xa577f5f0c9f1e5a7, 0x91a5430ed623abf0, 0xb2a94e58da4930c3, // x 2^9375 ~= 10^2822
0x202d2f87585ec0d7, 0x7a63589863efd480, 0x9325aaac89304b57, // x 2^9651 ~= 10^2905
0x049544fbba3c01a1, 0x53accb0f60ac6095, 0xf2622b4f6c68d6ce, // x 2^9926 ~= 10^2988
0x268a0d5f9d5ce861, 0xfe40703e1a91de57, 0xc7a117517a09153b, // x 2^10202 ~= 10^3071
0xb6cd470a2a3b1d63, 0x5f963916b20ea587, 0xa46a9fb9111003bc, // x 2^10478 ~= 10^3154
0xa3101c09fd8e6e96, 0xa2c6328011db5211, 0x876a39c722f798a7, // x 2^10754 ~= 10^3237
0x8507e5fdb0ec5d83, 0x7ce93cc7f8feeed4, 0xdf0ed8875e7b8914, // x 2^11029 ~= 10^3320
0xe19b7ebe4c7bfbca, 0x40930d1129943838, 0xb7b66e12fe1af499, // x 2^11305 ~= 10^3403
0xc90c4ec15c21a357, 0xd91c86512d147305, 0x974eb20b241a65f6, // x 2^11581 ~= 10^3486
0xb90341199c02a4eb, 0x69e684f53db6e8ce, 0xf93c8114f6c31f8a, // x 2^11856 ~= 10^3569
0xa873f1318cef91cb, 0xbf8718466b31a7ca, 0xcd45fa43b1ce4c8e, // x 2^12132 ~= 10^3652
0xacfe0dcc5262e273, 0x6e1bbb68662fd27a, 0xa910a550810203f8, // x 2^12408 ~= 10^3735
0x59e7c8921bbe3758, 0x834743c5eab7dcea, 0x8b3e56b1b5c57589, // x 2^12684 ~= 10^3818
0x145eed64fda2e6af, 0x1c605bdcc764238f, 0xe55d4e51d30b5592, // x 2^12959 ~= 10^3901
0xb747164b17268ea2, 0xd8aa19f1d85da07d, 0xbce81d3cc784a1ca, // x 2^13235 ~= 10^3984
0x3666af1cb2f0356b, 0xc8dd55687a68bb70, 0x9b95d5ee4f80366d, // x 2^13511 ~= 10^4067
0x70d67261b5bde1e9, 0x1d76f2d15166ec20, 0x8024383bab19730d, // x 2^13787 ~= 10^4150
0x084a3ba0b748546a, 0xc67f9026f83dca47, 0xd313b714d3a1c65e, // x 2^14062 ~= 10^4233
0x4411a8127eea085e, 0x441eb397ffcdab0d, 0xadd8501ad0361d15, // x 2^14338 ~= 10^4316
0x7b62a54ed6233032, 0x75458a1c8300e014, 0x8f2e2985332eae98, // x 2^14614 ~= 10^4399
0x162d5b51a1dd9594, 0x655bb1b7aa4e8196, 0xebd96954582af06f, // x 2^14889 ~= 10^4482
0x55e6c62f920d3682, 0x79fd57cf7c37941c, 0xc23f6474669f4abe, // x 2^15165 ~= 10^4565
0x19482fa0ac45669c, 0x803c1cd864033781, 0x9ffbf04722750449, // x 2^15441 ~= 10^4648
0xa412d1f95f4624cd, 0xc95abe9ce589e048, 0x83c3b03af95c9674, // x 2^15717 ~= 10^4731
0xc1207e487c57b4e1, 0xf93dd2c7669a8ed1, 0xd90b75715d861b38, // x 2^15992 ~= 10^4814
0xeb20d9a25e0372bd, 0xb5073df6adc221b4, 0xb2c2939d0763fcac, // x 2^16268 ~= 10^4897
0x1a648c339e28cc45, 0xbd14f0fa3e24b6ae, 0x933a7ad2419ea0b5, // x 2^16544 ~= 10^4980
};
#endif
#if SWIFT_DTOA_FLOAT_SUPPORT || SWIFT_DTOA_DOUBLE_SUPPORT || SWIFT_DTOA_FLOAT80_SUPPORT
// The power-of-10 tables do not directly store the associated binary
// exponent. That's because the binary exponent is a simple linear
// function of the decimal power (and vice versa), so it's just as
// fast (and uses much less memory) to compute it:
// The binary exponent corresponding to a particular power of 10.
// This matches the power-of-10 tables across the full range of Float80.
static int binaryExponentFor10ToThe(int p) {
return (int)(((((int64_t)p) * 55732705) >> 24) + 1);
}
// A decimal exponent that approximates a particular binary power.
static int decimalExponentFor2ToThe(int e) {
return (int)(((int64_t)e * 20201781) >> 26);
}
#endif
#if SWIFT_DTOA_FLOAT_SUPPORT
// Given a power `p`, this returns three values:
// * 64-bit fractions `lower` and `upper`
// * integer `exponent`
//
// The returned values satisty the following:
// ```
// lower * 2^exponent <= 10^p <= upper * 2^exponent
// ```
//
// In particular, if `10^p` can be exactly represented, this routine
// may return the same value for `lower` and `upper`.
//
static void intervalContainingPowerOf10_Float(int p, uint64_t *lower, uint64_t *upper, int *exponent) {
if (p < 0) {
uint64_t base = powersOf10_Float[p + 40];
int baseExponent = binaryExponentFor10ToThe(p + 40);
uint64_t tenToTheMinus40 = 0x8b61313bbabce2c6; // x 2^-132 ~= 10^-40
*lower = multiply64x64RoundingDown(base, tenToTheMinus40);
*upper = multiply64x64RoundingUp(base + 1, tenToTheMinus40 + 1);
*exponent = baseExponent - 132;
} else if (p <= 27) {
uint64_t exact = powersOf10_Float[p];
*upper = exact;
*lower = exact;
*exponent = binaryExponentFor10ToThe(p);
} else {
uint64_t exact = powersOf10_Float[p];
*upper = exact + 1;
*lower = exact;
*exponent = binaryExponentFor10ToThe(p);
}
}
#endif
#if SWIFT_DTOA_DOUBLE_SUPPORT
// As above, but returning 128-bit fractions suitable for
// converting doubles.
static void intervalContainingPowerOf10_Double(int p, swift_uint128_t *lower, swift_uint128_t *upper, int *exponent) {
if (p >= 0 && p <= 54) {
if (p <= 27) {
// Use one 64-bit exact value
swift_uint128_t exact;
initialize128WithHigh64(exact, powersOf10_Float[p]);
*upper = exact;
*lower = exact;
*exponent = binaryExponentFor10ToThe(p);
return;
} else {
// Multiply two 64-bit exact values to get a 128-bit exact value
swift_uint128_t base;
initialize128WithHigh64(base, powersOf10_Float[p - 27]);
int baseExponent = binaryExponentFor10ToThe(p - 27);
uint64_t extra = powersOf10_Float[27];
int extraExponent = binaryExponentFor10ToThe(27);
swift_uint128_t exact = multiply128x64RoundingDown(base, extra);
*upper = exact;
*lower = exact;
*exponent = baseExponent + extraExponent;
return;
}
}
// Multiply a 128-bit approximate value with a 64-bit exact value
int index = p + 400;
// Copy a pair of uint64_t into a swift_uint128_t
const uint64_t *base_p = powersOf10_Double + (index / 28) * 2;
swift_uint128_t base;
initialize128WithHighLow64(base, base_p[1], base_p[0]);
int extraPower = index % 28;
int baseExponent = binaryExponentFor10ToThe(p - extraPower);
int e = baseExponent;
if (extraPower > 0) {
int64_t extra = powersOf10_Float[extraPower];
e += binaryExponentFor10ToThe(extraPower);
*lower = multiply128x64RoundingDown(base, extra);
increment128(base);
*upper = multiply128x64RoundingUp(base, extra);
} else {
*lower = base;
increment128(base);
*upper = base;
}
*exponent = e;
}
#endif
#if SWIFT_DTOA_FLOAT80_SUPPORT
// As above, but returning 192-bit fractions suitable for
// converting float80.
static void intervalContainingPowerOf10_Float80(int p, swift_uint192_t *lower, swift_uint192_t *upper, int *exponent) {
if (p >= 0 && p <= 27) {
// We have an exact form, return a zero-width interval.
uint64_t exact = powersOf10_Float[p];
initialize192WithHighMidLow64(*upper, exact, 0, 0);
initialize192WithHighMidLow64(*lower, exact, 0, 0);
*exponent = binaryExponentFor10ToThe(p);
return;
}
int index = p + 5063;
const uint64_t *base_p = powersOf10_Float80 + (index / 83) * 3;
// Note: The low-order value in the Float80 table above
// is never UINT64_MAX, so there's never a carry from
// the increment here.
initialize192WithHighMidLow64(*upper, base_p[2], base_p[1], base_p[0] + 1);
initialize192WithHighMidLow64(*lower, base_p[2], base_p[1], base_p[0]);
int extraPower = index % 83;
int e = binaryExponentFor10ToThe(p - extraPower);
while (extraPower > 27) {
uint64_t power27 = powersOf10_Float[27];
multiply192x64RoundingDown(lower, power27);
multiply192x64RoundingUp(upper, power27);
e += binaryExponentFor10ToThe(27);
extraPower -= 27;
}
if (extraPower > 0) {
uint64_t extra = powersOf10_Float[extraPower];
multiply192x64RoundingDown(lower, extra);
multiply192x64RoundingUp(upper, extra);
e += binaryExponentFor10ToThe(extraPower);
}
*exponent = e;
}
#endif