_StringGuts shouldn't expose a subscript, implying efficient
access. Switch to the explicit code unit fetch method. Update tests
accordingly, and switch off of deprecated typealiases.
- numericValue returns nil instead of .nan for non-numerics
- Remove small-string optimizations from _scalarName that failed on 32-bit archs
- Put case mappings back into U.S.Properties
- Added more sanity tests
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
Promote small-string to small-string comparison into the fast path for
equality and less-than.
Small ASCII strings that are not binary equal do not compare equal,
allowing us to early exit. Small ASCII strings otherwise compare
lexicographically, which we can call prior to jumping through a few
intermediaries.
Unicode Kelvin sign normalizes to ASCII 'K', but our comparison logic
didn't handle this situation when the other side was single-byte all
ASCII. Fall back to the slow comparison path if the point of
difference between an all-ASCII string and a UTF-16 string falls on
such a non-ASCII-yet-normalizes-to-ASCII scalar (rare).
This property is too specific in that it forces a particular normalization; let's not expose it this way, but instead in the future with a full normalization API.
Switch StringObject and StringGuts from opaquely storing tagged cocoa
strings into storing small strings. Plumb small string support
throughout the standard library's routines.
In theory there could be a "fixed-layout" enum that's not exhaustive
but promises not to add any more cases with payloads, but we don't
need that distinction today.
(Note that @objc enums are still "fixed-layout" in the actual sense of
"having a compile-time known layout". There's just no special way to
spell that.)
* Add partial range subscripts to _UnmanagedOpaqueString
* Use SipHash13+_NormalizedCodeUnitIterator for String hashes on all platforms
* Remove unecessary collation algorithm shims
* Pass the buffer to the SipHasher for ASCII
* Hash the ascii parts of UTF16 strings the same way we hash pure ascii strings
* De-dupe some code that can be shared between _UnmanagedOpaqueString and _UnmanagedString<UInt16>
* ASCII strings now hash consistently for in hashASCII() and hashUTF16()
* Fix zalgo comparison regression
* Use hasher
* Fix crash when appending to an empty _FixedArray
* Compact ASCII characters into a single UInt64 for hashing
* String: Switch to _hash(into:)-based hashing
This should speed up String hashing quite a bit, as doing it through hashValue involves two rounds of SipHash nested in each other.
* Remove obsolete workaround for ARC traffic
* Ditch _FixedArray<UInt8> in favor of _UIntBuffer<UInt64, UInt8>
* Bad rebase remnants
* Fix failing benchmarks
* michael's feedback
* clarify the comment about nul-terminated string hashes
Restore (un-revert) sting comparison, with fixes
More exhaustive testing of opaque strings, which consistently reproduces prior sporadic failure. Shims fixups. Some test tweaking.