Commit Graph

23 Commits

Author SHA1 Message Date
Michael Ilseman
8294c0003a [string] Drop _StringGuts subscript; NFC
_StringGuts shouldn't expose a subscript, implying efficient
access. Switch to the explicit code unit fetch method. Update tests
accordingly, and switch off of deprecated typealiases.
2018-08-02 16:34:22 -07:00
Tony Allevato
d0e93acb00 Various fixes to Unicode.Scalar.Properties.
- numericValue returns nil instead of .nan for non-numerics
- Remove small-string optimizations from _scalarName that failed on 32-bit archs
- Put case mappings back into U.S.Properties
- Added more sanity tests
2018-07-05 20:42:56 -07:00
Tony Allevato
8eef50f6a9 Merge branch 'master' into unicode-properties 2018-07-04 08:42:35 -07:00
Slava Pestov
5d1f48e3ae stdlib: Update for stricter enforcement of @usableFromInline 2018-06-25 16:26:56 -07:00
Erik Little
863f3a19ff Rename @effects to @_effects
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
2018-06-06 12:53:03 -04:00
Michael Ilseman
ebdd5e6d98 [string] Fast-path for small string comparison
Promote small-string to small-string comparison into the fast path for
equality and less-than.

Small ASCII strings that are not binary equal do not compare equal,
allowing us to early exit. Small ASCII strings otherwise compare
lexicographically, which we can call prior to jumping through a few
intermediaries.
2018-05-24 14:47:04 -07:00
Michael Ilseman
ebbfd8c639 [string] Comparison bug fix: Kelvin
Unicode Kelvin sign normalizes to ASCII 'K', but our comparison logic
didn't handle this situation when the other side was single-byte all
ASCII. Fall back to the slow comparison path if the point of
difference between an all-ASCII string and a UTF-16 string falls on
such a non-ASCII-yet-normalizes-to-ASCII scalar (rare).
2018-04-23 17:45:04 -07:00
Tony Allevato
54f4c77ce7 [stdlib] Revert hasNormalizationBoundaryBefore
This property is too specific in that it forces a particular normalization; let's not expose it this way, but instead in the future with a full normalization API.
2018-04-22 12:01:03 -07:00
Slava Pestov
2e5aef9c8d stdlib: Remove redundant @usableFromInline attributes 2018-04-06 00:02:30 -07:00
Tony Allevato
fb9f7ecca1 Merge branch 'master' into unicode-properties 2018-03-31 09:54:18 -07:00
Slava Pestov
e1f50b2d36 SE-0193: Rename @_inlineable to @inlinable, @_versioned to @usableFromInline 2018-03-30 21:55:30 -07:00
Tony Allevato
5a50f27ae9 [stdlib] Migrate normalization usage to public properties 2018-03-28 06:55:53 -07:00
Michael Ilseman
93d6130066 [string] Integrate small strings.
Switch StringObject and StringGuts from opaquely storing tagged cocoa
strings into storing small strings. Plumb small string support
throughout the standard library's routines.
2018-03-27 14:00:59 -07:00
Jordan Rose
9034ba617b Ban @_fixed_layout on enums in favor of @_frozen
In theory there could be a "fixed-layout" enum that's not exhaustive
but promises not to add any more cases with payloads, but we don't
need that distinction today.

(Note that @objc enums are still "fixed-layout" in the actual sense of
"having a compile-time known layout". There's just no special way to
spell that.)
2018-03-20 14:49:10 -07:00
Lance Parker
cbf157f924 [stdlib]Unify String hashing implementation (#14921)
* Add partial range subscripts to _UnmanagedOpaqueString

* Use SipHash13+_NormalizedCodeUnitIterator for String hashes on all platforms

* Remove unecessary collation algorithm shims

* Pass the buffer to the SipHasher for ASCII

* Hash the ascii parts of UTF16 strings the same way we hash pure ascii strings

* De-dupe some code that can be shared between _UnmanagedOpaqueString and _UnmanagedString<UInt16>

* ASCII strings now hash consistently for in hashASCII() and hashUTF16()

* Fix zalgo comparison regression

* Use hasher

* Fix crash when appending to an empty _FixedArray

* Compact ASCII characters into a single UInt64 for hashing

* String: Switch to _hash(into:)-based hashing

This should speed up String hashing quite a bit, as doing it through hashValue involves two rounds of SipHash nested in each other.

* Remove obsolete workaround for ARC traffic

* Ditch _FixedArray<UInt8> in favor of _UIntBuffer<UInt64, UInt8>

* Bad rebase remnants

* Fix failing benchmarks

* michael's feedback

* clarify the comment about nul-terminated string hashes
2018-03-17 22:13:37 -07:00
Sho Ikeda
415ee8d703 [gardening] Change static internal to internal static for consistency
Before the changes:

- `git grep "internal static " | wc -l`: 161
- `git grep "static internal " | wc -l`: 10
2018-03-13 02:29:56 +09:00
Lance Parker
7b55cccf1e shorterPrefixesOther doesn't consume the longer segment, it only invalidates the shorter one. The pathological case needs to compare the entire segment, not skip the end of the longer one. 2018-02-28 11:27:40 -08:00
Lance Parker
4f9bd18c46 Gardening, hasNormalizationBoundary(after:) doesn't need a count parameter 2018-02-28 11:27:40 -08:00
Lance Parker
ecfb6c931e Fix indentation (#21)
* Ditched the simple/complex test distinction as they all pass now

* fixed indentation
2018-02-19 10:14:59 -08:00
Michael Ilseman
588e29633a WIP: Attempt at re-organizing a little bit of comparison code 2018-02-18 13:53:23 -08:00
Lance Parker
0661de22a2 [stdlib]Un-revert string comparison (#14694)
Restore (un-revert) sting comparison, with fixes

More exhaustive testing of opaque strings, which consistently reproduces prior sporadic failure. Shims fixups. Some test tweaking.
2018-02-18 10:50:33 -08:00
Lance Parker
abe6a6d177 Revert string comparison (#14657) 2018-02-15 14:37:43 -08:00
Lance Parker
897963a6f8 Unified String comparison strategy for all platforms 2018-02-14 15:44:11 -08:00