[stdlib]Unify String hashing implementation (#14921)

* Add partial range subscripts to _UnmanagedOpaqueString

* Use SipHash13+_NormalizedCodeUnitIterator for String hashes on all platforms

* Remove unecessary collation algorithm shims

* Pass the buffer to the SipHasher for ASCII

* Hash the ascii parts of UTF16 strings the same way we hash pure ascii strings

* De-dupe some code that can be shared between _UnmanagedOpaqueString and _UnmanagedString<UInt16>

* ASCII strings now hash consistently for in hashASCII() and hashUTF16()

* Fix zalgo comparison regression

* Use hasher

* Fix crash when appending to an empty _FixedArray

* Compact ASCII characters into a single UInt64 for hashing

* String: Switch to _hash(into:)-based hashing

This should speed up String hashing quite a bit, as doing it through hashValue involves two rounds of SipHash nested in each other.

* Remove obsolete workaround for ARC traffic

* Ditch _FixedArray<UInt8> in favor of _UIntBuffer<UInt64, UInt8>

* Bad rebase remnants

* Fix failing benchmarks

* michael's feedback

* clarify the comment about nul-terminated string hashes
This commit is contained in:
Lance Parker
2018-03-17 22:13:37 -07:00
committed by GitHub
parent f5d43e2df9
commit cbf157f924
9 changed files with 228 additions and 401 deletions

View File

@@ -127,8 +127,8 @@ extension _FixedArray${N} {
@_versioned
internal mutating func append(_ newElement: T) {
_sanityCheck(count < capacity)
self[count] = newElement
_count += 1
self[count-1] = newElement
}
}