AnyHashable has numerous edge cases where two AnyHashable values compare equal but produce different hashes. This breaks Set and Dictionary invariants and can cause unexpected behavior and/or traps. This change overhauls AnyHashable's implementation to fix these edge cases, hopefully without introducing new issues.
- Fix transitivity of ==. Previously, comparisons involving AnyHashable values with Objective-C provenance were handled specially, breaking Equatable:
let a = (42 as Int as AnyHashable)
let b = (42 as NSNumber as AnyHashable)
let c = (42 as Double as AnyHashable)
a == b // true
b == c // true
a == c // was false(!), now true
let d = ("foo" as AnyHashable)
let e = ("foo" as NSString as AnyHashable)
let f = ("foo" as NSString as NSAttributedStringKey as AnyHashable)
d == e // true
e == f // true
d == f // was false(!), now true
- Fix Hashable conformance for numeric types boxed into AnyHashable:
b == c // true
b.hashValue == c.hashValue // was false(!), now true
Fixing this required adding a custom AnyHashable box for all standard integer and floating point types. The custom box was needed to ensure that two AnyHashables containing the same number compare equal and hash the same way, no matter what their original type was. (This behavior is required to ensure consistency with NSNumber, which has not been preserving types since SE-0170.
- Add custom AnyHashable representations for Arrays, Sets and Dictionaries, so that when they contain numeric types, they hash correctly under the new rules above.
- Remove AnyHashable._usedCustomRepresentation. The provenance of a value should not affect its behavior.
- Allow AnyHashable values to be downcasted into compatible types more often.
- Forward _rawHashValue(seed:) to AnyHashable box. This fixes AnyHashable hashing for types that customize single-shot hashing.
https://bugs.swift.org/browse/SR-7496
rdar://problem/39648819
Drop append-related @inlinable annotations for String, StringGuts,
StringStorage, and the Views. Drop several for larger operations, such
as case conversion. Drop as many as we can from StringGuts for now.
With the number of tests Swift does, this had a relatively high chance to fail
regularly somewhere.
Also, rejecting the lower tail means rejecting things that are perfectly
uniform, which I don't think should be the purpose of this test.
I noticed this during testing, but it has nothing to do with the other changes
in this PR. This static violation has always been present as a warning and would
continue to be a warning after my changes.
When Set/Dictionary is nested in another Set, the boundaries of the nested collections weren’t correctly delineated in commutative hashing.
For example, these Sets all hashed the same:
[[1, 2], [3, 4]]
[[1, 3], [2, 4]]
[[1, 4], [2, 3]]
Hash collisions could thus be systematically generated.
To fix this, remove collection-level support for one-shot hashing and revert to the previous method of generating hash values. (Set is still able to support one-shot hashing for its members, though.)
The new _rawHashValue(seed:) requirement allows stdlib types to specialize their hashing when they’re hashed on their own (i.e., not as a component of some composite type).
This makes it possible to get rid of discriminator/terminator values and to eliminate most of Hasher’s resiliency overhead, leading to a measurable speedup, especially for tiny keys.
Newly internal declarations include Hasher._seed and the integer overloads of Hasher._combine(_:), as well as _SipHash13 and _SipHash24.
Unify the interfaces of these SipHash testing structs with Hasher. Update SipHash test to cover Hasher, too.
Add @usableFromInline to all newly internal stuff. In addition to its normal use, it also enables white box testing; compile tests that need to use these declarations with -disable-access-control.
This prototype is not fully implemented, and it relies on specific hash values to not trigger unhandled cases.
To keep its test working, define and use a custom hashing interface that emulates hashValue behavior prior to SE-0206.
This is safe to do with hash(into:), because random hash collisions can be eliminated with awesome certainty by trying a number of different hash seeds. (Unless there is a weakness in SipHash.)
In some cases, we intentionally want hashing to produce looser equivalency classes than equality — to let those cases keep working, add an optional hashEqualityOracle parameter.
Review usages of checkHashable and add hash oracles as needed.
StdlibUnittest uses gyb to avoid duplicating many source-context
arguments. However, this means that any test that wishes to add new
expect helpers has to also be gybbed. Given that this structure hasn't
changed in years, and we should have a real language support
eventually, de-gyb it.
Since this code was first written, we've added more language features that introduce the opportunity for a materializeForSet protocol witness to have an incompatible polymorphic convention with its concrete implementation:
- In a conditional conformance, if a witness comes from a constrained extension with additional protocol requirements, then the witness will require those conformances as additional polymorphic arguments, making its materializeForSet uncallable from code using the protocol witness.
- Given a subscript requirement, the witness may be a generic subscript with a more general signature than the witness, making the generic arguments to the concrete materializeForSet callback incompatible with those expected for the witness.
Longer term, representing materializeForSet patterns using accessor coroutines should obviate the need for this hack. For now, it's necessary for correctness, addressing rdar://problem/35760754.
Switch StringObject and StringGuts from opaquely storing tagged cocoa
strings into storing small strings. Plumb small string support
throughout the standard library's routines.
Streamline internal String creation. Previously, everything funneled
into a single generic function, however, every single call of the
generic funnel had relevant specific information that could be used
for a more efficient algorithm.
In preparation for efficiently forming small strings, refactor this
logic into a handful of more specialized subroutines to preserve more
specific information from the callers.
This assumes these will land in Swift 4.1; the attributes need to be adjusted if that turns out not to be the case.
It seems @available for protocol conformances is not yet functional. I added attributes for those anyway, marked with FIXME(conformance-availability).
# Conflicts:
# stdlib/public/core/ExistentialCollection.swift.gyb
# stdlib/public/core/Mirror.swift
Now that Array and Dictionary conform to Hashable, we need to make sure that their bridged counterparts provide the same hash values when converted to AnyHashable.
* Add conditional Hashable conformance to Optional, Dictionary, Array, ArraySlice and ContiguousArray
* Modified hashValue implementations
The hashValues are now calculated similar to the automatically synthesized values when conforming to Hashable.
This entails using _combineHashValues as values of the collections are iterated - as well as calling _mixInt before returning the hash.
* Added FIXMEs as suggested by Max Moiseev
* Use checkHashable to check Hashable conformance
* Use 2 space indentation
* Hashing of Dictionary is now independent of traversal order
* Added a test to proof failure of (previous) wrong implementation of Dictionary hashValue. Unfortunately it does not work.
* Removed '_mixInt' from 'hashValue' implementation of Optional and Array types based on recommendations from lorentey
* Another attempt at detecting bad hashing due to traversal order
* Dictionary Hashable validation tests now detect bad hashing due to dependence on traversal order
* Removed superfluous initial _mixInt call for Dictionary hashValue implementation.
* Add more elements to dictionary in test to increase the number of possible permutations - making it more likely to detect order-dependent hashes
* Added Hashable conformance to CollectionOfOne, EmptyCollection and Range types
* Fix indirect referral to the only member of CollectionOfOne
* Re-added Hashable conformance to Range after merge from master
* Change hashValue based on comment from @lorentey
* Remove tests for conditional Hashable conformance for Range types. This is left for a followup PR
* Added tests for CollectionOfOne and EmptyCollection
* Added conditional conformance fo Equatable and Hashable for DictionaryLiteral. Added tests too.
* Added conditional Equatable and Hashable conformance to Slice
* Use 'elementsEqual' for Slice equality operator
* Fixed documentation comment and indentation
* Fix DictionaryLiteral equality implementation
* Revert "Fix DictionaryLiteral equality implementation"
This reverts commit 7fc1510bc3.
* Fix DictionaryLiteral equality implementation
* Use equalElements(:by:) to compare DictionaryLiteral elements
* Added conditional conformance for Equatable and Hashable to AnyCollection
* Revert "Use 'elementsEqual' for Slice equality operator"
This reverts commit 0ba2278b96.
* Revert "Added conditional Equatable and Hashable conformance to Slice"
This reverts commit 84f9934bb4.
* Added conditional conformance for Equatable and Hashable for ClosedRange