This should enable in-place modification of the Values view, which should greatly improve the performance of code like this:
dictionary.values[index] = newValue
Add __consuming and __owned to Set and Dictionary members where applicable.
Ignore compiler intrinsics for casting for now — their ARC behavior is covered by unit tests that need to be updated.
- Don’t expose the raw execution seed to _rawHashValue.
- Change the type of _rawHashValue’s seed from (UInt64,UInt64) to a single Int. Working with a pair of UInt64s is unwieldy, and overkill in practice. Int as a seed also integrates nicely with Int as a hash value.
- Remove _HasherCore._generateSeed(). Instead, simply call finalize() on a copy of the hasher to get a seed suitable for _rawHashValue.
- Update Set and Dictionary to store a single Int as the seed value.
Note that this doesn’t affect the core hasher, which still mixes in the actual 128-bit execution seed during its initialization. To reduce the potential of confusion, use the name “rawSeed” to refer to an actual 128-bit seed value.
Exclusively store small strings in little-endian byte order. This
will insert byte swaps when accessing small strings on big endian
platforms, however these are usually extremely cheap.
This approach means that the layout of the code points and count
in memory will be the same on both big and little endian machines
simplifying future development. Prior to this change this code
was broken on big endian machines because the memory layout was
different (the count ending up in the middle of the string).
- Have the hash buffer include a reference to the original hash storage instance, along with a copy of its _HashTable, so that its lifetime can be independent from the deferred bridging object.
- Convert _BridgingHashBuffer to a ManagedBuffer so that we can easily put reference-counted properties in its header.
Signed/unsigned integer conversions check for unrepresentable values; this wasn’t recognized as impossible, so a trap got compiled into the Dictionary lookup path.
Note to self: next time just use bitwise operations.
There is no need to fetch the value corresponding to the given key.
Also implement the same for Values.subscript, although the impact there is marginal.
- Use _HashTable to unify low-level hashing operations across Set and Dictionary.
- Store the capacity directly inside the storage header. This allows the maximum load factor to be controlled by non-inlinable code.
- Introduce a dedicated class for the empty singleton.
- Add _BridgingHashBuffer, a standalone flat hash buffer for use in deferred bridging. Use it to eliminate the need to support non-hashable storage/wrapper variants and to improve memory use in cases where Key or Value are verbatim bridged.
- Eliminate the “TypedNative*Storage” class and _NativeSet/_NativeDictionary’s support for non-hashable keys.
- Rename storage classes as follows:
_RawNativeSetStorage ⟹ _RawSetStorage
_RawNativeDictionaryStorage ⟹ _RawDictionaryStorage
_TypedNativeSetStorage ⟹ (removed)
_TypedNativeDictionaryStorage ⟹ (removed)
_HashableTypedNativeSetStorage ⟹ _SetStorage
_HashableTypedNativeDictionaryStorage ⟹ _DictionaryStorage
The new names make it obvious which ivar layout is in use.
This allows us to move the empty NSSet/NSDictionary overrides out of the root storage class; they don’t really belong there. More importantly, it makes empty sets/dictionaries super obvious to lldb and other runtime tools.
_HashTable implements low-level hash table operations that are common between Set and Dictionary. It is a non-generic struct, dealing with maintaining hash table metadata rather than any actual elements.
_HashTable is intended as a mostly transparent abstraction. Most operations are inlinable, except for the ones related to the maximum load factor.
Bitsets implement sorted sets over nonnegative integers up to a predetermined maximum value.
These are intended to replace _UnsafeBitMap. The latter will be removed once its usages are eliminated.
- _UnsafeBitset.Word is the underlying abstraction, implementing a bitset using a single UInt value.
- _UnsafeBitset is a view over a contiguous range of words.
- _Bitset is a COW value type implementing the same construct.