Use a temporary bitset to avoid hashing elements more than once, and to prevent rehashings during the creation of the result set.
This leads to a speedup of about 0-4x, depending on the number of elements removed.
The performance of operations merging two similar-sized sets or dictionaries is currently quadratic. Enabling per-instance seeds makes most of these linear, which is what most people expect.
This affects Set.union, Set.symmetricDifference and similar operations, including the for-loop based forms that people write manually.
The only remaining quadratic cases are when two sets/dictionaries are merged that are mutated copy-on-write copies of the same original instance. (We can’t change the hash seed without a full rehash,
Like before, allow the use of Cocoa indices to access native sets/dictionaries, but approximate the same mutation count-based check as we do for native indices.
- Ensure that native collections that were converted from Cocoa have an age generated from the address of the original Cocoa object.
- To get the age of a Cocoa index, regenerate one from the object embedded in it.
- Compare self.age to index.age and trap if it doesn’t match.
# Conflicts:
# stdlib/public/core/HashTable.swift
The mutation count will allow us to recognize and trap on invalid indices more reliably. (However, it won’t be foolproof — some invalid indices may pass through the checks.)
- Change _scale to Int8 to make space for an extra Int32 without enlarging the storage header.
- Add an _age field inside the new gap.
- Initialize the age to a scrambled version of the storage address.
- Generate a new counter for every newly allocated storage instance, except for identical copy-on-write copies.
- Increment the mutation counter after every removal.
- Remove the old Index typealias for _HashTable.Bucket
- Rename _HashTable.AgedIndex to _HashTable.Index
- Rename _HashTable.Bucket.bucket to _HashTable.Bucket.offset
- Rename/update _HashTable members to adopt to new terminology
_HashTable implements low-level hash table operations that are common between Set and Dictionary. It is a non-generic struct, dealing with maintaining hash table metadata rather than any actual elements.
_HashTable is intended as a mostly transparent abstraction. Most operations are inlinable, except for the ones related to the maximum load factor.