Instead of representing every same-type constraint we see as a rewrite rule,
only record rewrite rules when we merge equivalence classes, and record
rules that map the between the anchors of the equivalence classes. This
gives us fewer, smaller rewrite rules that (by construction) build correct
anchors.
Use term rewriting exclusively in the computation of the canonical
dependent type (anchor) for an equivalence class, eliminating any dependence
on the specific potential archetypes and eliminating the hacks around
generic type parameters.
Previously, the term rewriting logic for same-type constraints was only
able to express “relative” rewrites, where the base of the type being
rewritten (i.e., the generic type at the root) is unchanged. This meant
that we were unable to capture same-type constraints such as “T == U” or
“T.Foo == T” within the rewriting system.
Separate out the notion of a rewrite path and extend it with an optional
new base. When we’re simplifying a term and we encounter one of these
replacements, the whole type from the base up through the last-matched
path component gets replaced with the replacement path.
Fully rewriting a given type will produce the canonical representation of that type, because all rewrite rules take a step toward a more-canonical type. Use this approach to compute the anchor of an equivalence class, replacing the existing ad hoc approach of enumerating known potential
archetypes—which was overly dependent on having the “right” set of
potential archetypes already computed.
Introduce a new representation of same-type constraints as rewrite
rules in a term-rewriting system, where each same-type constraint maps
to a rewrite rule that produces a "more canonical"
term. Fully-simplifying a type according to these rewrite rules will
produce the canonical representation of that type, i.e., the "anchor"
of its equivalence class.
The rewrite rules are stored as a prefix tree, such that a "match"
operation walks the tree matching the prefix of a path of associated
type references, e.g., SubSequence -> SubSequence -> Element, and any
node within the tree can provide a replacement path (e.g.,
"Element"). The "dump" operation provides a visualization of the tree,
e.g.,
::
`--(cont'd)
`--Sequence.Iterator
| `--IteratorProtocol.Element --> [Sequence.Element]
`--Sequence.SubSequence --> []
| `--Sequence.Element --> [Sequence.Iterator -> IteratorProtocol.Element]
| | `--(cont'd) --> [Sequence.Element]
| `--Sequence.Iterator --> [Sequence.Iterator]
| | `--IteratorProtocol.Element --> [Sequence.Iterator -> IteratorProtocol.Element]
| `--Sequence.SubSequence --> []
| | `--Sequence.Element --> [Sequence.Element]
| | `--Sequence.Iterator --> [Sequence.Iterator]
| `--C.Index --> [C.Index]
| `--C.Indices --> [C.Indices]
| `--Sequence.Element --> [C.Indices -> Sequence.Element]
| `--Sequence.SubSequence --> [C.Indices -> Sequence.SubSequence]
| `--C.Index --> [C.Indices -> C.Index]
| `--C.Indices --> [C.Indices -> C.Indices]
`--C.Index --> [Sequence.Element]
`--C.Indices
`--Sequence.Element --> [C.Index]
Thus far, there are no clients for this information: this commit
starts building these rewriting trees and can visualize them for debug
information.
If we used the requirement signature to create a protocol requirement
element in a requirement source, there's no need to verify that it's
from the requirement signature (duh).
Add a verification pass to ensure that all of the generic signatures in
a module are both minimal and canonical. The approach taken is quite
direct: for every (canonical) generic signature in the module, try
removing a single requirement and forming a new generic signature from
the result. If that new generic signature that provide the removed
requirement, then the original signature was not minimal.
Also canonicalize each resulting signature, to ensure that it meets the
requirements for a canonical signature.
Add a test to ensure that all of the generic signatures in the Swift
module are minimal and canonical, since they are ABI.
Replace the pair of PotentialArchetype's getArchetypeAnchor() and
getNestedArchetypeAnchor() with a straightforward, more-efficient
computation based on equivalence classes. This reduces the number of
times we query the archetype anchor cache by 88% when building the
standard library, as well as eliminating some
PotentialArchetype-specific APIs.
Equivalence classes stored their same-type constraints in a MapVector
keyed on the source potential archetype, which allowed traversal along the
paths of the graph. However, this capability isn't required, because we
end up walking all of the edges each time. Flatten the list of same-type
constraints to a single vector, eliminating constraint duplication between
the source and the target out-edge lists.
This cuts down on the number of same-type constraints we record by 50%,
but performance gains are limited (6% of stdlib type-checking time)
because most of the time these same-type constraints were skipped
anyway.
Now that getBuilder() is gone, stash an ASTContext into root
PotentialArchetypes instead of a GenericSignatureBuilder. This eliminates
some ickiness where we had to re-root potential archetypes when moving a
GenericSignatureBuilder.
Refactor the interfaces that involve “walking” a requirement source
(e.g., minimization, computation of the affected type, etc.) to rely
only on interface types and not potential archetypes.
We want to delay the realization of potential archetypes as late as
possible; for layout constraints, push realization of potential archetypes
down past the point where we know whether we’re learning anything new
about the equivalence class.
No functionality change, yet.
Any potential archetype with the lowest depth in the equivalence class
will do when we're resolving for equivalence classes, so keep the
lowest-depth potential archetype handy.
Teach GenericSignatureBuilder::resolveEquivalenceClass() to perform
resolution of a type into a dependent type and its equivalence class
without realizing that potential archetype. Improves type-checking
time for the standard library by ~6%.
Now that the GenericSignatureBuilder is no longer sensitive to the input
module, stop uniquing the canonical GSBs based on that module. The main
win here is when deserializing a generic environment: we would end up
creating a canonical GSB in the module we deserialized and another
canonical GSB in the module in which it is used.
Implement a module-agnostic conformance lookup operation within the GSB
itself, so it does not need to be supplied by the code constructing the
generic signature builder. This makes the generic signature builder
(closer to) being module-agnostic.
Use the "override" information in associated type declarations to provide
AST-level access to the associated type "anchor", i.e., the canonical
associated type that will be used in generic signatures, mangling,
etc.
In the Generic Signature Builder, only build potential archetypes for
associated types that are anchors, which reduces the number of
potential archetypes we build when type-checking the standard library
by 14% and type-checking time for the standard library by 16%.
There's a minor regression here in some generic signatures that were
accidentally getting (correct) same-type constraints. There were
existing bugs in this area already (Huon found some of them), while
will be addressed as a follow-up.
Fies SR-5726, where we were failing to type-check due to missed
associated type constraints.