Instead of representing every same-type constraint we see as a rewrite rule,
only record rewrite rules when we merge equivalence classes, and record
rules that map the between the anchors of the equivalence classes. This
gives us fewer, smaller rewrite rules that (by construction) build correct
anchors.
Although we were properly recording rewrite rules like tau_0_1 -> tau_0_0
in the rewrite tree, we were failing to apply them, so tau_0_1 wouldn’t
get properly canonicalized. Fix this, and add some assertions to make sure
we catch this with our current test suite.
Fixes rdar://problem/37469390.
The GenericSignatureBuilder is allowing unresolved dependent member
types to creep into generic signatures, which eventually blows up in
name mangling. Prefer to pick dependent member types that are
fully-resolved when choosing anchors.
This is a spot fix; a better approach would eliminate the notion of
unresolved dependent member types entirely from
PotentialArchetype. That's tracked by rdar://problem/35839277.
Fixes rdar://problem/36549499.
Use term rewriting exclusively in the computation of the canonical
dependent type (anchor) for an equivalence class, eliminating any dependence
on the specific potential archetypes and eliminating the hacks around
generic type parameters.
Previously, the term rewriting logic for same-type constraints was only
able to express “relative” rewrites, where the base of the type being
rewritten (i.e., the generic type at the root) is unchanged. This meant
that we were unable to capture same-type constraints such as “T == U” or
“T.Foo == T” within the rewriting system.
Separate out the notion of a rewrite path and extend it with an optional
new base. When we’re simplifying a term and we encounter one of these
replacements, the whole type from the base up through the last-matched
path component gets replaced with the replacement path.
Fully rewriting a given type will produce the canonical representation of that type, because all rewrite rules take a step toward a more-canonical type. Use this approach to compute the anchor of an equivalence class, replacing the existing ad hoc approach of enumerating known potential
archetypes—which was overly dependent on having the “right” set of
potential archetypes already computed.
Introduce a new representation of same-type constraints as rewrite
rules in a term-rewriting system, where each same-type constraint maps
to a rewrite rule that produces a "more canonical"
term. Fully-simplifying a type according to these rewrite rules will
produce the canonical representation of that type, i.e., the "anchor"
of its equivalence class.
The rewrite rules are stored as a prefix tree, such that a "match"
operation walks the tree matching the prefix of a path of associated
type references, e.g., SubSequence -> SubSequence -> Element, and any
node within the tree can provide a replacement path (e.g.,
"Element"). The "dump" operation provides a visualization of the tree,
e.g.,
::
`--(cont'd)
`--Sequence.Iterator
| `--IteratorProtocol.Element --> [Sequence.Element]
`--Sequence.SubSequence --> []
| `--Sequence.Element --> [Sequence.Iterator -> IteratorProtocol.Element]
| | `--(cont'd) --> [Sequence.Element]
| `--Sequence.Iterator --> [Sequence.Iterator]
| | `--IteratorProtocol.Element --> [Sequence.Iterator -> IteratorProtocol.Element]
| `--Sequence.SubSequence --> []
| | `--Sequence.Element --> [Sequence.Element]
| | `--Sequence.Iterator --> [Sequence.Iterator]
| `--C.Index --> [C.Index]
| `--C.Indices --> [C.Indices]
| `--Sequence.Element --> [C.Indices -> Sequence.Element]
| `--Sequence.SubSequence --> [C.Indices -> Sequence.SubSequence]
| `--C.Index --> [C.Indices -> C.Index]
| `--C.Indices --> [C.Indices -> C.Indices]
`--C.Index --> [Sequence.Element]
`--C.Indices
`--Sequence.Element --> [C.Index]
Thus far, there are no clients for this information: this commit
starts building these rewriting trees and can visualize them for debug
information.
A type Foo<...>.Bar may only exist conditionally (i.e. constraints on the
generic parameters of Foo), in which case those conditional requirements
should be implied when Foo<...>.Bar is mentioned.
Fixes SR-6850.
Rather than crashing when a generic signature is found to be non-minimal,
report the non-minimal requirement via the normal diagnostics machinery so
we can properly test for it.
Fixes rdar://problem/36912347 by letting us track which cases are
non-minimal in the standard library explicitly, so we can better
decide whether it's worth implementing a complete solution.
Queries against the generic signature might use types that are
ill-formed for that generic signature, e.g., because they refer to
associated types of unrelated protocols. Detect such conditions to
maintain GSB invariants that all potential archetypes in a well-formed
generic signature are well-formed.
Fixes the crash in SR-6797 / rdar://problem/36673825.
There is nothing specifically wrong with uttering a same-type
constraint in a where clause where both sides are concrete
types. Downgrade this to a warning; we'll check that the concrete
types match (of course), and such a well-formed constraint will simply
be canonicalized away.
This aids the migration of IndexDistance from an associated type to
Int.
Within the where clause of (e.g.) an extension, unqualified name lookup is
permitted to find associated types. Extend this to also include finding
typealiases declared within the protocol itself.
This is service of the IndexDistance change (SE-0191), and
fixes rdar://problem/35490504.
Add a verification pass to ensure that all of the generic signatures in
a module are both minimal and canonical. The approach taken is quite
direct: for every (canonical) generic signature in the module, try
removing a single requirement and forming a new generic signature from
the result. If that new generic signature that provide the removed
requirement, then the original signature was not minimal.
Also canonicalize each resulting signature, to ensure that it meets the
requirements for a canonical signature.
Add a test to ensure that all of the generic signatures in the Swift
module are minimal and canonical, since they are ABI.
Previously, we were inferring requirements from types within the definitions
of protocols, e.g., given something like:
protocol P {
associatedtype A: Collection
associatedtype B where A.Element == Set<B>
}
we would infer that B: Hashable. The code for doing this was actually
incorrect due to its mis-use of requirement sources, causing a few
crashers. Plus, it's not a good idea in general because it hides the
actual requirements on B. Stop doing this.
Also stop trying to infer requirements from conditional
requirements---those have already been canonicalized and minimized, so
there's nothing to infer from.
Fixes a former crasher that included well-formed code that was rejected
by my previous refactoring. Said crasher now passes, and IRGen's properly
as well. Also, account for three more fixed crashers.
Replace the pair of PotentialArchetype's getArchetypeAnchor() and
getNestedArchetypeAnchor() with a straightforward, more-efficient
computation based on equivalence classes. This reduces the number of
times we query the archetype anchor cache by 88% when building the
standard library, as well as eliminating some
PotentialArchetype-specific APIs.
The first step in enumerating the minimal, canonical set of requirements for
a generic signature is identifying which "subject" types will show up in
the left-hand side of the requirements. Previously, this would require us
to realize all of the potential archetypes, and perform a number of
archetype-anchor computations and comparisons.
Replace that with a simpler walk over the equivalence classes,
identifying the anchor types within each derived same-type component
of those equivalence classes, which form the subject types. This is
more straightforward, doesn't rely on potential archetypes, simplifies
the code, and eliminates a silly O(n^2)-for-small-n that's been
bothering me for a while.
Associated type redeclarations occasionally occur to push around
associated type witness inference. Suppress the warning about redeclarations
that add no requirements (i.e., have neither an inheritance nor a
where clause).
Switch the mapping from types to components used by the same-type
connected-components computation to be indexed by CanType instead,
eliminating a few more places where we force realization of a potential
archetype.
Equivalence classes stored their same-type constraints in a MapVector
keyed on the source potential archetype, which allowed traversal along the
paths of the graph. However, this capability isn't required, because we
end up walking all of the edges each time. Flatten the list of same-type
constraints to a single vector, eliminating constraint duplication between
the source and the target out-edge lists.
This cuts down on the number of same-type constraints we record by 50%,
but performance gains are limited (6% of stdlib type-checking time)
because most of the time these same-type constraints were skipped
anyway.
When computing connected components for the graph of derived same-type
constraints within an equivalence, replace the depth-first search with
union-find. The latter does not require us to have an out-edges list
for each node in the graph.
Now that getBuilder() is gone, stash an ASTContext into root
PotentialArchetypes instead of a GenericSignatureBuilder. This eliminates
some ickiness where we had to re-root potential archetypes when moving a
GenericSignatureBuilder.