Suppose we have these rules:
(1) [P].[P] => [P]
(2) [P].[P:X] => [P:X]
(3) [P].[P:Y] => [P:Y]
(4) [P:X].[concrete: G<τ_0_0> with <[P:Y]>]
Rule (2) and (4) overlap on the following term, which has had the concrete
type adjustment applied:
[P].[P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>]
The critical pair is obtained by applying rule (2) to both sides of the
branching is
[P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>] => [P:X]
Note that this is a distinct rule from (4), and again this new rule overlaps
with (2), producing another rule
[P:X].[concrete: G<τ_0_0> with <[P].[P].[P:Y]>] => [P:X]
This process doesn't terminate. The root cause of this problem is the
existence of rule (1), which appears when there are multiple non-trivial
conformance paths witnessing the conformance Self : P. This occurs when a
same-type requirement is defined between Self and some other type conforming
to P.
To make this work, we need to simplify concrete substitutions when adding a
new rule in completion. Now that rewrite paths can represent this form of
simplification, this is easy to add.
We didn't look at the length of terms appearing in concrete substitutions,
so runaway recursion there was only caught by the completion step limit
which takes much longer.
Also while plumbing this through, don't record homotopy generators
unless we're minimizing a protocol signature, since they're not
used for anything else yet.
For implementation reasons we want the requirement signature of a
protocol to directly include all protocol refinement relationships,
even if they can be derived via same-type requirements between Self
and some nested type.
Therefore, a protocol refinement rule [P].[Q] => [P] can only be
replaced with a generating conformance equation that consists
entirely of other conformance rules.
This exactly simulates the existing behavior of the GSB's redundant
requirements algorithm.
- Skip permanent rules (there's no point, we'll add them back next time)
- Skip conformance rules (these will be handled separately)
- Delete 3-cells that are entirely "in-context" (but don't quote me on
this one, I'm not quite yet convinced it's correct, but it feels right)
The left and right hand side of a merged associated type candidate rule
have a common prefix except for the associated type symbol at the end.
Instead of passing two MutableTerms, we can pass a single uniqued
Term and a Symbol.
Also, we can compute the merged symbol before pushing the candidate
onto the vector. This avoids unnecessary work if the merged symbol
is equal to the right hand side's symbol.
If we have a rewrite rule of the form
X.[P:A] => X.[Q:A]
We introduce a pair of rules
X.[P:A] => X.[P&Q:A]
X.[Q:A] => X.[P&Q:A]
But in reality only the second one is necessary. The first one is redundant
because you obtain the same result by applying the original rule followed by
the second rule.
In a confluent rewrite system, if the left hand side of a rule
X => Y can be reduced by some other rule X' => Y', then it is
permissible to delete the original rule X => Y altogether.
Confluence means that rewrite rules can be applied in any
order, so it is always valid to apply X' => Y' first, thus
X => Y is obsolete.
This was previously done in the completion procedure via a
quadratic algorithm that attempted to reduce each existing
rule via the newly-added rule obtained by resolving a critical
pair. Instead, we can do this in the post-processing pass
where we reduce right hand sides using a trie to speed up
the lookup.
This increases the amount of work performed by the
completion procedure, but eliminates the quadratic algorithm,
saving time overall.
The PropertyMap wants to use a try to map to PropertyBags, and it
needs longest-suffix rather than shortest-prefix matching.
Implement both by making Trie into a template class with two
parameters; the ValueType, and the MatchKind.
Note that while the MatchKind encodes the longest vs shortest
match part, matching on the prefix vs suffix of a term is up
to the caller, since the find() and insert() methods of Trie
take a pair of iterators, so simply passing in begin()/end() vs
rbegin()/rend() selects the direction.
Previously RewriteSystem::simplify() would attempt to apply every
rewrite rule at every position in the original term, which was
obviously a source of overhead.
The trie itself is somewhat unoptimized; for example, with a bit of
effort it could merge a node with its only child, if nodes stored
a range of elements to compare rather than a single element.
RewriteSystem::addRule() did two things before adding the rewrite rule:
- Simplify both sides. If both sides are now equal, discard the rule.
- If the last symbol on the left hand side was a superclass or concrete type
symbol, simplify the substitution terms in that symbol.
The problem is that the second step can produce a term which can be further
simplified, and in particular, one that is exactly equal to the left hand
side of some other rule.
To fix this, swap the order of the two steps. The only wrinkle is now we
have to check for a concrete type symbol at the end of _both_ the left hand
side and right hand side, since we don't orient the rule until we simplify
both sides.
I don't have a reduced test case for this one, but it was revealed by
compiler_crashers_2_fixed/0109-sr4737.swift after I introduced the trie.
Intuitively, the first token of a rewrite rule determines if the rule
applies to a protocol requirement signature 'Self', or a generic
parameter in the top-level generic signature. A rewrite rule can never
take a type starting with the protocol 'Self' to a type starting with a
generic parameter, or vice versa.
Enforce this by defining the notion of a 'domain', which is the set of
protocols to which the first atom in a term applies to. This set can be
empty (if we have a generic parameter atom), or it may contain more than
one element (if we have an associated type atom for a merged associated
type).
Store the protocol's direct associated types separately from the inherited
associated types, since in a couple of places we only need the direct
associated types.
Also, factor out a new ProtocolGraph::compute() method that does all the
steps in the right order.
If a type parameter has a protocol conformance and a concrete type,
we want to map associated types of the conformance to their concrete
type witnesses.
This is implemented as a post-processing pass in the completion
procedure that runs after the equivalence class map has been built.
If we have an equivalence class T => { conforms_to: [ P ], concrete: Foo },
then for each associated type A of P, we generate a new rule:
T.[P:A].[concrete: Foo.A] => T.[P:A] (if Foo.A is concrete)
T.[P:A] => T.(Foo.A) (if Foo.A is abstract)
If this process introduced any new rules, we check for any new
overlaps by re-running Knuth-Bendix completion; this may in turn
introduce new concrete associated type overlaps, and so on.
The overall completion procedure now alternates between Knuth-Bendix
and rebuilding the equivalence class map; the rewrite system is
complete when neither step is able to introduce any new rules.