The full state of the GSB isn’t all that useful for testing, creates a ton of noise and gets in the way of some cleanups we’d like to make in the interface.
Stop dumping it as part of `-debug-generic-signatures`.
When type-checking a function or subscript that itself does not have generic
parameters (but is within a generic context), we were creating a generic
signature builder which will always produce the same generic signature as
the enclosing context. Stop creating that generic signature builder.
Instead, teach the CompleteGenericTypeResolver to use the generic signature
+ the canonical generic signature builder for that signature to resolve
types, which also eliminates some extraneous re-type-checking.
Improves type-checking performance of the standard library by 36%.
The full state of the GSB isn’t all that useful for testing, creates a ton of noise and gets in the way of some cleanups we’d like to make in the interface.
Stop dumping it as part of `-debug-generic-signatures`.
When type-checking a function or subscript that itself does not have generic
parameters (but is within a generic context), we were creating a generic
signature builder which will always produce the same generic signature as
the enclosing context. Stop creating that generic signature builder.
Instead, teach the CompleteGenericTypeResolver to use the generic signature
+ the canonical generic signature builder for that signature to resolve
types, which also eliminates some extraneous re-type-checking.
Improves type-checking performance of the standard library by 36%.
When we have two nested types of a given potential archetype that have
the same name, introduce a (quietly) inferred constraint. This is
a future-proofing step for canonicalization, for a possible future
where we no longer require all of the nested types of a given name
to be equivalent, e.g., because we have a proper disambiguation
mechanism.
When a potential archetype refers to a concrete (non-associated) type
declaration, we bind to that concrete type. Add a new requirement
source kind for this case that is always derived, separating it from
the nested-type-name-match source.
One important aspect of this is that typealiases in protocols that
"override" an associated type an inherited protocol will generate the
same requirement signature as the equivalent protocol that uses a
same-type constraint, making the suppression of the "hey, this is
equivalent to a same-type constraint now!" warning an ABI-preserving
change.
With this, remove a now-unnecessary hack for nested-name-match
requirement sources.
In some circumstances, we could end up growing increasingly-nested
potential archetypes due to a poor choice of representatives and
anchors. Address this in two places:
* Always prefer to use the potential archetype with a lower nesting
depth (== number of nested types) to one with a greater nesting
depth, so we don't accumulate more nested types onto the
already-longer potential archetypes, and
* Prefer archetype anchors with a lower nesting depth *except* that we
always prefer archetype anchors comprised of a sequence of
associated types (i.e., no concrete type declarations), which is
important for canonicalization.
Fixes SR-4757 / rdar://problem/31912838, as well as a regression
involving infinitely-recursive potential archetypes caused by the
previous commit.
Infer same-type requirements among same-named associated
types/typealiases within inherited protocols. This is more staging; it
doesn't really have teeth until we stop wiring together these types as
part of lookup.
Introduce a warning about redeclaring the associated types from an
inherited protocol in the protocol being checked:
* If the new declaration is an associated type, note that the
declaration could be replaced by requirements in the protocol's
where clause.
* If the new declaration is a typealias, note that it could be
replaced by a same-type constraint in the protocol's where clause.
This stops after 5 recurrences of the same associated type. It is a
gross hack and a terrible idea, here as a placeholder to prevent us
from running off the rails in ill-formed code. This will go away when
we get further along the path with recursive protocol constraints.
Apply the same logic used for self-derived conformance constraints,
where we drop constraints derived from concrete conformances, to the
remaining kinds of constraints covered by isSelfDerivedSource().
When an otherwise abstract conformance constraint is derived from a
concrete conformance, retain the abstract conformance by removing the
requirement source that involves the concrete conformance. This
eliminates our reliance on the concrete conformance, which is not
retained as part of the generic signature.
Fixes rdar://problem/31163470 and rdar://problem/31520386.
When a requirement mentions a concrete type, that type might utter
other types (e.g., Set<T>) that infer requirements (here, T:
Hashable). Perform requirement inference for such types.
Part of rdar://problem/31520386.
When a nested type is within the same equivalence class as its parent,
don't emit a redundant same-type-to-concrete constraint for the
corresponding potential archetype. The nested type's constraint will
be derived from the parent... which is technically a self-derived
constraint, yet needs to be suppressed.
Generic signature canonicalization/minimization never removes type
parameters, so we cannot suppress type-parameter-to-concrete
requirements even when they are derived.
Fixes the rest of the known cases of rdar://problem/30478915.
Move the storage for the protocols to which a particular potential
archetype conforms into EquivalenceClass, so that it is more easily
shared. More importantly, keep track of *all* of the constraint
sources that produced a particular conformance requirement, so we can
revisit them later, which provides a number of improvements:
* We can drop self-derived requirements at the end, once we've
established all of the equivalence classes
* We diagnose redundant conformance requirements, e.g., "T: Sequence"
is redundant if "T: Collection" is already specified.
* We can choose the best path when forming the conformance access
path.
If a requirement is made redundant due to another requirement that was
inferred from the signature of a generic declaration, don't diagnose
the former as redundant. The user has likely written the requirement
explicitly for clarity purposes (e.g., to emphasize the Hashable
requirement on a function that takes a Set<T>). Removing the
requirement to silence the warning would make the code less clear.
This eliminates all of the annoying, spurious warnings from the build
of the overlays.
The stored dependent types in ProtocolRequirement elements within
requirement sources were incorrect for requirements created from the
requirement signature of another protocol, because we picked up the
already-substituted subject type. Thread the optional substitution map
through addRequirement(Requirement) as well, so we maintain the
original spelling of the stored dependent type.
This is a temporary fix; we should be able to recover the stored
dependent types from the potential archetypes in the requirement
source, so that we don't need to specify them explicitly at
construction time.
For a protocol requirement element within a requirement source, track
both the protocol in which the requirement was introduced as well as
the dependent type (relative to that protocol) on which the
requirement was introduced. This information is important when
reconstructing the path from a requirement-as-written to the location
of a desired protocol conformance.
Centralize the checking of a superclass constraint against a
same-type-to-concrete constraint. Additionally, produce a warning if
the superclass constraint is satisfied but is made redundant by an
existing same-type constraint.
Use the same infrastructure we have for same-type-to-concrete
constraints to check superclass constraints. Specifically,
* Track all superclass constraints; never "update" a requirement source
* Remove self-derived superclass constraints
* Pick the best superclass constraint within each connected component
of an equivalence class and use that for requirement generation.
* Diagnose conflicting superclass requirements during finalization
* Diagnose redundant superclass requirements (during finalization)
Introduce an operation on RequirementSource to determine whether a
constraint with such a source, when it lands on a given potential
archetype, is "self-derived": e.g., the final constraint is derived
from the original constraint. Remove such constraints from the system
during finalization, because otherwise they would make the original
constraint redundant.
Fixes rdar://problem/30478915, although we still need to apply this
same logic to other kinds of constraints in the system.
Use TrailingObjects to help us efficiently store the root potential
archetype within requirement sources, so we can reconstruct the
complete path from the point where a requirement was created to the
potential archetype it affects.
The test changes are because we are now dumping the root potential
archetype as part of -debug-generic-signatures.
When a type parameter is made concrete via an existential type,
conformance requirements on that type parameter will be
abstract. Fixes rdar://problem/30610428.
Track each same-type-to-concrete constraint on the potential archetype
against which it was written, and ensure that the requirement sources
for such same-type constraints stay with the potential archetypes on
which they were described. This is similar to the way we track
same-type constraints among potential archetypes.
Use this information to canonicalize same-type-to-concrete constraints
appropriately. For each connected component within an equivalence
class of potential archetypes, select the best requirement source to
the concrete type, or substitute in an abstract requirement source if
none exists. This approach ensures that components that would be
equivalent to that concrete type anyway get a derived source, while
components that get the concrete-type equivalence by being tied to
another
To get here, we also needed to change the notion of the archetype
anchor so that potential archetypes with no same-type constraints
directly in their path are preferred over potential archetypes that do
have a same-type constraint in their path. Otherwise, the anchor might
be something that is always concrete and is, therefore, not terribly
interesting.
Fixes the new case that popped up in rdar://problem/30478915.
Reimplement the RequirementSource class, which captures how
a particular requirement is satisfied by a generic signature. The
primary goal of this rework is to keep the complete path one follows
in a generic signature to get from some explicit requirement in the
generic signature to some derived requirement or type, e.g.,
1) Start at an explicit requirement "C: Collection"
2) Go to the inherited protocol Sequence,
3) Get the "Iterator" associated type
4) Get its conformance to "IteratorProtocol"
5) Get the "Element" associated type
We don't currently capture all of the information we want in the path,
but the basic structure is there, and should also allow us to capture
more source-location information, find the "optimal" path, etc. There are
are a number of potential uses:
* IRGen could eventually use this to dig out the witness tables and
type metadata it needs, instead of using its own fulfillment
strategy
* SubstitutionMap could use this to lookup conformances, rather than
it's egregious hacks
* The canonical generic signature builder could use this to lookup
conformances as needed, e.g., for the recursive-conformances case.
... and probably more simplifications, once we get this right.
When emitting a requirement for a same-type-to-concrete constraint,
use the known concrete source for just the first of the component
anchors. For the rest, we need to explicitly model the constraint lest
it get lost. Fixes rdar://problem/30478915.
Clean up the representation of PotentialArchetype in a few small ways:
* Eliminate the GenericTypeParamType* at the root, and instead just
store a GenericParamKey. That makes the potential archetypes
independent of a particular set of generic parameters.
* Give potential archetypes a link back to their owning
ArchetypeBuilder, so we can get contextual information (etc.) when
needed. We can remove the "builder" arguments as a separate step.
Also, collapse getName()/getDebugName()/getFullName() into
getNestedName() and getDebugName(). Generic parameters don't have
"names" per se, so they should only show up in debug dumps.
In support of the former, clean up some of the diagnostics emitted by
the archetype builder that were using 'Identifier' or 'StringRef'
where they should have been using a 'Type' (i.e., the type behind the
dependent archetype).
Fix a couple of places where we were not managing to consider
redundant constraints as 'redundant', so the canonicalized generic
signatures were not in fact minimal. Address this in several places:
* When a type is made concrete, any superclass requirements on it are
redundant
* When a type is made concrete, any same-type relationships among its
nested types are redundant
* Only emit potential-archetype-to-concrete constraints in the
resulting signature for the component anchors; not for every type.
The first two bullets were existing problems that are now fixed; the
last is something I appear to have introduced with last week's
refactoring of canonicalized same-type constraints.
The recent work to make the canonicalization deterministic exacerbated
the issue considerably, leading to huge blow-up in the size of
canonical generic signatures that cost ~100k in standard library
size. This change recovers that 100k.
The canonicalization of dependent member types had some
nondeterminism. The root of the problem was that we couldn't
round-trip dependent member types through the archetype
builder---resolving them to a potential archetype lost the specific
associated type that was recorded in the dependent member type, which
affected canonicalization. Maintain that information, make sure that
we always get the right archetype anchor, and tighten up the
canonicalization logic within a generic signature.
Fixes rdar://problem/30274260 and should unblock some other work
that depends on sanity from the archetype builder and generic
signature canonicalization.
Introduce an algorithm to canonicalize and minimize same-type
constraints. The algorithm itself computes the equivalence classes
that would exist if all explicitly-provided same-type constraints are
ignored, and then forms a minimal, canonical set of explicit same-type
constraints to reform the actual equivalence class known to the type
checker. This should eliminate a number of problems we've seen with
inconsistently-chosen same-type constraints affecting
canonicalization.
When enumerating requirements, always use the archetype anchors to
express requirements. Unlike "representatives", which are simply there
to maintain the union-find data structure used to track equivalence
classes of potential archetypes, archetype anchors are the
ABI-stable canonical types within a fully-formed generic signature.
The test case churn comes from two places. First, while
representatives are *often* the same as the archetype anchors, they
aren't *always* the same. Where they differ, we'll see a change in
both the printed generic signature and, therefore, it's
mangling.
Additionally, requirement inference now takes much greater
care to make sure that the first types in the requirement follow
archetype anchor ordering, so actual conformance requirements occur in
the requirement list at the archetype anchor---not at the first type
that is equivalent to the anchor---which permits the simplification in
IRGen's emission of polymorphic arguments.
When a same-type constraint equates a potential archetype with a
same-type constraint, it implies equality of the nested types with the
appropriate witnesses. However, these were getting marked as
"explicit" when they should be "redundant".
Fixes rdar://problem/29295062.
Now that the previous patches have shaken out implicit assumptions
about the order of generic requirements and substitutions, we can
make a more radical change, dropping redundant protocol requirements
when building the original generic signature.
This means that the canonical ordering and minimization that we
used to only perform when building the mangling signature is done
all of the time, and hence getCanonicalManglingSignature() can go
away.
Usages now either call getCanonicalSignature(), or operate on the
original signature directly.
Move the sorting algorithm from construction of the canonical mangling
signature to requirement enumeration.
This also changes how same-type constraints pick representatives, using
the more canonical total order.
We used RequirementSources to identify redundant requirements
when building the canonical mangling signature. There were a
few problems with how this worked:
- We used the Inferred requirement source for both requirements
that were inferred from function parameter and result types
(for example, T : Hashable in `func foo<T, U>(dict: [T : U])`
and some completely unrelated things, like same-type
constraints imposed on nested types by existing same-type
constraints on the outer types.
- We used the Protocol requirement source for associated type
requirements on protocols as well as conformance requirements
on inherited protocols.
- Introducing a new Redundant requirement did not mark existing
Explicit requirements as Redundant.
None of these were an issue because of how canonical mangling
signature construction works:
- We already started with a canonical signature, so there were
no Inferred requirements of the first kind (which we *do* want
to keep here)
- We dropped both Protocol and Redundant requirements, so it
didn't matter that the distinction here was not sufficiently
fine-grained
- We never introduced Explicit requirements that would be later
superceded by a Redundant requirement, because we always
started with an existing canonical signature where such Explicit
requirements were already dropped.
However, I want to use the same algorithm for building the
original signature as the canonical mangling signature, so this
patch introduces these changes:
- The Inferred source is now only used for inferred requirements of
the first kind; the second kind are now Redundant
- The Protocol source is now only used for associated type
requirements; inherited protocol conformances are now Redundant
- updateRequirementSource() now does the right thing with
introduced Redundant requirements
For now, this doesn't change much, but an upcoming patch
refactors ArchetypeBuilder::getGenericSignature() to also
use enumerateRequirements(), except dropping Redundant
requirements only, and not Protocol.
and provide a fix-it to move it to the new location as referenced
in SE-0081.
Fix up a few stray places in the standard library that is still using
the old syntax.
Update any ./test files that aren't expecting the new warning/fix-it
in -verify mode.
While investigating what I thought was a new crash due to this new
diagnostic, I discovered two sources of quite a few compiler crashers
related to unterminated generic parameter lists, where the right
angle bracket source location was getting unconditionally set to
the current token, even though it wasn't actually a '>'.