The canonicalization of dependent member types had some
nondeterminism. The root of the problem was that we couldn't
round-trip dependent member types through the archetype
builder---resolving them to a potential archetype lost the specific
associated type that was recorded in the dependent member type, which
affected canonicalization. Maintain that information, make sure that
we always get the right archetype anchor, and tighten up the
canonicalization logic within a generic signature.
Fixes rdar://problem/30274260 and should unblock some other work
that depends on sanity from the archetype builder and generic
signature canonicalization.
Introduce an algorithm to canonicalize and minimize same-type
constraints. The algorithm itself computes the equivalence classes
that would exist if all explicitly-provided same-type constraints are
ignored, and then forms a minimal, canonical set of explicit same-type
constraints to reform the actual equivalence class known to the type
checker. This should eliminate a number of problems we've seen with
inconsistently-chosen same-type constraints affecting
canonicalization.
Make the "archetype anchor" algorithm use the same total order used to
compare dependent types in a generic signature, so that it is a
dependable total order. Tighten up the compareDependentTypes() logic
(and checking of it) a bit now that it's being used more frequently.
The weird code around GenericSignature::isCanonicalTypeInContext() is
due to a longstanding issue where
ArchetypeBuilder::resolveArchetype(t)->getDependentType()
is not the identity function when the incoming type is a
DependentMemberType with known associated types. We want to establish
this behavior at some point going forward, in which case we can go
back to the prior implementation of
GenericSignature::isCanonicalTypeInContext().
A change was recently made to canonicalize replacement types in
GenericSignature::getSubstitutions().
This resulted in ParenType being stripped off, which triggered
the 'tuple splat' diagnostic on code that was accepted in Swift 3.0.
I believe this canonicalization step is unnecessary; we
canonicalize using a brand-new ArchetypeBuilder that has no
generic signature added to it, so this is just equivalent to a
call to getCanonicalType().
Also adding the generic signature in question to the builder is
not the right answer either; the replacement types might be
written in terms of a different generic signature, or possibly
in terms of archetypes.
Taking this out seems to have no effect except changing a few
SIL dumps to contain sugared types, which should be harmless.
Part of fixing <rdar://problem/29739905>.
Be more tolerant of substitutions for erroneous types; the type
checker will diagnose these issues, but later code paths might still
form these substititions. Put in error types rather than crashing.
When enumerating requirements, always use the archetype anchors to
express requirements. Unlike "representatives", which are simply there
to maintain the union-find data structure used to track equivalence
classes of potential archetypes, archetype anchors are the
ABI-stable canonical types within a fully-formed generic signature.
The test case churn comes from two places. First, while
representatives are *often* the same as the archetype anchors, they
aren't *always* the same. Where they differ, we'll see a change in
both the printed generic signature and, therefore, it's
mangling.
Additionally, requirement inference now takes much greater
care to make sure that the first types in the requirement follow
archetype anchor ordering, so actual conformance requirements occur in
the requirement list at the archetype anchor---not at the first type
that is equivalent to the anchor---which permits the simplification in
IRGen's emission of polymorphic arguments.
This commit introduces new kind of requirements: layout requirements.
This kind of requirements allows to expose that a type should satisfy certain layout properties, e.g. it should be a trivial type, have a given size and alignment, etc.
Fixes assertion failures in SILGen and the optimizer with this
exotic setup:
protocol P {
associatedtype T : Q
}
protocol Q {
func requirement<U : P>(u: U) where U.T == Self
}
Here, we only have a U : P conformance, and not Self : Q,
because Self : Q is available as U.T : Q.
There were three problems here:
- The SIL verifier was too strict in verifying the generic signature.
All that matters is we can get the Self parameter conformance, not
that it's the first requirement, etc.
- GenericSignature::getSubstitutionMap() had a TODO concerning handling
of same-type constraints -- this is the first test-case I've found
that triggered the problem.
- GenericEnvironment::getSubstitutionMap() incorrectly ignored
same-type constraints where one of the two types was a generic
parameter.
Fixes <https://bugs.swift.org/browse/SR-3321>.
Not sure why but this was another "toxic utility method".
Most of the usages fell into one of three categories:
- The base value was always non-null, so we could just call
getCanonicalType() instead, making intent more explicit
- The result was being compared for equality, so we could
skip canonicalization and call isEqual() instead, removing
some boilerplate
- Utterly insane code that made no sense
There were only a couple of legitimate uses, and even there
open-coding the conditional null check made the code clearer.
Also while I'm at it, make the SIL open archetypes tracker
more typesafe by passing around ArchetypeType * instead of
Type and CanType.
- The DeclContext versions of these methods have equivalents
on the DeclContext class; use them instead.
- The GenericEnvironment versions of these methods are now
static methods on the GenericEnvironment class. Note that
these are not made redundant by the instance methods on
GenericEnvironment, since the static methods can also be
called with a null GenericEnvironment, in which case they
just assert that the type is fully concrete.
- Remove some unnecessary #includes of ArchetypeBuilder.h
and GenericEnvironment.h. Now changes to these files
result in a lot less recompilation.
This is a more flexible interface that allows substitution operations to avoid needing an eager SubstitutionMap without relying on module-based conformance lookup. NFC yet.
The "core" data structure used to record the substitutions to be
performed is a TypeSubstitutionMap, which is a DenseMap. This is a
fairly heavyweight, static data structure for something where
* We occasionally want a more dynamic, lazily-populated data structure, and
* We can usually provide more efficient storage than a DenseMap.
So, introduce a Type::subst() variant that takes a TypeSubstitutionFn,
which is just a function that maps a SubstitutableType * to a Type (or
nothing). Use this as the core variant of subst(). with an adapter for
existing TypeSubstitutionMaps. Over time, TypeSubstitutionMap should
go away.
This eliminates the really gross registration of archetype builders
within the ASTContext, and is another little step toward lazily
constructing archetypes.
In the archetype builder, separate out the generic parameters (which
were the keys of the MapVector) from the potential archetypes (which
where the values of the MapVector) into two parallel
SmallVectors. This saves on storage, lets us avoid building up a
SmallVector of GenericTypeParamTypes, and helps flesh out this pattern
as a way of eliminating DenseMaps in favor of arrays.
The root potential archetypes in an archetype builder are associated
with generic parameters. Start decoupling potential archetypes from a
specific GenericTypeParamType and instead work with the abstracted
depth/index. The goal here is to allow the same archetype builder to
be used within different generic environments (which includes both
different generic parameters and different archetypes).
As part of this, boost the archetype builder's GenericTypeParamKey
from a local type to a more generic GenericParamKey that can be used
in other interfaces that want to work with abstracted generic
parameters.
Rather than directly using the ArchetypeBuilder associated with a
canonical generic signature, use a canonical GenericEnvironment
associated with that canonical generic signature. This has a few
benefits:
* It's cleaner to not have IRGen working with archetype builders;
GenericEnvironment is the right abstraction for mapping between
dependent types and archetypes for a specific context.
* It helps us separate the archetype builder from a *specific*
set of archetypes. This is an ongoing refactor that is intended to
allow us to re-use archetype builders across different generic
environments.
As part of this, ArchetypeBuilder::substDependentType() has gone away
in favor of GenericEnvironment::mapTypeIntoContext().
Type substitution works on a fairly narrow set of types: generic type
parameters (to, e.g., use a generic) and archetypes (to map out of a
generic context). Historically, it was also used with
DependentMemberTypes, but recent refactoring to eliminate witness
markers eliminate that code path.
Therefore, narrow TypeSubstitutionMap's keys to SubstitutableType,
which covers archetypes and generic type parameters. NFC
Introduce a new operation on generic signatures,
enumeratePairedRequirements(), which invokes a callback with each
(dependent type, set-of-conformance-requirements) pair that needs to
be recorded within substitutions. Switch the last two remaining
consumers of witness markers (GenericSignature::getAllDependentTypes()
and GenericSignature::getSubstitutions()) over to this new entrypoint.
Type::subst()'s "IgnoreMissing" option was fairly unprincipled, dropping
unsubstituted types into the resulting AST without any indication
whatsoever that anything went wrong. Replace this notion with a new
form of ErrorType that explicitly tracks which substituted type caused
the problem. It's still an ErrorType, but it prints like the
substituted type (which is important for code completion) and allows
us to step back to the substituted type if needed (which is used by
associated type inference). Then, allow Type::subst(), when the new
UseErrorTypes flag is passed, to form partially-substituted types that
contain errors, which both code completion and associated type
inference relied on.
Over time, I hope we can use error-types-with-original-types more
often to eliminate "<<error type>>" from diagnostics and teach
Type::subst() never to return a "null" type. Clients can check
"hasError()" to deal with failure cases rather than checking null.
Now that the previous patches have shaken out implicit assumptions
about the order of generic requirements and substitutions, we can
make a more radical change, dropping redundant protocol requirements
when building the original generic signature.
This means that the canonical ordering and minimization that we
used to only perform when building the mangling signature is done
all of the time, and hence getCanonicalManglingSignature() can go
away.
Usages now either call getCanonicalSignature(), or operate on the
original signature directly.
Instead of walking over PotentialArchetypes representatives directly
and using a separate list to record same-type constraints, just use
enumerateRequirements() and check the RequirementSource to drop
redundant requirements.
This means getGenericSignature() and getCanonicalManglingSignature()
can share the same logic for collecting requirements; the only
differences are the following:
- both drop requirements from Redundant sources, but mangling
signatures also drop requirements from Protocol sources
- mangling signatures also canonicalize the types appearing in the
final requirement
Move the sorting algorithm from construction of the canonical mangling
signature to requirement enumeration.
This also changes how same-type constraints pick representatives, using
the more canonical total order.
We used RequirementSources to identify redundant requirements
when building the canonical mangling signature. There were a
few problems with how this worked:
- We used the Inferred requirement source for both requirements
that were inferred from function parameter and result types
(for example, T : Hashable in `func foo<T, U>(dict: [T : U])`
and some completely unrelated things, like same-type
constraints imposed on nested types by existing same-type
constraints on the outer types.
- We used the Protocol requirement source for associated type
requirements on protocols as well as conformance requirements
on inherited protocols.
- Introducing a new Redundant requirement did not mark existing
Explicit requirements as Redundant.
None of these were an issue because of how canonical mangling
signature construction works:
- We already started with a canonical signature, so there were
no Inferred requirements of the first kind (which we *do* want
to keep here)
- We dropped both Protocol and Redundant requirements, so it
didn't matter that the distinction here was not sufficiently
fine-grained
- We never introduced Explicit requirements that would be later
superceded by a Redundant requirement, because we always
started with an existing canonical signature where such Explicit
requirements were already dropped.
However, I want to use the same algorithm for building the
original signature as the canonical mangling signature, so this
patch introduces these changes:
- The Inferred source is now only used for inferred requirements of
the first kind; the second kind are now Redundant
- The Protocol source is now only used for associated type
requirements; inherited protocol conformances are now Redundant
- updateRequirementSource() now does the right thing with
introduced Redundant requirements
For now, this doesn't change much, but an upcoming patch
refactors ArchetypeBuilder::getGenericSignature() to also
use enumerateRequirements(), except dropping Redundant
requirements only, and not Protocol.
These will be more useful once substitutions in protocol conformances
are moved to use interface types.
At first, these are only going to be used by the SIL optimizer.
A GenericEnvironment stores the mapping between GenericTypeParamTypes
and context archetypes (or eventually, concrete types, once we allow
extensions to constrain a generic parameter to a concrete type).
The goals here are two-fold:
- Eliminate the GenericTypeParamDecl::getArchetype() method, and
always use mapTypeIntoContext() instead
- Replace SILFunction::ContextGenericParams with a GenericEnvironment
This patch adds the new data type as well as serializer and AST
verifier support. but nothing else uses it yet.
Note that GenericSignature::get() now asserts if there are no
generic parameters, instead of returning null. This requires a
few tweaks here and there.
This is the opposite of GenericSignature::getSubstitutionMap(),
transforming an interface type substitution mapping into
a Substitution array.
Previously, Substitution arrays were built with hand-rolled
logic, usually relying on a GenericParamList's AllArchetypes
list. We want to stop using the AllArchetypes list, instead
using the requirements array from a GenericSignature.
Nothing calls this method yet; existing code will be refactored
to call it in the next few patches.
This is obviously the right thing to do in terms of ensuring
that two different expressions of the same signature always result
in the same type. It also has the pleasant side-effect of causing
the canonical function type to never be expressed in terms of type
parameters which have been equated with concrete types, which means
that various consumers that work primarily with canonical types
(such as SILGen and IRGen) no longer have to worry about such types,
at least when decomposing a generic function signature.
from the witness tables for their associations rather than passing
them separately.
This drastically reduces the number of physical arguments required
to invoke a generic function with a complex protocol hierarchy. It's
also an important step towards allowing recursive protocol
constraints. However, it may cause some performance problems in
generic code that we'll have to figure out ways to remediate.
There are still a few places in IRGen that rely on recursive eager
expansion of associated types and protocol witnesses. For example,
passing generic arguments requires us to map from a dependent type
back to an index into the all-dependent-types list in order to
find the right Substitution; that's something we'll need to fix
more generally. Specific to IRGen, there are still a few abstractions
like NecessaryBindings that use recursive expansion and are therefore
probably extremely expensive under this patch; I intend to fix those
up in follow-ups to the greatest extent possible.
There are also still a few things that could be made lazier about
type fulfillment; for example, we eagerly project the dynamic type
metadata of class parameters rather than waiting for the first place
we actually need to do so. We should be able to be lazier about
that, at least when the parameter is @guaranteed.
Technical notes follow. Most of the basic infrastructure I set up
for this over the last few months stood up, although there were
some unanticipated complexities:
The first is that the all-dependent-types list still does not
reliably contain all the dependent types in the minimized signature,
even with my last patch, because the primary type parameters aren't
necessarily representatives. It is, unfortunately, important to
give the witness marker to the primary type parameter because
otherwise substitution won't be able to replace that parameter at all.
There are better representations for all of that, but it's not
something I wanted to condition this patch on; therefore, we have to
do a significantly more expensive check in order to figure out a
dependent type's index in the all-dependent-types list.
The second is that the ability to add requirements to associated
types in protocol refinements means that we have to find the *right*
associatedtype declaration in order to find the associated witness
table. There seems to be relatively poor AST support for this
operation; maybe I just missed it.
The third complexity (so far) is that the association between an
archetype and its parent isn't particularly more important than
any other association it has. We need to be able to recover
witness tables linked with *all* of the associations that lead
to an archetype. This is, again, not particularly well-supported
by the AST, and we may run into problems here when we eliminate
recursive associated type expansion in signatures.
Finally, it's a known fault that this potentially leaves debug
info in a bit of a mess, since we won't have any informaton for
a type parameter unless we actually needed it somewhere.
Parse 'var [behavior] x: T', and when we see it, try to instantiate the property's
implementation in terms of the given behavior. To start out, behaviors are modeled
as protocols. If the protocol follows this pattern:
```
protocol behavior {
associatedtype Value
}
extension behavior {
var value: Value { ... }
}
```
then the property is instantiated by forming a conformance to `behavior` where
`Self` is bound to the enclosing type and `Value` is bound to the property's
declared type, and invoking the accessors of the `value` implementation:
```
struct Foo {
var [behavior] foo: Int
}
/* behaves like */
extension Foo: private behavior {
@implements(behavior.Value)
private typealias `[behavior].Value` = Int
var foo: Int {
get { return value }
set { value = newValue }
}
}
```
If the protocol requires a `storage` member, and provides an `initStorage` method
to provide an initial value to the storage:
```
protocol storageBehavior {
associatedtype Value
var storage: Something<Value> { ... }
}
extension storageBehavior {
var value: Value { ... }
static func initStorage() -> Something<Value> { ... }
}
```
then a stored property of the appropriate type is instantiated to witness the
requirement, using `initStorage` to initialize:
```
struct Foo {
var [storageBehavior] foo: Int
}
/* behaves like */
extension Foo: private storageBehavior {
@implements(storageBehavior.Value)
private typealias `[storageBehavior].Value` = Int
@implements(storageBehavior.storage)
private var `[storageBehavior].storage`: Something<Int> = initStorage()
var foo: Int {
get { return value }
set { value = newValue }
}
}
```
In either case, the `value` and `storage` properties should support any combination
of get-only/settable and mutating/nonmutating modifiers. The instantiated property
follows the settability and mutating-ness of the `value` implementation. The
protocol can also impose requirements on the `Self` and `Value` types.
Bells and whistles such as initializer expressions, accessors,
out-of-line initialization, etc. are not implemented. Additionally, behaviors
that instantiate storage are currently only supported on instance properties.
This also hasn't been tested past sema yet; SIL and IRGen will likely expose
additional issues.
When a dependent type is mapped into context, the result will either be
an archetype or a concrete type. The latter occurs if a same-type
constraint exists between the dependent type and the concrete type.
The logic to decide if a type should be passed directly or indirectly
was not handling this case if an interface type was passed down -- we
would just check if there was a class constraint present.
This resulted in mismatching conventions between an interface type and
its corresponding contextual type, which would trigger assertions.
Note that same-type constraints between generic parameters and concrete
types are still not supported for other reasons; the subject of the
constraint must still be an associated type of a generic parameter.
Fixes <rdar://problem/24687460>.
As part of this, use a different enum for parsed generic requirements.
NFC except that I noticed that ASTWalker wasn't visiting the second
type in a conformance constraint; fixing this seems to have no effect
beyond producing better IDE annotations.
This eliminates some minor overheads, but mostly it eliminates
a lot of conceptual complexity due to the overhead basically
appearing outside of its context.