In order to allow this, I've had to rework the syntax of substituted function types; what was previously spelled `<T> in () -> T for <X>` is now spelled `@substituted <T> () -> T for <X>`. I think this is a nice improvement for readability, but it did require me to churn a lot of test cases.
Distinguishing the substitutions has two chief advantages over the existing representation. First, the semantics seem quite a bit clearer at use points; the `implicit` bit was very subtle and not always obvious how to use. More importantly, it allows the expression of generic function types that must satisfy a particular generic abstraction pattern, which was otherwise impossible to express.
As an example of the latter, consider the following protocol conformance:
```
protocol P { func foo() }
struct A<T> : P { func foo() {} }
```
The lowered signature of `P.foo` is `<Self: P> (@in_guaranteed Self) -> ()`. Without this change, the lowered signature of `A.foo`'s witness would be `<T> (@in_guaranteed A<T>) -> ()`, which does not preserve information about the conformance substitution in any useful way. With this change, the lowered signature of this witness could be `<T> @substituted <Self: P> (@in_guaranteed Self) -> () for <A<T>>`, which nicely preserves the exact substitutions which relate the witness to the requirement.
When we adopt this, it will both obviate the need for the special witness-table conformance field in SILFunctionType and make it far simpler for the SILOptimizer to devirtualize witness methods. This patch does not actually take that step, however; it merely makes it possible to do so.
As another piece of unfinished business, while `SILFunctionType::substGenericArgs()` conceptually ought to simply set the given substitutions as the invocation substitutions, that would disturb a number of places that expect that method to produce an unsubstituted type. This patch only set invocation arguments when the generic type is a substituted type, which we currently never produce in type-lowering.
My plan is to start by producing substituted function types for accessors. Accessors are an important case because the coroutine continuation function is essentially an implicit component of the function type which the current substitution rules simply erase the intended abstraction of. They're also used in narrower ways that should exercise less of the optimizer.
If an enum has a payload case with zero size, we treat it as an empty case
for ABI purposes. Unfortunately, this meant that reflection metadata was
incomplete for such cases, with a Mirror reporting that the enum value
had zero children.
Tweak the field type metadata emission slightly to preserve the payload
type for such enum cases.
Fixes <https://bugs.swift.org/browse/SR-12044> / <rdar://problem/58861157>.
Motivation: `GenericSignatureImpl::getCanonicalSignature` crashes for
`GenericSignature` with underlying `nullptr`. This led to verbose workarounds
when computing `CanGenericSignature` from `GenericSignature`.
Solution: `GenericSignature::getCanonicalSignature` is a wrapper around
`GenericSignatureImpl::getCanonicalSignature` that returns the canonical
signature, or `nullptr` if the underlying pointer is `nullptr`.
Rewrite all verbose workarounds using `GenericSignature::getCanonicalSignature`.
The RemoteMirror library in shipping versions of macOS/iOS/tvOS/watchOS crashes if the compiler
emits a BuiltinTypeDescriptor with size zero. Although this is fixed in top-of-tree RemoteMirror,
we want binaries built with the new compiler to still be inspectable when run on older OSes.
Generate the metadata as an empty struct with no fields when deploying back to these older
platforms, which should be functionally equivalent for most purposes.
Fixes rdar://problem/57924984.
https://forums.swift.org/t/improving-the-representation-of-polymorphic-interfaces-in-sil-with-substituted-function-types/29711
This prepares SIL to be able to more accurately preserve the calling convention of
polymorphic generic interfaces by letting the type system represent "substituted function types".
We add a couple of fields to SILFunctionType to support this:
- A substitution map, accessed by `getSubstitutions()`, which maps the generic signature
of the function to its concrete implementation. This will allow, for instance, a protocol
witness for a requirement of type `<Self: P> (Self, ...) -> ...` for a concrete conforming
type `Foo` to express its type as `<Self: P> (Self, ...) -> ... for <Foo>`, preserving the relation
to the protocol interface without relying on the pile of hacks that is the `witness_method`
protocol.
- A bool for whether the generic signature of the function is "implied" by the substitutions.
If true, the generic signature isn't really part of the calling convention of the function.
This will allow closure types to distinguish a closure being passed to a generic function, like
`<T, U> in (*T, *U) -> T for <Int, String>`, from the concrete type `(*Int, *String) -> Int`,
which will make it easier for us to differentiate the representation of those as types, for
instance by giving them different pointer authentication discriminators to harden arm64e
code.
This patch is currently NFC, it just introduces the new APIs and takes a first pass at updating
code to use them. Much more work will need to be done once we start exercising these new
fields.
This does bifurcate some existing APIs:
- SILFunctionType now has two accessors to get its generic signature.
`getSubstGenericSignature` gets the generic signature that is used to apply its
substitution map, if any. `getInvocationGenericSignature` gets the generic signature
used to invoke the function at apply sites. These differ if the generic signature is
implied.
- SILParameterInfo and SILResultInfo values carry the unsubstituted types of the parameters
and results of the function. They now have two APIs to get that type. `getInterfaceType`
returns the unsubstituted type of the generic interface, and
`getArgumentType`/`getReturnValueType` produce the substituted type that is used at
apply sites.
Structurally prevent a number of common anti-patterns involving generic
signatures by separating the interface into GenericSignature and the
implementation into GenericSignatureBase. In particular, this allows
the comparison operators to be deleted which forces callers to
canonicalize the signature or ask to compare pointers explicitly.
This removes it from the AST and largely replaces it with AnyObject
at the SIL and IRGen layers. Some notes:
- Reflection still uses the notion of "unknown object" to mean an
object with unknown refcounting. There's no real reason to make
this different from AnyObject (an existential containing a
single object with unknown refcounting), but this way nothing
changes for clients of Reflection, and it's consistent with how
native objects are represented.
- The value witness table and reflection descriptor for AnyObject
use the mangling "BO" instead of "yXl".
- The demangler and remangler continue to support "BO" because it's
still in use as a type encoding, even if it's not an AST-level
Type anymore.
- Type-based alias analysis for Builtin.UnknownObject was incorrect,
so it's a good thing we weren't using it.
- Same with enum layout. (This one assumed UnknownObject never
referred to an Objective-C tagged pointer. That certainly wasn't how
we were using it!)
This memoizes the result, which is fine for all callers; the only
exception is open existential types where each new open existential
now explicitly gets a unique generic environment, allocated by
calling GenericEnvironment::getIncomplete().
box descriptors
We want to substitute opaque result types in addTypeRef but when we pass
SILFunctionTypes this would fail because AST type substitution does not
support lowered SIL types.
Instead add addLoweredTypeRef which substitutes based on SILTypes.
rdar://54529445
If we're allowed to know at IRGen time what the underlying type of an opaque type is, we can
satisfy references to the opaque type's metadata or protocol witness tables by directly referencing
the underlying type instead.
When we generate code that asks for complete metadata for a fully concrete specific type that
doesn't have trivial metadata access, like `(Int, String)` or `[String: [Any]]`,
generate a cache variable that points to a mangled name, and use a common accessor function
that turns that cache variable into a pointer to the instantiated metadata. This saves a bunch
of code size, and should have minimal runtime impact, since the demangling of any string only
has to happen once.
This mostly just works, though it exposed a couple of issues:
- Mangling a type ref including objc protocols didn't cause the objc protocol record to get
instantiated. Fixed as part of this patch.
- The runtime type demangler doesn't correctly handle retroactive conformances. If there are
multiple retroactive conformances in a process at runtime, then even though the mangled string
refers to a specific conformance, the runtime still just picks one without listening to the
mangler. This is left to fix later, rdar://problem/53828345.
There is some more follow-up work that we can do to further improve the gains:
- We could improve the runtime-provided entry points, adding versions that don't require size
to be cached, and which can handle arbitrary metadata requests. This would allow for mangled
names to also be used for incomplete metadata accesses and improve code size of some generic
type accessors. However, we'd only be able to take advantage of the new entry points in
OSes that ship a new runtime.
- We could choose to always symbolic reference all type references, which would generally reduce
the size of mangled strings, as well as make runtime demangling more efficient, since it wouldn't
need to hit the runtime caches. This would however require that we be able to handle symbolic
references across files in the MetadataReader in order to avoid regressing remote mirror
functionality.
When a c++ namespace IRGens for reflection, it triggers a path that
expects the namespace to have a particular form. Because it is always
empty, we can treat it just like an empty swift enum.
This improves on the previous situation:
- The request ensures that the backing storage for lazy properties
and property wrappers gets synthesized first; previously it was
only somewhat guaranteed by callers.
- Instead of returning a range this just returns an ArrayRef,
which simplifies clients.
- Indexing into the ArrayRef is O(1), which addresses some FIXMEs
in the SIL optimizer.
When referencing a superclass type from a subclass, for example, the
type uses the subclass's generic parameters, not the superclass's.
This can be important if a nested type constrains away some of its
parent type's generic parameters.
This doesn't solve all the problems around mis-referenced generic
parameters when some are constrained away, though. That might
require a runtime change. See the FIXME comments in the test cases.
rdar://problem/51627403
Previously even if a type's metadata was optimized away, we would still
emit a field descriptor, which in turn could reference nominal type
descriptors for other types via symbolic references, etc.
Instead of a wholly separate lazyness mechanism for foreign metadata where
the first call to getAddrOfForeignTypeMetadataCandidate() would emit the
metadata, emit it using the lazy metadata mechanism.
This eliminates some code duplication. It also ensures that foreign
metadata is only emitted once per SIL module, and not once per LLVM
module, avoiding duplicate copies that must be ODR'd away in multi-threaded
mode.
This fixes the test case from <rdar://problem/49710077>.
Use the `IRLInkage::InternalLinkOnceODR` linkage rather than computing
that in a couple of sites. This gives us semantic meaning to the
linkage being applied. NFC.
When we emit "false" symbolic references to accesors for conformances or
type metadata, ensure that we end up with two-byte-aligned symbolic
references. This addresses a problem with ARM+Thumb compilation where the
low bit wasn't getting set for Thumb code.
Fixes rdar://problem/46067353.
Switch key path metadata over to mangled names for each of the places it
refers to either a type metadata accessor or a witness table accessor. For
now, the mangled name is a symbolic reference to the existing accessors.
Part of rdar://problem/38038799.
The current representation of an associated conformance in a witness
tables (e.g., Iterator: IteratorProtocol within a witness table for
Sequence) is a function that the client calls.
Replace this with something more like what we do for associated types:
an associated conformance is either a pointer to the witness table (once
it is known) or a pointer to a mangled name that describes that
conformance. On first access, demangle the mangled name and replace the
entry with the resulting witness table. This will give us a more compact
representation of associated conformances, as well as always caching
them.
For now, the mangled name is a sham: it’s a mangled relative reference to
the existing witness table accessors, not a true mangled name. In time,
we’ll extend the support here to handle proper mangled names.
Part of rdar://problem/38038799.