in associated type accessors.
The only remaining use of getAll{DependentTypes,Archetypes} in IRGen
is in code that interprets Substitution arrays, which is unavoidable
absent a change to that representation.
"minimal" is defined as the set of requirements that would be
passed to a function with the type's generic signature that
takes the thick metadata of the parent type as its only argument.
Each runtime function definition in RuntimeFunctions.def states which calling convention
should be used for this runtime function. But IRGen and LLVMPasses were not always
properly propagating this declared calling convention all the way down to llvm's Call instructions.
In many cases, the standard C convention was set for the call irrespective of the actual calling
convention defined for a given runtime function. As a result, incorrect code was generated.
This commit tries to fix all those places, where such a mismatch was found. In many cases this is
achieved by defining a helper function CreateCall in such a way that makes sure that the call instruction
gets the same calling convention as the one used by its callee operand.
We were checking for a @convention(witness_method) callee with an
abstract Self type in several places. Factor this out into a new
pair of methods on SILFunctionType, and fix the logic for static
methods, where the Self archetype is wrapped in a metatype.
from the witness tables for their associations rather than passing
them separately.
This drastically reduces the number of physical arguments required
to invoke a generic function with a complex protocol hierarchy. It's
also an important step towards allowing recursive protocol
constraints. However, it may cause some performance problems in
generic code that we'll have to figure out ways to remediate.
There are still a few places in IRGen that rely on recursive eager
expansion of associated types and protocol witnesses. For example,
passing generic arguments requires us to map from a dependent type
back to an index into the all-dependent-types list in order to
find the right Substitution; that's something we'll need to fix
more generally. Specific to IRGen, there are still a few abstractions
like NecessaryBindings that use recursive expansion and are therefore
probably extremely expensive under this patch; I intend to fix those
up in follow-ups to the greatest extent possible.
There are also still a few things that could be made lazier about
type fulfillment; for example, we eagerly project the dynamic type
metadata of class parameters rather than waiting for the first place
we actually need to do so. We should be able to be lazier about
that, at least when the parameter is @guaranteed.
Technical notes follow. Most of the basic infrastructure I set up
for this over the last few months stood up, although there were
some unanticipated complexities:
The first is that the all-dependent-types list still does not
reliably contain all the dependent types in the minimized signature,
even with my last patch, because the primary type parameters aren't
necessarily representatives. It is, unfortunately, important to
give the witness marker to the primary type parameter because
otherwise substitution won't be able to replace that parameter at all.
There are better representations for all of that, but it's not
something I wanted to condition this patch on; therefore, we have to
do a significantly more expensive check in order to figure out a
dependent type's index in the all-dependent-types list.
The second is that the ability to add requirements to associated
types in protocol refinements means that we have to find the *right*
associatedtype declaration in order to find the associated witness
table. There seems to be relatively poor AST support for this
operation; maybe I just missed it.
The third complexity (so far) is that the association between an
archetype and its parent isn't particularly more important than
any other association it has. We need to be able to recover
witness tables linked with *all* of the associations that lead
to an archetype. This is, again, not particularly well-supported
by the AST, and we may run into problems here when we eliminate
recursive associated type expansion in signatures.
Finally, it's a known fault that this potentially leaves debug
info in a bit of a mess, since we won't have any informaton for
a type parameter unless we actually needed it somewhere.
Fix some interface type/context type confusion in the AST synthesis from the previous patch, add a unique private mangling for behavior protocol conformances, and set up SILGen to emit the conformances when property declarations with behaviors are visited. Disable synthesis of the struct memberwise initializer if any instance properties use behaviors; codegen will need to be redesigned here.
If a resilient protocol is defined in a different resilience domain
than the conformance, new requirements may be added resiliently
without recompiling the module containing the conformance.
So we must use the accessor, and call a runtime function in the body
of the accessor to instantiate the conformance.
Similarly to how we've always handled parameter types, we
now recursively expand tuples in result types and separately
determine a result convention for each result.
The most important code-generation change here is that
indirect results are now returned separately from each
other and from any direct results. It is generally far
better, when receiving an indirect result, to receive it
as an independent result; the caller is much more likely
to be able to directly receive the result in the address
they want to initialize, rather than having to receive it
in temporary memory and then copy parts of it into the
target.
The most important conceptual change here that clients and
producers of SIL must be aware of is the new distinction
between a SILFunctionType's *parameters* and its *argument
list*. The former is just the formal parameters, derived
purely from the parameter types of the original function;
indirect results are no longer in this list. The latter
includes the indirect result arguments; as always, all
the indirect results strictly precede the parameters.
Apply instructions and entry block arguments follow the
argument list, not the parameter list.
A relatively minor change is that there can now be multiple
direct results, each with its own result convention.
This is a minor change because I've chosen to leave
return instructions as taking a single operand and
apply instructions as producing a single result; when
the type describes multiple results, they are implicitly
bound up in a tuple. It might make sense to split these
up and allow e.g. return instructions to take a list
of operands; however, it's not clear what to do on the
caller side, and this would be a major change that can
be separated out from this already over-large patch.
Unsurprisingly, the most invasive changes here are in
SILGen; this requires substantial reworking of both call
emission and reabstraction. It also proved important
to switch several SILGen operations over to work with
RValue instead of ManagedValue, since otherwise they
would be forced to spuriously "implode" buffers.
This is another incremental step toward protocol resilience.
To support resiliently adding requirements with default implementations,
we need to emit the witness thunk for each default requirement once,
and share it between conformances.
However, the body of the witness thunk can call witness methods from
the conformance of <Self : P>. Formerly, witness thunks were only emitted
with a concrete Self type, so any calls were resolved statically.
Now that Self can be abstract in a witness thunk signature, we have to
pass in the witness table and do the necessary gymnastics on both sides
of the call.
At the call site, the witness table is either abstract, concrete, or
undefined, as follows:
- If the unsubstituted Self type is concrete in the witness method
signature, no witness table is necessary; this is the case of a
concrete (non-default) witness thunk.
- If the unsubstituted Self type is abstract and the substituted Self
type is concrete, the witness table is accessed via direct reference.
- If the unsubstituted Self type is abstract and the substituted Self
type is also abstract, the witness table comes from type metadata
that was passed in to the function where the call is taking place.
Inside the body of the witness method thunk, we only bind the witness
table if Self is an abstract type; this rules out the first case above,
where the witness table is not needed and cannot be provided by the
caller.
The result of a SIL witness_method instruction now lowers as an
explosion containing two values, the function pointer itself and
the witness table.
Similarly, partial application thunks now grab the witness table and
package it up in the context.
Special care is taken to support function_ref + apply and
function_ref + partial_apply of @convention(witness_method) callees;
here, we can hit the case where we don't know the original conformance
because the callee is concrete, in which case we just pass in a null
pointer as the witness table.
Witness thunks with an abstract Self currently only work for protocols
without any associated type requirements; to support those, we need
to be able to fulfill associated type metadata from the witness
table for the <Self : P> conformance. This will be addressed as part
of @rjmccall's calling convention work.
Also I didn't make any attempt to support this for @objc protocols that
do not have a witness table. In this case, the extra parameter is not
necessary since we can perform dynamic dispatch on the 'self' value to
call requirements; however, @objc protocols will not support default
implementations, at least not in the near-term.
This is the first patch in a series that will allow new protocol
requirements to be added resiliently, with the runtime filling in
default implementations in witness tables.
First, this adds a new flag to the protocol descriptor indicating
that the protocol is resilient. In this case, there are two
additional fields, MinimumWitnessTableSizeInWords and
DefaultWitnessTableSizeInWords, followed by tail-allocated
default witnesses.
The swift_getGenericWitnessTable() entry point now fills in the
default witnesses from the protocol if the given witness table
template is smaller than the expected witness table size.
This also changes the layout of instantiated witness tables to move
the address point to the end of private data. Previously the private
data came after the requirements, but this meant that adding new
requirements would require sliding the private data at runtime and
accessing it indirectly. It is much simpler to access it from
negative offsets instead.
I updated IRGen to emit the new metadata, but currently all protocols
are flagged as not resilient, and default witnesses are not emitted;
this will come in a subsequent patch once some more plumbing is
in place.
To avoid generating GOT entries for references to protocols defined
in the current module, I had to add some hacks to the existing hack
for this. I'll hopefully clean this up in a principled manner later.
Instead of directly emitting calls to swift_getGenericMetadata*() and
referencing metadata templates, call a metadata accessor function
corresponding to the UnboundGenericType of the NominalTypeDecl.
The body of this accessor forwards arguments to a runtime metadata
instantiation function, together with the template.
Also, move some code around, so that metadata accesses which are
only done as part of the body of a metadata accessor function are
handled separately in emitTypeMetadataAccessFunction().
Apart from protocol conformances, this means metadata templates are
no longer referenced from outside the module where they were defined.
Recent changes added support for resiliently-sized enums, and
enums resilient to changes in implementation strategy.
This patch adds resilient case numbering, fixing the problem
where adding new payload cases would break existing code by
changing the numbering of no-payload cases.
The problem is that internally, enum cases are numbered with payload
cases coming first, followed by no-payload cases. While each list
is itself in declaration order, with new additions coming at the
end, we need to partition it to give us a fast runtime test for
"is this a payload or no-payload case index."
The resilient numbering strategy used here is that the getEnumTag
and destructiveInjectEnumTag value witness functions now take a
tag index in the range [-ElementsWithPayload..ElementsWithNoPayload-1].
Payload elements are numbered in *reverse* declaration order, so
adding new payload cases yields decreasing tag indices, and adding
new no-payload cases yields increasing tag indices, allowing use
sites to be resilient.
This adds the adjustment between 'fragile' and 'resilient' tag
indices in a somewhat unsatisfying manner, because the calculation
could be pushed down further into EnumImplStrategy, simplifying
both the IRGen code and the generated IR. I'll clean this up later.
In the meantime, clean up some other stuff in GenEnum.cpp, mostly
abstracting code that walks cases.
Use them to generate value witnesses when the type has dynamic packing.
Regularize the interface for calling value witnesses.
Not a huge difference yet, although we do re-use local type data
a little more effectively now.
Since that's somewhat expensive, allow the generation of meaningful
IR value names to be efficiently controlled in IRGen. By default,
enable meaningful value names only when generating .ll output.
I considered giving protocol witness tables the name T:Protocol
instead of T.Protocol, but decided that I didn't want to update that
many test cases.
Most notably, the source caches did not respect dominance. The
simplest solution was just to drop them in favor of the ordinary
caching system; this is unfortunate because it requires walking
over the path twice instead of exploiting the trie, but it's much
easier to make this work, especially in combination with the other
caching mechanisms at play.
This will be tested by later commits that enable lazy-loading of
local type data in various contexts.
This eliminates some minor overheads, but mostly it eliminates
a lot of conceptual complexity due to the overhead basically
appearing outside of its context.
The main idea here is that we really, really want to be
able to recover the protocol requirement of a conformance
reference even if it's abstract due to the conforming type
being abstract (e.g. an archetype). I've made the conversion
from ProtocolConformance* explicit to discourage casual
contamination of the Ref with a null value.
As part of this change, always make conformance arrays in
Substitutions fully parallel to the requirements, as opposed
to occasionally being empty when the conformances are abstract.
As another part of this, I've tried to proactively fix
prospective bugs with partially-concrete conformances, which I
believe can happen with concretely-bound archetypes.
In addition to just giving us stronger invariants, this is
progress towards the removal of the archetype from Substitution.
Instead of categorically forbidding caching within a conditional
scope, permit it but remember how to remove the cache entries.
This means that ad-hoc emission code can now get exactly the
right caching behavior if they use this properly. In keeping
with that, adjust a bunch of code to properly nest scopes
according to the conditional paths they enter.
There are several interesting new features here.
The first is that, when emitting a SILFunction, we're now able to
cache type data according to the full dominance structure of the
original function. For example, if we ask for type metadata, and
we've already computed it in a dominating position, we're now able
to re-use that value; previously, we were limited to only doing this
if the value was from the entry block or the LLVM basic block
matched exactly. Since this tracks the SIL dominance relationship,
things in IRGen which add their own control flow must be careful
to suppress caching within blocks that may not dominate the
fallthrough; this mechanism is currently very crude, but could be
made to allow a limited amount of caching within the
conditionally-executed blocks.
This query is done using a proper dominator tree analysis, even at -O0.
I do not expect that we will frequently need to actually build the
tree, and I expect that the code-size benefits of doing a real
analysis will be significant, especially as we move towards making
more metadata lazily computed.
The second feature is that this adds support for "abstract"
cache entries, which indicate that we know how to derive the metadata
but haven't actually done so. This code isn't yet tested, but
it's going to be the basis of making a lot of things much lazier.
of associated types in protocol witness tables.
We use the global access functions when the result isn't
dependent, and a simple accessor when the result can be cheaply
recovered from the conforming metadata. Otherwise, we add a
cache slot to a private section of the witness table, forcing
an instantiation per conformance. Like generic type metadata,
concrete instantiations of generic conformances are memoized.
There's a fair amount of code in this patch that can't be
dynamically tested at the moment because of the widespread
reliance on recursive expansion of archetypes / dependent
types. That's something we're now theoretically in a position
to change, and as we do so, we'll test more of this code.
This speculatively re-applies 7576a91009,
i.e. reverts commit 11ab3d537f.
We have not been able to duplicate the build failure in
independent testing; it might have been spurious or unrelated.
of associated types in protocol witness tables.
We use the global access functions when the result isn't
dependent, and a simple accessor when the result can be cheaply
recovered from the conforming metadata. Otherwise, we add a
cache slot to a private section of the witness table, forcing
an instantiation per conformance. Like generic type metadata,
concrete instantiations of generic conformances are memoized.
There's a fair amount of code in this patch that can't be
dynamically tested at the moment because of the widespread
reliance on recursive expansion of archetypes / dependent
types. That's something we're now theoretically in a position
to change, and as we do so, we'll test more of this code.
This reverts commit 6528ec2887, i.e.
it reapplies b1e3120a28, with a fix
to unbreak release builds.
The runtime entry doesn't just report the error, unlike the other report* functions, it also does the crashing.
Reapplying independent of unrelated reverted patches.
This reverts commit b1e3120a28.
Reverting because this patch uses WitnessTableBuilder::PI in NDEBUG code.
That field only exists when NDEBUG is not defined, but now NextCacheIndex, a
field that exists regardless, is being updated based on information from PI.
This problem means that Release builds do not work.
of associated types in protocol witness tables.
We use the global access functions when the result isn't
dependent, and a simple accessor when the result can be cheaply
recovered from the conforming metadata. Otherwise, we add a
cache slot to a private section of the witness table, forcing
an instantiation per conformance. Like generic type metadata,
concrete instantiations of generic conformances are memoized.
There's a fair amount of code in this patch that can't be
dynamically tested at the moment because of the widespread
reliance on recursive expansion of archetypes / dependent
types. That's something we're now theoretically in a position
to change, and as we do so, we'll test more of this code.