SILValue.h/.cpp just defines the SIL base classes. Referring to specific instructions is a (small) kind of layering violation.
Also I want to keep SILValue small so that it is really just a type alias of ValueBase*.
NFC.
As there are no instructions left which produce multiple result values, this is a NFC regarding the generated SIL and generated code.
Although this commit is large, most changes are straightforward adoptions to the changes in the ValueBase and SILValue classes.
And use the new project_existential_box to get to the address value.
SILGen now generates a project_existential_box for each alloc_existential_box.
And IRGen re-uses the address value from the alloc_existential_box if the operand of project_existential_box is an alloc_existential_box.
This lets the generated code be the same as before.
And use project_box to get to the address value.
SILGen now generates a project_box for each alloc_box.
And IRGen re-uses the address value from the alloc_box if the operand of project_box is an alloc_box.
This lets the generated code be the same as before.
Other than that most changes of this (quite large) commit are straightforward.
Now that all the pieces are in place, we can finally start seeing
some benefits. In particular, the code for witness thunk emission
is much simpler now.
If an abstraction pattern has interface types in it, we
already enforce that it was constructed with a generic
signature.
This generic signature is only used to answer questions
about the abstraction pattern's own formal type, and
not with the substituted type.
So only put type lowering cache entries in the dependent
cache if they contain interface types in the substituted
type. Otherwise, if only the abstraction pattern has
interface types in it, the entry can live in the
independent cache, allowing it to be looked up without
a pushGenericContext() / popGenericContext() call.
For correctness, we now have to store the generic signature
in the type key as well. Subsequent changes should reduce
the size of the cache, by lowering fewer archetypes.
NFC, since nothing uses this for now.
When lowering the original unsubstituted type to check for parameters
and results being passed indirectly, be careful to map it to archetypes,
since the abstraction pattern's generic signature might not equal
M.Types.getCurGenericContext().
Also, don't use '==' to compare canonical interface types.
NFC for now, since this code is largely not exercised.
Now that we can ask questions of a generic signature directly,
we can pass dependent types right on through to TypeClassifierBase
without consulting getArchetypes().
This required a small fix to getTypeLowering() to correctly form
an abstraction pattern from an interface type.
There are no remaining usages of getArchetypes(), so that
entire ArchetypeBuilder instance is now gone.
Also refactor away SIL's take on the GenericsRAII helper class.
In a bunch of use-cases we use stripSinglePredecessorArgs to eliminate this
case. There is no reason to assume that this is being done in the caller of
RCIdentity. Lets make sure that we handle this case here.
rdar://24156136
- isTypeParameter() -- check if this is an archetype or dependent
interface type.
- requiresClass() -- check if this is a class-constrained type
parameter.
The old isOpaque() check has been replaced by
(isTypeParameter() && !requiresClass(moduleDecl)).
This allows us to pass the ModuleDecl on down to
GenericSignature::requiresClass(), enabling the use of
interface types in abstraction patterns.
NFC for now.
This improves the quality of code but more importantly makes it easier to ensure
that new terminators are handled in this code since all of the switches are now
covered switches.
In all of the cases where this is being used, we already immediately perform an
unreachable if we find a TermKind::Invalid. So simplify the code and move it
into the conversion switch itself.
This eliminates some minor overheads, but mostly it eliminates
a lot of conceptual complexity due to the overhead basically
appearing outside of its context.
The main idea here is that we really, really want to be
able to recover the protocol requirement of a conformance
reference even if it's abstract due to the conforming type
being abstract (e.g. an archetype). I've made the conversion
from ProtocolConformance* explicit to discourage casual
contamination of the Ref with a null value.
As part of this change, always make conformance arrays in
Substitutions fully parallel to the requirements, as opposed
to occasionally being empty when the conformances are abstract.
As another part of this, I've tried to proactively fix
prospective bugs with partially-concrete conformances, which I
believe can happen with concretely-bound archetypes.
In addition to just giving us stronger invariants, this is
progress towards the removal of the archetype from Substitution.
If a global variable in a module we are compiling has a type containing
a resilient value type from a different module, we don't know the size
at compile time, so we cannot allocate storage for the global statically.
Instead, we will use a buffer, just like alloc_stack does for archetypes
and resilient value types.
This adds a new SIL instruction but does not yet make use of it.
The big differences here are that:
1. We no longer use the 4096 trick.
2. Now we store all indices inline so no mallocing is required and the
value is trivially copyable. We allow for much larger indices to be
stored inline which makes having an unrepresentable index a much smaller
issue. For instance on a 32 bit platform, in NewProjection, we are able
to represent an index of up to (1 << 26) - 1, which should be more than
enough to handle any interesting case.
3. We can now have up to 7 ptr cases and many more index cases (with each extra
bit needed to represent the index cases lowering the representable range of
indices).
The whole data structure is much simpler and easier to understand as a
bonus. A high level description of the ADT is as follows:
1. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are not all
set to 1 represent an enum with a pointer case. This means that one can have
at most ((1 << num_tagged_bits(T*)) - 2) enum cases associated with
pointers.
2. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are all set
is either an invalid PointerIntEnum or an index.
3. A PointerIntEnum with all bits set is an invalid PointerIntEnum.
4. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are all set
but for which the upper bits are not all set is an index enum. The case bits
for the index PointerIntEnum are stored in bits [num_tagged_bits(T*),
num_tagged_bits(T*) + num_index_case_bits]. Then the actual index is stored
in the remaining top bits. For the case in which this is used in swift
currently, we use 3 index bits meaning that on a 32 bit system we have 26
bits for representing indices meaning we can represent indices up to
67_108_862. Any index larger than that will result in an invalid
PointerIntEnum. On 64 bit we have many more bits than that.
By using this representation, we can make PointerIntEnum a true value type
that is trivially constructable and destructable without needing to malloc
memory.
In order for all of this to work, the user of this needs to construct an
enum with the appropriate case structure that allows the data structure to
determine what cases are pointer and which are indices. For instance the one
used by Projection in swift is:
enum class NewProjectionKind : unsigned {
// PointerProjectionKinds
Upcast = 0,
RefCast = 1,
BitwiseCast = 2,
FirstPointerKind = Upcast,
LastPointerKind = BitwiseCast,
// This needs to be set to ((1 << num_tagged_bits(T*)) - 1). It
// represents the first NonPointerKind.
FirstIndexKind = 7,
// Index Projection Kinds
Struct = PointerIntEnumIndexKindValue<0, EnumTy>::value,
Tuple = PointerIntEnumIndexKindValue<1, EnumTy>::value,
Index = PointerIntEnumIndexKindValue<2, EnumTy>::value,
Class = PointerIntEnumIndexKindValue<3, EnumTy>::value,
Enum = PointerIntEnumIndexKindValue<4, EnumTy>::value,
LastIndexKind = Enum,
};
Having a separate address and container value returned from alloc_stack is not really needed in SIL.
Even if they differ we have both addresses available during IRGen, because a dealloc_stack is always dominated by the corresponding alloc_stack in the same function.
Although this commit quite large, most changes are trivial. The largest non-trivial change is in IRGenSIL.
This commit is a NFC regarding the generated code. Even the generated SIL is the same (except removed #0, #1 and @local_storage).
This is something that we have wanted for a long time and will enable us to
remove some hacks from the compiler (i.e. how we determine in the ARC optimizer
that we have "fatalError" like function) and also express new things like
"noarc".
When enabled, generate closure functions with guaranteed conventions as their context parameters, and pass context arguments to them as guaranteed when possible. (When forming a closure by partial_apply, the partial apply still needs to take ownership of the parameters, regardless of their convention.)
enumerate them lazily.
This leads to compilation time improvement, as some of the LSValues previously
enumerated do not be created in this approach.
i.e. we enumerate LSValues created by loads previously, but the LoadInsts could be
target for RLE. In such case, the enumerated LSValues are not used.
Compilation time improvement: 1775ms to 1721ms (2.7% to 2.6% of the entire
compilation time for stdlib -O).
Existing tests ensure correctness.
Note: we still enumerate locations, as we need to know how many locations there are
in the function to resize the bitvector appropriately before the data flow runs.
Instead of bodging a representation of the SIL capture parameters for a closure into the formal type of closure SILDeclRefs, introduce those parameters in a separate lowering step. This lets us clean up some TypeLowering code that was tolerating things like SILBoxTypes and naked LValueTypes in formal types for nefarious ends (though requires some hacks in SILGen to keep the representation of curry levels consistent, which is something I hope to clean up next). This also decouples the handling of captures from the handling of other parameters, which should enable us to make them +0. For now, there should be NFC.