This change uses the 'gather bits' functionality of enum payloads
to create a contiguous value to switch over. This allows us to
remove the code that currently attempts to build a switch statement
by comparing each element in the payload in turn.
The downside of this technique is that we may do more work up front
gathering bits and we may also need to compare larger values in some
situations. The upside is that we can remove a lot of complicated
code from IRGen. Also, we pass the responsibility for multi-way
branch generation to LLVM which can make use of a wider range of
switch lowering strategies than IRGen can sensibly support.
Add a new scatterBits function that is simpler and more generic
than the old interleaveSpareBits function. It is essentially a
constant version of the emitScatterBits function.
The change replaces 'set bit enumeration' with arithmetic
and bitwise operations. For example, the formula
'(((x & -x) + x) & x) ^ x' can be used to find the rightmost
contiguous bit mask. This is essentially the operation that
SetBitEnumerator.findNext() performed.
Removing this functionality reduces the complexity of the
ClusteredBitVector (a.k.a. SpareBitVector) implementation and,
more importantly, API which will make it easier to modify
the implementation of spare bit masks going forward. My end
goal being to make spare bit operations work more reliably on
big endian systems.
Side note:
This change modifies the emit gather/scatter functions so that
they work with an APInt, rather than a SpareBitVector, which
makes these functions a bit more generic. These functions emit
instructions that are essentially equivalent to the parallel bit
extract/deposit (PEXT and PDEP) instructions in BMI2 on x86_64
(although we don't emit those directly currently). They also map
well to bitwise manipulation instructions on other platforms (e.g.
RISBG on IBM Z). So we might find uses for them outside spare bit
manipulation in the future.
At some point it might be useful to be able to use an appropriate
explosion schema to explode payloads but currently the code that
handles this is unused. There are lots of different ways we might
add this functionality in the future and it isn't clear that the
existing code will be the best way to use explosion schemas in the
future.
For now remove this dead code so that its presence doesn't obscure
the code that is actually in use.
Previously even if a type's metadata was optimized away, we would still
emit a field descriptor, which in turn could reference nominal type
descriptors for other types via symbolic references, etc.
The old logic was confusing. The LazyTypeGlobals map would contain
entries for all referenced types, even those without lazy metadata.
And for a type with lazy metadata, the IsLazy field would begin
with a value of false -- unless it was imported.
When a non-imported type was finally visited in the AST, we would
try to "enable" lazyness for it, which meant queueing up any
metadata that had been requested prior, or immediately emitting
the metadata otherwise.
Instead, let's add a separate map that caches whether a type has
lazy metadata or not. The first time we ask for the metadata of a
type, consult this map. If the type has lazy metadata according to
the map, queue up metadata emission for the type. Otherwise, emit
metadata eagerly when the type is visited in the AST.
Single-payload enum layout unfortunately assumes that constructing the payload case is a no-op, such
as when building Optional.some(x) for a value x. For multi-payload enums, now that we use the unused
spare bits to form extra inhabitants when the enum is in turn wrapped in an Optional or other single-
payload enum, this means we have to zero all of those bits. The spare bits may have been selected from
intra-field or tail padding bytes that are undefined in the underlying payload types. Fixes
rdar://problem/47635801.
This is essentially a long-belated follow-up to Arnold's #12606.
The key observation here is that the enum-tag-single-payload witnesses
are strictly more powerful than the XI witnesses: you can simulate
the XI witnesses by using an extra case count that's <= the XI count.
Of course the result is less efficient than the XI witnesses, but
that's less important than overall code size, and we can work on
fast-paths for that.
The extra inhabitant count is stored in a 32-bit field (always present)
following the ValueWitnessFlags, which now occupy a fixed 32 bits.
This inflates non-XI VWTs on 32-bit targets by a word, but the net effect
on XI VWTs is to shrink them by two words, which is likely to be the
more important change. Also, being able to access the XI count directly
should be a nice win.
Because layout minimizes the number of tag bits used, and favors high spare bits, the
spare bit representations end up overlapping the extra inhabitant representations, since we
just counted down from -1. If there are fewer tag bits than total spare bits, rotate the
extra inhabitant values so they correctly line up with the tag representations in this
situation. rdar://problem/46468090
Previously, they would forward their unused spare bits to be used by other multi-payload enums, but
did not implement anything for single-payload extra inhabitants.
In order to handle LinkOnceODR semantics correctly across various object
formats, introduce a new helper ApplyIRLinkage. This abstracts the need
to create a COMDAT group and set it on the GlobalValue. Adjust all
sites where we set the IR linkage attributes to use this mechanism
instead to avoid having to track down symbols not being added to a
COMDAT group.
The YAML format is the same one produced by the -dump-type-info
frontend mode.
For now this is only enabled if the -read-type-info-path frontend
flag is specified.
Progress on <rdar://problem/17528739>.
This allows us to layout-optimize Optional<T> when T is a struct with an
extra-inhabitant-bearing field anywhere in its definition, not only at
the beginning. rdar://problem/43019427
We also use this for field offset globals, which are not always
constant. I think in practice everything was getting set
correctly, but it was hard to follow the logic.
swift_unknownRetain does not work on indirect enum heap buffers.
The existing logic would say: oh one case is
ReferenceCounting::BridgeObject (which on its own probably would not
work with swift_unknownRetain), oh and another second case is NativeObject (for
the indirect buffer), let's use unknowRetain.
Pair the logic down to only use a single style of reference count
operation if all cases are of the same ReferenceCounting type.
rdar://40525268
Upstream has renamed the DEBUG() macro to LLVM_DEBUG. This updates swift
accordingly:
$ find . -name \*.cpp -print -exec sed -i "" -E "s/ DEBUG\(/ LLVM_DEBUG(/g" {} \;