This is essentially a long-belated follow-up to Arnold's #12606.
The key observation here is that the enum-tag-single-payload witnesses
are strictly more powerful than the XI witnesses: you can simulate
the XI witnesses by using an extra case count that's <= the XI count.
Of course the result is less efficient than the XI witnesses, but
that's less important than overall code size, and we can work on
fast-paths for that.
The extra inhabitant count is stored in a 32-bit field (always present)
following the ValueWitnessFlags, which now occupy a fixed 32 bits.
This inflates non-XI VWTs on 32-bit targets by a word, but the net effect
on XI VWTs is to shrink them by two words, which is likely to be the
more important change. Also, being able to access the XI count directly
should be a nice win.
The dummy WitnessTableEntry for the protocol conformance descriptor
was used to ensure that the witness indices were offset by 1 (to
account for the protocol conformance descriptor), but it led to some
odd index adjustments. Instead, make the index adjustments explicit
when we're passing a WitnessIndex off to be loaded for a witness
table.
A concrete conformance may involve conditional conformances, which are witness
tables that we can access from the original conformance's one. We need to track
metadata and be able to follow it in a metadata path.