The use of 'nocapture' for parameters and return values is incorrect for C++ types, as they can actually capture a pointer into its own value (e.g. std::string in libstdc++)
rdar://115062687
This is essentially a long-belated follow-up to Arnold's #12606.
The key observation here is that the enum-tag-single-payload witnesses
are strictly more powerful than the XI witnesses: you can simulate
the XI witnesses by using an extra case count that's <= the XI count.
Of course the result is less efficient than the XI witnesses, but
that's less important than overall code size, and we can work on
fast-paths for that.
The extra inhabitant count is stored in a 32-bit field (always present)
following the ValueWitnessFlags, which now occupy a fixed 32 bits.
This inflates non-XI VWTs on 32-bit targets by a word, but the net effect
on XI VWTs is to shrink them by two words, which is likely to be the
more important change. Also, being able to access the XI count directly
should be a nice win.
Previously we had a single mask for all x86-64 targets which included both the top and bottom bits. This accommodated simulators, which use the top bit, while macOS uses the bottom bit, but reserved one bit more than necessary on each. This change breaks out x86-64 simulators from non-simulators and reserves only the one bit used on each.
rdar://problem/34805348 rdar://problem/29765919