Prefer direct stored properties to computed ones — there is no reason
to risk inlining issues, esp. since things like `Slice.base` aren’t
even force-inlined.
Prefer using `_wholeGuts` to spelling out the full incantation.
To prevent unaligned indices from breaking well-defined index distance
and index offset calculations, round every index down to the nearest
whole Character.
For the horrific details, see the forum discussion below.
https://forums.swift.org/t/string-index-unification-vs-bidirectionalcollection-requirements/55946
To avoid rounding from regressing String performance in the regular
case (when indices aren’t being passed across string views), introduce
a new String.Index flag bit that indicates that the index is already
Character aligned.
This used to forward to `Slice.replaceSubrange`, but that’s a generic algorithm that isn’t aware of the pecularities of Unicode extended grapheme clusters, and it can be mislead by unusual cases, like a substring or subrange whose bounds aren’t `Character`-aligned, or a replacement string that starts with a continuation scalar.
There are three flavors, corresponding to i < endIndex, i <= endIndex, and range containment checks.
Additionally, we have separate variants for index validation in substrings.
Assign some previously reserved bits in String.Index and _StringObject to keep track of their associated storage encoding (either UTF-8 or UTF-16).
None of these bits will be reliably set in processes that load binaries compiled with older stdlib releases, but when they do end up getting set, we can use them opportunistically to more reliably detect cases where an index is applied on a string with a mismatching encoding.
As more and more code gets recompiled with 5.7+, the stdlib will gradually become able to detect such issues with complete accuracy.
Code that misuses indices this way was always considered broken; however, String wasn’t able to reliably detect these runtime errors before. Therefore, I expect there is a large amount of broken code out there that keeps using bridged Cocoa String indices (UTF-16) after a mutation turns them into native UTF-8 strings. Therefore, instead of trapping, this commit silently corrects the issue, transcoding the offsets into the correct encoding.
It would probably be a good idea to also emit a runtime warning in addition to recovering from the error. This would generate some noise that would gently nudge folks to fix their code.
rdar://89369680
The demangling library can't use the error handling from the main runtime
because it isn't always linked with it. However, it's useful to have
some error handling, and in particular to be able to get data into the
crash logs.
This is complicated because of the way the demangling library gets used,
the upshot of which is that I've had to add a second object library just
for libswiftCore's use, so that the demangler will use the runtime's
error handling functions when present, and fall back on its own when
they aren't.
rdar://89139049
- The previous implementation trapped when the source buffer was empty. That behaviour is both not documented and unnecessary. If the source buffer is empty, its length is necessarily shorter or equal than the length of the destination.
- The updated version simply returns when the source buffer is empty.
Windows ARM64 is a LLP64 platform, which means that `unsigned long` is a
32-bit value. This was already mapped properly for x86_64, but somehow
had missed ARM64. This repairs that which is required for building the
standard library.