Include the initial implementation of _StringGuts, a 2-word
replacement for _LegacyStringCore. 64-bit Darwin supported, 32-bit and
Linux support in subsequent commits.
In grand LLVM tradition, the first step to redesigning _StringCore is
to first rename it to _LegacyStringCore. Subsequent commits will
introduce the replacement, and eventually all uses of the old one will
be moved to the new one.
NFC.
- Update NSRange -> Range guidance
- Fix example in Optional
- Improve RangeExpression docs
- Fix issue in UnsafeRawBufferPointer.initializeMemory
- Code point -> scalar value most places
- Reposition the dot above the scripty `i'
- Fix ExpressibleByArrayLiteral code sample
Because of the way grapheme breaking changes across updates to ICU and the Unicode standard, it may not even be legit to check this at all. It's certainly not unsafe to skip the check, so let's see if we can do that in release builds, as grapheme breaking is expensive.
This takes care of the standard library portion, but we need a new
BuiltinUTF16ExtendedGraphemeClusterLiteralConvertible protocol in order to
fully recover the performance of character literals.
Note that part of the character_literals.swift test is currently disabled. That
will need to be fixed before we can merge this work.
A `Character` _should_ contain only a single grapheme, but we can't formally require it because of grapheme cluster literals and the shifting sands of Unicode. Fixes https://bugs.swift.org/browse/SR-4955
This speeds up construction of a String from large Character representations,
and various other operations that would otherwise require additional grapheme
breaking just to interpret the Character.
- remove additional 'characters' references from String docs
- improved language around escaping pointer arguments
- key path type abstracts
- codable type abstract revisions
- a few more NSString API fixes
* removing .characters from examples
* beginning new String doc revisions
* improvements to the String Foundation overlay docs
* minor revisions elsewhere
This is a follow-up fix for making struct constructors inline(__always) in
155db0a4bd: Let Character literals, which fit into 64 bits, be folded into a single integer constant.
and
d8f1caf4a6: Inline all the new low-level bits
If we decide that this structs should not have fixed layout we must re-evaluate the performance difference of not being able to inline
the struct constructors.
This is done by ensuring that the corresponding Character constructor is inlined. llvm will do the constant folding.
Also add a test which checks this.
It makes character literals much faster (3x improvement for the CharacterLiteralsSmall benchmark)
And it removes _a lot_ of redundant code (~80% for the CharacterLiteralsSmall benchmark)
This optimization checks to see if a Character can fit in the 63-bit
small representation instead of unconditionally constructing a String
and paying the cost of that allocation. If it does, the small
representation is computed directly from its UTF-8 code units.
In optimized builds, this turns Character literals <= 8 UTF-8 code units
long into single 64-bit integer loads -- a huge improvement.
As of now:
* old APIs are just marked as `deprecated` not `unavaiable`. To make it
easier to co-operate with other toolchain repos.
* Value variant of API is implemented as public @private
`_ofInstance(_:)`.
This is another necessary step in introducing changes
for SE-0107: UnsafeRawPointer.
UnsafeRawPointer is great for bytewise pointer operations.
OpaquePointer goes away.
The _RawByte type goes away.
StringBuffer always binds memory to the correct CodeUnit
when allocating memory.
Before accessing the string, a dynamic element width check
allows us to assume the bound memory type.
Generic entry points like atomicCompareExchange no longer handle
both kinds of pointers. Normally that's good because you
should not be using generics in that case, just upcast
to raw pointer. However, with pointers-to-pointers
you can't do that.
In various cases where we had global operators for non-generic
concrete types (such as String + String), move those operators into
the type. This should not affect the sources, but makes the exposition
of the library cleaner.
Plus, it's a good test for the compiler, which uncovered a few issues
where the compiler was coupled with the library.