Add inlinability annotations to restore performance parity with 4.2 String.
Take advantage of known NFC as a fast-path for comparison, and
overhaul comparison dispatch.
RRC improvements and optmizations.
* Refactor out RRC implementation into dedicated file.
* Change our `_invariantCheck` pattern to generate efficient code in
asserts builds and make the optimizer job's easier.
* Drop a few Bidi shims we no longer need.
* Restore View decls to String, workaround no longer needed
* Cleaner unicode helper facilities
This is a giant squashing of a lot of individual changes prototyping a
switch of String in Swift 5 to be natively encoded as UTF-8. It
includes what's necessary for a functional prototype, dropping some
history, but still leaves plenty of history available for future
commits.
My apologies to anyone trying to do code archeology between this
commit and the one prior. This was the lesser of evils.
The code to compare character values can be huge; it doesn’t seem a good idea to forcibly inline it. The @inline(__always) here can make specialized Set<Character> operations balloon 10x their current size with unrelated modifications in Set.
The code size bloat doesn’t always happen; I *guess* that _fastPaths inside String.== cause unintuitive inlining decisions in callers.
Add an isSmall query to Character so testing doesn't have to bake in
internal format. Clarify the purpose of the invalid UTF-16 backdoor
creation method.
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
This includes various revisions to the APIs landing in Swift 4.2, including:
- Random and other randomness APIs
- Hashable changes
- MemoryLayout.offset(of:)
The @inlinable attribute was left off by accident, but this turned out to be a measurable performance boost for Character hashing.
Also add @effects(releasenone), for good measure.
In theory there could be a "fixed-layout" enum that's not exhaustive
but promises not to add any more cases with payloads, but we don't
need that distinction today.
(Note that @objc enums are still "fixed-layout" in the actual sense of
"having a compile-time known layout". There's just no special way to
spell that.)
Use the visitor pattern in most of the opaque-by-hand call
sites. Inspecting the compiler output does not show excessive and
unanticipated ARC, but there may need to be further tweaks.
One downside of the visitor pattern as written is that there's extra
shuffling around of registers for the closure CC. Hopefully this will
also be fixed soon.
Stop inlining _asOpaque into user code. Inlining it bloats user code
as there's a bit-test-and-branch to a block containing the _asOpaque
call, followed up some operations to e.g. manipulate the range or
re-align the calling convention, etc., followed by a final branch to
opaque stdlib code.
Instead, branch directly into opaque stdlib code. In theory, this
means that supporting all opaque patterns can be done with minimal
bloat. On ARM, this is a single tbnz instruction.
In preparation for small strings optimizations, bifurcate StringGuts's
inits to denote when the caller is choosing to skip any checks for is
small. In the majority of cases, the caller has more local information
to guide the decision.
Adds todos and comments as well:
* TODO(SSO) denotes any task that should be done simultaneously with
the introduction of small strings.
* TODO(TODO: JIRA) denotes tasks that should eventually happen
later (and will have corresponding JIRAs filed for).
* TODO(Comparison) denotes tasks when the new string comparison
lands.
Include the initial implementation of _StringGuts, a 2-word
replacement for _LegacyStringCore. 64-bit Darwin supported, 32-bit and
Linux support in subsequent commits.
In grand LLVM tradition, the first step to redesigning _StringCore is
to first rename it to _LegacyStringCore. Subsequent commits will
introduce the replacement, and eventually all uses of the old one will
be moved to the new one.
NFC.
- Update NSRange -> Range guidance
- Fix example in Optional
- Improve RangeExpression docs
- Fix issue in UnsafeRawBufferPointer.initializeMemory
- Code point -> scalar value most places
- Reposition the dot above the scripty `i'
- Fix ExpressibleByArrayLiteral code sample
Because of the way grapheme breaking changes across updates to ICU and the Unicode standard, it may not even be legit to check this at all. It's certainly not unsafe to skip the check, so let's see if we can do that in release builds, as grapheme breaking is expensive.
This takes care of the standard library portion, but we need a new
BuiltinUTF16ExtendedGraphemeClusterLiteralConvertible protocol in order to
fully recover the performance of character literals.
Note that part of the character_literals.swift test is currently disabled. That
will need to be fixed before we can merge this work.
A `Character` _should_ contain only a single grapheme, but we can't formally require it because of grapheme cluster literals and the shifting sands of Unicode. Fixes https://bugs.swift.org/browse/SR-4955