Commit Graph

5 Commits

Author SHA1 Message Date
Karoy Lorentey
a1dae65528 [stdlib] Adjust availability of _CharacterRecognizer conformances
These never made it to 5.8, so their availability needs to be bumped to 5.9.
2023-04-11 16:43:06 -07:00
Karoy Lorentey
2f1ed631e2 [stdlib] _CharacterRecognizer: Add Sendable, Equatable, CustomStringConvertible conformances
Equatability allows faster implementations for updating cached grapheme boundary state after a text mutation, because it enables quick detection of before/after state equality, without having to feed the recognizers until they produce a synchronized grapheme break.

The CustomStringConvertible conformance makes it orders of magnitude more pleasant to debug code that uses this.

Sendable is a baseline requirement for value types these days.
2023-01-06 14:51:37 -08:00
Karoy Lorentey
fa2f63cae0 [stdlib] _CharacterRecognizer._firstBreak(inUncheckedUnsafeUTF8Buffer:startingAt:) 2023-01-03 21:00:01 -08:00
Karoy Lorentey
87422e5dc4 [stdlib] _CharacterRecognizer: Remove initializer argument 2023-01-03 20:59:24 -08:00
Karoy Lorentey
55583ac13c [stdlib] Add new SPI for grapheme breaking (outside String)
`Unicode._CharacterRecognizer` is a newly exported opaque type that
exposes the stdlib’s extended grapheme cluster breaking facility,
independent of `String`.

This essentially makes the underlying simple state machine public,
without exposing any of the (unstable) Unicode details.

The ability to perform grapheme breaking over, say, the scalars stored
in multiple `String` values can be extremely useful while building
custom text processing algorithms and data structures.

Ideally this would eventually become API, but before proposing this
to Swift Evolution, I’d like to prove the shape of the type in actual
use (and we’ll also need to find better names for its operations).
2022-12-30 16:32:01 -08:00