The user experience with extended grapheme literals is currently:
1. Strict: we hard error on "invalid" grapheme literals.
2. Complete: we validate all literals to either be
known-single-grapheme or not.
3. Incorrect: we have Unicode 8 semantics implemented but applications
will have some other version of Unicode as dictated by the OS they are
running on.
In Swift 4.0, this incorrectness mostly crops up in obscure corner
case areas, where we are overly restrictive in some ways and overly
relaxed in others. But, there is one particularly embarrassing area
where it does come up: we reject emoji introduced after Unicode 8 as
grapheme literals, counter to common user expectations.
In a future (sub-)version of Swift we should completely re-evaluate
this user story, but doing so in time for Swift 4.0 is untenable. This
patch attempts to tweak the way in which we are incorrect in the most
minimally invasive way possible to preserve the same user experience
while permitting many post-Unicode-8 emoji as valid grapheme literals.
This change overrides processing of ZWJ and emoji modifiers to not
declare a grapheme break after/before, respectively.
This allows UnicodeScalars to be constructed from an integer, rather
then from a string. Not only this avoids an unnecessary memory
allocation (!) when creating a UnicodeScalar, this also allows the
compiler to statically check that the string contains a single scalar
value (in the same way the compiler checks that Character contains only
a single extended grapheme cluster).
rdar://17966622
Swift SVN r21198
This is only for the frontend, not for stdlib. The implementation is very
slow, optimizing it is the next step.
rdar://16755123 rdar://16013860
Swift SVN r18928
double-quoted string literals that contain a single extended grapheme cluster
SEGCL by default infer type String, but you can ask to infer Character
for them.
Single quoted literals continue to infer Character.
Actual extended grapheme cluster segmentation is not implemented yet,
<rdar://problem/16755123> Implement extended grapheme cluster
segmentation in libSwiftBasic
This is part of
<rdar://problem/16363872> Remove single quoted characters
Swift SVN r17034