mirror of
https://github.com/apple/swift.git
synced 2025-12-14 20:36:38 +01:00
The user experience with extended grapheme literals is currently: 1. Strict: we hard error on "invalid" grapheme literals. 2. Complete: we validate all literals to either be known-single-grapheme or not. 3. Incorrect: we have Unicode 8 semantics implemented but applications will have some other version of Unicode as dictated by the OS they are running on. In Swift 4.0, this incorrectness mostly crops up in obscure corner case areas, where we are overly restrictive in some ways and overly relaxed in others. But, there is one particularly embarrassing area where it does come up: we reject emoji introduced after Unicode 8 as grapheme literals, counter to common user expectations. In a future (sub-)version of Swift we should completely re-evaluate this user story, but doing so in time for Swift 4.0 is untenable. This patch attempts to tweak the way in which we are incorrect in the most minimally invasive way possible to preserve the same user experience while permitting many post-Unicode-8 emoji as valid grapheme literals. This change overrides processing of ZWJ and emoji modifiers to not declare a grapheme break after/before, respectively.