There is some follow-up work remaining:
- test/stdlib/UnicodeTrie test kills the type checker without manual type annotations. <rdar://problem/17539704>
- test/Sema/availability test raises a type error on 'a: String == nil', which we want, but probably not as a side effect of string-to-pointer conversions. I'll fix this next.
Swift SVN r19477
In UTF-8 decoder:
- implement U+FFFD insertion according to the recommendation given in the
Unicode spec. This required changing the decoder to become stateful, which
significantly increased complexity due to the need to maintain an internal
buffer.
- reject invalid code unit sequences properly instead of crashing rdar://16767868
- reject overlong sequences rdar://16767911
In stdlib:
- change APIs that assume that UTF decoding can never fail to account for
possibility of errors
- fix a bug in UnicodeScalarView that could cause a crash during backward
iteration if U+8000 is present in the string
- allow noncharacters in UnicodeScalar. They are explicitly allowed in the
definition of "Unicode scalar" in the specification. Disallowing noncharacters
in UnicodeScalar prevents actually using these scalar values as internal
special values during string processing, which is exactly the reason why they
are reserved in the first place.
- fix a crash in String.fromCString() that could happen if it was passed a null
pointer
In Lexer:
- allow noncharacters in string literals. These Unicode scalar values are not
allowed to be exchanged externally, but it is totally reasonable to have them
in literals as long as they don't escape the program. For example, using
U+FFFF as a delimiter and then calling str.split("\uffff") is completely
reasonable.
This is a lot of changes in a single commit; the primary reason why they are
lumped together is the need to change stdlib APIs to account for the
possibility of UTF decoding failure, and this has long-reaching effects
throughout stdlib where these APIs are used.
Swift SVN r19045
Preserve the AbstractCC of a function symbol when emitting a SIL ConstantRefInst by adding a new StaticFunction variant to LoweredValue. This lets us avoid a bunch of bitcasting noise for static functions that aren't used as values, and will lets us emit C-to-Swift-ABI thunks on demand when C functions are used as values.
Swift SVN r4543