Commit Graph

14 Commits

Author SHA1 Message Date
Dmitri Hrybenko
2103b1d995 stdlib/Unicode: fix UTF-16 decoder not to crash on invalid code unit sequences
Also implemented U+FFFD insertion in UTF-16 decoder according to Unicode
reccomendation.


Swift SVN r19091
2014-06-23 14:52:24 +00:00
Dmitri Hrybenko
f370ca0746 stdlib: fix a bunch of various Unicode issues, primarily in UTF-8 decoding
In UTF-8 decoder:
- implement U+FFFD insertion according to the recommendation given in the
  Unicode spec.  This required changing the decoder to become stateful, which
  significantly increased complexity due to the need to maintain an internal
  buffer.
- reject invalid code unit sequences properly instead of crashing rdar://16767868
- reject overlong sequences rdar://16767911

In stdlib:
- change APIs that assume that UTF decoding can never fail to account for
  possibility of errors
- fix a bug in UnicodeScalarView that could cause a crash during backward
  iteration if U+8000 is present in the string
- allow noncharacters in UnicodeScalar.  They are explicitly allowed in the
  definition of "Unicode scalar" in the specification.  Disallowing noncharacters
  in UnicodeScalar prevents actually using these scalar values as internal
  special values during string processing, which is exactly the reason why they
  are reserved in the first place.
- fix a crash in String.fromCString() that could happen if it was passed a null
  pointer

In Lexer:
- allow noncharacters in string literals.  These Unicode scalar values are not
  allowed to be exchanged externally, but it is totally reasonable to have them
  in literals as long as they don't escape the program.  For example, using
  U+FFFF as a delimiter and then calling str.split("\uffff") is completely
  reasonable.

This is a lot of changes in a single commit; the primary reason why they are
lumped together is the need to change stdlib APIs to account for the
possibility of UTF decoding failure, and this has long-reaching effects
throughout stdlib where these APIs are used.


Swift SVN r19045
2014-06-20 13:07:40 +00:00
Chris Lattner
62cad3dce8 add a "..<" operator, which is an alias for ".." and switch the stdlib to use it.
Until I have a chance to update the testsuite, we'll accept both.



Swift SVN r18998
2014-06-19 05:42:29 +00:00
Nadav Rotem
1dc6de3785 Add a comment that explains how we disable inlining. NFC.
Swift SVN r18646
2014-05-27 00:32:20 +00:00
Nadav Rotem
dd4d9f5820 Make sure we don't inline the cold path by wrapping it with a closure.
Swift SVN r18641
2014-05-26 20:24:05 +00:00
Dave Abrahams
d17c4171da [stdlib] 80-column fixes
Swift SVN r18382
2014-05-19 02:08:19 +00:00
Nadav Rotem
14f4dd3a97 fix a typo
Swift SVN r18182
2014-05-16 06:06:52 +00:00
Nadav Rotem
70dfa94160 Extend the coverage of the fast path in the UTF16 comparator to more than ascii.
Swift SVN r18180
2014-05-16 05:58:39 +00:00
Nadav Rotem
2d592b6af9 Rename local variables. NFC.
Swift SVN r18176
2014-05-16 05:05:34 +00:00
Nadav Rotem
c1db20d178 Rename member_NthContiguous -> _nthContiguous
Swift SVN r18172
2014-05-16 04:26:10 +00:00
Nadav Rotem
c77b97079c Hoist the check for contiguous storage outside of the comparison loop.
Once we know that the storage is contiguous we use the new API _NthContiguous.
We can further optimize this code by specializing the access to ascii or UTF-16.



Swift SVN r18167
2014-05-16 03:44:32 +00:00
Dave Abrahams
a8bbc4c89b [stdlib] String internal API review changes
I had to XFAIL test/ClangModules/cf.swift, which is failing for reasons
I can't understand.  <rdar://problem/16911496>

Swift SVN r18071
2014-05-14 14:18:52 +00:00
Ted Kremenek
9eea282719 Switch range operators ".." and "...".
- 1..3 now means 1,2
- 1...3 now means 1,2,3

Implements <rdar://problem/16839891>

Swift SVN r18066
2014-05-14 07:36:00 +00:00
Ted Kremenek
49d4fca14d Rename UTF16Scalars to UnicodeScalarView.
Implements <rdar://problem/16821900>.

Swift SVN r17899
2014-05-11 23:51:07 +00:00