20 Commits

Author SHA1 Message Date
Tim Kientzle
1d961ba22d Add #include "swift/Basic/Assertions.h" to a lot of source files
Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
2024-06-05 19:37:30 -07:00
Tony Allevato
300a952ede Replace u8 string literal prefixes with SWIFT_UTF8 macro.
In C++20, `u8` literals create values of type `char8_t` instead of
`char`, and these can't be implicitly converted. This macro
mitigates the difference and allows the same code to compile under
C++14/17 modes and C++20, preserving the `char` type while ensuring
that the text is interpreted as UTF-8.
2023-08-04 16:41:36 -04:00
Ben Barham
e2c9836a1d [CursorInfo] Add Clang documentation to SymbolGraph output
This currently doesn't check for inherited docs, ie. either the
imported declaration has docs or it doesn't. There's also a few odd
cases with mixed doc types and when each line is prefixed with '*', but
it's good enough for an initial implementation.

Moves UTF8 sanitisation out of ASTPrinter.h and into Unicode.h so that
it can be used here as well.

Resolves rdar://91388603.
2022-04-08 13:46:38 -07:00
Becca Royal-Gordon
4bd532ab9a Don't import string macros with invalid UTF-8
Swift string literals are only permitted to contain well-formed UTF-8, but C does not share this restriction, and ClangImporter wasn't checking for that before it created `StringLiteralExpr`s for imported macros; this could cause crashes when importing a header. This commit makes us drop these macros instead.

Although invalid UTF-8 always *did* cause a segfault in my testing, I'm not convinced that there isn't a way to cause a miscompile with a bug like this. If we somehow did generate code that fed ill-formed UTF-8 to the builtin literal init for Swift.String, the resulting string could cause undefined behavior at runtime. So I have additionally added a defensive assertion to StringLiteralInst that any UTF-8 string represented in SIL is well-formed. Hopefully that will catch any non-crashing compiler bugs like this one.

Fixes rdar://67840900.
2022-01-26 20:57:13 -08:00
Erik Eckstein
755f6aa2e4 AST, SIL: Remove UTF16 encoding from StringLiteralExpr and StringLiteralInst
The UTF16 encoding is not used (anymore). I think it became obsolete with the switch to the UTF8 String representation.
2020-08-06 19:09:09 +02:00
Michael Ilseman
bca1b74427 [Character] Permit tagged emoji Character literals
Loosen up the compiler's grapheme analysis to allow the emoji tagged
sequences as graphemes.
2018-11-15 09:06:20 -08:00
Michael Ilseman
f88fb9a97a Be a little more permissive in emoji grapheme literals.
The user experience with extended grapheme literals is currently:

1. Strict: we hard error on "invalid" grapheme literals.

2. Complete: we validate all literals to either be
known-single-grapheme or not.

3. Incorrect: we have Unicode 8 semantics implemented but applications
will have some other version of Unicode as dictated by the OS they are
running on.

In Swift 4.0, this incorrectness mostly crops up in obscure corner
case areas, where we are overly restrictive in some ways and overly
relaxed in others. But, there is one particularly embarrassing area
where it does come up: we reject emoji introduced after Unicode 8 as
grapheme literals, counter to common user expectations.

In a future (sub-)version of Swift we should completely re-evaluate
this user story, but doing so in time for Swift 4.0 is untenable. This
patch attempts to tweak the way in which we are incorrect in the most
minimally invasive way possible to preserve the same user experience
while permitting many post-Unicode-8 emoji as valid grapheme literals.

This change overrides processing of ZWJ and emoji modifiers to not
declare a grapheme break after/before, respectively.
2017-07-21 13:33:03 -07:00
Bob Wilson
37e7d1c627 Merge remote-tracking branch 'origin/master' into master-next 2017-01-08 17:07:46 -08:00
practicalswift
6d1ae2a39c [gardening] 2016 → 2017 2017-01-06 16:41:22 +01:00
Bob Wilson
13da3fa8b1 Merge remote-tracking branch 'origin/master' into master-next 2016-12-04 18:16:09 -08:00
practicalswift
797b80765f [gardening] Use the correct base URL (https://swift.org) in references to the Swift website
Remove all references to the old non-TLS enabled base URL (http://swift.org)
2016-11-20 17:36:03 +01:00
Bob Wilson
c08a96a880 Update references to UTF* types and functions to match llvm r282822.
The content of LLVM's "Support/ConvertUTF.h" header was moved into the
"llvm" namespace. Update this code to match.
2016-10-13 17:05:51 -07:00
Saleem Abdulrasool
0c32cb3972 Basic: include missing header (#2239)
We use SmallVector, but do not include the header.  Include it explicitly.  NFC.
2016-04-18 19:50:17 -07:00
Zach Panzarino
e3a4147ac9 Update copyright date 2015-12-31 23:28:40 +00:00
Dmitri Hrybenko
f6e926224e Fix warnings about unused variables when assertions are turned off
Swift SVN r23641
2014-12-03 04:48:38 +00:00
Dmitri Hrybenko
8fa7112760 Fix 80-column violation
Swift SVN r22551
2014-10-06 21:01:22 +00:00
Roman Levenstein
97014172b7 [sil-combine] String literal concatenation optimization. Constant-fold concatenation of string literals known at compile-time.
Addresses rdar://17033696.

Swift SVN r21526
2014-08-28 11:33:21 +00:00
Dmitri Hrybenko
938e7c2676 stdlib: introduce UnicodeScalarLiteralConvertible protocol
This allows UnicodeScalars to be constructed from an integer, rather
then from a string.  Not only this avoids an unnecessary memory
allocation (!) when creating a UnicodeScalar, this also allows the
compiler to statically check that the string contains a single scalar
value (in the same way the compiler checks that Character contains only
a single extended grapheme cluster).

rdar://17966622

Swift SVN r21198
2014-08-14 16:04:39 +00:00
Dmitri Hrybenko
7704e19b7d libBasic: implement extended grapheme cluster segmentation algorithm
This is only for the frontend, not for stdlib.  The implementation is very
slow, optimizing it is the next step.

rdar://16755123 rdar://16013860


Swift SVN r18928
2014-06-16 14:20:43 +00:00
Dmitri Hrybenko
669f633070 Add "single extended grapheme cluster" literals (SEGCL) -- a subset of
double-quoted string literals that contain a single extended grapheme cluster

SEGCL by default infer type String, but you can ask to infer Character
for them.

Single quoted literals continue to infer Character.

Actual extended grapheme cluster segmentation is not implemented yet,
<rdar://problem/16755123> Implement extended grapheme cluster
segmentation in libSwiftBasic

This is part of
<rdar://problem/16363872> Remove single quoted characters

Swift SVN r17034
2014-04-29 14:08:16 +00:00