589 Commits

Author SHA1 Message Date
Rintaro Ishizaki
7b701d57d3 [Lexer] Improve diagnostics for single-quote string literal
Ignore the contents of interpolation.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
e6e55c23e1 [Lexer] Code tweaks in lexStringLiteral()
NFC. Reorder code to improve readability.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
1a4d37597f [Lexer] Remove unnecessary logic from lexCharacter
EOF and newline (in non-multiline string literal) must be handled by
call site. lexCharacter doesn't need to handle them. Added assertion
instead.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
ba5172738c [Lexer] Advance pointer to the end of end-quote in lexCharacter
This simplifies main lexStringLiteral loop
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
893524ea77 [Lexer] Add assertion in advanceIfCustomDelimiter()
CurPtr[-1] must be '#' when called.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
4536e69dd6 [Lexer] Don't emit diagnostics in skipToEndOfInterpolatedExpression()
Removed Diags parameter from it.
Skipped bytes are revisited by main lexer function anyway. So emitting
diagnostics in it causes duplicated errors.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
c9c69633c4 [Lexer] Simplify handling quotes in skipToEndOfInterpolatedExpression()
NFC
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
0e9b232755 [Lexer] Fix double diagnostics for unterminated string literal
We should not emit diagnostics in skipToEndOfInterpolatedExpression()
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
143f55a6e5 [Lexer] Add formStringLiteralToken dedicated for forming string literal 2018-09-19 18:58:54 +09:00
John Holdsworth
7ae5a7af6a Point link to Zero-width Jira (#19364) 2018-09-18 23:36:32 +09:00
John Holdsworth
e55d4254d2 [Parse] Allow multiline attribute messages (SE-200) (#19219)
Multiline string literal at attribute message position was disallowed in
59778f8ecb.

Reworked to try to at least get multiline strings working which might be
useful as messages for attributes (for example a detailed “unavailable”
annotation) minus the code which read off the start of the StringRef buffer.
2018-09-14 10:33:32 +09:00
Jordan Rose
e180bf9d4e [Parse] Tweak a utility function that relies on reading past the end (#19230)
Lexer::getEncodedStringSegment (now getEncodedStringSegmentImpl)
assumes that it can read one byte past the end of a string segment in
order to avoid bounds-checks on things like "is this a \r\n
sequence?". However, the function was being used for strings that did
not come from source where this assumption was not always valid.
Change the reusable form of the function to always copy into a
temporary buffer, allowing the fast path to continue to be used for
normal parsing.

Caught by ASan!

rdar://problem/44306756
2018-09-11 20:11:58 -07:00
Rintaro Ishizaki
59778f8ecb [Parse] Disable support for multiline/extended escaping string literal
in attribute message

Strings of diagnostics message processed in EncodedDiagnosticMessage
aren't necessarily from parsed Swift source code. That means, they might
not have quotes around it. Furthermore, memory around them might not be
managed. The logic in 'Lexer::getEncodedStringSegment()' used to cause
access violation.

For now, disable multiline string literal and extended escaping in string
literal for attribute message position. Considering the message might be
from Clang, we cannot simply enable this.

rdar://problem/44228891
2018-09-08 17:32:36 +09:00
John Holdsworth
4da8cbe655 Implement SE-0200 (extended escaping in string literals)
Supports string literals like #"foo"\n"bar"#.
2018-09-06 15:19:52 -07:00
Brent Royal-Gordon
df22ea1bfb Revert "[Parse] Implementation for SE-200 (raw strings)" 2018-09-06 12:22:41 -07:00
swift-ci
192587c98a Merge pull request #17668 from johnno1962a/master 2018-09-06 11:47:36 -07:00
John Holdsworth
9208c5bca6 Final nits in comments. 2018-09-06 14:20:19 +01:00
John Holdsworth
f0f08e1e86 Remove zero width detection for now 2018-09-05 21:51:19 +01:00
John Holdsworth
999bb40294 New diagnostic for closing delimiter 2018-09-04 20:21:20 +01:00
John Holdsworth
02f7cd5db6 generated zero-width characters 2018-09-02 23:59:34 +01:00
John Holdsworth
dc96342368 Response to xwu's review 2018-09-02 11:37:02 +01:00
John Holdsworth
3fc43bcb80 Check for zero-width characters in delimiters 2018-09-01 21:54:54 +01:00
John Holdsworth
032d865fa1 Response to rintaro's 2nd review 2018-08-31 18:51:37 +01:00
Jordan Rose
63cd1258ea Stop using SourceManager::getBufferIdentifierForLoc to find buffer IDs
The right way is findBufferContainingLoc. getBufferIdentifierForLoc is
both slower and wrong in the presence of #sourceLocation.

I couldn't come up with a test for the change in IDE/Utils.cpp because
refactoring still seems to be broken around #sourceLocation. I'll file
bugs for that.
2018-08-29 11:46:41 -07:00
John Holdsworth
9691076af0 Response to rintaro's review 2018-08-27 19:14:26 +01:00
John Holdsworth
4209b72a66 Delimiter specific diagnostic 2018-08-27 10:15:50 +01:00
John Holdsworth
6bd7cb884c Pragmatic support of multiline/delimited in attributes 2018-08-20 08:43:53 +01:00
John Holdsworth
7866093ea5 Extend token boundary to include delimiter 2018-08-17 02:12:01 +01:00
Alex Hoppen
ac512d4341 [libSyntax] Add a reference counted version of OwnedString
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
2018-08-13 15:37:53 -07:00
John Holdsworth
74dd71ca9b Delimited/Raw strings inside interpolations 2018-08-10 17:27:38 +01:00
Jordan Rose
fc9ea1e329 Add Lexer::IsHashbangAllowed, drop SourceManager::getHashbangBufferID (#18534)
Having this be a single buffer hardcoded in the SourceManager and set
by all clients is silly. SourceFiles with the 'Main' kind are allowed
to have hashbang lines (`#!`), other files are not. And anyone
manually setting up a Lexer can decide for themselves.

No intended behavioral change.
2018-08-07 08:25:05 -07:00
Rintaro Ishizaki
1aacb8fefb Merge pull request #17788 from rintaro/parse-identifier-drop3
[Parse] Drop Swift3 support for '$', 'throws', and 'rethrows' as identifier
2018-07-25 19:36:39 +09:00
Rintaro Ishizaki
21db9723e4 [Lexer] Don't include backtick length into comment length
Although backtick is a kind of trivia piece, Token::getCommentRange doesn't
take it into account. Invalid Token::getCommentRange used to cause
compiler crash.

rdar://problem/42492793
https://bugs.swift.org/browse/SR-8315
2018-07-24 04:37:07 +09:00
John Holdsworth
2317048f44 Alternative implementation for raw strings 2018-07-09 18:38:54 +01:00
Rintaro Ishizaki
a810908da1 [Parse] Drop Swift3 support for '$' as an identifier
Swift3 used to parse '$' as an identifier without digagnostics.

Related 6accc5989e
2018-07-06 18:04:26 +09:00
John Holdsworth
14213b84bd Revised implementation for raw strings 2018-07-02 12:54:06 +01:00
Rintaro Ishizaki
da764e16f2 [Lexer][QoI] Diagnose and fix-it consecutive 'U+00A0's at once 2018-05-01 20:31:37 +09:00
Ahmad Alhashemi
036b2f534d Minor style edits 2018-04-23 13:21:06 -04:00
Ahmad Alhashemi
e8c17b6686 Move non-breaking space handling to lexUnknown 2018-04-22 15:54:07 -04:00
Ahmad Alhashemi
1603ec2bee [Parser] Detect nonbreaking space U+00A0 and fixit 2018-04-22 15:54:07 -04:00
Huon Wilson
00c32698e2 [Parse] Put indent-is-4-spaces assumption in one place.
This single location can, theoretically, be made more intelligent about
deducing indent from elsewhere and all the consumers will just work.
2018-04-04 10:34:33 +10:00
Rintaro Ishizaki
7237875870 [Parse] Eliminate square_lit token 2018-03-14 21:50:53 +09:00
omochimetaru
3c8057e13f [Parse] Remove unnecessary Lexer fields 2018-03-12 23:35:13 +09:00
Rintaro Ishizaki
1dc3e17f52 [Lexer] Performance improvement for Lexer::kindOfIdentifier (#15168)
Don't check SIL_KEYWORDs when we're not in SIL mode.
Lexer::kindOfIdentifier is super hot-path because many of
tokens in a swift source file are keywords or identifiers.
2018-03-12 15:03:58 +09:00
omochimetaru
8592496b22 [Parse] Remove dead code in lexHash (#15154) 2018-03-12 08:47:54 +09:00
omochimetaru
5e555868fe [Parse] Reduce branch by running lexTrivia always (#15137) 2018-03-10 21:26:54 +09:00
Sho Ikeda
74ba135008 Merge pull request #15040 from ikesyo/gardening-not-empty
[gardening] Use `!empty()` over `size() > 0`
2018-03-08 18:47:12 +09:00
omochimetaru
c4c5b7130e [Parse] Fix error in unsigned char target (#15068) 2018-03-08 15:57:43 +09:00
omochimetaru
967a48cb77 [Parse] refactor Lexer initialization 2018-03-08 11:17:51 +09:00
Sho Ikeda
cea6c03eb2 [gardening] Use !empty() over size() > 0 2018-03-08 09:21:09 +09:00