Multiline string literal at attribute message position was disallowed in
59778f8ecb.
Reworked to try to at least get multiline strings working which might be
useful as messages for attributes (for example a detailed “unavailable”
annotation) minus the code which read off the start of the StringRef buffer.
Lexer::getEncodedStringSegment (now getEncodedStringSegmentImpl)
assumes that it can read one byte past the end of a string segment in
order to avoid bounds-checks on things like "is this a \r\n
sequence?". However, the function was being used for strings that did
not come from source where this assumption was not always valid.
Change the reusable form of the function to always copy into a
temporary buffer, allowing the fast path to continue to be used for
normal parsing.
Caught by ASan!
rdar://problem/44306756
in attribute message
Strings of diagnostics message processed in EncodedDiagnosticMessage
aren't necessarily from parsed Swift source code. That means, they might
not have quotes around it. Furthermore, memory around them might not be
managed. The logic in 'Lexer::getEncodedStringSegment()' used to cause
access violation.
For now, disable multiline string literal and extended escaping in string
literal for attribute message position. Considering the message might be
from Clang, we cannot simply enable this.
rdar://problem/44228891
The right way is findBufferContainingLoc. getBufferIdentifierForLoc is
both slower and wrong in the presence of #sourceLocation.
I couldn't come up with a test for the change in IDE/Utils.cpp because
refactoring still seems to be broken around #sourceLocation. I'll file
bugs for that.
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
Having this be a single buffer hardcoded in the SourceManager and set
by all clients is silly. SourceFiles with the 'Main' kind are allowed
to have hashbang lines (`#!`), other files are not. And anyone
manually setting up a Lexer can decide for themselves.
No intended behavioral change.
Although backtick is a kind of trivia piece, Token::getCommentRange doesn't
take it into account. Invalid Token::getCommentRange used to cause
compiler crash.
rdar://problem/42492793
https://bugs.swift.org/browse/SR-8315
Don't check SIL_KEYWORDs when we're not in SIL mode.
Lexer::kindOfIdentifier is super hot-path because many of
tokens in a swift source file are keywords or identifiers.