Commit Graph

49 Commits

Author SHA1 Message Date
Ben Barham
9779c18da3 Rename startswith to starts_with
LLVM is presumably moving towards `std::string_view` -
`StringRef::startswith` is deprecated on tip. `SmallString::startswith`
was just renamed there (maybe with some small deprecation inbetween, but
if so, we've missed it).

The `SmallString::startswith` references were moved to
`.str().starts_with()`, rather than adding the `starts_with` on
`stable/20230725` as we only had a few of them. Open to switching that
over if anyone feels strongly though.
2024-03-13 22:25:47 -07:00
pinkjuice66
50d6b1f663 [Parse] Add test cases for validating UTF-8 correctness 2024-01-09 19:31:00 +09:00
Alex Hoppen
fe2ae72ad2 [IDE] Rename CodeCompletion to IDEInspection in cases where the code path no longer exclusively applies to code completion
The code completio infrastructure is also being used for cursor info now, so it should no longer be called code completion.

rdar://103251187
2022-12-13 11:41:05 +01:00
Robert Widmann
37e7052c68 Remove -emit-syntax and -verify-syntax-tree 2022-11-16 15:07:48 -08:00
Robert Widmann
d1cadb86c5 Remove Explicit Trivia from Lexer 2022-11-16 14:52:28 -08:00
Rintaro Ishizaki
a673043737 Terminology change: 'garbage' -> 'unexpected'
There are no "garbage" characters in Swift code. They are just
"unexpected."
2022-08-15 14:32:28 -07:00
Hamish Knight
080d59b3df [Lexer] Delay token diagnostics
Queue up diagnostics when lexing, waiting until
`Lexer::lex` is called before emitting them. This
allows us to re-lex without having to deal with
previously invalid tokens.
2022-04-12 16:03:47 +01:00
Alex Hoppen
a7641a7cd2 [Lexer] Adjust tests for new delayed trivia lexing 2021-02-05 08:15:54 +01:00
Owen Voorhees
45bc578ae5 [SourceManager] Rename line and column APIs for clarity 2020-05-21 12:54:07 -05:00
Owen Voorhees
8a6711769e [Diagnostics] Refactor DiagnosticConsumer interface
DiagnosticInfo now holds all the information needed to consume
a diagnostic, so remove unneeded parameters from handleDiagnostic.
2019-10-29 13:52:12 -07:00
Gwen Mittertreiner
67cfef2d60 SectionRef::getContents() now returns Expected
Updated uses of object::SectionRef::getContents() since it now returns
an Expected<StringRef> instead of modifying the one it's passed.

See also: git-svn-id:
https://llvm.org/svn/llvm-project/llvm/trunk@360892
91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-20 16:02:41 -07:00
David Ungar
44daa88ebd Format 2019-04-03 12:53:31 -07:00
David Ungar
9cc3a4a9d8 Rename defaultDiagnosticLoc to bufferIndirectlyCausingDiagnostic. 2019-04-03 12:52:49 -07:00
David Ungar
c139c5909a Pass defaultDiagnosticLoc to handleDiagnostic, not currentPrimaryInput. 2019-04-02 22:27:55 -07:00
David Ungar
7a0e0ffc8a Store current primary in diagnostic engine, pass it down via handleDiagnostic. Unformmated. 2019-04-02 00:43:28 -07:00
Harlan Haskins
f5fc6f0c57 [Lexer] Handle SwiftInterface files as well as SIL
Previously, the Lexer kept a single flag whether we’re lexing Swift or SIL. Instead, keep track if we’re parsing Swift, SIL, or a Swiftinterface file. .swiftinterface files allow $-prefixed identifiers anywhere.
2019-01-22 11:02:36 -08:00
Argyrios Kyrtzidis
5b1aab1cc6 [unittests/Parse] Update unit tests to accomodate ParsedTrivia introduction 2019-01-17 13:47:29 -08:00
Jordan Rose
e180bf9d4e [Parse] Tweak a utility function that relies on reading past the end (#19230)
Lexer::getEncodedStringSegment (now getEncodedStringSegmentImpl)
assumes that it can read one byte past the end of a string segment in
order to avoid bounds-checks on things like "is this a \r\n
sequence?". However, the function was being used for strings that did
not come from source where this assumption was not always valid.
Change the reusable form of the function to always copy into a
temporary buffer, allowing the fast path to continue to be used for
normal parsing.

Caught by ASan!

rdar://problem/44306756
2018-09-11 20:11:58 -07:00
Jordan Rose
63cd1258ea Stop using SourceManager::getBufferIdentifierForLoc to find buffer IDs
The right way is findBufferContainingLoc. getBufferIdentifierForLoc is
both slower and wrong in the presence of #sourceLocation.

I couldn't come up with a test for the change in IDE/Utils.cpp because
refactoring still seems to be broken around #sourceLocation. I'll file
bugs for that.
2018-08-29 11:46:41 -07:00
Jordan Rose
fc9ea1e329 Add Lexer::IsHashbangAllowed, drop SourceManager::getHashbangBufferID (#18534)
Having this be a single buffer hardcoded in the SourceManager and set
by all clients is silly. SourceFiles with the 'Main' kind are allowed
to have hashbang lines (`#!`), other files are not. And anyone
manually setting up a Lexer can decide for themselves.

No intended behavioral change.
2018-08-07 08:25:05 -07:00
omochimetaru
967a48cb77 [Parse] refactor Lexer initialization 2018-03-08 11:17:51 +09:00
omochimetaru
1588fb914b [Parse] test current buggy behavior 2018-03-08 11:17:51 +09:00
omochimetaru
d12542503f [Syntax] test diagnostics in Lexer with libSyntax (#14954) 2018-03-04 08:53:54 +09:00
omochimetaru
ebd5323b42 [Syntax] add UTF-8 BOM support to libSyntax 2017-12-28 01:26:09 +09:00
omochimetaru
c8a6ef13da [Parse] add test about BOM + trivia 2017-12-27 18:06:18 +09:00
Rintaro Ishizaki
b4e7f74ab4 [Syntax] Add dedicated unittest for Lexer with trivia parsing 2017-12-19 09:14:20 +09:00
omochimetaru
ed58c152bf [Parse] Improve Lexer's UTF-8 BOM handling (#13483)
* Add BOM handling testcases
* Add ContentStart to Lexer for BOM handling
2017-12-18 17:22:11 +09:00
Rintaro Ishizaki
60bfa893b9 [Parse] Make PersistentParserState to hold ParserPosition
instead of PersistentParserState::ParserPos.
2017-12-09 13:58:50 +09:00
Rintaro Ishizaki
41c2cf2845 [Parse] Add test for Lexer::getStateForBeginningOfToken() with Trivia 2017-12-04 10:46:03 -08:00
David Farler
f450f0ccdf Revert "Preserve whitespace and comments during lexing as Trivia"
This reverts commit d6e2b58382.
2016-11-18 13:23:31 -08:00
David Farler
d6e2b58382 Preserve whitespace and comments during lexing as Trivia
Store leading a trailing "trivia" around a token, such as whitespace,
comments, doc comments, and escaping backticks. These are syntactically
important for preserving formatting when printing ASTs but don't
semantically affect the program.

Tokens take all trailing trivia up to, but not including, the next
newline. This is important to maintain checks that statements without
semicolon separators start on a new line, among other things.

Trivia are now data attached to the ends of tokens, not tokens
themselves.

Create a new Syntax sublibrary for upcoming immutable, persistent,
thread-safe ASTs, which will contain only the syntactic information
about source structure, as well as for generating new source code, and
structural editing. Proactively move swift::Token into there.

Since this patch is getting a bit large, a token fuzzer which checks
for round-trip equivlence with the workflow:

fuzzer => token stream => file1
  => Lexer => token stream => file 2 => diff(file1, file2)

Will arrive in a subsequent commit.

This patch does not change the grammar.
2016-11-15 16:11:57 -08:00
Chris Lattner
0001dc27bb remove support for the experiemental "character literals" feature.
Swift SVN r30509
2015-07-22 22:35:19 +00:00
Argyrios Kyrtzidis
a935e7c13e [Lexer] Recognize editor placeholders as identifiers and provide a specific error when encountered.
Swift SVN r26212
2015-03-17 01:52:59 +00:00
Dmitri Hrybenko
4d0a6d7db8 Update unittests for LLVM API changes in MemoryBuffer
Swift SVN r21411
2014-08-22 08:46:53 +00:00
Ted Kremenek
7da31bdfdd Disable parsing of single quoted character literals, enabling under a flag.
I didn't want to rip this logic out wholesale.  There is a possibility
the character lexing can be reborn/revisited later, and
disabling it in the parser was easy.

Swift SVN r18102
2014-05-15 07:05:59 +00:00
Dmitri Hrybenko
65cf5f2098 Lexer: compute ArtificialEOF correctly in a sublexer of a sublexer
This fixes code completion crash in rdar://15561934, but there are still no
code completion results in interpolated string literals.


Swift SVN r14539
2014-02-28 23:03:06 +00:00
Dmitri Hrybenko
d681b81641 Revert my r14516, it breaks the buildbot
Swift SVN r14518
2014-02-28 15:28:39 +00:00
Dmitri Hrybenko
3bb9166405 Lexer: don't advance current pointer past end of the source buffer in case
there is a string literal with embedded NUL just before EOF

This used to crash, rdar://15561934


Swift SVN r14516
2014-02-28 14:28:25 +00:00
John McCall
10ac15ed0d Lex $notAllDigits as an identifier and diagnose it in the lexer
outside of debugger-support mode.  Rip out the existing special-case
code when parsing expr-identifier.

This means that the Lexer needs a LangOptions.  Doug and I
talked about just adding that as a field of SourceMgr, but
decided that it was worth it to preserve the possibility of
parsing different dialects in different source files.

By design, the lexer doesn't tokenize fundamentally differently
in different language modes; it might decide something is invalid,
or it might (eventually) use a different token kind for the
same consumed text, but we don't want it deciding to consume more or
less of the stream per token.

Note that SIL mode does make that kind of difference, and that
arguably means that various APIs for tokenizing need to take a
"is SIL mode" flag, but we're getting away with it because we
just don't really care about fidelity of SIL source files.

rdar://14899000

Swift SVN r13896
2014-02-14 01:54:17 +00:00
Argyrios Kyrtzidis
44d46de7c9 Use swift::SourceManager's addNewSourceBuffer() instead of llvm::SourceMgr's AddNewSourceBuffer().
Also remove the SourceLoc parameter from addNewSourceBuffer(). In llvm::SourceMgr
it is used to indicate textual inclusion, which we don't have in swift.

Swift SVN r10014
2013-11-07 00:51:56 +00:00
Argyrios Kyrtzidis
5db368ce7b [Lexer] Introduce Lexer::getLocForStartOfToken() that returns the location at the start of the token that a given offset points to.
Swift SVN r8281
2013-09-16 18:41:16 +00:00
Dmitri Hrybenko
8a8b97985b Lexer: fix a bug where restoring lexer state could produce duplicate
code_complete tokens


Swift SVN r7636
2013-08-27 22:34:29 +00:00
Dmitri Hrybenko
dc439adb59 Lexer: improve recovery for invalid character literals and invalid escape
sequences in charater literals

Now we return tok::character_literal with REPLACEMENT CHARACTER U+FFFD instead
of tok::unknown.


Swift SVN r7475
2013-08-22 20:56:47 +00:00
Argyrios Kyrtzidis
4908da8361 [Lexer] Remove the public Lexer constructor that accepts a StringRef.
Replace uses of it with the newly introduced constructor that accepts a buffer ID.
The StringRef constructor was rather unsafe since it had the implicit requirement that the StringRef
was null-terminated.

Swift SVN r6942
2013-08-06 14:59:03 +00:00
Dmitri Hrybenko
e1c4ae3174 Wrap llvm::SourceMgr in swift::SourceManager so that we can add new members
to the source manager.


Swift SVN r6815
2013-08-01 20:39:22 +00:00
Dmitri Hrybenko
68bba56cec Lexer tests: simplify helper function: no need to pass down a SourceMgr
Swift SVN r6810
2013-08-01 18:39:20 +00:00
Dmitri Hrybenko
72ae1fd842 swift::tokenize: don't include tok::eof, per feedback from Argyrios
Swift SVN r6716
2013-07-29 22:23:31 +00:00
Dmitri Hrybenko
464df1cc11 Lexer: make tok::eof length equal to 0
It used to be equal to 1, which makes Lexer::getLocForEndOfToken() return
an out-of-bounds location for tok::eof.


Swift SVN r6626
2013-07-26 00:14:09 +00:00
Argyrios Kyrtzidis
6e7d0490f7 Allow optionally to produce comment tokens when lexing and add a tokenize() utility function.
Swift SVN r6008
2013-07-05 15:02:42 +00:00