Commit Graph

588 Commits

Author SHA1 Message Date
omochimetaru
22cddbf033 [Syntax] Parse invalid chars as trivia 2018-03-06 08:25:09 -08:00
omochimetaru
b4192d80e9 [Parse] EmitDiagnosticsIfToken in lexUnknown
It is needed from lexTrivia update future.
2018-03-06 08:25:09 -08:00
omochimetaru
e6f42fc63d [Parse] split lexUnknown function from lexImpl 2018-03-06 08:25:09 -08:00
omochimetaru
0a69bd7f8d [NFC] readable control flow in lexImpl default case
There are 3 token flow and 2 trivia flow.
This commit makes it clear and more readable.

And apply clang-format.
2018-03-06 08:25:09 -08:00
omochimetaru
3e252c98a8 [NFC] Fix naming style in lexImpl default case 2018-03-06 08:25:09 -08:00
omochimetaru
190af6c063 [Syntax] support nul character as garbage trivia 2018-03-05 16:53:24 +09:00
omochimetaru
58857fa1cb [Parse] refactor Lexer by made NulCharacterKind type 2018-03-05 16:53:18 +09:00
Sho Ikeda
26d650292f [gardening] Use empty() over size() == 0 2018-03-05 14:43:13 +09:00
Rintaro Ishizaki
766774206b [Lexer] Don't setEscapedIdentifier(true) for tok::eof at ArtificialEOF
https://bugs.swift.org/browse/SR-6926

This happens when the Parser re-lexing comment tokens that sets
ArtificialEOF at the end of comment range.
It used to cause an assertion failure:
(!value || Kind == tok::identifier) && "only identifiers can be escaped identifiers"
2018-02-12 14:58:12 +09:00
Rintaro Ishizaki
0780c529c4 [Syntax] Unify RawSyntax and RawTokenSyntax using union and TrailingObjects
It better matches with SwiftSyntax model.

Using TrailingObjects reduces the number of heap allocation which
gains 18% performance improvement.
2018-01-18 14:49:46 +09:00
Erik Eckstein
a680768971 SIL: In textual SIL allow global SIL names starting with '$'.
For example: @$S1m3fooyyF
It's needed to change the mangling prefix to $S.
The parser change only affects SIL (and not swift).

I didn't add test case because it will be fully tested when changing the mangling prefix.
2018-01-05 11:29:15 -08:00
omochimetaru
bc88330740 [Parse] Lexer build backtick trivia around espaced identifier token 2017-12-29 00:22:49 +09:00
omochimetaru
ebd5323b42 [Syntax] add UTF-8 BOM support to libSyntax 2017-12-28 01:26:09 +09:00
omochimetaru
861ee3a112 [Parse] use pre increment for simple increment (#13624) 2017-12-27 15:46:58 +09:00
omochimetaru
70986a687f [Parse] fix lexTrivia LF bug 2017-12-22 14:04:14 +09:00
omochimetaru
fbe34e0f6f [Parse] improve LF handling efficiency. 2017-12-22 01:23:25 +09:00
omochimetaru
f86e1c8201 [Parse] add CRLF support in lexTrivia 2017-12-21 14:27:13 +09:00
omochimetaru
9daeaf0d06 [Parse] refactor lexTrivia with squash 2017-12-20 14:09:47 +09:00
omochimetaru
24509a0bde [Parse] Change LeadingTrivia type to Trivia 2017-12-20 14:09:47 +09:00
Rintaro Ishizaki
cc72a3b934 [Lexer] Use ContentStart position for hashbang trivia 2017-12-19 09:24:34 +09:00
Rintaro Ishizaki
2c06060165 [Syntax] Add CarriageReturn trivia kind
To distinguish '\r' from '\n'.
2017-12-19 09:24:34 +09:00
Rintaro Ishizaki
181333ce0f [Lexer] Lex conflict marker as a trivia 2017-12-19 09:24:33 +09:00
omochimetaru
5de598f34a [Parse] use skipHashbang in lexTrivia 2017-12-18 18:22:04 +09:00
omochimetaru
aeb9ba6f96 [Parse] use skipSlashSlashComment in lexTrivia 2017-12-18 18:22:04 +09:00
omochimetaru
f7136ae635 [Parse] delete skipUpToEndOfLine 2017-12-18 18:22:04 +09:00
omochimetaru
ed58c152bf [Parse] Improve Lexer's UTF-8 BOM handling (#13483)
* Add BOM handling testcases
* Add ContentStart to Lexer for BOM handling
2017-12-18 17:22:11 +09:00
Rintaro Ishizaki
5571e5cc76 [Lexer] Clear trivia at the top of lexImpl()
To make sure we only parse trivia for the current token.
2017-12-08 12:08:05 +09:00
Rintaro Ishizaki
9b32c62fbf [Lexer] Add TODO/FIXMEs for lexTrivia 2017-12-08 12:08:05 +09:00
Rintaro Ishizaki
2b1e316cf6 [Syntax] Add parsing hashbang (shebang) as a trivia.
Added GarbageText trivia kind for any skipped text.
2017-12-08 12:07:00 +09:00
Rintaro Ishizaki
e7a393f13f [Lexer] Lex vertical tab '\v' and form-feed '\t' trivias 2017-12-08 11:36:20 +09:00
Rintaro Ishizaki
d767dc39ba [Lexer] Improve implementation of lexTrivia 2017-12-08 11:36:20 +09:00
Rintaro Ishizaki
dcc37c3340 [Syntax] Normalize TriviaPiece internal value.
Length field of comments are always 1.
Text field of whitespaces are always "".
2017-12-04 10:46:03 -08:00
Rintaro Ishizaki
e01d525621 [Lexer] Remove some special trivia handling in Lexer
Even in multiline string mode, we should parse trailing trivia.

Removed special handling for backtick trivias, it's not produced in
Lexer anyway.
2017-12-04 10:46:03 -08:00
Rintaro Ishizaki
d46073dd75 [libSyntax] Backtracking restarts from leading trivia position
When reading syntax.
2017-12-04 10:46:03 -08:00
Rintaro Ishizaki
a78fda0720 [Syntax] Always lex Trivia when SF.shouldKeepSyntaxInfo()
For backward compatibility, Don't lex comments as trailing trivias.
2017-11-17 14:56:49 +09:00
Rintaro Ishizaki
40b195d98c [Syntax] Get rid of fullLex
Defer (Token, Trivia) -> RawTokenSyntax conversion from Lexer to Parser.
This is a part of effort for consolidating Syntax and AST parsing.
2017-11-17 14:56:49 +09:00
Xi Ge
75db3c1db8 Re-apply libSyntax patches after fixing ASAN issue (#12730)
* Re-apply "libSyntax: Ensure round-trip printing when we build syntax tree from parser incrementally. (#12709)"

* Re-apply "libSyntax: Root parsing context should hold a reference to the current token in the parser, NFC."

* Re-apply "libSyntax: avoid copying token text when lexing token syntax nodes, NFC. (#12723)"

* Actually fix the container-overflow issue.
2017-11-03 13:25:33 -07:00
Xi Ge
7ebf66ed2d libSyntax: forward declare libSyntax entities in several header files, NFC. (#12735) 2017-11-02 20:55:18 -07:00
Xi Ge
4d1249aa82 Revert "libSyntax: Ensure round-trip printing when we build syntax tree from parser incrementally. (#12709)"
This reverts commit 0d98c4c5df.
2017-11-02 14:44:26 -07:00
Xi Ge
407db56b8d Revert "libSyntax: avoid copying token text when lexing token syntax nodes, NFC. (#12723)"
This reverts commit 7981630ddd.
2017-11-02 14:43:42 -07:00
Xi Ge
7981630ddd libSyntax: avoid copying token text when lexing token syntax nodes, NFC. (#12723)
This is likely the root cause for memory surge when we always turn on
syntax token lexing. Since the underlying buffer outlives the syntax
tree, it's reasonable to refer the text instead of copying and owning it.
2017-11-02 14:04:25 -07:00
Xi Ge
0d98c4c5df libSyntax: Ensure round-trip printing when we build syntax tree from parser incrementally. (#12709) 2017-11-01 20:29:30 -07:00
Doug Gregor
8f43cba0b5 [Syntax] Replace TrivialList's std::deque with a std::vector.
For very large source files, the parser's syntax map---which contains a
very large number of TrivialLists---was taking an inordinate amount of
memory due to the inefficiency of std::deque. Specifically, a
std::deque containing just one trivial element would allocate 4k of
memory. With the ~120MB SIL output of one of the parse_stdlib tests,
these std::deques would add up to > 6GB of memory, most of which is
wasted.

Replacing the std::deque with a std::vector knocks the memory required
for one of the parse_stdlib tests from > 8GB down closer to 2 GB. The
parser's syntax map is still large (e.g., a 512MB allocation for the
overall vector plus a few hundred MB of raw-syntax data), but not
prohibitively so.

Part of rdar://problem/34771322.
2017-11-01 14:02:21 -07:00
Doug Gregor
945ac3de0a Revert " Re-enable parse_stdlib tests." 2017-11-01 06:59:35 -07:00
Doug Gregor
62f43ae75b [Syntax] Replace TrivialList's std::deque with a std::vector.
For very large source files, the parser's syntax map---which contains a
very large number of TrivialLists---was taking an inordinate amount of
memory due to the inefficiency of std::deque. Specifically, a
std::deque containing just one trivial element would allocate 4k of
memory. With the ~120MB SIL output of one of the parse_stdlib tests,
these std::deques would add up to > 6GB of memory, most of which is
wasted.

Replacing the std::deque with a std::vector knocks the memory required
for one of the parse_stdlib tests from > 8GB down closer to 2 GB. The
parser's syntax map is still large (e.g., a 512MB allocation for the
overall vector plus a few hundred MB of raw-syntax data), but not
prohibitively so.

Part of rdar://problem/34771322.
2017-10-31 23:33:19 -07:00
Xi Ge
844aeae2d5 Re-apply "libSyntax: create a basic infrastructure for generating libSyntax entities by using Parser." (#12538) 2017-10-20 22:58:28 -07:00
Greg Parker
48a6b9d464 Revert "libSyntax: create a basic infrastructure for generating libSyntax entities by using Parser."
This reverts commit ee7a06276d.
It causes build failures like "'swift/Syntax/SyntaxNodes.h' file not found".
2017-10-19 17:11:48 -07:00
Xi Ge
ee7a06276d libSyntax: create a basic infrastructure for generating libSyntax entities by using Parser. 2017-10-18 17:02:00 -07:00
Saleem Abdulrasool
7bd2256120 runtime: clean up last of -Wqual-cast warnings
This fixes up the remaining cast qualifier warnings from GCC 6.  Use
multiple casts to adjust the const qualification.  Prefer C++ style
casts.  NFC.
2017-09-22 14:14:13 -07:00
Xi Ge
2ba1ca2d8f IDE: simplify some code. NFC (#11935)
* IDE: simplify some code. NFC

* add assert.
2017-09-14 18:19:48 -07:00