Lexer::BufferID is the single point of truth, Lexer::BufferStart and
Lexer::BufferEnd is just a cache -- they always point to the beginning and end
of the buffer, even in a sublexer.
Swift SVN r7079
Now we have a clear separation between a primary lexer, which scans the whole
buffer, and a sublexer, which can be created from a primary lexer to scan a
part of the buffer.
Swift SVN r7077
Replace uses of it with the newly introduced constructor that accepts a buffer ID.
The StringRef constructor was rather unsafe since it had the implicit requirement that the StringRef
was null-terminated.
Swift SVN r6942
Decouple splitting an interpolated string to segments, from encoding the string segments.
This allows us to tokenize or re-lex a string literal without having to allocate memory for
encoding the string segments when we don't need them encoded.
Swift SVN r6940
-Parse the expression using a begin/end state sub-Lexer instead of a general StringRef Lexer
-Introduce a Lexer constructor accepting a BufferID and a range inside the buffer, and use it for swift::tokenize.
Swift SVN r6938
Also centralizes the knowledge about whether the hashbang is allowed in the
SourceManager. This fixes a bug in tokenize() because previously it just had
to guess.
Swift SVN r6822
around everywhere
Fixes:
rdar://14585108 Code completion does not work at the beginning of the file
rdar://14592634 Code completion returns zero results at EOF in a function
without a closing brace
Swift SVN r6820
completion token
This is required to handle cases like fooObject.#^A^#.bar where code completion
is invoked inside the ".." token. Previously, the token would not be split and
the lexer would produce an incorrect tokenization for this case. Now we
produce ".", tok::code_complete, ".".
Swift SVN r6635
-Introduce PersistentParserState to represent state persistent among multiple parsing passes.
The advantage is that PersistentParserState is independent of a particular Parser or Lexer object.
-Use PersistentParserState to keep information about delayed function body parsing and eliminate parser-specific
state from the AST (ParserTokenRange).
-Introduce DelayedParsingCallbacks to abstract out of the parser the logic about which functions should be delayed
or skipped.
Many thanks to Dmitri for his valuable feedback!
Swift SVN r6580
* Added a mode in swift-ide-test to test code completion. Unlike c-index-test,
the code completion token in tests is a real token -- we don't need to
count lines and columns anymore.
* Added support in lexer to produce a code completion token.
* Added a parser interface to code completion. It is passed down from the
libFrontend to the parser, but its functions are not called yet.
* Added a sketch of the interface of code completion consumer and code
completion results.
Note: all this is not doing anything useful yet.
Swift SVN r6128
This is to make overload resolution for the protected constructor call sites more explicit.
Otherwise, adding a parameter in the protected constructor and a default parameter in the public one
can inadvertently result in the protected constructor call sites switching to picking the public one.
Swift SVN r6007
When we are interpreting escape sequences in the lexer, we copy the string
literal bytes to ASTContext instead of storing a pointer to the source buffer.
But then we used to try to get a source location for that string in the heap,
which is not a valid source buffer. It succeeds during parsing, but breaks
when we try to print a diagnostic using this location.
Added a verifier check for this.
Also added a real source range for StringLiteralExpr, instead of a source range
with begin == end, produced from the beginning location.
Swift SVN r5961
In order to do this, we need to save and restore parser state easily. The
important pieces of state are:
* lexer position;
* lexical scope stack.
Lexer position can be saved/restored easily. We don't need to store the tokens
for the function body because swift does not have a preprocessor and we can
easily re-lex everything we need. We just store the lexer state for the
beginning and the end of the body.
To save the lexical scope stack, we had to change the underlying data
structure. Originally, the parser used the ScopedHashTable, which supports
only a stack of scopes. But we need a *tree* of scopes. I implemented
TreeScopedHashTable based on ScopedHashTable. It has an optimization for
pushing/popping scopes in a stack fashion -- these scopes will not be allocated
on the heap. While ‘detached’ scopes that we want to re-enter later, and all
their parent scopes, are moved to the heap.
In parseIntoTranslationUnit() we do a second pass over the 'structural AST'
that does not contain function bodies to actually parse them from saved token
ranges.
Swift SVN r5886
parser state. Backtracking will be used a lot when we implement delayed
parsing for function bodies, and we don't want to leak lexer and parser state
details to AST classes when we store the state for the first and last token for
the function body.
Swift SVN r5759
Original message:
SIL Parsing: add plumbing to know when we're parsing a .sil file
Enhance the lexer to lex "sil" as a keyword in sil mode.
Swift SVN r4988
Now that we enforce semicolon or newline separation between statements, we can relax the whitespace requirements on '(' and '[' tokens. A "following" token is now just a token that isn't at the start of a line, and any token can be a "starting" token. This allows for:
a(b)
a (b)
a[b]
a [b]
to parse as applications and subscripts, and:
a
(b)
a
[b]
to parse as an expr followed by a tuple or an expr followed by a container literal.
Swift SVN r4573
When parsing an expression, if we see the production [[identifier '<']], use the following heuristic to choose whether to parse it as a type list or as an operator expression:
- Speculatively parse the subsequent production as a type parameter list.
- If the parse succeeds, examine the token after the closing '>'. If it is one of the following:
l_paren_following
l_square_following
r_paren
r_square
l_brace
r_brace
comma
semicolon
period
then accept the parse as a type list.
- If the parse fails, or if the type list is not followed by one of those tokens, reject the type list and parse as an operator expression.
This only implements the parsing rule. The type parameters are just dropped on the floor--the AST representation and Sema changes are forthcoming. Encouragingly, no test or library code appears to be broken by this rule.
Swift SVN r4044
APFloat's parser gives us the parsing for free. Unlike C99 we require at least one digit on both sides of the hexadecimal point in order to allow '0x1.method()' expressions, similar to Dave's proposed change to float lexing. Also, we were requiring a sign after 'e' in the exponent, which is inconsistent with C, C++, and the Java regex we claim to follow, so I made the exponent sign optional.
Swift SVN r3940
If we generalize John's insight about l_(paren|square) being about
"starting" and "following" tokens, then we can detect many statement
or declaration boundaries that are lacking either white space or a
semicolon.
Ensuring some amount of whitespace between statements and declarations
is good for future proofing.
Swift SVN r3914
The lexer now models tuples, patterns, subscripting, function calls, and
field access robustly. The output tokens are now better named as well:
l_paren and l_paren_call, and l_square and l_square_subscript. It
should be much more clear now which one to use. Also, the use of
l_paren or l_square will not arbitrarily flip flop if the token before
it is a keyword or if the token before it was the trailing ']' of an
attribute list. Similarly, tuples will always cause the lexer to produce
l_paren, regardless if the user typed '((x,y))' or '( (x,y))'.
When we someday add array literals, the right token is now naturally
falling out of the lexer.
Swift SVN r3840