Commit Graph

192 Commits

Author SHA1 Message Date
Dmitri Hrybenko
ecd798b9fd Comment parsing: attaching comments to declarations
We can attach comments to declarations.  Right now we only support comments
that precede the declarations (trailing comments will be supported later).

The implementation approach is different from one we have in Clang.  In Swift
the Lexer attaches the comments to the next token, and parser checks if
comments are present on the first token of the declaration.  This is much
cleaner, and faster than Clang's approach (where we perform a binary search on
source locations and do ad-hoc fixups afterwards).

The comment <-> decl correspondence is modeled as "virtual" attributes that can
not be spelled in the source.  These attributes are not serialized at the
moment -- this will be implemented later.


Swift SVN r14031
2014-02-18 09:04:37 +00:00
Dmitri Hrybenko
7d29d25b7d Fix include order
Swift SVN r13912
2014-02-14 16:37:03 +00:00
John McCall
10ac15ed0d Lex $notAllDigits as an identifier and diagnose it in the lexer
outside of debugger-support mode.  Rip out the existing special-case
code when parsing expr-identifier.

This means that the Lexer needs a LangOptions.  Doug and I
talked about just adding that as a field of SourceMgr, but
decided that it was worth it to preserve the possibility of
parsing different dialects in different source files.

By design, the lexer doesn't tokenize fundamentally differently
in different language modes; it might decide something is invalid,
or it might (eventually) use a different token kind for the
same consumed text, but we don't want it deciding to consume more or
less of the stream per token.

Note that SIL mode does make that kind of difference, and that
arguably means that various APIs for tokenizing need to take a
"is SIL mode" flag, but we're getting away with it because we
just don't really care about fidelity of SIL source files.

rdar://14899000

Swift SVN r13896
2014-02-14 01:54:17 +00:00
Argyrios Kyrtzidis
e244f51229 [Lexer] Add some const goodness to the SourceManager that the Lexer uses.
No functionality change.

Swift SVN r12182
2014-01-11 01:09:30 +00:00
Dmitri Hrybenko
81dc5deee8 Change 'def' keyword back to 'func'
Swift SVN r10522
2013-11-17 07:45:28 +00:00
Dmitri Hrybenko
91ce21666d Change 'func' keyword to 'def'
I tried hard find all references to 'func' in documentation, comments and
diagnostics, but I am sure that I missed a few.  If you find something, please
let me know.

rdar://15346654


Swift SVN r9886
2013-11-02 01:00:42 +00:00
Argyrios Kyrtzidis
61b4e8ede5 [Lexer] Add some comments for Lexer::getLocForStartOfToken().
Swift SVN r8288
2013-09-16 19:32:13 +00:00
Argyrios Kyrtzidis
5db368ce7b [Lexer] Introduce Lexer::getLocForStartOfToken() that returns the location at the start of the token that a given offset points to.
Swift SVN r8281
2013-09-16 18:41:16 +00:00
Dmitri Hrybenko
c98719dc03 Cleanup includes and forward declarations
Swift SVN r7194
2013-08-13 03:36:47 +00:00
Dmitri Hrybenko
dc655eb1cb Remove more abuse of SourceLoc::Value::getPointer()
Swift SVN r7109
2013-08-09 22:35:33 +00:00
Dmitri Hrybenko
db08a32a95 Factor out SourceManager::getLocOffsetInBuffer()
and remove some abuse of SourceLoc::Value::getPointer()


Swift SVN r7105
2013-08-09 21:53:32 +00:00
Dmitri Hrybenko
23b4e9fa86 Lexer: make some implementation details private
Swift SVN r7096
2013-08-09 20:21:29 +00:00
Dmitri Hrybenko
888d9b4e58 Tighten the contract of Lexer::getEncodedCharacterLiteral():
treat the token as a token (instead of a string) and check that it has the
correct kind


Swift SVN r7094
2013-08-09 20:15:37 +00:00
Dmitri Hrybenko
de59d8dcd4 Remove unneeded llvm:: qualifier for llvm::StringRef and llvm::SmallVector
Swift SVN r7089
2013-08-09 18:41:46 +00:00
Dmitri Hrybenko
d1453b3f9d Lexer constuctors: allow to create a lexer that scans a subrange without having
to create a parent lexer first.


Swift SVN r7085
2013-08-09 18:19:55 +00:00
Dmitri Hrybenko
49e15b33e7 Remove the public Lexer::getBufferEnd() function which breaks encapsulation
of the source buffer


Swift SVN r7084
2013-08-09 18:10:29 +00:00
Dmitri Hrybenko
987590ae8c Lexer: constify permanent configuration parameters
Swift SVN r7082
2013-08-09 17:50:13 +00:00
Dmitri Hrybenko
056923e802 Lexer constructors: inline Lexer::initLexer into the principal constructor.
Introducing it was just a refactoring step.


Swift SVN r7081
2013-08-09 17:46:31 +00:00
Dmitri Hrybenko
ea710cd62a Lexer constructors: simplify and establish an invariant:
Lexer::BufferID is the single point of truth, Lexer::BufferStart and
Lexer::BufferEnd is just a cache -- they always point to the beginning and end
of the buffer, even in a sublexer.


Swift SVN r7079
2013-08-09 17:40:17 +00:00
Dmitri Hrybenko
c92fd7cbdd Lexer: refactor constructors to bring the parameter count back to a sane number
Now we have a clear separation between a primary lexer, which scans the whole
buffer, and a sublexer, which can be created from a primary lexer to scan a
part of the buffer.


Swift SVN r7077
2013-08-09 02:06:50 +00:00
Dmitri Hrybenko
02cf73dc30 Lexer: remove redundant parameters from the sublexer constructor
Swift SVN r7073
2013-08-09 00:15:01 +00:00
Dmitri Hrybenko
e5923592e1 Refactor Lexer::State to store a SourceLoc intead of a pointer into the buffer
Swift SVN r7065
2013-08-08 23:51:03 +00:00
Argyrios Kyrtzidis
4908da8361 [Lexer] Remove the public Lexer constructor that accepts a StringRef.
Replace uses of it with the newly introduced constructor that accepts a buffer ID.
The StringRef constructor was rather unsafe since it had the implicit requirement that the StringRef
was null-terminated.

Swift SVN r6942
2013-08-06 14:59:03 +00:00
Argyrios Kyrtzidis
d6b048dfe0 [Lexer] Refactor lexing of interpolated strings.
Decouple splitting an interpolated string to segments, from encoding the string segments.
This allows us to tokenize or re-lex a string literal without having to allocate memory for
encoding the string segments when we don't need them encoded.

Swift SVN r6940
2013-08-06 14:59:01 +00:00
Argyrios Kyrtzidis
cda7a109bd [Lexer] Remove the special hack in the Lexer that handles lexing an expression in an interpolated string.
-Parse the expression using a begin/end state sub-Lexer instead of a general StringRef Lexer
-Introduce a Lexer constructor accepting a BufferID and a range inside the buffer, and use it for swift::tokenize.

Swift SVN r6938
2013-08-06 14:58:59 +00:00
Dmitri Hrybenko
7684d6dc2a Add a good diagnostic for a hashbang line when it is not allowed
Also centralizes the knowledge about whether the hashbang is allowed in the
SourceManager.  This fixes a bug in tokenize() because previously it just had
to guess.


Swift SVN r6822
2013-08-01 23:07:43 +00:00
Dmitri Hrybenko
5cea4ebf41 Code completion: put CodeCompletionOffset on SourceManager instead of passing
around everywhere

Fixes:
rdar://14585108 Code completion does not work at the beginning of the file
rdar://14592634 Code completion returns zero results at EOF in a function
                without a closing brace


Swift SVN r6820
2013-08-01 22:08:58 +00:00
Dmitri Hrybenko
e1c4ae3174 Wrap llvm::SourceMgr in swift::SourceManager so that we can add new members
to the source manager.


Swift SVN r6815
2013-08-01 20:39:22 +00:00
Dmitri Hrybenko
015cea67a5 Move hashbang line detection to the Lexer implementation, instead of requiring
the Lexer client to skip it before constructing the Lexer.


Swift SVN r6801
2013-08-01 02:18:25 +00:00
Dmitri Hrybenko
a0cd8e93ae Lexer: don't consume the first token before we apply all code completion
settings

This allows us to code complete at EOF when there is no newline.


Swift SVN r6640
2013-07-26 03:48:20 +00:00
Dmitri Hrybenko
8363bc3ba4 Remove unused function Lexer::stateRangeHasCodeCompletionToken()
Swift SVN r6636
2013-07-26 01:14:07 +00:00
Dmitri Hrybenko
f9fa6aa8fc Code completion: insert a zero character into the buffer to mark the code
completion token

This is required to handle cases like fooObject.#^A^#.bar where code completion
is invoked inside the ".." token.  Previously, the token would not be split and
the lexer would produce an incorrect tokenization for this case.  Now we
produce ".", tok::code_complete, ".".


Swift SVN r6635
2013-07-26 00:45:57 +00:00
Argyrios Kyrtzidis
1e597f5d8a [Parser] When delay-parsing a function body, make sure the closing brace is included.
Swift SVN r6599
2013-07-25 16:20:13 +00:00
Argyrios Kyrtzidis
66d1e516c8 Refactor how multiple parsing passes and delayed parsing works.
-Introduce PersistentParserState to represent state persistent among multiple parsing passes.
  The advantage is that PersistentParserState is independent of a particular Parser or Lexer object.
-Use PersistentParserState to keep information about delayed function body parsing and eliminate parser-specific
  state from the AST (ParserTokenRange).
-Introduce DelayedParsingCallbacks to abstract out of the parser the logic about which functions should be delayed
  or skipped.

Many thanks to Dmitri for his valuable feedback!

Swift SVN r6580
2013-07-25 01:40:16 +00:00
Dmitri Hrybenko
02084efab7 Implement code completion for some function calls and member variable accesses
in expr-dot and expr-postfix that can be typechecked without typechecking the
beginning of the function body.


Swift SVN r6198
2013-07-12 02:00:41 +00:00
Dmitri Hrybenko
3c5b12fc0f Code completion: add a lot of infrastructure code
* Added a mode in swift-ide-test to test code completion.  Unlike c-index-test,
  the code completion token in tests is a real token -- we don't need to
  count lines and columns anymore.

* Added support in lexer to produce a code completion token.

* Added a parser interface to code completion.  It is passed down from the
  libFrontend to the parser, but its functions are not called yet.

* Added a sketch of the interface of code completion consumer and code
  completion results.

Note: all this is not doing anything useful yet.


Swift SVN r6128
2013-07-10 20:53:40 +00:00
Argyrios Kyrtzidis
6e7d0490f7 Allow optionally to produce comment tokens when lexing and add a tokenize() utility function.
Swift SVN r6008
2013-07-05 15:02:42 +00:00
Argyrios Kyrtzidis
fe8d9bba2f [Lexer] Make the order of parameters of the protected constructor significantly different than the public one.
This is to make overload resolution for the protected constructor call sites more explicit.
Otherwise, adding a parameter in the protected constructor and a default parameter in the public one
can inadvertently result in the protected constructor call sites switching to picking the public one.

Swift SVN r6007
2013-07-05 15:02:41 +00:00
Dmitri Hrybenko
03074dceb2 StringLiteralExpr: always include a good source range
When we are interpreting escape sequences in the lexer, we copy the string
literal bytes to ASTContext instead of storing a pointer to the source buffer.
But then we used to try to get a source location for that string in the heap,
which is not a valid source buffer. It succeeds during parsing, but breaks
when we try to print a diagnostic using this location.

Added a verifier check for this.

Also added a real source range for StringLiteralExpr, instead of a source range
with begin == end, produced from the beginning location.


Swift SVN r5961
2013-07-02 17:19:27 +00:00
Dmitri Hrybenko
0b8f72b9c1 Introduce an artificial EOF in the Lexer
This ensures that we don't go past the end of a subrange of a buffer when doing
delayed parsing.


Swift SVN r5952
2013-07-01 21:48:38 +00:00
Dmitri Hrybenko
f73d866d91 Implement delayed parsing for function bodies
In order to do this, we need to save and restore parser state easily.  The
important pieces of state are:

* lexer position;
* lexical scope stack.

Lexer position can be saved/restored easily.  We don't need to store the tokens
for the function body because swift does not have a preprocessor and we can
easily re-lex everything we need.  We just store the lexer state for the
beginning and the end of the body.

To save the lexical scope stack, we had to change the underlying data
structure.  Originally, the parser used the ScopedHashTable, which supports
only a stack of scopes.  But we need a *tree* of scopes.  I implemented
TreeScopedHashTable based on ScopedHashTable.  It has an optimization for
pushing/popping scopes in a stack fashion -- these scopes will not be allocated
on the heap.  While ‘detached’ scopes that we want to re-enter later, and all
their parent scopes, are moved to the heap.

In parseIntoTranslationUnit() we do a second pass over the 'structural AST'
that does not contain function bodies to actually parse them from saved token
ranges.


Swift SVN r5886
2013-06-28 22:38:10 +00:00
Dmitri Hrybenko
f4d6053ebd Fix typo in 'getStateForBeginnigOfToken'
Swift SVN r5761
2013-06-21 22:36:40 +00:00
Dmitri Hrybenko
a2ddd4e488 Refactor lexer backtracking. Introduce opaque types that encapsulate lexer and
parser state.  Backtracking will be used a lot when we implement delayed
parsing for function bodies, and we don't want to leak lexer and parser state
details to AST classes when we store the state for the first and last token for
the function body.


Swift SVN r5759
2013-06-21 22:26:41 +00:00
Chris Lattner
bde8f753da wire up minimal sil function parsing (still not validating or creating
a SIL function).  This introduces contextual lexing state for SIL function 
bodies for SIL-specific lexing rules.


Swift SVN r5179
2013-05-16 18:46:46 +00:00
Chris Lattner
5b4c31dc94 reland r4968, with a bugfix to avoid breaking the lexer measuring token lengths.
Original message:
SIL Parsing: add plumbing to know when we're parsing a .sil file
Enhance the lexer to lex "sil" as a keyword in sil mode.


Swift SVN r4988
2013-04-30 00:28:37 +00:00
Chris Lattner
aafe3bdbdc revert r4968, it apparently breaks the world. I'll recommit it when I have time to investigate.
Swift SVN r4971
2013-04-29 16:58:43 +00:00
Chris Lattner
b503206bec SIL Parsing: add plumbing to know when we're parsing a .sil file
Enhance the lexer to lex "sil" as a keyword in sil mode.


Swift SVN r4970
2013-04-29 05:42:59 +00:00
Joe Groff
88d4284178 Parser: Relax whitespace rules for ( and [.
Now that we enforce semicolon or newline separation between statements, we can relax the whitespace requirements on '(' and '[' tokens. A "following" token is now just a token that isn't at the start of a line, and any token can be a "starting" token. This allows for:

  a(b)
  a (b)
  a[b]
  a [b]

to parse as applications and subscripts, and:

  a
  (b)
  a
  [b]

to parse as an expr followed by a tuple or an expr followed by a container literal.

Swift SVN r4573
2013-04-02 16:43:20 +00:00
Dave Zarzycki
f043cdca5a Revert "wipUnifyFormToken"
This reverts commit 39e8ace65c29d9e83f4fbebc24be9c9896e9a3e1.

Sorry!

Swift SVN r4185
2013-02-25 03:13:14 +00:00
Dave Zarzycki
06d3c09d97 wipUnifyFormToken
Swift SVN r4183
2013-02-25 03:11:11 +00:00