Commit Graph

39 Commits

Author SHA1 Message Date
Alex Hoppen
294977534c [libSyntax] Remove incremental JSON transfer option
We were only keeping track of `RawSyntax` node IDs to incrementally transfer a syntax tree via JSON. However, AFAICT the incremental JSON transfer option has been superceeded by `SyntaxParseActions`, which are more efficient.

So, let’s clean up and remove the `RawSyntax` node ID and JSON incremental transfer option.

In places that still need a notion of `RawSyntax` identity (like determining the reused syntax regions), use the `RawSyntax`’s pointer instead of the manually created ID.

In `incr_transfer_round_trip.py` always use the code path that uses the `SyntaxParseActions` and remove the transitional code that was still using the incremental JSON transfer but was never called.
2021-04-07 10:01:34 +02:00
Alex Hoppen
d0cb7ad624 [libSyntax] Eliminate loop in RawSyntax constructor
Currently, when creating a `RawSyntax` layout node, the `RawSyntax` constructor needs to iterate over all child nodes to
a) sum up their sub node count
b) add their arena as a child arena of the new node's arena

But we are already iterating over all child nodes in every place that calls these constructors. So instead of looping twice, we can perform the above operations in the loop that already exists and pass the parameters to the `RawSyntax` constructor, which spees up `RawSyntax` node creation.

To ensure the integrity of the `RawSyntax` tree, the passed in values are still validated in release builds.
2021-03-26 18:30:46 +01:00
Alex Hoppen
28f5f79bb7 [libSyntax] Don't reference count RawSyntax
Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.

During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.

This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
2021-03-01 09:43:54 +01:00
Alex Hoppen
e43bad2c71 [libSyntax] Store the token's text in the SyntaxArena
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
2021-02-10 09:50:12 +01:00
Alex Hoppen
5e1ba8b16e [libSyntax] Store raw trivia inside RawSyntax and only lex into pieces when requested 2021-02-05 08:15:54 +01:00
Alex Hoppen
803499e165 [libSyntax] Require RawSyntax to always live inside a SyntaxArena
This way, we will later be able to store additional information about
the node inside the same arena with a guarantee that they will always be
alive as long as the node is alive.

These additional information will include
a) the token's text (which can be a StringRef into a copy of the source
   code that lives inside the SyntaxArena)
b) the token's unparsed trivia, which can be decomposed into pieces when
   needed.
2021-02-01 10:34:44 +01:00
Alex Hoppen
8bb1167e21 [libSyntax] Restructure RawSyntax to more closely resemble the SwiftSyntax implementation 2021-01-29 13:08:12 +01:00
Alex Hoppen
8ec8516893 Remove ByteTree serialization format
It was originally designed for faster trasmission of syntax trees from
C++ to SwiftSyntax, but superceded by the CLibParseActions. There's no
deserializer for it anymore, so let's just remove it.
2021-01-14 20:37:49 +01:00
moatom
2e95a0d265 Fix include guards 2019-06-02 12:10:43 +09:00
Saleem Abdulrasool
d281b98220 litter the tree with llvm_unreachable
This silences the instances of the warning from Visual Studio about not all
codepaths returning a value.  This makes the output more readable and less
likely to lose useful warnings.  NFC.
2018-09-13 15:26:14 -07:00
Alex Hoppen
49d0d5b7a3 [libSyntax] Make the ByteTree protocol version consist of a major and minor component 2018-08-29 13:40:50 -07:00
Alex Hoppen
34a89d45e2 [libSyntax] Make the ByteTree format forwards-compatible 2018-08-22 12:07:57 -07:00
Alex Hoppen
9eb0208bb5 [libSyntax] Add support for incremental serialization of ByteTrees 2018-08-21 10:55:15 -07:00
Alex Hoppen
79e9113a58 Merge pull request #18677 from ahoppen/ref-counted-owned-string
[libSyntax] Add a reference counted version of OwnedString
2018-08-14 11:37:40 -07:00
Alex Hoppen
ac512d4341 [libSyntax] Add a reference counted version of OwnedString
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
2018-08-13 15:37:53 -07:00
Alex Hoppen
280b186fa0 [libSyntax] Add a binary serialization format for syntax trees 2018-08-10 10:13:00 -07:00
Alex Hoppen
88937f93c4 Merge pull request #18295 from ahoppen/json-serialization-improvements
[libSyntax] JSON serialization improvements
2018-07-30 08:46:16 -07:00
Alex Hoppen
07b449bbd5 [JSONSerialization] Introduce ScalarReferenceTraits
For ScalarTraits, a buffer was always created on the heap to which the
scalar string value was written just to be copied to the output buffer
again. In case the value already exists in a memory buffer it is way
cheaper to avoid the heap allocation and copy it straight to the output
buffer.
2018-07-27 16:20:34 -07:00
Alex Hoppen
57196f8902 [libSyntax] Enable serialization of syntax trees for incremental transfer 2018-07-23 12:32:49 -07:00
Alex Hoppen
57ccdd89b6 [incrParse] Add validation of incremental parsing
If enabled using the environment variable
SOURCEKIT_INCREMENTAL_PARSE_VALIDATION, the incrementally parsed syntax
tree will be compared to the from-scratch parsing syntax tree. If they
differ a warning is emitted and log files showing the difference written
to a temporary directory.
2018-07-18 13:35:11 -07:00
Alex Hoppen
9d59cd286b [incrParse] Add a stable id to the syntax nodes
The id is meant to be stable across incremental parses
2018-07-13 16:56:03 -07:00
Dexin Li
e0f8b27117 [Syntax]Add a deserializer that convert json to libSyntax tree (#15203) 2018-03-16 15:22:04 -07:00
Xi Ge
a121ce65ca Syntax: add APIs to help syntax tree serialization. NFC (#15241) 2018-03-14 13:12:31 -07:00
Xi Ge
94c3f55117 libSyntax: extract meta-information of trivia kinds to syntax_gyb_support. NFC
The existing libSyntax infrastructure uses external python
dictionaries to share logic between C++ and Swift implementations.
This patch teaches trivia kinds to adapt to this infrastructure
 as well.
2018-03-06 17:45:43 -08:00
Rintaro Ishizaki
fced748790 [Syntax] Represent missing optioanl nodes as nullptr (#14300)
Allocating RawSyntax/SyntaxData for missing optional node is a waste of
resource.
2018-01-31 19:24:00 +09:00
Rintaro Ishizaki
0780c529c4 [Syntax] Unify RawSyntax and RawTokenSyntax using union and TrailingObjects
It better matches with SwiftSyntax model.

Using TrailingObjects reduces the number of heap allocation which
gains 18% performance improvement.
2018-01-18 14:49:46 +09:00
Rintaro Ishizaki
192434003d [Syntax] Introduce TOKEN macro in TokenKinds.def 2018-01-17 02:38:14 +09:00
Xi Ge
7476677bb2 libSyntax: create separate node kinds for quote (") and multiline quote (""").
This allows us to serialize the quote tokens without serializing their
underlying text.
2018-01-04 09:08:21 -08:00
Xi Ge
22ce6934bd libSyntax: parse string interpolation expression. (#13708)
A string interpolation expression is composed of { OpenQuote, Segments,
CloseQuote }. To represent OpenQuote, CloseQuote and StringSegment, we have to
introduce new token kinds correspondingly.
2018-01-03 20:41:34 -08:00
Marcel Jackwerth
a1a285d740 [Syntax] Serialize contextual_keyword text. (#13647) 2018-01-02 10:53:06 -08:00
omochimetaru
3a3e89ba0c [Syntax] add TriviaKind::CarriageReturnLineFeed 2017-12-20 20:50:40 +09:00
Rintaro Ishizaki
2c06060165 [Syntax] Add CarriageReturn trivia kind
To distinguish '\r' from '\n'.
2017-12-19 09:24:34 +09:00
Rintaro Ishizaki
2b1e316cf6 [Syntax] Add parsing hashbang (shebang) as a trivia.
Added GarbageText trivia kind for any skipped text.
2017-12-08 12:07:00 +09:00
Harlan
a5098e6b69 Generate libSyntax API (#10926)
* Generate libSyntax API

This patch removes the hand-rolled libSyntax API and replaces it with an
API that's entirely automatically generated. This means the API is
guaranteed to be internally stylistically and functionally consistent.
2017-07-25 18:19:58 -07:00
Harlan
70089a7bcc [Syntax] Represent TokenSyntax as a Syntax node (#10606)
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.

This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
2017-06-27 11:08:10 -07:00
Harlan Haskins
a273ca405d [Syntax] Don't serialize text for simple tokens
The text for tok::kw_struct, and similar, are always known.
Save space by not serializing them.
2017-06-20 15:43:34 -07:00
Harlan Haskins
841df8d8b5 Correct comment 2017-06-15 16:11:13 -07:00
Harlan Haskins
d0d4967f3f Fix unfinished comment 2017-06-15 16:10:03 -07:00
Harlan Haskins
bc6e56c17c Add simple diff test for serialized syntax 2017-06-15 16:08:11 -07:00