This eliminates the overhead of ParsedRawSyntaxNode needing to do memory management.
If ParsedRawSyntaxNode needs to point to some data the memory is allocated from a bump allocator.
There are also some improvements on how the ParsedSyntaxBuilders work.
Instead of creating syntax nodes directly, modify the parser to invoke an abstract interface 'SyntaxParseActions' while it is parsing the source code.
This decouples the act of parsing from the act of forming a syntax tree representation.
'SyntaxTreeCreator' is an implementation of SyntaxParseActions that handles the logic of creating a syntax tree.
To enforce the layering separation of parsing and syntax tree creation, a static library swiftSyntaxParse is introduced to compose the two.
This decoupling is important for introducing a syntax parser library for SwiftSyntax to directly access parsing.
Instead of creating multiple CodeBlockItemList nodes, that need to get merged and discarded later on, do this:
* Ensure for libSyntax parsing that we parse the whole file
* Create top-level CodeBlockItem nodes that we just directly wrap with a single CodeBlockItemList node at the end
The importance of this change will become more obvious later on when we'll decouple syntax parsing from the formation of libSyntax tree nodes.
This allows an elegant design in which we can still allocate RawSyntax
nodes using a bump allocator but are able to automatically free that
buffer once the last RawSyntax node within that buffer is freed.
This also resolves a memory leak of RawSyntax nodes that was caused by
ParserUnit not freeing its underlying ASTContext.
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
To enhance the error-recovery of syntax parsing, this patch allows the
parser to synthesize missing nodes to satisfy the requirement of a
syntax node under parsing. As proof-of-concept, we synthesize r-braces
for function body to avoid regressing a function decl to an unknown
decl.
This also fixes several issues where attribute arguments could not be
parsed as a TokenList since some of its arguments already had structure
and were not tokens
Before this patch, we have one flag (KeepSyntaxInfo) to turn on two syntax
functionalities of parser: (1) collecting parsed tokens for coloring and
(2) building syntax trees. Since sourcekitd is the only consumer of either of these
functionalities, sourcekitd by default always enables such flag.
However, empirical results show (2) is both heavier and less-frequently
needed than (1). Therefore, separating the flag to two flags makes more
sense, where CollectParsedToken controls (1) and BuildSyntaxTree
controls (2).
CollectingParsedToken is always enabled by sourcekitd because
formatting and syntax-coloring need it; however BuildSyntaxTree should
be explicitly switched on by sourcekitd clients.
resolves: rdar://problem/37483076