It turns out that the bitpacked Commons struct is actually fairly expensive because the CPU needs to apply bitmasks to fetch the IsToken and Presence flag. We've got padding space available, so we might as well properly align these boolean flags.
Also on a source level, replace a couple of bit-restricted unsigned fields by their representing type (e.g. SyntaxKind).
Finally, we can pull out the common bits to RawSyntax and have the Bits union only contain the token- or layout-specific fields.
This also allows us to initialise these fields in the constructor's initialiser list (instead of in the initialiser body).
Lastly, change copyToArenaIfNecessary to work on a char *& and length, which allows us to initialise leading/trailing trivia/token text in the initialiser list and adjust if necessary later.
The getData and getRecordedOrDeferredNode methods broke the move-only semantics of ParsedRawSyntaxNode because it allowed retrieving the data without resetting it.
Change most uses to use takeData or takeRecordedOrDeferredNode instead of the get methods.
To retrieve the children of a ParsedRawSyntaxNode, we still need a way to access the OpaqueSyntaxNode for inspection by the SyntaxParseActions without resetting it. For this, we still maintain a method with the semantics of getData, but it’s now called getUnsafeOpaqueData to make sure it’s not accidentally used.
By now ParsedRawSyntaxNode does not have any knowledge about deferred
node data anymore, which frees up SyntaxParseActions (and, in
particular its sublass SyntaxTreeCreator) to perform optimisations to
more efficiently create and record deferred nodes.
This is a multi-commit effort to push the responsibility of deferred
node handling to the SyntaxParseActions which have more detailed
knowledge of their requirements on deferred nodes and might perform
additional optimisations. For example, the SyntaxTreeCreator can always
create RawSyntax nodes (even for deferred nodes) and decide to simply
not use them, should the deferred nodes not get recorded.
So that we can easily detect 'ParsedSyntaxNode' leaking. When it's
moved, the original node become "null" node. In the destructor of
'ParsedSyntaxNode', assert the node is not "recorded" node.
This eliminates the overhead of ParsedRawSyntaxNode needing to do memory management.
If ParsedRawSyntaxNode needs to point to some data the memory is allocated from a bump allocator.
There are also some improvements on how the ParsedSyntaxBuilders work.
Instead of creating syntax nodes directly, modify the parser to invoke an abstract interface 'SyntaxParseActions' while it is parsing the source code.
This decouples the act of parsing from the act of forming a syntax tree representation.
'SyntaxTreeCreator' is an implementation of SyntaxParseActions that handles the logic of creating a syntax tree.
To enforce the layering separation of parsing and syntax tree creation, a static library swiftSyntaxParse is introduced to compose the two.
This decoupling is important for introducing a syntax parser library for SwiftSyntax to directly access parsing.