Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.
During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.
This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
In practice SyntaxArena.containsPointer is almost always called with a
pointer from the SyntaxArena's source buffer. To avoid walking through
all of the bump allocator's slabs until we find the one containing the
source buffer, add a hot use memory region (which lives inside the bump
allocator) that is checked first before consulting the bump allocator.
Referencing a string in arbitrary memory is not safe since the source
buffer to which it points may have been freed. Instead copy all strings
into the SyntaxArena. Since RawSyntax nodes retain their arena, they can
be sure that the string won't disappear if it lives in their arena.
To avoid lots of small copies, we copy the entire source buffer once
into the syntax arena and make StringRefs point into that buffer.
Currently when parsing a SourceFile, the parser
gets handed pointers so that it can write the
interface hash and collected tokens directly into
the file. It can also call `setSyntaxRoot` at
the end of parsing to set the syntax tree.
In preparation for the removal of
`performParseOnly`, this commit formalizes these
values as outputs of `ParseSourceFileRequest`,
ensuring that the file gets parsed when the
interface hash, collected tokens, or syntax tree
is queried.
Like the last commit, SourceFile is used a lot by Parse and Sema, but
less so by the ClangImporter and (de)Serialization. Split it out to
cut down on recompilation times when something changes.
This commit does /not/ split the implementation of SourceFile out of
Module.cpp, which is where most of it lives. That might also be a
reasonable change, but the reason I was reluctant to is because a
number of SourceFile members correspond to the entry points in
ModuleDecl. Someone else can pick this up later if they decide it's a
good idea.
No functionality change.
Most of AST, Parse, and Sema deal with FileUnits regularly, but SIL
and IRGen certainly don't. Split FileUnit out into its own header to
cut down on recompilation times when something changes.
No functionality change.
This eliminates the overhead of ParsedRawSyntaxNode needing to do memory management.
If ParsedRawSyntaxNode needs to point to some data the memory is allocated from a bump allocator.
There are also some improvements on how the ParsedSyntaxBuilders work.
Instead of creating syntax nodes directly, modify the parser to invoke an abstract interface 'SyntaxParseActions' while it is parsing the source code.
This decouples the act of parsing from the act of forming a syntax tree representation.
'SyntaxTreeCreator' is an implementation of SyntaxParseActions that handles the logic of creating a syntax tree.
To enforce the layering separation of parsing and syntax tree creation, a static library swiftSyntaxParse is introduced to compose the two.
This decoupling is important for introducing a syntax parser library for SwiftSyntax to directly access parsing.