Now that we have a fast SyntaxDataRef, create a corresponding SyntaxRef hierarchy. In contrast to the Syntax, SyntaxRef does *not* own the backing SyntaxDataRef, but merely has a pointer to the SyntaxDataRef.
In addition to the requirements imposed by SyntaxDataRef, the user of the SyntaxRef hierarchy needs to make sure that the backing SyntaxDataRef of a SyntaxRef node stays alive. While this sounds like a lot of requirements, it has performance advantages:
- Passing a SyntaxRef node around is just passing a pointer around.
- When casting a SyntaxRef node, we only need to create a new SyntaxRef (aka. pointer) that points to the same underlying SyntaxDataRef - there's no need to duplicate the SyntaxDataRef.
- As SyntaxDataRef is not ref-counted, there's no ref-counting overhead involved.
Furthermore, the requirements are typically fulfilled. The getChild methods on SyntaxRef return an OwnedSyntaxRef, which stores the SyntaxDataRef. As long as this variable is stored somewhere on the stack, the corresponding SyntaxRef can safely be used for the duration of the stack frame. Even calls like the following are possible, because OwnedSyntaxRef returned by getChild stays alive for the duration of the entire statement.
```
useSyntaxRef(mySyntaxRef.getChild(0).getRef())
```
For syntax nodes that previously didn’t have a `validate` method, the newly added `validate` method is a no-op. This will make validation easier in upcoming generic code.
Instead of having a heap-allocated RefCountedBox to store a SyntaxData's
parent, reference-count SyntaxData itself. This has a couple of
advantages:
- When passing SyntaxData around, only a pointer needs to be passed
instead of the entire struct contents. This is faster.
- We can later introduce a SyntaxDataRef, which behaves similar to
SyntaxData, but delegates the responsibility that the parent stays
alive to the user. While sacrificing guaranteed memory safety, this
means that SyntaxData can then be stack-allocated without any
ref-counting overhead.
Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.
During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.
This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
Instead, reference count the SyntaxData's parent. This has a couple of
advantages:
1. We eliminate a const_cast that was potentially unsafe
2. It more closely resembles the architecture on the Swift side
3. It has the potential to be optimised further if the parent can be
accessed in an unsafe, non-reference-counted way
- Stop producing 'backtick' trivia for escaping identifier token. '`'s
are now parts of the token text
- Adjust and simplify C++ libSyntax APIs
- Add 'is_deprecated' property to Trivia.py to attribute SwiftSyntax
APIs
rdar://problem/54810608
* Generate libSyntax API
This patch removes the hand-rolled libSyntax API and replaces it with an
API that's entirely automatically generated. This means the API is
guaranteed to be internally stylistically and functionally consistent.
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.
This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
Add an option to the lexer to go back and get a list of "full"
tokens, which include their leading and trailing trivia, which
we can index into from SourceLocs in the current AST.
This starts the Syntax sublibrary, which will support structured
editing APIs. Some skeleton support and basic implementations are
in place for types and generics in the grammar. Yes, it's slightly
redundant with what we have right now. lib/AST conflates syntax
and semantics in the same place(s); this is a first step in changing
that to separate the two concepts for clarity and also to get closer
to incremental parsing and type-checking. The goal is to eventually
extract all of the syntactic information from lib/AST and change that
to be more of a semantic/symbolic model.
Stub out a Semantics manager. This ought to eventually be used as a hub
for encapsulating lazily computed semantic information for syntax nodes.
For the time being, it can serve as a temporary place for mapping from
Syntax nodes to semantically full lib/AST nodes.
This is still in a molten state - don't get too close, wear appropriate
proximity suits, etc.