To construct struct syntax, this patch first specialized type
inheritance clause. For protocol's class requirement, we currently
treat it as an unknown type.
This patch also teaches SyntaxParsingContext to collect syntax nodes
from back in place. This is useful to squash multiple decl modifiers
for declarations like function. This is not used for struct declaration
because only accessibility modifier is allowed.
Because generic where clause doesn't coerce well to our existing syntax
context kinds, we add a new syntax context kind with this patch called
"Syntax". This context kind indicates that when error occurs, the
collection of syntax nodes falling into the context should be coerced
to UnknownSyntax.
RawTokenSyntax is a derived class from RawSyntax that is reference
counted with its own destructor function registered. Unfortunately, the destructor
function of RawSyntax is non-virtual before this patch. This means when reference counter
releases a pointer of RawSyntax, it won't clean-up the additional stuff in RawTokenSyntax.
Avoid heap-allocated memory for syntax parsing context.
Add more assertions to ensure syntax nodes are created only at the top of context stack.
Allow syntax parsing context to delay the specifying of context kind and target syntax kind.
This commit also adds ArrayExpr and DictionaryExpr to the libSyntax nodes
family. Also, it refactors the original parser code for these two
expressions to better fit to the design of SyntaxParsingContext.
This commit has also fixed two crashers.
This commit teaches parser to parse two libSyntax nodes: FunctionCallArgument and
FunctionCallArgumentList. Along with the change, some libSyntax parsing infrastructure changes
as well: (1) parser doesn't directly insert token into the buffer for libSyntax node creation;
instead, when creating a simple libSyntax node like integer literal expression, parser should indicate the location of the last token in the node; (2) implicit libSyntax nodes like empty
statement list must contain a source location indicating where the implicit nodes should appear
(immediately before the token at the given location).
This commit teaches parser to generate code block syntax node. As a support for this,
SyntaxParsingContext can be created by a single syntax kind, indicating the whole context
should be parsed into a node of that given syntax. Another change is to bridge created syntax
node with the given context kind. For instance, if a statement context results into an expression
node, the expression node will be bridged to a statement by wrapping it with a ExpressionStmt
node.
* Re-apply "libSyntax: Ensure round-trip printing when we build syntax tree from parser incrementally. (#12709)"
* Re-apply "libSyntax: Root parsing context should hold a reference to the current token in the parser, NFC."
* Re-apply "libSyntax: avoid copying token text when lexing token syntax nodes, NFC. (#12723)"
* Actually fix the container-overflow issue.
Since all parsing contexts need a reference to the current token of the
parser, we should pass the token reference to the root context. Therefore, the derived
sub-contexts can just copy it while being spawned.
For very large source files, the parser's syntax map---which contains a
very large number of TrivialLists---was taking an inordinate amount of
memory due to the inefficiency of std::deque. Specifically, a
std::deque containing just one trivial element would allocate 4k of
memory. With the ~120MB SIL output of one of the parse_stdlib tests,
these std::deques would add up to > 6GB of memory, most of which is
wasted.
Replacing the std::deque with a std::vector knocks the memory required
for one of the parse_stdlib tests from > 8GB down closer to 2 GB. The
parser's syntax map is still large (e.g., a 512MB allocation for the
overall vector plus a few hundred MB of raw-syntax data), but not
prohibitively so.
Part of rdar://problem/34771322.
For very large source files, the parser's syntax map---which contains a
very large number of TrivialLists---was taking an inordinate amount of
memory due to the inefficiency of std::deque. Specifically, a
std::deque containing just one trivial element would allocate 4k of
memory. With the ~120MB SIL output of one of the parse_stdlib tests,
these std::deques would add up to > 6GB of memory, most of which is
wasted.
Replacing the std::deque with a std::vector knocks the memory required
for one of the parse_stdlib tests from > 8GB down closer to 2 GB. The
parser's syntax map is still large (e.g., a 512MB allocation for the
overall vector plus a few hundred MB of raw-syntax data), but not
prohibitively so.
Part of rdar://problem/34771322.
This patch allows Parser to generate a refined token stream to satisfy tooling's need. For syntax coloring, token stream from lexer is insufficient because (1) we have contextual keywords like get and set; (2) we may allow keywords to be used as argument labels and names; and (3) we need to split tokens like "==<". In this patch, these refinements are directly fulfilled through parsing without additional heuristics. The refined token vector is optionally saved in SourceFile instance.
I forgot to reset the macro parameters after converting them to
varargs, which didn't get caught running PR testing.
This patch ensures they're all the same.
* Generate libSyntax API
This patch removes the hand-rolled libSyntax API and replaces it with an
API that's entirely automatically generated. This means the API is
guaranteed to be internally stylistically and functionally consistent.
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.
This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.