We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
To enhance the error-recovery of syntax parsing, this patch allows the
parser to synthesize missing nodes to satisfy the requirement of a
syntax node under parsing. As proof-of-concept, we synthesize r-braces
for function body to avoid regressing a function decl to an unknown
decl.
This also fixes several issues where attribute arguments could not be
parsed as a TokenList since some of its arguments already had structure
and were not tokens
Before this patch, we have one flag (KeepSyntaxInfo) to turn on two syntax
functionalities of parser: (1) collecting parsed tokens for coloring and
(2) building syntax trees. Since sourcekitd is the only consumer of either of these
functionalities, sourcekitd by default always enables such flag.
However, empirical results show (2) is both heavier and less-frequently
needed than (1). Therefore, separating the flag to two flags makes more
sense, where CollectParsedToken controls (1) and BuildSyntaxTree
controls (2).
CollectingParsedToken is always enabled by sourcekitd because
formatting and syntax-coloring need it; however BuildSyntaxTree should
be explicitly switched on by sourcekitd clients.
resolves: rdar://problem/37483076