ParsedSyntaxBuilder has a convenient function to add member to a syntax-collection
child. The function name uses the type name of the collection's members,
which can lead to name collision. This patch renames it.
To ensure SwiftSyntax calls a compatible parser library, this patch sets
up a C API that returns a constant string calculated during compilation time to indicate
the version of syntax node declarations. The same hash will be calculated
in the SwiftSyntax (client) side as well by using the same algorithm.
During runtime, SwiftSyntax will verify its hash value is identical to the
result of calling swiftparse_node_declaration_hash before actual
parsing happens.
This patch only sets the API up. The actual implementation of the
hashing algorithm will come later.
Instead of creating syntax nodes directly, modify the parser to invoke an abstract interface 'SyntaxParseActions' while it is parsing the source code.
This decouples the act of parsing from the act of forming a syntax tree representation.
'SyntaxTreeCreator' is an implementation of SyntaxParseActions that handles the logic of creating a syntax tree.
To enforce the layering separation of parsing and syntax tree creation, a static library swiftSyntaxParse is introduced to compose the two.
This decoupling is important for introducing a syntax parser library for SwiftSyntax to directly access parsing.
`llvm::ArrayRef<T>` does not define `erase`. However, since the intent
here is to remove the last n elements, we can use `drop_back` instead.
Furthermore, replace the direct use of `Layout` with `getLayout()`.
This was identified by gcc 8.2.
This silences the instances of the warning from Visual Studio about not all
codepaths returning a value. This makes the output more readable and less
likely to lose useful warnings. NFC.
This allows an elegant design in which we can still allocate RawSyntax
nodes using a bump allocator but are able to automatically free that
buffer once the last RawSyntax node within that buffer is freed.
This also resolves a memory leak of RawSyntax nodes that was caused by
ParserUnit not freeing its underlying ASTContext.
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
The recommended way forward is to use the SyntaxClassifier on the Swift
side.
By removing the C++ SyntaxClassifier, we can also eliminate the
-force-libsyntax-based-processing option that was used to bootstrap
incremental parsing and would generate the syntax map from a syntax
tree.
For ScalarTraits, a buffer was always created on the heap to which the
scalar string value was written just to be copied to the output buffer
again. In case the value already exists in a memory buffer it is way
cheaper to avoid the heap allocation and copy it straight to the output
buffer.
For now, the accessors have been underscored as `_read` and `_modify`.
I'll prepare an evolution proposal for this feature which should allow
us to remove the underscores or, y'know, rename them to `purple` and
`lettuce`.
`_read` accessors do not make any effort yet to avoid copying the
value being yielded. I'll work on it in follow-up patches.
Opaque accesses to properties and subscripts defined with `_modify`
accessors will use an inefficient `materializeForSet` pattern that
materializes the value to a temporary instead of accessing it in-place.
That will be fixed by migrating to `modify` over `materializeForSet`,
which is next up after the `read` optimizations.
SIL ownership verification doesn't pass yet for the test cases here
because of a general fault in SILGen where borrows can outlive their
borrowed value due to being cleaned up on the general cleanup stack
when the borrowed value is cleaned up on the formal-access stack.
Michael, Andy, and I discussed various ways to fix this, but it seems
clear to me that it's not in any way specific to coroutine accesses.
rdar://35399664
If enabled using the environment variable
SOURCEKIT_INCREMENTAL_PARSE_VALIDATION, the incrementally parsed syntax
tree will be compared to the from-scratch parsing syntax tree. If they
differ a warning is emitted and log files showing the difference written
to a temporary directory.
Naming the bit-field structs is a significant readability improvement
because it's very clear that you shouldn't touch e.g. Bits.Token
without having checked/asserted that you're in a token case.
The assertions are all in statement context (which was obvious
because the NDEBUG versions all included semicolons), so there's no
reason not to use the traditional `do { } while (false)` trick instead
of a statement-expression.
This also clears up some warnings in atypical build configurations.