We were only keeping track of `RawSyntax` node IDs to incrementally transfer a syntax tree via JSON. However, AFAICT the incremental JSON transfer option has been superceeded by `SyntaxParseActions`, which are more efficient.
So, let’s clean up and remove the `RawSyntax` node ID and JSON incremental transfer option.
In places that still need a notion of `RawSyntax` identity (like determining the reused syntax regions), use the `RawSyntax`’s pointer instead of the manually created ID.
In `incr_transfer_round_trip.py` always use the code path that uses the `SyntaxParseActions` and remove the transitional code that was still using the incremental JSON transfer but was never called.
Currently, when creating a `RawSyntax` layout node, the `RawSyntax` constructor needs to iterate over all child nodes to
a) sum up their sub node count
b) add their arena as a child arena of the new node's arena
But we are already iterating over all child nodes in every place that calls these constructors. So instead of looping twice, we can perform the above operations in the loop that already exists and pass the parameters to the `RawSyntax` constructor, which spees up `RawSyntax` node creation.
To ensure the integrity of the `RawSyntax` tree, the passed in values are still validated in release builds.
For syntax nodes that previously didn’t have a `validate` method, the newly added `validate` method is a no-op. This will make validation easier in upcoming generic code.
In contrast to SyntaxData, SyntaxDataRef is not memory-safe, but
designed to be fast. In particular, the following guarantees from
SyntaxData are being dropped:
- SyntaxDataRef does not retain the SyntaxArena containing its
RawSyntax. The user of SyntaxDataRef has to provide that guarantee.
However, that's usually pretty easily done by just retaining the
SyntaxArena of the tree's root node.
- The parent of a SyntaxDataRef must outlive the child node. This is
the more tricky constraint, but if a tree is just walked top to
bottom with nodes stored on the stack, this is given by the way the
stack is being unrolled.
These methods are super small and setting up the stack frame etc. takes
up the majority (or at least a significant amount) of their execution
time. So let's inline them.
Instead of having a heap-allocated RefCountedBox to store a SyntaxData's
parent, reference-count SyntaxData itself. This has a couple of
advantages:
- When passing SyntaxData around, only a pointer needs to be passed
instead of the entire struct contents. This is faster.
- We can later introduce a SyntaxDataRef, which behaves similar to
SyntaxData, but delegates the responsibility that the parent stays
alive to the user. While sacrificing guaranteed memory safety, this
means that SyntaxData can then be stack-allocated without any
ref-counting overhead.
Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.
During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.
This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
This decreases the size of RawSyntax nodes from 88 to 64 bytes by
- Avoiding some padding by moving RefCount further up
- Limiting the length of tokens and their trivia to 32 bits. We would
hit this limit with files >4GB but we also hit this limit at other
places like the TextLength property in the Common bits.
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
Referencing a string in arbitrary memory is not safe since the source
buffer to which it points may have been freed. Instead copy all strings
into the SyntaxArena. Since RawSyntax nodes retain their arena, they can
be sure that the string won't disappear if it lives in their arena.
To avoid lots of small copies, we copy the entire source buffer once
into the syntax arena and make StringRefs point into that buffer.
This way, we will later be able to store additional information about
the node inside the same arena with a guarantee that they will always be
alive as long as the node is alive.
These additional information will include
a) the token's text (which can be a StringRef into a copy of the source
code that lives inside the SyntaxArena)
b) the token's unparsed trivia, which can be decomposed into pieces when
needed.
Instead, reference count the SyntaxData's parent. This has a couple of
advantages:
1. We eliminate a const_cast that was potentially unsafe
2. It more closely resembles the architecture on the Swift side
3. It has the potential to be optimised further if the parent can be
accessed in an unsafe, non-reference-counted way
It was originally designed for faster trasmission of syntax trees from
C++ to SwiftSyntax, but superceded by the CLibParseActions. There's no
deserializer for it anymore, so let's just remove it.
Use a newly introduced `swift_gyb_target_sources` to gyb and use the
generated sources when building. Let CMake figure out when to run the
command, let it invoke the command properly, and indicate that the
sources being added to the target are generated.
SIL differentiability witnesses are a new top-level SIL construct mapping
"original" SIL functions to derivative SIL functions.
SIL differentiability witnesses have the following components:
- "Original" `SILFunction`.
- SIL linkage.
- Differentiability parameter indices (`IndexSubset`).
- Differentiability result indices (`IndexSubset`).
- Derivative `GenericSignature` representing differentiability generic
requirements (optional).
- JVP derivative `SILFunction` (optional).
- VJP derivative `SILFunction` (optional).
- "Is serialized?" bit.
This patch adds the `SILDifferentiabilityWitness` data structure, with
documentation, parsing, and printing.
Resolves TF-911.
Todos:
- TF-1136: upstream `SILDifferentiabilityWitness` serialization.
- TF-1137: upstream `SILDifferentiabilityWitness` verification.
- TF-1138: upstream `SILDifferentiabilityWitness` SILGen from
`@differentiable` and `@derivative` attributes.
- TF-20: robust mangling for `SILDifferentiabilityWitness` names.
By convention, most structs and classes in the Swift compiler include a `dump()` method which prints debugging information. This method is meant to be called only from the debugger, but this means they’re often unused and may be eliminated from optimized binaries. On the other hand, some parts of the compiler call `dump()` methods directly despite them being intended as a pure debugging aid. clang supports attributes which can be used to avoid these problems, but they’re used very inconsistently across the compiler.
This commit adds `SWIFT_DEBUG_DUMP` and `SWIFT_DEBUG_DUMPER(<name>(<params>))` macros to declare `dump()` methods with the appropriate set of attributes and adopts this macro throughout the frontend. It does not pervasively adopt this macro in SILGen, SILOptimizer, or IRGen; these components use `dump()` methods in a different way where they’re frequently called from debugging code. Nor does it adopt it in runtime components like swiftRuntime and swiftReflection, because I’m a bit worried about size.
Despite the large number of files and lines affected, this change is NFC.
(implemented by Nathan Hawes @nathawes)
Advance \p Loc to the last non-missing token of the specified or, if it
doesn't contain any, the last non-missing token preceding it in the
tree.
To represent a type with code completion.
type? '.'? <code-completion-token>
This is "parser only" node which is not exposed to SwiftSyntax.
Using this, defer to set the parsed type to code-completion callbacks.