Commit Graph

51 Commits

Author SHA1 Message Date
Josh Soref
871634b9f3 Spelling include (#42616)
* spelling: accessible

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: are

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: assume

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: attempt

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: children

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: convenience

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: creation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: default

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: dereference

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: deserialization

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: embedded

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: enriched

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: excluding

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: for

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: global

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: guarantee

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: initialization

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: initialize

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: label

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: lifting

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: mangled

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: metadata

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: minimum

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: offset

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: only

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: otherwise

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: output

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: overall

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: parameter

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: passed

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: performance

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: referenced

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: standard

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: syntax

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: that

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: the

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: trivia

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: truncate

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: undesirable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: uniformly

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: uninitialized

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: value

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: verification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

Co-authored-by: Josh Soref <jsoref@users.noreply.github.com>
2022-04-25 09:00:59 -07:00
Alex Hoppen
294977534c [libSyntax] Remove incremental JSON transfer option
We were only keeping track of `RawSyntax` node IDs to incrementally transfer a syntax tree via JSON. However, AFAICT the incremental JSON transfer option has been superceeded by `SyntaxParseActions`, which are more efficient.

So, let’s clean up and remove the `RawSyntax` node ID and JSON incremental transfer option.

In places that still need a notion of `RawSyntax` identity (like determining the reused syntax regions), use the `RawSyntax`’s pointer instead of the manually created ID.

In `incr_transfer_round_trip.py` always use the code path that uses the `SyntaxParseActions` and remove the transitional code that was still using the incremental JSON transfer but was never called.
2021-04-07 10:01:34 +02:00
Alex Hoppen
d0cb7ad624 [libSyntax] Eliminate loop in RawSyntax constructor
Currently, when creating a `RawSyntax` layout node, the `RawSyntax` constructor needs to iterate over all child nodes to
a) sum up their sub node count
b) add their arena as a child arena of the new node's arena

But we are already iterating over all child nodes in every place that calls these constructors. So instead of looping twice, we can perform the above operations in the loop that already exists and pass the parameters to the `RawSyntax` constructor, which spees up `RawSyntax` node creation.

To ensure the integrity of the `RawSyntax` tree, the passed in values are still validated in release builds.
2021-03-26 18:30:46 +01:00
Alex Hoppen
ab7c51b9e2 [libSyntax] Improve data structure in RawSyntax
It turns out that the bitpacked Commons struct is actually fairly expensive because the CPU needs to apply bitmasks to fetch the IsToken and Presence flag. We've got padding space available, so we might as well properly align these boolean flags.

Also on a source level, replace a couple of bit-restricted unsigned fields by their representing type (e.g. SyntaxKind).

Finally, we can pull out the common bits to RawSyntax and have the Bits union only contain the token- or layout-specific fields.
This also allows us to initialise these fields in the constructor's initialiser list (instead of in the initialiser body).

Lastly, change copyToArenaIfNecessary to work on a char *& and length, which allows us to initialise leading/trailing trivia/token text in the initialiser list and adjust if necessary later.
2021-03-11 07:56:47 +01:00
Alex Hoppen
aa74b1bbe0 Merge pull request #36313 from ahoppen/pr/custom-opt-storage
[libSyntax] Hide null AbsoluteRawSyntax behind a custom OptionalStorage implementation
2021-03-09 09:44:20 +01:00
Alex Hoppen
f284529bd3 [libSyntax] Add a unsafe but fast SyntaxDataRef version of SyntaxData
In contrast to SyntaxData, SyntaxDataRef is not memory-safe, but
designed to be fast. In particular, the following guarantees from
SyntaxData are being dropped:
 - SyntaxDataRef does not retain the SyntaxArena containing its
   RawSyntax. The user of SyntaxDataRef has to provide that guarantee.
   However, that's usually pretty easily done by just retaining the
   SyntaxArena of the tree's root node.
 - The parent of a SyntaxDataRef must outlive the child node. This is
   the more tricky constraint, but if a tree is just walked top to
   bottom with nodes stored on the stack, this is given by the way the
   stack is being unrolled.
2021-03-05 16:59:54 +01:00
Alex Hoppen
3fa3e6cf82 [libSyntax] Make copyToArenaIfNecessary a method on SyntaxArena 2021-03-05 14:26:33 +01:00
Alex Hoppen
95acf4d959 [libSyntax] Inline commonly called methods in RawSyntax and AbsoluteRawSyntax
These methods are super small and setting up the stack frame etc. takes
up the majority (or at least a significant amount) of their execution
time. So let's inline them.
2021-03-04 10:48:41 +01:00
Alex Hoppen
28f5f79bb7 [libSyntax] Don't reference count RawSyntax
Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.

During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.

This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
2021-03-01 09:43:54 +01:00
Alex Hoppen
c1d65de89c [libSyntax] Optimise layout of RawSyntax to be more space efficient
This decreases the size of RawSyntax nodes from 88 to 64 bytes by
- Avoiding some padding by moving RefCount further up
- Limiting the length of tokens and their trivia to 32 bits. We would
  hit this limit with files >4GB but we also hit this limit at other
  places like the TextLength property in the Common bits.
2021-02-10 09:50:12 +01:00
Alex Hoppen
e43bad2c71 [libSyntax] Store the token's text in the SyntaxArena
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
2021-02-10 09:50:12 +01:00
Alex Hoppen
5e1ba8b16e [libSyntax] Store raw trivia inside RawSyntax and only lex into pieces when requested 2021-02-05 08:15:54 +01:00
Alex Hoppen
803499e165 [libSyntax] Require RawSyntax to always live inside a SyntaxArena
This way, we will later be able to store additional information about
the node inside the same arena with a guarantee that they will always be
alive as long as the node is alive.

These additional information will include
a) the token's text (which can be a StringRef into a copy of the source
   code that lives inside the SyntaxArena)
b) the token's unparsed trivia, which can be decomposed into pieces when
   needed.
2021-02-01 10:34:44 +01:00
Alex Hoppen
8bb1167e21 [libSyntax] Restructure RawSyntax to more closely resemble the SwiftSyntax implementation 2021-01-29 13:08:12 +01:00
Brent Royal-Gordon
99faa033fc [NFC] Standardize dump() methods in frontend
By convention, most structs and classes in the Swift compiler include a `dump()` method which prints debugging information. This method is meant to be called only from the debugger, but this means they’re often unused and may be eliminated from optimized binaries. On the other hand, some parts of the compiler call `dump()` methods directly despite them being intended as a pure debugging aid. clang supports attributes which can be used to avoid these problems, but they’re used very inconsistently across the compiler.

This commit adds `SWIFT_DEBUG_DUMP` and `SWIFT_DEBUG_DUMPER(<name>(<params>))` macros to declare `dump()` methods with the appropriate set of attributes and adopts this macro throughout the frontend. It does not pervasively adopt this macro in SILGen, SILOptimizer, or IRGen; these components use `dump()` methods in a different way where they’re frequently called from debugging code. Nor does it adopt it in runtime components like swiftRuntime and swiftReflection, because I’m a bit worried about size.

Despite the large number of files and lines affected, this change is NFC.
2019-10-31 18:37:42 -07:00
Alex Hoppen
38732abd46 [libSyntax] Pass RC<SyntaxArena> by reference where possible 2018-08-24 08:39:54 -07:00
Alex Hoppen
66374a14ea [libSyntax] Make RawSyntax nodes hold a strong reference to their arena
This allows an elegant design in which we can still allocate RawSyntax
nodes using a bump allocator but are able to automatically free that
buffer once the last RawSyntax node within that buffer is freed.

This also resolves a memory leak of RawSyntax nodes that was caused by
ParserUnit not freeing its underlying ASTContext.
2018-08-24 08:39:54 -07:00
Alex Hoppen
ac512d4341 [libSyntax] Add a reference counted version of OwnedString
We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
2018-08-13 15:37:53 -07:00
Alex Hoppen
13aeb5440b [libSyntax] Lazily compute a node's text length 2018-08-09 15:39:48 -07:00
Alex Hoppen
c8226d1507 [libSyntax] Make a typealias to unsigned to represent SyntaxNodeIds 2018-07-19 13:57:08 -07:00
Alex Hoppen
419ba044f1 [libSyntax] Record reused node IDs
This is cheaper than recording reused region offsets and the reused node
IDs will later be used to incrementally transfer the syntax to
SwiftSyntax.
2018-07-19 13:55:57 -07:00
Alex Hoppen
9d59cd286b [incrParse] Add a stable id to the syntax nodes
The id is meant to be stable across incremental parses
2018-07-13 16:56:03 -07:00
John McCall
3247232aa3 Remove some gratutious uses of GCC extensions from the Syntax library.
Naming the bit-field structs is a significant readability improvement
because it's very clear that you shouldn't touch e.g. Bits.Token
without having checked/asserted that you're in a token case.

The assertions are all in statement context (which was obvious
because the NDEBUG versions all included semicolons), so there's no
reason not to use the traditional `do { } while (false)` trick instead
of a statement-expression.

This also clears up some warnings in atypical build configurations.
2018-06-30 03:59:49 -04:00
Alex Hoppen
f8cd1ca749 [libSyntax] Compute the text length of every node on the fly 2018-05-22 08:52:32 -07:00
Alex Hoppen
0eb16d6799 [libSyntax] Update space needed for raw syntax bits
The bit for ManualMemory was not taken care of previously
2018-05-22 08:52:32 -07:00
Alex Hoppen
f369d5d7d4 [libSyntax] Add static_asserts for the size of RawSyntaxBits 2018-05-22 08:52:31 -07:00
Alex Hoppen
d832a38fda [libSyntax] Documentation improvements 2018-05-22 08:52:31 -07:00
Xi Ge
7b4218c2f7 libSyntax: cache absolute positions on SyntaxData.
Aligning with what we did for SwiftSyntax, this patch uses caches for
absolute position calculation on the C++ side.
2018-04-30 15:09:00 -07:00
Rintaro Ishizaki
057254dbc1 [Syntax] Bump allocate and cache/reuse RawSyntax
Introduced SyntaxArena for managing memory and cache.

SyntaxArena holds BumpPtrAllocator as a allocation storage.
RawSyntax is now able to be constructed with normal heap allocation, or
by SyntaxArena. RawSyntax has ManualMemory flag which indicates it's managed by
SyntaxArena. If the flag is true, its Retain()/Release() is no-op thus it's
never destructed by IntrusiveRefCntPtr.
This speedups the memory allocation for RawSyntax.

Also, in Syntax parsing, "token" RawSyntax is reused if:
a) It's not string literal with >16 length; and
b) It doesn't contain random text trivia (e.g. comment).
This reduces the overall allocation cost.
2018-02-02 01:27:06 +09:00
Rintaro Ishizaki
6108c881be [Syntax] Use TrailingObjects for SyntaxData (#14301)
This should optimize memory usage for SyntaxData.
2018-01-31 21:50:04 +09:00
Rintaro Ishizaki
fced748790 [Syntax] Represent missing optioanl nodes as nullptr (#14300)
Allocating RawSyntax/SyntaxData for missing optional node is a waste of
resource.
2018-01-31 19:24:00 +09:00
Rintaro Ishizaki
0780c529c4 [Syntax] Unify RawSyntax and RawTokenSyntax using union and TrailingObjects
It better matches with SwiftSyntax model.

Using TrailingObjects reduces the number of heap allocation which
gains 18% performance improvement.
2018-01-18 14:49:46 +09:00
Xi Ge
031488bada libSyntax: several enhancements on source location bridging. (#13956)
libSyntax nodes don't maintain absolute source location on each
individual node. Instead, the absolute locations are calculated on
demand with a given root by accumulating the length of all the other
nodes before the target node. This bridging is important for issuing
diagnostics from libSyntax entities.

With the observation that our current implementation of the source
location calculation has multiple bugs, this patch re-implemented this
bridging by using the newly-added syntax visitor. Also, we moved the function
from RawSyntax to Syntax for better visibility.

To test this source location calculation, we added a new action in
swift-syntax-test. This action parses a given file as a
SourceFileSyntax, calculates the absolute location of the
EOF token in the SourceFileSyntax, and dump the buffer from the start
of the input file to the absolute location of the EOF. Finally, we compare
the dump with the original input to ensure they are identical.
2018-01-15 16:39:17 -08:00
omochimetaru
9b7f2502c4 [Syntax] line number counting for CRLF 2017-12-22 03:46:12 +09:00
Xi Ge
653de9f23f [test] libSyntax: add a flag to swift-syntax-test to print trivial node kinds.
These trivial node kinds include node collections like stmtlist and
unknown syntax like UnknownExpr.
2017-11-30 14:33:15 -08:00
Xi Ge
01796bf664 libSyntax: fix a memory leak issue because of non-virtual destructor. rdar://35116413
RawTokenSyntax is a derived class from RawSyntax that is reference
counted with its own destructor function registered. Unfortunately, the destructor
function of RawSyntax is non-virtual before this patch. This means when reference counter
releases a pointer of RawSyntax, it won't clean-up the additional stuff in RawTokenSyntax.
2017-11-13 13:18:29 -08:00
Xi Ge
57b077f971 libSyntax: Add convenient APIs to check the category of SyntaxKind. NFC (#12627) 2017-10-25 18:52:43 -07:00
Xi Ge
e4e486edea libSyntax: when printing syntax tree with kind, optionally give syntax kind a visual highlight. 2017-10-21 15:47:19 -07:00
Xi Ge
e0dfa6119f libSyntax: add a test to ensure the generated syntax kinds from parser are expected. 2017-10-21 14:12:59 -07:00
Robert Widmann
481715a227 Ensure RawSyntax macro parameters are the same with NDEBUG (#11196)
I forgot to reset the macro parameters after converting them to
varargs, which didn't get caught running PR testing.

This patch ensures they're all the same.
2017-07-26 09:59:58 -07:00
Harlan
a5098e6b69 Generate libSyntax API (#10926)
* Generate libSyntax API

This patch removes the hand-rolled libSyntax API and replaces it with an
API that's entirely automatically generated. This means the API is
guaranteed to be internally stylistically and functionally consistent.
2017-07-25 18:19:58 -07:00
Harlan
70089a7bcc [Syntax] Represent TokenSyntax as a Syntax node (#10606)
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.

This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
2017-06-27 11:08:10 -07:00
practicalswift
7eb7d5b109 [gardening] Fix 100 typos. 2017-04-18 17:01:42 +02:00
Harlan
631c7d8064 [Syntax] Refactor Tuple Type Syntax (#8254)
* Refactor Tuple Type Syntax

This patch:

- Refactors TypeArgumentListSyntax and
  TypeArgumentListSyntaxData to use the SyntaxCollection and
  SyntaxCollectionData APIs.
- Refactors TupleTypeElementSyntax to own its trailing comma, and
  updates the tests accordingly.
- Provides an infrastructure for promoting types to use
  the SyntaxCollection APIs

* Addressed comments.

* Renamed makeBlankTypeArgumentList()

* Update makeTupleType

* Changed makeTupleType to take an element list.

* Updated comment.

* Improved API for creating TupleTypeElementListSyntax'es

* Added round-trip test

* Removed last TypeArgumentList holdovers.

* Fixed round-trip test invocation
2017-03-22 08:02:29 -04:00
David Farler
c958cd65eb [Syntax] Allow UnknownSyntax to have children
This will make it easier to incrementally implement syntax nodes,
while allowing us to embed nodes that we do know about inside ones
that we don't.

https://bugs.swift.org/browse/SR-4062
2017-02-28 14:30:57 -08:00
David Farler
c343298b8f [Syntax] Implement return-statement and integer-literal-expr
A return statement needs something to return, so implement
integer-literal-expression too. This necessarily also forced
UnknownExprSyntax, UnknownStmtSyntax, and UnknownDeclSyntax,
which are stand-in token buckets for when we don't know
how to transform/migrate an AST.

This commit also contains the core function for caching
SyntaxData children. This is highly tricky code, with some
detailed comments in SyntaxData.{h,cpp}. The gist is that
we have to atomically swap in a SyntaxData pointer into the
child field, so we can maintain pointer identity of SyntaxData
nodes, while still being able to cache them internally.

To prove that this works, there is a multithreaded test that
checks that two threads can ask for a child that hasn't been
cached yet without crashing or violating pointer identity.

https://bugs.swift.org/browse/SR-4010
2017-02-22 18:45:29 -08:00
David Farler
2d4cb088f4 [Syntax] Remove dummy indent implementation
This isn't a robust implementation and is breaking the build. I'll
put it back once indentation is better specified.
2017-02-17 15:55:00 -08:00
David Farler
7ee42994c8 Start the Syntax library and optional full token lexing
Add an option to the lexer to go back and get a list of "full"
tokens, which include their leading and trailing trivia, which
we can index into from SourceLocs in the current AST.

This starts the Syntax sublibrary, which will support structured
editing APIs. Some skeleton support and basic implementations are
in place for types and generics in the grammar. Yes, it's slightly
redundant with what we have right now. lib/AST conflates syntax
and semantics in the same place(s); this is a first step in changing
that to separate the two concepts for clarity and also to get closer
to incremental parsing and type-checking. The goal is to eventually
extract all of the syntactic information from lib/AST and change that
to be more of a semantic/symbolic model.

Stub out a Semantics manager. This ought to eventually be used as a hub
for encapsulating lazily computed semantic information for syntax nodes.
For the time being, it can serve as a temporary place for mapping from
Syntax nodes to semantically full lib/AST nodes.

This is still in a molten state - don't get too close, wear appropriate
proximity suits, etc.
2017-02-17 12:57:04 -08:00
David Farler
44f15558d6 Revert "Refactor: Rename Parser::consumeToken, consumeLoc. Add consumeToken API."
This reverts commit 39bfc123a3.
2016-11-18 13:23:31 -08:00
practicalswift
6fa577dfbd [gardening] Fix recently introduced typos. Fix inconsistent headers. 2016-11-17 13:09:02 +01:00