Commit Graph

47 Commits

Author SHA1 Message Date
Alex Hoppen
294977534c [libSyntax] Remove incremental JSON transfer option
We were only keeping track of `RawSyntax` node IDs to incrementally transfer a syntax tree via JSON. However, AFAICT the incremental JSON transfer option has been superceeded by `SyntaxParseActions`, which are more efficient.

So, let’s clean up and remove the `RawSyntax` node ID and JSON incremental transfer option.

In places that still need a notion of `RawSyntax` identity (like determining the reused syntax regions), use the `RawSyntax`’s pointer instead of the manually created ID.

In `incr_transfer_round_trip.py` always use the code path that uses the `SyntaxParseActions` and remove the transitional code that was still using the incremental JSON transfer but was never called.
2021-04-07 10:01:34 +02:00
Alex Hoppen
d0cb7ad624 [libSyntax] Eliminate loop in RawSyntax constructor
Currently, when creating a `RawSyntax` layout node, the `RawSyntax` constructor needs to iterate over all child nodes to
a) sum up their sub node count
b) add their arena as a child arena of the new node's arena

But we are already iterating over all child nodes in every place that calls these constructors. So instead of looping twice, we can perform the above operations in the loop that already exists and pass the parameters to the `RawSyntax` constructor, which spees up `RawSyntax` node creation.

To ensure the integrity of the `RawSyntax` tree, the passed in values are still validated in release builds.
2021-03-26 18:30:46 +01:00
Alex Hoppen
95acf4d959 [libSyntax] Inline commonly called methods in RawSyntax and AbsoluteRawSyntax
These methods are super small and setting up the stack frame etc. takes
up the majority (or at least a significant amount) of their execution
time. So let's inline them.
2021-03-04 10:48:41 +01:00
Alex Hoppen
28f5f79bb7 [libSyntax] Don't reference count RawSyntax
Instead, only reference count the SyntaxArena that the RawSyntax nodes
live in. The user of RawSyntax nodes must guarantee that the SyntaxArena
stays alive as long as the RawSyntax nodes are being accessed.

During parse time, the SyntaxTreeCreator holds on to the SyntaxArena
in which it creates RawSyntax nodes. When inspecting a syntax tree,
the root SyntaxData node keeps the SyntaxArena alive. The change should
be mostly invisible to the users of the public libSyntax API.

This change significantly decreases the overall reference-counting
overhead. Since we were not able to free individual RawSyntax nodes
anyway, performing the reference-counting on the level of the
SyntaxArena feels natural.
2021-03-01 09:43:54 +01:00
Alex Hoppen
c1d65de89c [libSyntax] Optimise layout of RawSyntax to be more space efficient
This decreases the size of RawSyntax nodes from 88 to 64 bytes by
- Avoiding some padding by moving RefCount further up
- Limiting the length of tokens and their trivia to 32 bits. We would
  hit this limit with files >4GB but we also hit this limit at other
  places like the TextLength property in the Common bits.
2021-02-10 09:50:12 +01:00
Alex Hoppen
e43bad2c71 [libSyntax] Store the token's text in the SyntaxArena
Do the same thing that we are already doing for trivia: Since RawSyntax
nodes always live inside a SyntaxArena, we don't need to tail-allocate
an OwnedString to store the token's text. Instead we can just copy it
to the SyntaxArena. If we copy the entire source buffer to the syntax
arena at the start of parsing, this means that no more copies are
required later on. Plus we also avoid ref-counting the OwnedString which
should also increase performance.
2021-02-10 09:50:12 +01:00
Alex Hoppen
5637c25168 [libSyntax] Always copy leading and trailing trivia strings into a SyntaxArena buffer
Referencing a string in arbitrary memory is not safe since the source
buffer to which it points may have been freed. Instead copy all strings
into the SyntaxArena. Since RawSyntax nodes retain their arena, they can
be sure that the string won't disappear if it lives in their arena.

To avoid lots of small copies, we copy the entire source buffer once
into the syntax arena and make StringRefs point into that buffer.
2021-02-05 08:15:54 +01:00
Alex Hoppen
5e1ba8b16e [libSyntax] Store raw trivia inside RawSyntax and only lex into pieces when requested 2021-02-05 08:15:54 +01:00
Alex Hoppen
803499e165 [libSyntax] Require RawSyntax to always live inside a SyntaxArena
This way, we will later be able to store additional information about
the node inside the same arena with a guarantee that they will always be
alive as long as the node is alive.

These additional information will include
a) the token's text (which can be a StringRef into a copy of the source
   code that lives inside the SyntaxArena)
b) the token's unparsed trivia, which can be decomposed into pieces when
   needed.
2021-02-01 10:34:44 +01:00
Alex Hoppen
8bb1167e21 [libSyntax] Restructure RawSyntax to more closely resemble the SwiftSyntax implementation 2021-01-29 13:08:12 +01:00
Brent Royal-Gordon
99faa033fc [NFC] Standardize dump() methods in frontend
By convention, most structs and classes in the Swift compiler include a `dump()` method which prints debugging information. This method is meant to be called only from the debugger, but this means they’re often unused and may be eliminated from optimized binaries. On the other hand, some parts of the compiler call `dump()` methods directly despite them being intended as a pure debugging aid. clang supports attributes which can be used to avoid these problems, but they’re used very inconsistently across the compiler.

This commit adds `SWIFT_DEBUG_DUMP` and `SWIFT_DEBUG_DUMPER(<name>(<params>))` macros to declare `dump()` methods with the appropriate set of attributes and adopts this macro throughout the frontend. It does not pervasively adopt this macro in SILGen, SILOptimizer, or IRGen; these components use `dump()` methods in a different way where they’re frequently called from debugging code. Nor does it adopt it in runtime components like swiftRuntime and swiftReflection, because I’m a bit worried about size.

Despite the large number of files and lines affected, this change is NFC.
2019-10-31 18:37:42 -07:00
Rintaro Ishizaki
5c8cacec17 [Syntax] Include leading/trailing trivia size to the cache ID
We have to differenciate cache IDs between:
  (Token l_brace
    (trivia space 1)
    (text="{"))
and:
  (Token l_brace
    (text="{")
    (trivia space 1))
2019-10-21 15:42:19 -07:00
Rintaro Ishizaki
0e8010d8b9 Revert "Merge pull request #27592 from rintaro/syntaxparse-exprtuple"
This reverts commit cdfd1ab2cf, reversing
changes made to eb02f20f99.
2019-10-14 12:15:48 -07:00
Rintaro Ishizaki
33e561a810 [Syntax] Include leading/trailing trivia size to the cache ID
We have to differenciate cache IDs between:
  (Token l_brace
    (trivia space 1)
    (text="{"))
and:
  (Token l_brace
    (text="{")
    (trivia space 1))
2019-10-09 17:01:35 -07:00
Argyrios Kyrtzidis
668fa1d721 [ParsedRawSyntaxNode] Fix ParsedRawSyntaxNode::dump()
Using dumpTokenKind() function instead of getTokenText().
2019-01-07 19:56:37 -08:00
Alex Hoppen
38732abd46 [libSyntax] Pass RC<SyntaxArena> by reference where possible 2018-08-24 08:39:54 -07:00
Alex Hoppen
66374a14ea [libSyntax] Make RawSyntax nodes hold a strong reference to their arena
This allows an elegant design in which we can still allocate RawSyntax
nodes using a bump allocator but are able to automatically free that
buffer once the last RawSyntax node within that buffer is freed.

This also resolves a memory leak of RawSyntax nodes that was caused by
ParserUnit not freeing its underlying ASTContext.
2018-08-24 08:39:54 -07:00
Alex Hoppen
13aeb5440b [libSyntax] Lazily compute a node's text length 2018-08-09 15:39:48 -07:00
Alex Hoppen
419ba044f1 [libSyntax] Record reused node IDs
This is cheaper than recording reused region offsets and the reused node
IDs will later be used to incrementally transfer the syntax to
SwiftSyntax.
2018-07-19 13:55:57 -07:00
Alex Hoppen
705f5b79a2 [libSyntax] Rename getAbsolutePosition-related methods for more clarity 2018-07-19 09:15:53 -07:00
Alex Hoppen
9d59cd286b [incrParse] Add a stable id to the syntax nodes
The id is meant to be stable across incremental parses
2018-07-13 16:56:03 -07:00
John McCall
3247232aa3 Remove some gratutious uses of GCC extensions from the Syntax library.
Naming the bit-field structs is a significant readability improvement
because it's very clear that you shouldn't touch e.g. Bits.Token
without having checked/asserted that you're in a token case.

The assertions are all in statement context (which was obvious
because the NDEBUG versions all included semicolons), so there's no
reason not to use the traditional `do { } while (false)` trick instead
of a statement-expression.

This also clears up some warnings in atypical build configurations.
2018-06-30 03:59:49 -04:00
Alex Hoppen
15b2bae80a [libSyntax] Improve syntax related dump functions 2018-05-22 09:07:55 -07:00
Alex Hoppen
2c02b1e1b4 [incrParse] Fix lexer offset issue when missing tokens get synthesized 2018-05-22 08:52:39 -07:00
Alex Hoppen
f8cd1ca749 [libSyntax] Compute the text length of every node on the fly 2018-05-22 08:52:32 -07:00
Xi Ge
7b4218c2f7 libSyntax: cache absolute positions on SyntaxData.
Aligning with what we did for SwiftSyntax, this patch uses caches for
absolute position calculation on the C++ side.
2018-04-30 15:09:00 -07:00
Rintaro Ishizaki
6c0af2a24f [Syntax] Introduce CodeBlockItem (#14458)
CodeBlockItem represents Decl, Stmt or Expr that optionally followed by
semi-colon.
SourceFile syntax holds a list of CodeBlockItem.
2018-02-08 10:31:01 +09:00
Rintaro Ishizaki
057254dbc1 [Syntax] Bump allocate and cache/reuse RawSyntax
Introduced SyntaxArena for managing memory and cache.

SyntaxArena holds BumpPtrAllocator as a allocation storage.
RawSyntax is now able to be constructed with normal heap allocation, or
by SyntaxArena. RawSyntax has ManualMemory flag which indicates it's managed by
SyntaxArena. If the flag is true, its Retain()/Release() is no-op thus it's
never destructed by IntrusiveRefCntPtr.
This speedups the memory allocation for RawSyntax.

Also, in Syntax parsing, "token" RawSyntax is reused if:
a) It's not string literal with >16 length; and
b) It doesn't contain random text trivia (e.g. comment).
This reduces the overall allocation cost.
2018-02-02 01:27:06 +09:00
Rintaro Ishizaki
fced748790 [Syntax] Represent missing optioanl nodes as nullptr (#14300)
Allocating RawSyntax/SyntaxData for missing optional node is a waste of
resource.
2018-01-31 19:24:00 +09:00
Rintaro Ishizaki
d8f6ff0019 [Syntax] Reserve capacity for vector when possible 2018-01-31 15:21:21 +09:00
Rintaro Ishizaki
0780c529c4 [Syntax] Unify RawSyntax and RawTokenSyntax using union and TrailingObjects
It better matches with SwiftSyntax model.

Using TrailingObjects reduces the number of heap allocation which
gains 18% performance improvement.
2018-01-18 14:49:46 +09:00
Xi Ge
031488bada libSyntax: several enhancements on source location bridging. (#13956)
libSyntax nodes don't maintain absolute source location on each
individual node. Instead, the absolute locations are calculated on
demand with a given root by accumulating the length of all the other
nodes before the target node. This bridging is important for issuing
diagnostics from libSyntax entities.

With the observation that our current implementation of the source
location calculation has multiple bugs, this patch re-implemented this
bridging by using the newly-added syntax visitor. Also, we moved the function
from RawSyntax to Syntax for better visibility.

To test this source location calculation, we added a new action in
swift-syntax-test. This action parses a given file as a
SourceFileSyntax, calculates the absolute location of the
EOF token in the SourceFileSyntax, and dump the buffer from the start
of the input file to the absolute location of the EOF. Finally, we compare
the dump with the original input to ensure they are identical.
2018-01-15 16:39:17 -08:00
Xi Ge
653de9f23f [test] libSyntax: add a flag to swift-syntax-test to print trivial node kinds.
These trivial node kinds include node collections like stmtlist and
unknown syntax like UnknownExpr.
2017-11-30 14:33:15 -08:00
Rintaro Ishizaki
9eddbd1bc7 [libSyntax] Don't print missing nodes 2017-11-29 09:58:05 +09:00
Xi Ge
a4a01f9121 libSyntax: parse ternary expression.
Along with starting to support ternary expressions, this commit also
slightly changes SyntaxParsingContext APIs as follows:

1. Previously, makeNode() only supports node creation by using the nodes
from the underlying syntax token array; this commit allows it to use the nodes from
the pending syntax list as well.

2. This commit strictly limits that the pending syntax list should never
contain token syntax node.

3. The node kind test shouldn't include unknown kinds. They are noisy.
2017-11-14 23:29:23 -08:00
Xi Ge
a448a7371f libSyntax: parse codeblock syntax node. (#12771)
This commit teaches parser to generate code block syntax node. As a support for this, 
SyntaxParsingContext can be created by a single syntax kind, indicating the whole context 
should be parsed into a node of that given syntax. Another change is to bridge created syntax 
node with the given context kind. For instance, if a statement context results into an expression 
node, the expression node will be bridged to a statement by wrapping it with a ExpressionStmt 
node.
2017-11-05 17:37:59 -08:00
Xi Ge
e4e486edea libSyntax: when printing syntax tree with kind, optionally give syntax kind a visual highlight. 2017-10-21 15:47:19 -07:00
Xi Ge
e0dfa6119f libSyntax: add a test to ensure the generated syntax kinds from parser are expected. 2017-10-21 14:12:59 -07:00
Harlan
a5098e6b69 Generate libSyntax API (#10926)
* Generate libSyntax API

This patch removes the hand-rolled libSyntax API and replaces it with an
API that's entirely automatically generated. This means the API is
guaranteed to be internally stylistically and functionally consistent.
2017-07-25 18:19:58 -07:00
Harlan
70089a7bcc [Syntax] Represent TokenSyntax as a Syntax node (#10606)
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.

This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
2017-06-27 11:08:10 -07:00
practicalswift
797c2d8118 [gardening] Fix end of namespace comments 2017-04-20 22:01:01 +02:00
practicalswift
431e5a1440 [gardening] Use consistent end of namespace comments 2017-04-20 13:47:10 +02:00
David Farler
303a3e5824 Start the Migrator library
The Swift 4 Migrator is invoked through either the driver and frontend
with the -update-code flag.

The basic pipeline in the frontend is:

- Perform some list of syntactic fixes (there are currently none).
- Perform N rounds of sema fix-its on the primary input file, currently
  set to 7 based on prior migrator seasons.  Right now, this is just set
  to take any fix-it suggested by the compiler.
- Emit a replacement map file, a JSON file describing replacements to a
  file that Xcode knows how to understand.

Currently, the Migrator maintains a history of migration states along
the way for debugging purposes.

- Add -emit-remap frontend option
  This will indicate the EmitRemap frontend action.
- Don't fork to a separte swift-update binary.
  This is going to be a mode of the compiler, invoked by the same flags.
- Add -disable-migrator-fixits option
  Useful for debugging, this skips the phase in the Migrator that
  automatically applies fix-its suggested by the compiler.
- Add -emit-migrated-file-path option
  This is used for testing/debugging scenarios. This takes the final
  migration state's output text and writes it to the file specified
  by this option.
- Add -dump-migration-states-dir

  This dumps all of the migration states encountered during a migration
  run for a file to the given directory. For example, the compiler
  fix-it migration pass dumps the input file, the output file, and the
  remap file between the two.

  State output has the following naming convention:
  ${Index}-${MigrationPassName}-${What}.${extension}, such as:
  1-FixitMigrationState-Input.swift

rdar://problem/30926261
2017-04-17 16:25:02 -07:00
Harlan
631c7d8064 [Syntax] Refactor Tuple Type Syntax (#8254)
* Refactor Tuple Type Syntax

This patch:

- Refactors TypeArgumentListSyntax and
  TypeArgumentListSyntaxData to use the SyntaxCollection and
  SyntaxCollectionData APIs.
- Refactors TupleTypeElementSyntax to own its trailing comma, and
  updates the tests accordingly.
- Provides an infrastructure for promoting types to use
  the SyntaxCollection APIs

* Addressed comments.

* Renamed makeBlankTypeArgumentList()

* Update makeTupleType

* Changed makeTupleType to take an element list.

* Updated comment.

* Improved API for creating TupleTypeElementListSyntax'es

* Added round-trip test

* Removed last TypeArgumentList holdovers.

* Fixed round-trip test invocation
2017-03-22 08:02:29 -04:00
practicalswift
246cfa6c16 [gardening] Use consistent headers 2017-02-24 09:37:37 +01:00
David Farler
2d4cb088f4 [Syntax] Remove dummy indent implementation
This isn't a robust implementation and is breaking the build. I'll
put it back once indentation is better specified.
2017-02-17 15:55:00 -08:00
David Farler
7ee42994c8 Start the Syntax library and optional full token lexing
Add an option to the lexer to go back and get a list of "full"
tokens, which include their leading and trailing trivia, which
we can index into from SourceLocs in the current AST.

This starts the Syntax sublibrary, which will support structured
editing APIs. Some skeleton support and basic implementations are
in place for types and generics in the grammar. Yes, it's slightly
redundant with what we have right now. lib/AST conflates syntax
and semantics in the same place(s); this is a first step in changing
that to separate the two concepts for clarity and also to get closer
to incremental parsing and type-checking. The goal is to eventually
extract all of the syntactic information from lib/AST and change that
to be more of a semantic/symbolic model.

Stub out a Semantics manager. This ought to eventually be used as a hub
for encapsulating lazily computed semantic information for syntax nodes.
For the time being, it can serve as a temporary place for mapping from
Syntax nodes to semantically full lib/AST nodes.

This is still in a molten state - don't get too close, wear appropriate
proximity suits, etc.
2017-02-17 12:57:04 -08:00