This patch allows Parser to generate a refined token stream to satisfy tooling's need. For syntax coloring, token stream from lexer is insufficient because (1) we have contextual keywords like get and set; (2) we may allow keywords to be used as argument labels and names; and (3) we need to split tokens like "==<". In this patch, these refinements are directly fulfilled through parsing without additional heuristics. The refined token vector is optionally saved in SourceFile instance.
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.
This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
This adds support for SE-0168, multi-line string literals.
Extend the lexer to recognize the new literals. Test cases added.
There are still areas for future diagnostic improvement, such as fixits and notes as to why a multi-line string literal will be malformed. Multi-line literals are explicitly forbidden inside of string interpolation, though this may be relaxed in the future.
The Swift 4 Migrator is invoked through either the driver and frontend
with the -update-code flag.
The basic pipeline in the frontend is:
- Perform some list of syntactic fixes (there are currently none).
- Perform N rounds of sema fix-its on the primary input file, currently
set to 7 based on prior migrator seasons. Right now, this is just set
to take any fix-it suggested by the compiler.
- Emit a replacement map file, a JSON file describing replacements to a
file that Xcode knows how to understand.
Currently, the Migrator maintains a history of migration states along
the way for debugging purposes.
- Add -emit-remap frontend option
This will indicate the EmitRemap frontend action.
- Don't fork to a separte swift-update binary.
This is going to be a mode of the compiler, invoked by the same flags.
- Add -disable-migrator-fixits option
Useful for debugging, this skips the phase in the Migrator that
automatically applies fix-its suggested by the compiler.
- Add -emit-migrated-file-path option
This is used for testing/debugging scenarios. This takes the final
migration state's output text and writes it to the file specified
by this option.
- Add -dump-migration-states-dir
This dumps all of the migration states encountered during a migration
run for a file to the given directory. For example, the compiler
fix-it migration pass dumps the input file, the output file, and the
remap file between the two.
State output has the following naming convention:
${Index}-${MigrationPassName}-${What}.${extension}, such as:
1-FixitMigrationState-Input.swift
rdar://problem/30926261
Add an option to the lexer to go back and get a list of "full"
tokens, which include their leading and trailing trivia, which
we can index into from SourceLocs in the current AST.
This starts the Syntax sublibrary, which will support structured
editing APIs. Some skeleton support and basic implementations are
in place for types and generics in the grammar. Yes, it's slightly
redundant with what we have right now. lib/AST conflates syntax
and semantics in the same place(s); this is a first step in changing
that to separate the two concepts for clarity and also to get closer
to incremental parsing and type-checking. The goal is to eventually
extract all of the syntactic information from lib/AST and change that
to be more of a semantic/symbolic model.
Stub out a Semantics manager. This ought to eventually be used as a hub
for encapsulating lazily computed semantic information for syntax nodes.
For the time being, it can serve as a temporary place for mapping from
Syntax nodes to semantically full lib/AST nodes.
This is still in a molten state - don't get too close, wear appropriate
proximity suits, etc.
Store leading a trailing "trivia" around a token, such as whitespace,
comments, doc comments, and escaping backticks. These are syntactically
important for preserving formatting when printing ASTs but don't
semantically affect the program.
Tokens take all trailing trivia up to, but not including, the next
newline. This is important to maintain checks that statements without
semicolon separators start on a new line, among other things.
Trivia are now data attached to the ends of tokens, not tokens
themselves.
Create a new Syntax sublibrary for upcoming immutable, persistent,
thread-safe ASTs, which will contain only the syntactic information
about source structure, as well as for generating new source code, and
structural editing. Proactively move swift::Token into there.
Since this patch is getting a bit large, a token fuzzer which checks
for round-trip equivlence with the workflow:
fuzzer => token stream => file1
=> Lexer => token stream => file 2 => diff(file1, file2)
Will arrive in a subsequent commit.
This patch does not change the grammar.
Like cursor-info, range info (""source.request.cursorinfo"") answers some
questions clients have for a code snippet under selection, for instance, the type of a selected
expression. This commit implements this new quest kind and provides two
simple information about the selected code: (1) the kind of the
snippet, currently limited to single-statement and expression; and (2)
the type of the selected expression. Gradually, we will enrich the
response to provide more insight into the selected code snippet.
Lexing for conflict markers is inspired by the way clang handles them with a few
updates of our own. Clang's lexer currently searches for the start of the
conflict marker, attempts to find the divider points, lexes that, then finds the
end. We, unfortunately, cannot be so forgiving because of operator overloads.
Instead, we search for the start and end markers and ignore all text in between.
Even if what is found is not conflict markers, it certainly is not valid Swift either.
Introduce a new attribute, swift3_migration, that lets us describe the
transformation required to map a Swift 2.x API into its Swift 3
equivalent. The only transformation understood now is "renamed" (to
some other declaration name), but there's a message field where we can
record information about other changes. The attribute can grow
somewhat (e.g., to represent parameter reordering) as we need it.
Right now, we do nothing but store and validate this attribute.
This is a frequently reported and surprising issue where lack of whitespace leads to
rejecting common code like "X*-4". Fix this by diagnosing it specifically as a lack
of whitespace problem, including a fixit to insert the missing whitespace (to transform
it into "X * -4". This even handles the cases where there are multiple valid (single)
splits possible by emitting a series of notes.
Emit a fix-it replacing them with double-quote string literals.
<rdar://problem/21950709> QoI: Parse single-quoted literals like double-quoted literals
Swift SVN r31973
Rename 'assignment' attribute of infix operators to 'mutating'. Add
'has_assignment' attribute, which results in an implicit declaration of
the assignment version of the same operator. Parse "func =foo"
declaration and "foo.=bar" expression. Validate some basic properties of
in-place methods.
Not yet implemented: automatic generation of wrapper for =foo() if foo()
is implemented, or vice versa; likewise for operators.
Swift SVN r26508
Also, remove calls to isSwiftReservedName in
ClangImporter::Implementation::importName(), since 'true' and 'false'
are now keywords and rdar://problem/13187570 is no longer a problem.
rdar://problem/18797808
Swift SVN r23244
When a subclass is missing a required initializer, produce an error
within the subclass that mentions the required initializer along with
a Fix-It that provides an initializer stub, e.g.,
required init(coder aDecoder: NSCoder!) {
fatalError("init(coder:) has not been implemented")
}
We take care to insert this stub in the main class, after all of the
initializers (if there are any) or near the beginning of the class (if
there aren't any initializers), and try to match the existing
indentation. If this works out, we should handle unsatisfied protocol
requirements the same way. <rdar://problem/17923210>
Swift SVN r21055
backtracking, because we didn't restore the lexer state to before the comment, preventing
it from being attached to the token that followed it after backtracking was restored.
This is super obscure right now, but causes two tests to fail with my forthcoming patch,
lets nip this in the bud.
Swift SVN r19553
This consolidates the \x, \u, and \U escape sequences into one \u{abc} escape sequence.
For now we still parse and cleanly reject the old forms with a nice error message, this
will eventually be removed in a later beta (tracked by rdar://17527814)
Swift SVN r19435
* replaced yet another variant of isWhitespace with the version from
clang/Basic/CharInfo.h. The major difference is that our variant used to
consider '\0' whitespace.
* made sure that we don't construct StringRefs that point after the end of the
buffer. If the buffer ends with "#", then MemoryBuffer will only guarantee
that there is one additional NUL character. memcmp(), OTOH, is allowed to
access the complete span of the provided memory. I colud not actually get
this to crash on OSX 10.10, but I do remember similar crashes we fixed in Clang.
* added checks to reject extra tokens at the end of the build configuration
directive -- see tests, that code used to compile without diagnostics. The
lexer tried to do this, but in a self-referential way -- by checking the
NextToken variable (which is actually the previous token, when viewed from
the point of lexImpl()). The checks I added are a little too strict, they
reject comments at the end of the directive, but at least we don't accept
strange constructs. Allowing comments would not be hard, just requires
factoring out lexer's routines to skip comments so that they accept a pointer
to the buffer and return the comment end point. Filed
<rdar://problem/16301704> Allow comments at the end of bulid configuration directives
for that.
Found by inspection... I was grepping the codebase for 'isWhitespace'.
Swift SVN r14959
Lex a backtick-enclosed `[:identifier_start:][:identifier_cont:]+` as an identifier, even if it's a Swift keyword. For now, require that the escaped name still be a valid identifier, keyword collisions notwithstanding. (We could in theory allow an arbitrary string, but we'd have to invent a mangling for non-identifier characters and do other tooling which doesn't seem productive.)
Swift SVN r14671