Commit Graph

588 Commits

Author SHA1 Message Date
Rintaro Ishizaki
a2d3ff4deb [SE-0182][Lexer] Diagnose escaped newline at the end of the last line in multiline string 2017-07-26 21:18:58 +09:00
John Holdsworth
c0fcc1afba [Parse] An implementation for SE-0182 2017-07-21 18:07:06 +01:00
David Rönnqvist
9ed9a860a0 Fix unicode handling when checking invalid characters
Use `advanceIfValidContinuationOfIdentifier` instead of `isValidIdentifierContinuationCodePoint` to handle unicode.
2017-07-08 19:27:42 +02:00
David Rönnqvist
57731ebc09 [QoI] [Parse] Improve error message when parsing floating point exponent
Update error messages to mention the invalid character.
Improve the diagnostic of floating point exponents.

Add tests for error messages when parsing floating point exponents.
Update existing tests for new error messages.
2017-07-08 13:32:34 +02:00
David Rönnqvist
a615d9ede3 [QoI] Improve error message when parsing integer literal
Rephrased error message to indicate which character is unexpected.
Provide error message variations when parsing binary, octal, decimal (default), and hexadecimal integer literals.
Look for unexpected digits in binary and octal integer literals.
Look for unexpected letters in hex integer literals.

Resolves: SR-5236 rdar://problem/32858684
2017-07-06 19:59:09 +02:00
Harlan
70089a7bcc [Syntax] Represent TokenSyntax as a Syntax node (#10606)
Previously, users of TokenSyntax would always deal with RC<TokenSyntax>
which is a subclass of RawSyntax. Instead, provide TokenSyntax as a
fully-realized Syntax node, that will always exist as a leaf in the
Syntax tree.

This hides the implementation detail of RawSyntax and SyntaxData
completely from clients of libSyntax, and paves the way for future
generation of Syntax nodes.
2017-06-27 11:08:10 -07:00
Rintaro Ishizaki
c8bd1aa401 [Parse] Fix skipping string interpolation in Lexer
Maintain inner most string literal mode to determine whether we allow
newline character or not.

* Disallow newline after multiline string in string interpolation. (SR-5171)
* Allow unbalanced `"` in multiline string in string interpolation.
2017-06-16 02:22:49 +09:00
Robert Widmann
3e2bbfe904 [Gardening] Cleanup TokenKinds.def (#10034)
* [Parse] Refactored internal structure of Tokens.def and documented usage.

Added a level of structure to the macro definitions to allow Swift
keywords to be cleanly accessed separately from SIL and Swift keywords
together. Documented structure and usage.

* [Parse] Made use of new guarantees and abstractions in Tokens.def

Used guarantees about undefining macros after import and new
SWIFT_KEYWORD abstraction to simplify usage of the Token.def
imports.

* Gardening
2017-06-01 15:08:48 -07:00
practicalswift
c9f576662a [gardening] Fix inconsistent spacing 2017-05-09 10:36:04 +02:00
Brent Royal-Gordon
1d70565d55 Better diagnostics for multi-line string literals (#9148)
[Parse] Improve multi-line string literal errors

Adds descriptive errors and fix-its for multi-line string literal indentation and newline errors.
2017-05-08 13:24:31 -07:00
Robin Kunde
6d63d90e0e SR-331: Diagnostic notes and fixits for unicode confusables (#9070) 2017-05-06 17:40:35 -04:00
Huon Wilson
07c5ab8fb2 Implement \ syntax for Swift key paths.
This introduces a few unfortunate things because the syntax is awkward.
In particular, the period and following token in \.[a], \.? and \.! are
token sequences that don't appear anywhere else in Swift, and so need
special handling. This is somewhat compounded by \foo.bar.baz possibly
being \(foo).bar.baz or \(foo.bar).baz (parens around the type), and,
furthermore, needing to distinguish \Foo?.bar from \Foo.?bar.

rdar://problem/31724243
2017-05-01 16:06:15 -07:00
John Holdsworth
fea17f22aa [Parse] Add support for multiline strings inside interpolations
Adds support for multiline string literals to appear inside of string interpolations. Tests added.
2017-04-27 19:02:23 -07:00
John Holdsworth
981e706fd9 An implementation for 0168-multi-line-string-literals.md (#8813)
This adds support for SE-0168, multi-line string literals.

Extend the lexer to recognize the new literals. Test cases added.

There are still areas for future diagnostic improvement, such as fixits and notes as to why a multi-line string literal will be malformed. Multi-line literals are explicitly forbidden inside of string interpolation, though this may be relaxed in the future.
2017-04-25 18:13:03 -07:00
Joe Groff
595e0e4ede Merge branch 'master' into keypaths 2017-04-19 18:38:24 -07:00
David Farler
303a3e5824 Start the Migrator library
The Swift 4 Migrator is invoked through either the driver and frontend
with the -update-code flag.

The basic pipeline in the frontend is:

- Perform some list of syntactic fixes (there are currently none).
- Perform N rounds of sema fix-its on the primary input file, currently
  set to 7 based on prior migrator seasons.  Right now, this is just set
  to take any fix-it suggested by the compiler.
- Emit a replacement map file, a JSON file describing replacements to a
  file that Xcode knows how to understand.

Currently, the Migrator maintains a history of migration states along
the way for debugging purposes.

- Add -emit-remap frontend option
  This will indicate the EmitRemap frontend action.
- Don't fork to a separte swift-update binary.
  This is going to be a mode of the compiler, invoked by the same flags.
- Add -disable-migrator-fixits option
  Useful for debugging, this skips the phase in the Migrator that
  automatically applies fix-its suggested by the compiler.
- Add -emit-migrated-file-path option
  This is used for testing/debugging scenarios. This takes the final
  migration state's output text and writes it to the file specified
  by this option.
- Add -dump-migration-states-dir

  This dumps all of the migration states encountered during a migration
  run for a file to the given directory. For example, the compiler
  fix-it migration pass dumps the input file, the output file, and the
  remap file between the two.

  State output has the following naming convention:
  ${Index}-${MigrationPassName}-${What}.${extension}, such as:
  1-FixitMigrationState-Input.swift

rdar://problem/30926261
2017-04-17 16:25:02 -07:00
Nathan Hawes
50aba8539b [cursor-info] Fix invalid assertion firing for symbols referenced within nested interpolated strings 2017-04-10 13:22:09 -07:00
Joe Groff
e3046d6f75 Parsing for native keypaths.
Use `#keyPath2` as a stand-in for the final syntax.
2017-04-04 11:31:15 -07:00
Hugh Bellamy
b564a917be Port swiftSyntax to Windows 2017-02-21 08:24:56 +07:00
David Farler
7ee42994c8 Start the Syntax library and optional full token lexing
Add an option to the lexer to go back and get a list of "full"
tokens, which include their leading and trailing trivia, which
we can index into from SourceLocs in the current AST.

This starts the Syntax sublibrary, which will support structured
editing APIs. Some skeleton support and basic implementations are
in place for types and generics in the grammar. Yes, it's slightly
redundant with what we have right now. lib/AST conflates syntax
and semantics in the same place(s); this is a first step in changing
that to separate the two concepts for clarity and also to get closer
to incremental parsing and type-checking. The goal is to eventually
extract all of the syntactic information from lib/AST and change that
to be more of a semantic/symbolic model.

Stub out a Semantics manager. This ought to eventually be used as a hub
for encapsulating lazily computed semantic information for syntax nodes.
For the time being, it can serve as a temporary place for mapping from
Syntax nodes to semantically full lib/AST nodes.

This is still in a molten state - don't get too close, wear appropriate
proximity suits, etc.
2017-02-17 12:57:04 -08:00
Hugh Bellamy
f001b7562b Use relatively new LLVM_FALLLTHROUGH instead of our own SWIFT_FALLTHROUGH 2017-02-12 10:47:03 +07:00
David Farler
c427c46178 Fix overflow when searching for <#
ede6bf7a80 increments the buffer pointer to early
when searching for the <# prefix to stop lexing an operator,
so the source buffer is accessed 1 byte off the end.

Thanks ASan and thanks @gparker42!

rdar://problem/28457876
2017-01-27 06:14:20 -08:00
David Farler
ede6bf7a80 Lexer: Don't split an operator '.<' from '.<#placeholder#>'
'.<#placeholder#>' is actually an unresolved reference where the name is
an editor placeholder, not the operator '.<' followed by #placeholder#>.

rdar://problem/28457876
2017-01-26 09:57:02 -08:00
Rintaro Ishizaki
0affdb057d [Lexer] Disable all SIL_KEYWORDs in non-SIL mode
Instead of long '||' conditions, use SIL_KEYWORD in Tokens.def.
2017-01-16 23:33:54 +09:00
practicalswift
6d1ae2a39c [gardening] 2016 → 2017 2017-01-06 16:41:22 +01:00
Rintaro Ishizaki
509db744f1 [Lexer] Disallow '$' as a start of identifier, special handle '$' (#6004)
Allowing 'let `$0` = 1', introduced in 6accc598 was not intentional.
2016-12-21 11:31:58 +09:00
Michael Gottesman
59c6a64f5a [gardening] 0 => nullptr. Fixed with clang-tidy. 2016-12-06 23:14:13 -08:00
David Farler
330c2d96e6 Make the lexer UTF-8 RFC 3629 correct re: prefix octets
RFC 2279 states that, in UTF-8:
"The octet values FE and FF never appear."

RFC 3629 states that, in UTF-8:
"The octet values C0, C1, F5 to FF never appear."

Generalize the check to advance past invalid starting bytes for
a UTF-8 sequence to fix a crash in the lexer.
2016-12-05 17:21:17 -08:00
practicalswift
797b80765f [gardening] Use the correct base URL (https://swift.org) in references to the Swift website
Remove all references to the old non-TLS enabled base URL (http://swift.org)
2016-11-20 17:36:03 +01:00
David Farler
f450f0ccdf Revert "Preserve whitespace and comments during lexing as Trivia"
This reverts commit d6e2b58382.
2016-11-18 13:23:31 -08:00
David Farler
d6e2b58382 Preserve whitespace and comments during lexing as Trivia
Store leading a trailing "trivia" around a token, such as whitespace,
comments, doc comments, and escaping backticks. These are syntactically
important for preserving formatting when printing ASTs but don't
semantically affect the program.

Tokens take all trailing trivia up to, but not including, the next
newline. This is important to maintain checks that statements without
semicolon separators start on a new line, among other things.

Trivia are now data attached to the ends of tokens, not tokens
themselves.

Create a new Syntax sublibrary for upcoming immutable, persistent,
thread-safe ASTs, which will contain only the syntactic information
about source structure, as well as for generating new source code, and
structural editing. Proactively move swift::Token into there.

Since this patch is getting a bit large, a token fuzzer which checks
for round-trip equivlence with the workflow:

fuzzer => token stream => file1
  => Lexer => token stream => file 2 => diff(file1, file2)

Will arrive in a subsequent commit.

This patch does not change the grammar.
2016-11-15 16:11:57 -08:00
Xi Ge
ac3411234d [SourceKit] The initial implementation of range-info request.
Like cursor-info, range info (""source.request.cursorinfo"") answers some
questions clients have for a code snippet under selection, for instance, the type of a selected
expression. This commit implements this new quest kind and provides two
simple information about the selected code: (1) the kind of the
snippet, currently limited to single-statement and expression; and (2)
the type of the selected expression. Gradually, we will enrich the
response to provide more insight into the selected code snippet.
2016-11-03 16:07:04 -07:00
Robert Widmann
6accc5989e Disable the ability to use '$' as an identifier harder
When in Swift 3 Compatibility Mode we now acceptable a standalone
'$' as an identifier.  In all other cases this is now disallowed
and must be surrounded by backticks.
2016-10-27 16:51:18 -04:00
Rintaro Ishizaki
1df518c978 [Lexer] Simplify comment detection code in lexOperatorIdentifier 2016-10-24 11:17:12 +09:00
Rintaro Ishizaki
9d1a3fc62c [Lexer] Trim comments off operator-identifier before tokenize
Previously, builtin operators followed by comment block were tokenized as
normal operators (e.g. tok::oper_binary_spaced) instead of dedicated
token(e.g. tok::equal).
That used to cause strange parse errors:

  test.swift:1:3: error: use of unresolved operator '='
  _ =/* */2
    ^
2016-10-22 23:32:13 +09:00
Robert Widmann
389f779a27 Fixup tests 2016-07-31 19:28:45 -07:00
Wheerd
f008d03c73 Added a simple test for dollar identifier lexing. 2016-07-31 19:28:45 -07:00
Wheerd
110ec21802 Fixed a single '$' being accepted as a valid identifier by the lexer.
Now it becomes an unknown token instead.
2016-07-31 19:28:45 -07:00
Chris Lattner
45118037cc When we lex an invalid leftbound dot, instead of emitting an error and swallowing it
silently, have the lexer return it as an unknown token.  Enhance the expr parser to
detect these things and squash any expression in progress into an ErrorExpr.  This
allows us to silence really bad downstream errors.  For example, on:

struct S { func f() {} }
func f() {
  _ = S.
}

we formerly produced:

 x.swift:5:8: error: expected member name following '.'
 x.swift:5:7: error: expected member name or constructor call after type name
 x.swift:5:7: note: add arguments after the type to construct a value of the type
 x.swift:5:7: note: use '.self' to reference the type object

we now emit just the first one.  This fixes:
<rdar://problem/22290244> QoI: "UIColor." gives two issues, should only give one
2016-07-03 16:04:35 -07:00
Rintaro Ishizaki
a5eed3828a [Lexer][SR-1724] Handle hex letters after '.' on hex number literal
The following case used to emit an error:

  extension Int {
    var asUiColor: UIColor { ... }
  }

  0xfff.asUiColor
2016-06-22 13:07:41 +09:00
Robert Widmann
95b28cba98 [SR-1545] Implement lexing of source conflict markers. (#2924)
Lexing for conflict markers is inspired by the way clang handles them with a few
updates of our own.  Clang's lexer currently searches for the start of the
conflict marker, attempts to find the divider points, lexes that, then finds the
end.  We, unfortunately, cannot be so forgiving because of operator overloads.
Instead, we search for the start and end markers and ignore all text in between.
Even if what is found is not conflict markers, it certainly is not valid Swift either.
2016-06-09 17:15:12 -07:00
Ted Kremenek
b8bbed8c13 [WIP] Implement SE-0039 (Modernizing Playground Literals) (#2215)
* Implement the majority of parsing support for SE-0039.

* Parse old object literals names using new syntax and provide FixIt.

For example, parse "#Image(imageLiteral:...)" and provide a FixIt to
change it to "#imageLiteral(resourceName:...)".  Now we see something like:

test.swift:4:9: error: '#Image' has been renamed to '#imageLiteral
var y = #Image(imageLiteral: "image.jpg")
        ^~~~~~ ~~~~~~~~~~~~
        #imageLiteral resourceName

Handling the old syntax, and providing a FixIt for that, will be handled in a separate
commit.

Needs tests.  Will be provided in later commit once full parsing support is done.

* Add back pieces of syntax map for object literals.

* Add parsing support for old object literal syntax.

... and provide fixits to new syntax.

Full tests to come in later commit.

* Improve parsing of invalid object literals with old syntax.

* Do not include bracket in code completion results.

* Remove defunct code in SyntaxModel.

* Add tests for migration fixits.

* Add literals to code completion overload tests.

@akyrtzi told me this should be fine.

* Clean up response tests not to include full paths.

* Further adjust offsets.

* Mark initializer for _ColorLiteralConvertible in UIKit as @nonobjc.

* Put attribute in the correct place.
2016-04-25 07:19:26 -07:00
practicalswift
66183cdbf7 [gardening] Fix unjustified spacing 2016-04-07 10:10:24 +02:00
Jesse Rusak
51ded13bce [Lexer] Updates to operator-whitespace handling from code review 2016-04-02 17:38:42 -04:00
Jesse Rusak
83b4c47222 [Lexer] Treat comments as whitespace for operator arity rules.
Previously, comments were treated as non-whitespace. Operators
also checked for right-boundedness before detecting comments
which start in the middle of an operator.

This resolves both of these issues so that comments are
consistently treated as whitespace when determining whether
operators are left- or right-bound.

Fixes SR-186 and SR-960 (SE-0037)
2016-03-18 11:51:27 -04:00
Toni Suter
4f5a94e1a9 [Lexer] Remove redundant checks in lexNumber()
The if statements check whether a number literal consists
of just the 0b/0o/0x prefix which would be invalid. However, in all three cases there's a preceding check
that already addresses this issue.
2016-03-16 22:28:39 +01:00
Chris Lattner
1868856e26 update regex in comment, thanks to @schnippi for pointing this out. 2016-03-13 15:14:31 -07:00
Chris Lattner
4992474168 Add support for #sourceLocation in its ratified forms. Switch gyb to produce
the new form.  This keeps accepting #setline for now, but we should rip it out
at some point.
2016-03-11 22:21:42 -08:00
Slava Pestov
bbbe307980 SIL: Introduce SILDefaultWitnessTable and start plumbing
This will be used to help IRGen record protocol requirements
with resilient default implementations in protocol metadata.

To enable testing before all the Sema support is in place, this
patch adds SIL parser, printer and verifier support for default
witness tables.

For now, SILGen emits empty default witness tables for protocol
declarations in resilient modules, and IRGen ignores them when
emitting protocol metadata.
2016-02-05 20:57:11 -08:00
Chris Lattner
d522cd4270 Centralize the parsing logic for #identifiers and make it more similar to
the identifier parsing logic.
2016-02-03 22:37:28 -08:00