589 Commits

Author SHA1 Message Date
Hamish Knight
611fd33f58 Update regex literal delimiters
Update the lexing code for the replacement of the
`'/.../'` and `'|...|'` delimiters with `#/.../#`
and `#|...|#` respectively, in addition to
allowing the `re'...'` delimiter.
2022-02-24 17:37:16 -08:00
swift-ci
d423084f69 Merge remote-tracking branch 'origin/main' into rebranch 2021-12-19 16:16:40 -08:00
Michael Ilseman
7bff9da67d Revert "Revert "Merge pull request #40595 from hamishknight/straw-bales"" 2021-12-19 10:08:48 -07:00
swift-ci
82424d1d5a Merge remote-tracking branch 'origin/main' into rebranch 2021-12-18 21:13:14 -08:00
Arnold Schwaighofer
9511994e52 Revert "Merge pull request #40595 from hamishknight/straw-bales"
This reverts commit a67a0436f7, reversing
changes made to 9965df76d0.

This commit or the earlier commit this commit is based on (#40531) broke the
incremental bot.
2021-12-18 11:02:37 -08:00
swift-ci
f40e666b81 Merge remote-tracking branch 'origin/main' into rebranch 2021-12-17 18:13:25 -08:00
Hamish Knight
128f5d4bc6 Update regex literal lexing and emission
Update the lexing implementation to defer to the
regex library, which will pass back the pointer
from to resume lexing, and update the emission to
call the new `Regex(_regexString:version:)`
overload, that will accept the regex string with
delimiters.

Because this uses the library's lexing
implementation, the delimiters are now `'/.../'`
and `'|...|'` instead of plain `'...'`.
2021-12-17 18:05:31 +00:00
swift-ci
fa68ec007b Merge remote-tracking branch 'origin/main' into rebranch 2021-12-07 02:33:57 -08:00
Hamish Knight
37f16520e6 Prototype regex literal AST and emission
With `-enable-experimental-string-processing`,
start lexing `'` delimiters as regex literals (this
is just a placeholder delimiter for now). The
contents of which gets passed to the libswift
library, which can return an error string to be
emitted, or null for success.

The libswift side isn't yet hooked up to the Swift
regex parser, so for now just emit a dummy
diagnostic for regexes starting with quantifiers.

If successful, build an AST node which will be
emitted as an implicit call to an
`init(_regexString:)` initializer of an in-scope
`Regex` decl (which will eventually be a known
stdlib decl).
2021-12-06 21:16:14 +00:00
swift-ci
908cd43b0d Merge remote-tracking branch 'origin/main' into rebranch 2021-11-19 12:13:27 -08:00
Richard Wei
65bffd7ad7 Add _MatchingEngine and _StringProcessing modules.
These modules are part of the experimental declarative string processing feature. If accepted to the Standard Library, _StringProcessing will be available via implicit import just like _Concurrency, though _MatchingEngine will still be hidden as an implementation detail.

`_MatchingEngine` will contain the general-purpose pattern matching engine ISA, bytecode, and executor. `_StringProcessing` will contain regular expression and pattern matching APIs whose implementation depends on the matching engine..

Also consolidates frontend flag `-enable-experimental-regex` as `-enable-experimental-string-processing`.

Resolves rdar://85478647.
2021-11-19 09:27:33 -08:00
Ben Barham
11f28196bc Merge pull request #40168 from bnbarham/rebranch-failures
[rebranch] Fix compilation failures
2021-11-17 08:50:11 +10:00
Michael Ilseman
2740e2707c Experimental Regex Strawperson (use Swift in the parser) (#40117)
[regex] Use Swift in the parser

Add in a strawperson use of Swift by the parser, for
future regex support.
2021-11-14 07:11:47 -07:00
Ben Barham
e139a2f2ea [rebranch] Rename various functions to match new names in LLVM
llvm-project 601102d282d5e9a1429fea52ee17303aec8a7c10 renamed various
functions in `CharInfo.h` and `Lexer.h`. Rename uses in Swift.
2021-11-13 15:33:09 +10:00
Alex Hoppen
b888dc0e40 [Parser] Don't modify the current token kind when cutting off parsing
Previously, when we reached the maximum nesting level, we changed the current token’s kind to an EOF token. A lot of places in the parser are not set up to expect this token change. The intended workaround was to check whether pushing a structure marker failed (which would change the token kind) and bail out parsing if this happened. This was fragile and caused assertion failures in assert builds.

Instead of changing the current token’s kind, and failing to push the structure marker, let the lexer know that it should cut off lexing, essentially making the input buffer stop at the current position. The parser will continue to consume its current token (`Parser.Tok`) and the next token that’s already lexed in the lexer (`Lexer.NextToken`) before reaching the emulated EOF token. Thus two more tokens are parsed than before, but that shouldn’t make much of a difference.
2021-11-09 12:28:10 +01:00
Hamish Knight
71fb1691af [Frontend] Add -warn-on-editor-placeholder
This hidden frontend option lets us be more lax
when type-checking in the presence of editor
placeholders by treating them as holes during
constraint solving.
2021-08-18 13:21:05 +01:00
Dario Rexin
4486aac405 [Lexer] Avoid casting to signed char in switch statements (#37289)
This cast causes overflows on platforms that use unsigned char as default and at least on linux-aarch64 causes false negatives in the switch statements.
2021-05-06 18:41:46 -07:00
Alex Hoppen
f12c151823 [Lexer] Improve lexing of BOM trivia
Simplify lexing of BOM trivia, eliminating the need to manually
construct the trivia StringRef.
2021-02-11 10:53:07 +01:00
Alex Hoppen
a8c01365b8 [Lexer] Eliminate unnecessary calls to TriviaLexer::lexTrivia
If the lexer itself keeps track of where the first comment of a token
starts, we can avoid parsing trivia into pieces.
2021-02-05 08:15:55 +01:00
Alex Hoppen
a7641a7cd2 [Lexer] Adjust tests for new delayed trivia lexing 2021-02-05 08:15:54 +01:00
Alex Hoppen
6911553067 [Lexer] Push trivia lexing down to the parser
This is an intermediate state in which the lexer delegates the
responsibility for trivia lexing to the parser. Later, the parser will
delegate this responsibility to SyntaxParsingContext which will hand it
over to SyntaxParseAction, which will only lex the pieces if it is
really necessary to do so.
2021-02-05 08:15:54 +01:00
Alex Hoppen
2bf5e4e209 [Lexer] Extract trivia piece lexing to a separate TriviaLexer
The lexer is only responsible for skipping over trivia and noting their
length. A separate TriviaLexer can be invoked to split the raw trivia
string into its pieces.

Since most of the time the trivia pieces aren't needed, this will allow
us to later only parse trivia into pieces when they are explicitly
needed.
2021-02-04 14:27:28 +01:00
Minhyuk Kim
028594b740 [Parse] Move standalone_dollar_identifier diagnosis to Parser. Resolves SR-13092. 2020-11-04 21:26:54 +09:00
maustinstar
e53290de0e Remove redundant false delimiter check 2020-10-23 23:16:00 -04:00
maustinstar
1d2c426239 Distinguish raw stringd from multiline delimiters [SR-10011] 2020-10-23 17:07:56 -04:00
Ben Barham
7594cfb0bc [Parse] Update Lexer::getLocFor*Line methods to return correct locations
When the location given to getLocForStartOfLine was an empty line, it
would actually return the location of the next line rather than the
given location as it should.

If the location given to getLocForEndOfLine was inside a token on a line
that was either empty or contained whitespace, it would skip to the end
of that token and then return the location for the next line. This was
an issue for multiline strings, where the string is a single token but
it's over multiple lines.
2020-08-12 17:25:54 +10:00
Suyash Srijan
7ee6319cdc [Parse] [Sema] Update confusables diagnostic to mention the character names as well (#33105)
* [Parser] Update 'Confusables.def' file to include confusable and base character names

* [Parser] Add a new utility method to return the names of the confusable and base characters for a given confusable codepoint

* [Parser] Update diagnostic for confusable character during lexing to mention confusable and base character names

* [Sema] If there is just a single confusable character, emit a tailored diagnostic that also mentions the character names

* [Diagnostics] Add new diagnostic messages to the localization file

* [Test] Update confusables test

* [Utils] Update unicode confusables txt file and update script to regenerate confusables def file

* [Parse] Regenerate 'Confusables.def' using updated script

* [Utils] Adjust generate_confusables script based on review feedback

Fix a mistake with name mapping. Updated header comment. Fix a couple of linting issues.

* [Parse] Regenerate 'Confusables.def' file once again after script changes

* [Parse] Add the newline after end of 'getConfusableAndBaseCodepointNames' method

* [Test] Update diagnostic message in 'Syntax/Parser/diags.swift'
2020-07-27 23:15:31 +01:00
Suyash Srijan
71f6797c8f [Lexer] Handle UTF8 characters in dollar identifier (#32961) 2020-07-18 13:20:22 +01:00
Anthony Latsis
a11cc4fcfc Handle more built-in operators and error intersections with the unwrap collision diagnostic 2020-02-22 03:55:43 +03:00
Frederick Kellison-Linn
71697c37ca Allow implicit self in escaping closures when self usage is unlikely to cause cycle (#23934)
* WIP implementation

* Cleanup implementation

* Install backedge rather than storing array reference

* Add diagnostics

* Add missing parameter to ResultFinderForTypeContext constructor

* Fix tests for correct fix-it language

* Change to solution without backedge, change lookup behavior

* Improve diagnostics for weak captures and captures under different names

* Remove ghosts of implementations past

* Address review comments

* Reorder member variable initialization

* Fix typos

* Exclude value types from explicit self requirements

* Add tests

* Add implementation for AST lookup

* Add tests

* Begin addressing review comments

* Re-enable AST scope lookup

* Add fixme

* Pull fix-its into a separate function

* Remove capturedSelfContext tracking from type property initializers

* Add const specifiers to arguments

* Address review comments

* Fix string literals

* Refactor implicit self diagnostics

* Add comment

* Remove trailing whitespace

* Add tests for capture list across multiple lines

* Add additional test

* Fix typo

* Remove use of ?: to fix linux build

* Remove second use of ?:

* Rework logic for finding nested self contexts
2019-12-20 02:38:41 +00:00
Rintaro Ishizaki
8edea315cd [Syntax] Abolish 'backtick' trivia
- Stop producing 'backtick' trivia for escaping identifier token. '`'s
  are now parts of the token text
- Adjust and simplify C++ libSyntax APIs
- Add 'is_deprecated' property to Trivia.py to attribute SwiftSyntax
  APIs

rdar://problem/54810608
2019-09-09 11:49:25 -07:00
Rintaro Ishizaki
cb308b7e53 Revert "Revert "[Parser] Decouple the parser from AST creation (part 2)""
This reverts commit 8ad3cc8a82.
2019-08-27 14:36:41 -07:00
Rintaro Ishizaki
8ad3cc8a82 Revert "[Parser] Decouple the parser from AST creation (part 2)" 2019-08-27 12:28:48 -07:00
Jan Svoboda
8da82d3272 Revert Lexer UTF8 replacement char changes 2019-08-26 23:15:45 +02:00
Rintaro Ishizaki
ad5b8253e3 [Lexer] Don't lex trailing trivia for eof (including artificial eof) 2019-08-26 22:37:20 +02:00
Jan Svoboda
77924c4b84 [Parser] Decouple the parser from AST creation (part 2)
Instead of creating the AST directly in the parser (and libSyntax or
SwiftSyntax via SyntaxParsingContext), make Parser to explicitly create
a tree of ParsedSyntaxNodes. Their OpaqueSyntaxNodes can be either
libSyntax or SwiftSyntax. If AST is needed, it can be generated from the
libSyntax tree.
2019-08-26 19:10:51 +02:00
Doug Gregor
7f293f66b3 [Parser] Allow use of $ declarations in all modes.
Allow the use of declarations whose names start with $ in all
modes. However, normal code cannot define new entities with names that
start with $: only the implementation can do that, e.g., for property
delegates.
2019-04-23 11:31:58 -07:00
Davide Italiano
49b2f4eba7 [DebuggerSupport] Unbreak closures in the expression parser.
The new trick is that of leavign an unresolved identifier for
the expression parser to fill in.

<rdar://problem/47982630>
2019-02-12 16:36:47 -08:00
Harlan Haskins
b81c491d3d [Lexer] Allow $-prefixed identifiers for parseable interfaces
We’re printing a new name for lazy storage in parseable interfaces,
`$__lazy_storage_$_{propname}`. This is intentionally $-prefixed so it cannot
conflict with variables written in source, but it doesn’t use a `.` anymore
because parseable interfaces need to be...parseable.
2019-01-22 11:02:36 -08:00
Harlan Haskins
f5fc6f0c57 [Lexer] Handle SwiftInterface files as well as SIL
Previously, the Lexer kept a single flag whether we’re lexing Swift or SIL. Instead, keep track if we’re parsing Swift, SIL, or a Swiftinterface file. .swiftinterface files allow $-prefixed identifiers anywhere.
2019-01-22 11:02:36 -08:00
Argyrios Kyrtzidis
c7ac859310 [Parse] Optimize syntax parsing: Speed-up Lexer::lexTrivia()
Introduce ParsedTrivia which is a more efficient structure to use during lexing than syntax::Trivia.
2019-01-17 12:10:27 -08:00
Adrian Prantl
ff63eaea6f Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

      for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
2018-12-04 15:45:04 -08:00
Rintaro Ishizaki
56ce86178c [CodeCompletion] Eat identifier characters after CC position
Text on editors (e.g. Xcode) may contain identifiers following to CC
position which should be considered as "filtering text" for the code
completion.

For example, a user types '@a' the CC position is between '@' and 'a'.
The user probably expects attributes starting with 'a'.

Eat identifier characters after CC token in Lexer. By this change,
for instance for '@<token>IB class', the parser now can detect this
is a start of class declaration and emit class attributes completions.

rdar://problem/46103294
2018-12-04 08:29:26 +09:00
Rintaro Ishizaki
4f058028b7 [Lexer] Micro optimization for getTokenAt()
lex() invokes lexImpl(). It doesn't need to lex next token.
2018-11-02 12:03:54 +09:00
Davide Italiano
3b96d21a2c Merge pull request #19808 from apple/bananaphone
[Lexer] Allow '$x' to be parsed as identifier in debugger mode.
2018-10-10 14:16:46 -07:00
Davide Italiano
2af9a1dc7f [Lexer] Allow '$x' to be parsed as identifier in debugger mode.
This is needed to support `print $0` in lldb where `$0` is the
shortcut for a closure argument. The associated lldb change
will be merged together with this one and will contain a test.

<rdar://problem/201719448>
2018-10-09 15:00:10 -07:00
Rintaro Ishizaki
8f7254d722 [Lexer] Skip comments in interpolated expression in string literal
when skipping to the end of the interpolated expression.
i.e. Skip the comment as a comment.

Previously, ')' or '"' in comment in interpolated expression used to
cause assertion failure or mis-compilation in no-assert build.

rdar://problem/20289969
2018-10-09 09:35:50 +09:00
Rintaro Ishizaki
a0ebdbc089 [Lexer] Fix assertion failure for unterminated string literal
in string interpolation in multiline string literal.

    """
    \("<-this is unterminated.
    """

In this case, the outer multiline literal should form 'tok::unknown'
along with a error message.
2018-09-22 02:09:19 +09:00
Rintaro Ishizaki
a5759b73b5 [Lexer] Improve fix-it to remove "too long" delimiter in string literal
Fix-it to remove extra '#'s at once.
2018-09-19 18:58:54 +09:00
Rintaro Ishizaki
aaa4110a7c [Lexer] Factor out diagnostics for single-quoted string literal
For readability.
2018-09-19 18:58:54 +09:00