Commit Graph

218 Commits

Author SHA1 Message Date
Anthony Latsis
06a5670c8f Basic: Untie swift::SourceLoc from llvm::SMLoc
Storing a `llvm::SMLoc` is a superfluous indirection, and getting rid of
it enables us to unconditionally import `SourceLoc` into Swift.
2025-07-11 18:48:42 +01:00
Tony Allevato
2b0f9aa765 Allow module aliases to use escaped identifiers as the alias name.
The original module names themselves must still be valid unescaped identifiers; most of the serialization logic in the compiler depends on the name of a module matching its name on the file system, and it would be very complex to turn escaped identifiers into file-safe names.
2025-03-11 17:18:43 -04:00
Tony Allevato
d94bd80c62 Add support for raw identifiers.
Raw identifiers are backtick-delimited identifiers that can contain any
non-identifier character other than the backtick itself, CR, LF, or other
non-printable ASCII code units, and which are also not composed entirely
of operator characters.
2025-03-11 17:18:43 -04:00
Ben Barham
ef8825bfe6 Migrate llvm::Optional to std::optional
LLVM has removed llvm::Optional, move over to std::optional. Also
clang-format to fix up all the renamed #includes.
2024-02-21 11:20:06 -08:00
pinkjuice66
88619a5c5a [Parse] Remove unused parameters and return value of lexTrivia function 2024-01-05 01:27:29 +09:00
Evan Wilde
f3ff561c6f [NFC] add llvm namespace to Optional and None
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
                     bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".

I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
2023-06-27 09:03:52 -07:00
Robert Widmann
d1cadb86c5 Remove Explicit Trivia from Lexer 2022-11-16 14:52:28 -08:00
Hamish Knight
515945fc6d [Parse] Avoid skipping bodies with /.../ regex literals
While skipping, if we encounter a token that looks
like it could be the start of a `/.../` regex
literal, fall back to parsing the function or type
body normally, as such a token could become a
regex literal. As such, it could treat `{` and
`}` as literal, or otherwise have contents that
would be lexically invalid Swift.

To avoid falling back in too many cases, we apply
the existing regex literal heuristics. Cases that
pass the heuristic fall back to regular parsing.
Cases that fail the heuristic are further checked
to make sure they wouldn't contain an unbalanced
`{` or `}`, but otherwise are allowed to be
skipped. This allows us to continue skipping for
most occurrences of infix and prefix `/`.

This is meant as a lower risk workaround to fix the
the issue, we ought to go back to handling regex
literals in the lexer.

Resolves rdar://95354010
2022-06-22 20:05:21 +01:00
Hamish Knight
325ba43396 [Parse] NFC: Split off tryScanRegexLiteral
This is a `const` version of `tryLexRegexLiteral`
that does not advance `CurPtr`, but attempts to
scan ahead to the end of a regex literal.
2022-06-22 20:05:21 +01:00
Hamish Knight
4f5bf99e07 [Parse] NFC: Allow sub-lexers to disable diagnostics
And `const` qualify a couple of params/methods.
2022-06-22 20:05:19 +01:00
Hamish Knight
ba28b6a19b [Parse] Split prefix operators from regex in the lexer
Teach the lexer not to consider `/` an operator
character when attempting to re-lex a regex
literal. This allows us to split off a prefix
operator.

Previously this was done after-the-fact in the
parser, but that didn't cover the unapplied infix
operator case, and didn't form a `tok::amp_prefix`
for `foo(&/.../)`, which led to a suboptimal
diagnostic.

This also now means we'll split an operator for
cases such as `foo(!/^/)` rather than treating it
as an unapplied infix operator.

rdar://92469917
2022-04-29 10:53:56 +01:00
Josh Soref
4721852fcb Spelling parse (#42469)
* spelling: appear

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: availability

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: available

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: coerce

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: collection

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: condition

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: conditional

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: delimiter

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: derived

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: diagnostics

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: disambiguation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: dropped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: escaped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: existence

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: expression

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: expressions

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: extended

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: furthermore

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: identifier

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: indentation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: inspect

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: miscellaneous

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: multiline

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: offset

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: passthrough

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: precede

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: prefix

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: receiver

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: reference

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: registered

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: representing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: returned

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: sequence

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: should

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: successfully

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: that

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: the

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: trivia

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unsupported

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: whitespace

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

Co-authored-by: Josh Soref <jsoref@users.noreply.github.com>
2022-04-21 09:31:40 -07:00
Hamish Knight
f1a799037e [Parse] Introduce /.../ regex literals
Start parsing regex literals with `/.../`
delimiters.

rdar://83253726
2022-04-12 16:03:49 +01:00
Hamish Knight
080d59b3df [Lexer] Delay token diagnostics
Queue up diagnostics when lexing, waiting until
`Lexer::lex` is called before emitting them. This
allows us to re-lex without having to deal with
previously invalid tokens.
2022-04-12 16:03:47 +01:00
Michael Ilseman
7bff9da67d Revert "Revert "Merge pull request #40595 from hamishknight/straw-bales"" 2021-12-19 10:08:48 -07:00
Arnold Schwaighofer
9511994e52 Revert "Merge pull request #40595 from hamishknight/straw-bales"
This reverts commit a67a0436f7, reversing
changes made to 9965df76d0.

This commit or the earlier commit this commit is based on (#40531) broke the
incremental bot.
2021-12-18 11:02:37 -08:00
Hamish Knight
128f5d4bc6 Update regex literal lexing and emission
Update the lexing implementation to defer to the
regex library, which will pass back the pointer
from to resume lexing, and update the emission to
call the new `Regex(_regexString:version:)`
overload, that will accept the regex string with
delimiters.

Because this uses the library's lexing
implementation, the delimiters are now `'/.../'`
and `'|...|'` instead of plain `'...'`.
2021-12-17 18:05:31 +00:00
Hamish Knight
37f16520e6 Prototype regex literal AST and emission
With `-enable-experimental-string-processing`,
start lexing `'` delimiters as regex literals (this
is just a placeholder delimiter for now). The
contents of which gets passed to the libswift
library, which can return an error string to be
emitted, or null for success.

The libswift side isn't yet hooked up to the Swift
regex parser, so for now just emit a dummy
diagnostic for regexes starting with quantifiers.

If successful, build an AST node which will be
emitted as an implicit call to an
`init(_regexString:)` initializer of an in-scope
`Regex` decl (which will eventually be a known
stdlib decl).
2021-12-06 21:16:14 +00:00
Michael Ilseman
2740e2707c Experimental Regex Strawperson (use Swift in the parser) (#40117)
[regex] Use Swift in the parser

Add in a strawperson use of Swift by the parser, for
future regex support.
2021-11-14 07:11:47 -07:00
Alex Hoppen
b888dc0e40 [Parser] Don't modify the current token kind when cutting off parsing
Previously, when we reached the maximum nesting level, we changed the current token’s kind to an EOF token. A lot of places in the parser are not set up to expect this token change. The intended workaround was to check whether pushing a structure marker failed (which would change the token kind) and bail out parsing if this happened. This was fragile and caused assertion failures in assert builds.

Instead of changing the current token’s kind, and failing to push the structure marker, let the lexer know that it should cut off lexing, essentially making the input buffer stop at the current position. The parser will continue to consume its current token (`Parser.Tok`) and the next token that’s already lexed in the lexer (`Lexer.NextToken`) before reaching the emulated EOF token. Thus two more tokens are parsed than before, but that shouldn’t make much of a difference.
2021-11-09 12:28:10 +01:00
Rintaro Ishizaki
ed1db1bed2 [CodeCompletion] Explain why results aren't recommended
* Implement 'getDiagnosticSeverity()' and 'getDiagnosticMessage()' on
  'CodeCompletionResult'
* Differentiate 'RedundantImportIndirect' from 'RedundantImport'
* Make non-Sendable check respects '-warn-concurrency'

rdar://76129658
2021-08-10 17:11:14 -07:00
Alex Hoppen
f12c151823 [Lexer] Improve lexing of BOM trivia
Simplify lexing of BOM trivia, eliminating the need to manually
construct the trivia StringRef.
2021-02-11 10:53:07 +01:00
Alex Hoppen
a8c01365b8 [Lexer] Eliminate unnecessary calls to TriviaLexer::lexTrivia
If the lexer itself keeps track of where the first comment of a token
starts, we can avoid parsing trivia into pieces.
2021-02-05 08:15:55 +01:00
Alex Hoppen
6911553067 [Lexer] Push trivia lexing down to the parser
This is an intermediate state in which the lexer delegates the
responsibility for trivia lexing to the parser. Later, the parser will
delegate this responsibility to SyntaxParsingContext which will hand it
over to SyntaxParseAction, which will only lex the pieces if it is
really necessary to do so.
2021-02-05 08:15:54 +01:00
Alex Hoppen
2bf5e4e209 [Lexer] Extract trivia piece lexing to a separate TriviaLexer
The lexer is only responsible for skipping over trivia and noting their
length. A separate TriviaLexer can be invoked to split the raw trivia
string into its pieces.

Since most of the time the trivia pieces aren't needed, this will allow
us to later only parse trivia into pieces when they are explicitly
needed.
2021-02-04 14:27:28 +01:00
Ben Barham
7594cfb0bc [Parse] Update Lexer::getLocFor*Line methods to return correct locations
When the location given to getLocForStartOfLine was an empty line, it
would actually return the location of the next line rather than the
given location as it should.

If the location given to getLocForEndOfLine was inside a token on a line
that was either empty or contained whitespace, it would skip to the end
of that token and then return the location for the next line. This was
an issue for multiline strings, where the string is a single token but
it's over multiple lines.
2020-08-12 17:25:54 +10:00
Frederick Kellison-Linn
71697c37ca Allow implicit self in escaping closures when self usage is unlikely to cause cycle (#23934)
* WIP implementation

* Cleanup implementation

* Install backedge rather than storing array reference

* Add diagnostics

* Add missing parameter to ResultFinderForTypeContext constructor

* Fix tests for correct fix-it language

* Change to solution without backedge, change lookup behavior

* Improve diagnostics for weak captures and captures under different names

* Remove ghosts of implementations past

* Address review comments

* Reorder member variable initialization

* Fix typos

* Exclude value types from explicit self requirements

* Add tests

* Add implementation for AST lookup

* Add tests

* Begin addressing review comments

* Re-enable AST scope lookup

* Add fixme

* Pull fix-its into a separate function

* Remove capturedSelfContext tracking from type property initializers

* Add const specifiers to arguments

* Address review comments

* Fix string literals

* Refactor implicit self diagnostics

* Add comment

* Remove trailing whitespace

* Add tests for capture list across multiple lines

* Add additional test

* Fix typo

* Remove use of ?: to fix linux build

* Remove second use of ?:

* Rework logic for finding nested self contexts
2019-12-20 02:38:41 +00:00
Rintaro Ishizaki
8768832f24 Revert "Merge pull request #27281 from rintaro/reapply-syntaxparse-genericparam"
This reverts commit 5d3e8d6c83, reversing
changes made to 27e881d97e.
2019-10-14 12:46:31 -07:00
Rintaro Ishizaki
0569cbfb28 Revert "Revert "[SyntaxParse] Parse generic parameter clause and generic where clause""
This reverts commit 1584e87aa7.
2019-09-20 15:26:04 -07:00
Rintaro Ishizaki
1584e87aa7 Revert "[SyntaxParse] Parse generic parameter clause and generic where clause" 2019-09-20 14:02:53 -07:00
Rintaro Ishizaki
f919b2ddd8 [SyntaxParse] Parse generic parameter clause and generic where clause 2019-09-19 23:09:58 -07:00
Harlan Haskins
c82c9b8210 [ModuleInterfaces] Remove references to 'parseable' interfaces everywhere
Now that we've settled on Module Interface as the name, let's remove the
vestiges of "Parseable Interfaces"
2019-09-13 14:55:48 -07:00
Doug Gregor
7f293f66b3 [Parser] Allow use of $ declarations in all modes.
Allow the use of declarations whose names start with $ in all
modes. However, normal code cannot define new entities with names that
start with $: only the implementation can do that, e.g., for property
delegates.
2019-04-23 11:31:58 -07:00
Harlan Haskins
f5fc6f0c57 [Lexer] Handle SwiftInterface files as well as SIL
Previously, the Lexer kept a single flag whether we’re lexing Swift or SIL. Instead, keep track if we’re parsing Swift, SIL, or a Swiftinterface file. .swiftinterface files allow $-prefixed identifiers anywhere.
2019-01-22 11:02:36 -08:00
Argyrios Kyrtzidis
c7ac859310 [Parse] Optimize syntax parsing: Speed-up Lexer::lexTrivia()
Introduce ParsedTrivia which is a more efficient structure to use during lexing than syntax::Trivia.
2019-01-17 12:10:27 -08:00
Adrian Prantl
ff63eaea6f Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

      for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
2018-12-04 15:45:04 -08:00
Rintaro Ishizaki
143f55a6e5 [Lexer] Add formStringLiteralToken dedicated for forming string literal 2018-09-19 18:58:54 +09:00
John Holdsworth
e55d4254d2 [Parse] Allow multiline attribute messages (SE-200) (#19219)
Multiline string literal at attribute message position was disallowed in
59778f8ecb.

Reworked to try to at least get multiline strings working which might be
useful as messages for attributes (for example a detailed “unavailable”
annotation) minus the code which read off the start of the StringRef buffer.
2018-09-14 10:33:32 +09:00
Jordan Rose
e180bf9d4e [Parse] Tweak a utility function that relies on reading past the end (#19230)
Lexer::getEncodedStringSegment (now getEncodedStringSegmentImpl)
assumes that it can read one byte past the end of a string segment in
order to avoid bounds-checks on things like "is this a \r\n
sequence?". However, the function was being used for strings that did
not come from source where this assumption was not always valid.
Change the reusable form of the function to always copy into a
temporary buffer, allowing the fast path to continue to be used for
normal parsing.

Caught by ASan!

rdar://problem/44306756
2018-09-11 20:11:58 -07:00
John Holdsworth
4da8cbe655 Implement SE-0200 (extended escaping in string literals)
Supports string literals like #"foo"\n"bar"#.
2018-09-06 15:19:52 -07:00
Brent Royal-Gordon
df22ea1bfb Revert "[Parse] Implementation for SE-200 (raw strings)" 2018-09-06 12:22:41 -07:00
swift-ci
192587c98a Merge pull request #17668 from johnno1962a/master 2018-09-06 11:47:36 -07:00
John Holdsworth
dc96342368 Response to xwu's review 2018-09-02 11:37:02 +01:00
John Holdsworth
6bd7cb884c Pragmatic support of multiline/delimited in attributes 2018-08-20 08:43:53 +01:00
Jordan Rose
fc9ea1e329 Add Lexer::IsHashbangAllowed, drop SourceManager::getHashbangBufferID (#18534)
Having this be a single buffer hardcoded in the SourceManager and set
by all clients is silly. SourceFiles with the 'Main' kind are allowed
to have hashbang lines (`#!`), other files are not. And anyone
manually setting up a Lexer can decide for themselves.

No intended behavioral change.
2018-08-07 08:25:05 -07:00
John Holdsworth
2317048f44 Alternative implementation for raw strings 2018-07-09 18:38:54 +01:00
John Holdsworth
14213b84bd Revised implementation for raw strings 2018-07-02 12:54:06 +01:00
Alex Hoppen
de9737c946 [incrParse] Support incremental parsing for edited files 2018-05-22 08:52:33 -07:00
Huon Wilson
00c32698e2 [Parse] Put indent-is-4-spaces assumption in one place.
This single location can, theoretically, be made more intelligent about
deducing indent from elsewhere and all the consumers will just work.
2018-04-04 10:34:33 +10:00
omochimetaru
3c8057e13f [Parse] Remove unnecessary Lexer fields 2018-03-12 23:35:13 +09:00