`libInternalSwiftSyntaxParser.dylib` currently doesn’t link against `SwiftExperimentalStringProcessing`, so it can’t use the regex lexing functions defined within. This caused SwiftSyntax to fail if the source code contained regex literals.
Implement a fallback regex lexing function in C++ and use it for SwiftSyntax parsing.
rdar://93580240
Co-authored-by: Rintaro Ishizaki <rishizaki@apple.com>
Treat a prefix operator containing `/` the same as
the unapplied infix operator case, where we
tentatively lex. This means that we bail if there
is no closing `/` or the starting character is
invalid. This leaves binary operator containing
`/` in expression position as the last place where
we know that we definitely have a regex literal.
The code completion might occur inside an attriubte that isn’t part of the AST because it’s missing a `VarDecl` that it could be attached to. In these cases, record the `CustomAttr` and type check it standalone, pretending it was part of a `DeclContext`.
This also fixes a few issues where code completion previously wouldn’t find the attribute constructor call and thus wasn’t providing code completion inside the property wrapper.
rdar://92842803
Teach the lexer not to consider `/` an operator
character when attempting to re-lex a regex
literal. This allows us to split off a prefix
operator.
Previously this was done after-the-fact in the
parser, but that didn't cover the unapplied infix
operator case, and didn't form a `tok::amp_prefix`
for `foo(&/.../)`, which led to a suboptimal
diagnostic.
This also now means we'll split an operator for
cases such as `foo(!/^/)` rather than treating it
as an unapplied infix operator.
rdar://92469917
This fixes:
* An issue where the diagnostic messages were leaked
* Diagnose at correct position inside the regex literal
To do this:
* Introduce 'Parse' SwiftCompiler module that is a bridging layer
between '_CompilerRegexParser' and C++ libParse
* Move libswiftParseRegexLiteral and libswiftLexRegexLiteral to 'Parse'
Also this change makes 'SwiftCompilerSources/Package.swift' be configured
by CMake so it can actually be built with 'swift-build'.
rdar://92187284
Queue up diagnostics when lexing, waiting until
`Lexer::lex` is called before emitting them. This
allows us to re-lex without having to deal with
previously invalid tokens.
When recovering from a parser error in an expression, we resumed parsing at a '{'. I assume this was because we wanted to continue inside e.g. an if-body if parsing the condition failed, but it's actually causing more issue because when parsing e.g.
```swift
expr + has - error +
functionTakesClosure {
}
```
we continue parsing at the `{` of the trailing closure, which is a completely garbage location to continue parsing.
The motivating example for this change was (in a result builder)
```swift
Text("\(island.#^COMPLETE^#)")
takeTrailingClosure {}
```
Here `Text(…)` has an error (because it contains a code completion token) and thus we skip `takeTrailingClosure`, effectively parsing
```swift
Text(….) {}
```
which the type checker wasn’t very happy with and thus refused to provide code completion. With this change, we completely drop `takeTrailingClosure {}`. The type checker is a lot happier with that.
As the _MatchingEngine module no longer contains the matching engine, this patch renames this module to describe its role more accurately. Because this module primarily contains the AST and the regex parsing logic, I propose we rename it to "_RegexParser".
Also renames the ExperimentalRegex module in SwiftCompilerSources to _RegexParser for consistency. This would prevent errors if sources in _RegexParser used qualified lookup with the module name.
Instead of setting the code completion position when parsing the if-statement, which doesn’t create a `CodeCompletionExpr`, parse it as a new top-level expression.
As far as test-cases are concerned, this removes the “RareKeyword” flair from top-level completions in the modified test case. This makes sense IMO.
When parsing a regular expression literal, accept a serialized capture structure from the regex parser. During type checking, decode it and form Swift types.
Examples:
```swift
'/(.)(.)/' // ==> `Regex<(Substring, Substring)>`
'/(?<label>.)(.)/' // ==> `Regex<(label: Substring, Substring)`
'/((.))*((.)?)/' //==> `Regex<([Substring], [Substring], Substring, Substring?)>`
```
Also:
- Fix a bug where a regex literal parsing error is not returning an error parser result.
Note:
- This needs to land after apple/swift-experimental-string-processing#92 and after `dev/4` tag has been created.
- See apple/swift-experimental-string-processing#92 for regex parser changes and the capture structure encoding.
- The `RegexLiteralParsingFn` `CaptureStructureOut` pointer type change from `char *` to `void *` will not break builds due to implicit pointer conversion (SE-0324) and unchanged ABI.
Resolves rdar://83253511.
This reverts commit a67a0436f7, reversing
changes made to 9965df76d0.
This commit or the earlier commit this commit is based on (#40531) broke the
incremental bot.
Update the lexing implementation to defer to the
regex library, which will pass back the pointer
from to resume lexing, and update the emission to
call the new `Regex(_regexString:version:)`
overload, that will accept the regex string with
delimiters.
Because this uses the library's lexing
implementation, the delimiters are now `'/.../'`
and `'|...|'` instead of plain `'...'`.
With `-enable-experimental-string-processing`,
start lexing `'` delimiters as regex literals (this
is just a placeholder delimiter for now). The
contents of which gets passed to the libswift
library, which can return an error string to be
emitted, or null for success.
The libswift side isn't yet hooked up to the Swift
regex parser, so for now just emit a dummy
diagnostic for regexes starting with quantifiers.
If successful, build an AST node which will be
emitted as an implicit call to an
`init(_regexString:)` initializer of an in-scope
`Regex` decl (which will eventually be a known
stdlib decl).
Previously, when we reached the maximum nesting level, we changed the current token’s kind to an EOF token. A lot of places in the parser are not set up to expect this token change. The intended workaround was to check whether pushing a structure marker failed (which would change the token kind) and bail out parsing if this happened. This was fragile and caused assertion failures in assert builds.
Instead of changing the current token’s kind, and failing to push the structure marker, let the lexer know that it should cut off lexing, essentially making the input buffer stop at the current position. The parser will continue to consume its current token (`Parser.Tok`) and the next token that’s already lexed in the lexer (`Lexer.NextToken`) before reaching the emulated EOF token. Thus two more tokens are parsed than before, but that shouldn’t make much of a difference.
Split up the expr list parsing members such that
there are separate entry points for tuple and
argument list parsing, and start using the argument
list parsing member for call and subscripts.
* Implement 'getDiagnosticSeverity()' and 'getDiagnosticMessage()' on
'CodeCompletionResult'
* Differentiate 'RedundantImportIndirect' from 'RedundantImport'
* Make non-Sendable check respects '-warn-concurrency'
rdar://76129658
With the introduction of `isolated` as
a type modifier for actor types, the
parsing of a parameter regressed such
that `isolated` was no longer accepted
as an ordinary argument label. This patch
fixes that and adds a little lookahead
utility to clean-up the code that
disambiguates the uses of `isolated`
as either a label or a type modifier.
Resolves rdar://80300022