C++ interop is now enabled in SwiftCompilerSources, so we can remove some of the C bridging layer and use C++ classes directly from Swift.
rdar://83361087
This is unfortunately needed to ensure we correctly
re-lex regex literal tokens correctly, which is
needed for diagnostic logic to correctly compute
source ranges.
rdar://92469692
This fixes:
* An issue where the diagnostic messages were leaked
* Diagnose at correct position inside the regex literal
To do this:
* Introduce 'Parse' SwiftCompiler module that is a bridging layer
between '_CompilerRegexParser' and C++ libParse
* Move libswiftParseRegexLiteral and libswiftLexRegexLiteral to 'Parse'
Also this change makes 'SwiftCompilerSources/Package.swift' be configured
by CMake so it can actually be built with 'swift-build'.
rdar://92187284
As the _MatchingEngine module no longer contains the matching engine, this patch renames this module to describe its role more accurately. Because this module primarily contains the AST and the regex parsing logic, I propose we rename it to "_RegexParser".
Also renames the ExperimentalRegex module in SwiftCompilerSources to _RegexParser for consistency. This would prevent errors if sources in _RegexParser used qualified lookup with the module name.
Instead of returning a parser error, which results
in generic parser recovery that skips until the
next decl, return an `ErrorExpr` so we can
continue parsing.
When parsing a regular expression literal, accept a serialized capture structure from the regex parser. During type checking, decode it and form Swift types.
Examples:
```swift
'/(.)(.)/' // ==> `Regex<(Substring, Substring)>`
'/(?<label>.)(.)/' // ==> `Regex<(label: Substring, Substring)`
'/((.))*((.)?)/' //==> `Regex<([Substring], [Substring], Substring, Substring?)>`
```
Also:
- Fix a bug where a regex literal parsing error is not returning an error parser result.
Note:
- This needs to land after apple/swift-experimental-string-processing#92 and after `dev/4` tag has been created.
- See apple/swift-experimental-string-processing#92 for regex parser changes and the capture structure encoding.
- The `RegexLiteralParsingFn` `CaptureStructureOut` pointer type change from `char *` to `void *` will not break builds due to implicit pointer conversion (SE-0324) and unchanged ABI.
Resolves rdar://83253511.
This reverts commit a67a0436f7, reversing
changes made to 9965df76d0.
This commit or the earlier commit this commit is based on (#40531) broke the
incremental bot.
Update the lexing implementation to defer to the
regex library, which will pass back the pointer
from to resume lexing, and update the emission to
call the new `Regex(_regexString:version:)`
overload, that will accept the regex string with
delimiters.
Because this uses the library's lexing
implementation, the delimiters are now `'/.../'`
and `'|...|'` instead of plain `'...'`.
- Frontend: Implicitly import `_StringProcessing` when frontend flag `-enable-experimental-string-processing` is set.
- Type checker: Set a regex literal expression's type as `_StringProcessing.Regex<(Substring, DynamicCaptures)>`. `(Substring, DynamicCaptures)` is a temporary `Match` type that will help get us to an end-to-end working system. This will be replaced by actual type inference based a regex's pattern in a follow-up patch (soon).
- SILGen: Lower a regex literal expression to a call to `_StringProcessing.Regex.init(_regexString:)`.
- String processing runtime: Add `Regex`, `DynamicCaptures` (matching actual APIs in apple/swift-experimental-string-processing), and `Regex(_regexString:)`.
Upcoming:
- Build `_MatchingEngine` and `_StringProcessing` modules with sources from apple/swift-experimental-string-processing.
- Replace `DynamicCaptures` with inferred capture types.
With `-enable-experimental-string-processing`,
start lexing `'` delimiters as regex literals (this
is just a placeholder delimiter for now). The
contents of which gets passed to the libswift
library, which can return an error string to be
emitted, or null for success.
The libswift side isn't yet hooked up to the Swift
regex parser, so for now just emit a dummy
diagnostic for regexes starting with quantifiers.
If successful, build an AST node which will be
emitted as an implicit call to an
`init(_regexString:)` initializer of an in-scope
`Regex` decl (which will eventually be a known
stdlib decl).