The code, previously, only properly handled such dependencies being a distinct category for Swift source and Swift textual dependency infos. Swift binary module dependencies must handle this similarly and this change adds the missing support for them. Recent refactor of the scanner also means that now Swift binary dependencies with Swift overlay dependencies may crash the scanner, and this change resolves this as well.
Resolves rdar://117088840
Update swift cache key computation mechanism from one cache key per
output, to one cache key per primary input file (for all outputs that
associated with that input).
The new schema allows fewer cache lookups while still preserving most of
the flexibility for batch mode and incremental mode.
It is possible that import resolution failed because we are attempting to resolve a module which can only be brought in via a modulemap of a different Clang module dependency which is not otherwise on the current search paths. For example, suppose we are scanning a '.swiftinterface' for module 'Foo', which contains:
'''
@_exported import Foo
import Bar
...
Where 'Foo' is the underlying Framework clang module whose '.modulemap' defines an auxiliary module 'Bar'. Because 'Foo' is a framework, its modulemap is under '<some_framework_search_path>/Foo.framework/Modules/module.modulemap'. Which means that lookup of `Bar` alone from Swift will not be able to locate the module in it. However, the lookup of Foo will itself bring in the auxiliary module becuase the Clang scanner instance scanning for clang module Foo will be able to find it in the corresponding framework module's modulemap and register it as a dependency which means it will be registered with the scanner's cache in the step above. To handle such cases, we first add all successfully-resolved modules and (for Clang modules) their transitive dependencies to the cache, and then attempt to re-query imports for which resolution originally failed from the cache. If this fails, then the scanner genuinely failed to resolve this dependency.
Allow DependencyScanner to canonicalize path using a prefix map. When
option `-scanner-prefix-map` option is used, dependency scanner will
remap all the input paths in following:
* all the paths in the CAS file system or clang include tree
* all the paths related to input on the command-line returned by scanner
This allows all the input paths to be canonicalized so cache key can be
computed reguardless of the exact on disk path.
The sourceFile field is not remapped so build system can track the exact
file as on the local file system.
'ModuleDependencyScanner' maintains a Thread Pool along with a pool of workers
which are capable of executing a filesystem lookup of a named module dependency.
When resolving imports of a given Swift module, each import's resolution
operation can be issued asunchronously.
From being a scattered collection of 'static' methods in ScanDependencies.cpp
and member methods of ASTContext. This makes 'ScanDependencies.cpp' much easier
to read, and abstracts the actual scanning logic away to a place with common
state which will make it easier to reason about in the future.
In case import resolution order somehow sometimes matters, it's prudent to process/resolve/locate implicitly-imported modules first.
Resolves rdar://113917657
dependencies
It is valuable for clients to be able to distinguish which dependencies of a
Swift module originated from 'import' statements, and which ones are implicit
dependency Swift overlays of imported Clang modules.
Instead of the code querying the compiler's built-in Clang instance, refactor the
dependency scanner to explicitly keep track of module output path. It is still
set according to '-module-cache-path' as it has been prior to this change, but
now the scanner can use a different module cache for scanning PCMs, as specified
with '-clang-scanner-module-cache-path', without affecting module output path.
Resolves rdar://113222853
The code of `ScanDependencies.cpp` was creating invalid JSON since #66031
because in the case of having `extraPcmArgs` and `swiftOverlayDependencies`,
but not `bridgingHeader`, a comma will not be added at the end of
`extraPcmArgs`, creating an invalid JSON file. Additionally that same PR
added a trailing comma at the end of the `swiftOverlayDependencies`, which
valid JSON does not allow, but that bug was removed in #66366.
Both problems are, however, present in the 5.9 branch, because #66936
included #66031, but not #66366.
Besides fixing the problem in `ScanDependencies.cpp` I modified every test
that uses `--scan-dependencies` to pass the produced JSON through
Python's `json.tool` in order to validate proper JSON is produced. In
most cases I was able to pipe the output of the tool into `FileCheck`,
but in some cases the validation is done by itself because the checks
depend on the exact format generated by `--scan-dependencies`. In
a couple of tests I added a call to `FileCheck` that seemed to be
missing.
Without these changes, two tests seems to be generating invalid JSON in
my machine:
- `ScanDependencies/local_cache_consistency.swift` (which outputs `Expecting ',' delimiter: line 525 column 11 (char 22799)`)
- `ScanDependencies/placholder_overlay_deps.swift`
Add a flag `finalized` to indicate that a module entry in the dependency
cache is finalized and no longer needs to be updated. This prevents the
command-line flags from dependency inputs get added multiple times on
re-scan with the same service.
While during normal compilation, adding the same command-line flags
multiple times are fine, it is bad for caching builds as a new
compilation cache key needs to be computed every time.
Make sure the sources for bridging header is added as part of the CAS
filesystem for the main module. Even bridging header should be compiled
into PCH, PCHs are not standalone that it can be used without source
file.
Unlike `swift-frontend -scan-dependencies` option, when dependency
scanner is used as a library by swift driver, the SwiftScanningService
is shared for multiple driver invocations. It can't keep states (like
common file dependencies) that can change from one invocation to
another.
Instead, the clang/SDK file dependencies are computed from each driver
invocations to avoid out-of-date information when scanning service is
reused.
The test case for a shared Service will be added to swift-driver repo
since there is no tool to test it within swift compiler.
Reformatting everything now that we have `llvm` namespaces. I've
separated this from the main commit to help manage merge-conflicts and
for making it a bit easier to read the mega-patch.
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
Rename `-enable-cas` to `-compile-cache-job` to align with clang option
names and promote that to a new driver only flag.
Few other additions to driver flag for caching behaviors:
* `-compile-cache-remarks`: now cache hit/miss remarks are guarded behind
this flag
* `-compile-cache-skip`: skip replaying from the cache. Useful as a
debugging tool to do the compilation using CAS inputs even the output
is a hit from the cache.
Teach swift dependency scanner to use CAS to capture the full dependencies for a build and construct build commands with immutable inputs from CAS.
This allows swift compilation caching using CAS.
Instead of being a part of 'directDependencies' on a module dependency info, make them a separate array of dependency IDs for Swift Source and Textual modules.
This will allow clients to still distinguish direct module dependencies imported from a given module, versus dependencies added because direct/transitive Clang module dependencies have Swift overlays.
This change does *not* remove overlay dependencies from 'directDependencies' yet, just adds them as a separate field on the module details info. A followup change will remove overlay and bridging header dependencies from 'directDependencies' once the clients have had a chance to adopt to this change.
Other parts of the scanner lib (e.g. target-info query) already do this. We must always make sure to process the incoming command-line strings and run them through 'llvm::cl::TokenizeGNUCommandLine' in order to process escaped paths.
Part of rdar://106712169
For a `@Testable` import in program source, if a Swift interface dependency is discovered, and has an adjacent binary `.swiftmodule`, open up the module, and pull in its optional dependencies. If an optional dependency cannot be resolved on the filesystem, fail silently without raising a diagnostic.
Using a virutal output backend to capture all the outputs from
swift-frontend invocation. This allows redirecting and/or mirroring
compiler outputs to multiple location using different OutputBackend.
As an example usage for the virtual outputs, teach swift compiler to
check its output determinism by running the compiler invocation
twice and compare the hash of all its outputs.
Virtual output will be used to enable caching in the future.
Instead, treat them like any other module that is specific to the scanning context hash of the scan it originates from.
Otherwise we may actually have simultaneous scans happening for the same source module but with different context hashes, and the current scheme leads to collisions.
For example, when scanning a source module `Foo`, which, when depending on module `Bar` causes a cross-import overlay `_Foo_Bar` to be added, do not add this cross-import overlay when scanning `Foo` itself. For example, if `Foo` adds a dependency on `Bar` itself in its own dependency graph.
Using mutual exclusion, ensuring that multiple threads executing dependency scans do not encounter data races on shared mutable state.
There are two layers with shared state where we need to be careful:
- `DependencyScanningTool`, as the main entity that scanning clients interact with. This tool instantiates compiler instances for individual scans, computing a scanning invocation hash. It needs to remember those instances for future use, and when creating instances it needs to reset LLVM argument processor's global state, meaning all uses of argument processing must be in a critical section.
- `SwiftDependencyScanningService`, as the main cache where dependency scanning results are stored. Each individual scan instantiates a `ModuleDependenciesCache`, which uses the scanning service as the underlying storage. The services' storage is segmented to storing dependencies discovered in a scan with a given context hash, which means two different scanning invocations running at the same time will be accessing different locations in its storage, thus not requiring synchronization. But the service still has some shared state that must be protected, such as the collection of discovered source modules, and the map used to query context-hash-specific underlying cache storage.
Add them to the set of direct dependencies of the Swift module the bridging header belongs to, therefore also ensuiring that their module info will be contained in in the output graph.
Part of rdar://105742859
- '-o <output_path>'
- '-disable-implicit-swift-modules'
- '-Xcc -fno-implicit-modules' and '-Xcc -fno-implicit-module-maps'
- '-candidate-module-file'
These were previously supplied by the driver. Instead, they will now be ready to be run directly from the dependency scanner's output.
Do this by computing a transitive closure on the computed dependency graph, relying on the fact that it is a DAG.
The used algorithm is:
```
for each v ∈ V {
T(v) = { v }
}
for v ∈ V in reverse topological order {
for each (v, w) ∈ E {
T(v) = T(v) ∪ T(w)
}
}
```
We would previously unconditionally treat the command line as GNU style arguments. However, Windows uses a different command-line style, and this would incorrectly process the arguments, potentially corrupting paths which do not quote the path separator. Ideally, we would introduce a new api (`swiftscan_compiler_target_info_query_v3`?) that takes a quoting style (matching `--rsp-quoting`) which would allow us to support both quoting styles properly.
Otherwise the scanning action will not look for them as dependencies, and the compilation it is used to inform will not specify these moduels as explicit inpouts.
Resolves rdar://104761392
This new version takes the path to the compiler executable as a parameter, in order for libSwiftScan to compute compiler-relative portions of runtimeLibraryPaths, runtimeResourcePath. V1, without knowing the path to the compiler executable, produced incomplete sets of these paths.