Avoid path encoding difference (for example, real_path vs. path from
symlink) by eliminating the path from cache key. Cache key is now
encoded with the index of the input file from all the input files from
the command-line, reguardless if those inputs will produce output or
not. This is to ensure stable ordering even the batching is different.
Add a new cache computation API that is preferred for using input index
directly. Old API for cache key is deprecated but still updated to
fallback to real_path comparsion if needed.
As a result of swift scan API change, rename the feature in JSON file to
avoid version confusion between swift-driver and libSwiftScan.
rdar://119387650
The code, previously, only properly handled such dependencies being a distinct category for Swift source and Swift textual dependency infos. Swift binary module dependencies must handle this similarly and this change adds the missing support for them. Recent refactor of the scanner also means that now Swift binary dependencies with Swift overlay dependencies may crash the scanner, and this change resolves this as well.
Resolves rdar://117088840
Update swift cache key computation mechanism from one cache key per
output, to one cache key per primary input file (for all outputs that
associated with that input).
The new schema allows fewer cache lookups while still preserving most of
the flexibility for batch mode and incremental mode.
Allow DependencyScanner to canonicalize path using a prefix map. When
option `-scanner-prefix-map` option is used, dependency scanner will
remap all the input paths in following:
* all the paths in the CAS file system or clang include tree
* all the paths related to input on the command-line returned by scanner
This allows all the input paths to be canonicalized so cache key can be
computed reguardless of the exact on disk path.
The sourceFile field is not remapped so build system can track the exact
file as on the local file system.
'ModuleDependencyScanner' maintains a Thread Pool along with a pool of workers
which are capable of executing a filesystem lookup of a named module dependency.
When resolving imports of a given Swift module, each import's resolution
operation can be issued asunchronously.
From being a scattered collection of 'static' methods in ScanDependencies.cpp
and member methods of ASTContext. This makes 'ScanDependencies.cpp' much easier
to read, and abstracts the actual scanning logic away to a place with common
state which will make it easier to reason about in the future.
In case import resolution order somehow sometimes matters, it's prudent to process/resolve/locate implicitly-imported modules first.
Resolves rdar://113917657
dependencies
It is valuable for clients to be able to distinguish which dependencies of a
Swift module originated from 'import' statements, and which ones are implicit
dependency Swift overlays of imported Clang modules.
Instead of the code querying the compiler's built-in Clang instance, refactor the
dependency scanner to explicitly keep track of module output path. It is still
set according to '-module-cache-path' as it has been prior to this change, but
now the scanner can use a different module cache for scanning PCMs, as specified
with '-clang-scanner-module-cache-path', without affecting module output path.
Resolves rdar://113222853
The code of `ScanDependencies.cpp` was creating invalid JSON since #66031
because in the case of having `extraPcmArgs` and `swiftOverlayDependencies`,
but not `bridgingHeader`, a comma will not be added at the end of
`extraPcmArgs`, creating an invalid JSON file. Additionally that same PR
added a trailing comma at the end of the `swiftOverlayDependencies`, which
valid JSON does not allow, but that bug was removed in #66366.
Both problems are, however, present in the 5.9 branch, because #66936
included #66031, but not #66366.
Besides fixing the problem in `ScanDependencies.cpp` I modified every test
that uses `--scan-dependencies` to pass the produced JSON through
Python's `json.tool` in order to validate proper JSON is produced. In
most cases I was able to pipe the output of the tool into `FileCheck`,
but in some cases the validation is done by itself because the checks
depend on the exact format generated by `--scan-dependencies`. In
a couple of tests I added a call to `FileCheck` that seemed to be
missing.
Without these changes, two tests seems to be generating invalid JSON in
my machine:
- `ScanDependencies/local_cache_consistency.swift` (which outputs `Expecting ',' delimiter: line 525 column 11 (char 22799)`)
- `ScanDependencies/placholder_overlay_deps.swift`
Add a flag `finalized` to indicate that a module entry in the dependency
cache is finalized and no longer needs to be updated. This prevents the
command-line flags from dependency inputs get added multiple times on
re-scan with the same service.
While during normal compilation, adding the same command-line flags
multiple times are fine, it is bad for caching builds as a new
compilation cache key needs to be computed every time.
Make sure the sources for bridging header is added as part of the CAS
filesystem for the main module. Even bridging header should be compiled
into PCH, PCHs are not standalone that it can be used without source
file.
Unlike `swift-frontend -scan-dependencies` option, when dependency
scanner is used as a library by swift driver, the SwiftScanningService
is shared for multiple driver invocations. It can't keep states (like
common file dependencies) that can change from one invocation to
another.
Instead, the clang/SDK file dependencies are computed from each driver
invocations to avoid out-of-date information when scanning service is
reused.
The test case for a shared Service will be added to swift-driver repo
since there is no tool to test it within swift compiler.
Reformatting everything now that we have `llvm` namespaces. I've
separated this from the main commit to help manage merge-conflicts and
for making it a bit easier to read the mega-patch.
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
Rename `-enable-cas` to `-compile-cache-job` to align with clang option
names and promote that to a new driver only flag.
Few other additions to driver flag for caching behaviors:
* `-compile-cache-remarks`: now cache hit/miss remarks are guarded behind
this flag
* `-compile-cache-skip`: skip replaying from the cache. Useful as a
debugging tool to do the compilation using CAS inputs even the output
is a hit from the cache.
Teach swift dependency scanner to use CAS to capture the full dependencies for a build and construct build commands with immutable inputs from CAS.
This allows swift compilation caching using CAS.
Instead of being a part of 'directDependencies' on a module dependency info, make them a separate array of dependency IDs for Swift Source and Textual modules.
This will allow clients to still distinguish direct module dependencies imported from a given module, versus dependencies added because direct/transitive Clang module dependencies have Swift overlays.
This change does *not* remove overlay dependencies from 'directDependencies' yet, just adds them as a separate field on the module details info. A followup change will remove overlay and bridging header dependencies from 'directDependencies' once the clients have had a chance to adopt to this change.
For a `@Testable` import in program source, if a Swift interface dependency is discovered, and has an adjacent binary `.swiftmodule`, open up the module, and pull in its optional dependencies. If an optional dependency cannot be resolved on the filesystem, fail silently without raising a diagnostic.
Using a virutal output backend to capture all the outputs from
swift-frontend invocation. This allows redirecting and/or mirroring
compiler outputs to multiple location using different OutputBackend.
As an example usage for the virtual outputs, teach swift compiler to
check its output determinism by running the compiler invocation
twice and compare the hash of all its outputs.
Virtual output will be used to enable caching in the future.
For example, when scanning a source module `Foo`, which, when depending on module `Bar` causes a cross-import overlay `_Foo_Bar` to be added, do not add this cross-import overlay when scanning `Foo` itself. For example, if `Foo` adds a dependency on `Bar` itself in its own dependency graph.
Add them to the set of direct dependencies of the Swift module the bridging header belongs to, therefore also ensuiring that their module info will be contained in in the output graph.
Part of rdar://105742859
- '-o <output_path>'
- '-disable-implicit-swift-modules'
- '-Xcc -fno-implicit-modules' and '-Xcc -fno-implicit-module-maps'
- '-candidate-module-file'
These were previously supplied by the driver. Instead, they will now be ready to be run directly from the dependency scanner's output.
Do this by computing a transitive closure on the computed dependency graph, relying on the fact that it is a DAG.
The used algorithm is:
```
for each v ∈ V {
T(v) = { v }
}
for v ∈ V in reverse topological order {
for each (v, w) ∈ E {
T(v) = T(v) ∪ T(w)
}
}
```
Otherwise the scanning action will not look for them as dependencies, and the compilation it is used to inform will not specify these moduels as explicit inpouts.
Resolves rdar://104761392
This changes the scanner's behavior to "resolve" a discovered module's dependencies to a set of Module IDs: module name + module kind (swift textual, swift binary, clang, etc.).
The 'ModuleDependencyInfo' objects that are stored in the dependency scanner's cache now carry a set of kind-qualified ModuleIDs for their dependencies, in addition to unqualified imported module names of their dependencies.
Previously, the scanner's internal state would cache a module dependnecy as having its own set of dependencies which were stored as names of imported modules. This led to a design where any time we needed to process the dependency downstream from its discovery (e.g. cycle detection, graph construction), we had to query the ASTContext to resolve this dependency's imports, which shouldn't be necessary. Now, upon discovery, we "resolve" a discovered dependency by executing a lookup for each of its imported module names (this operation happens regardless of this patch) and store a fully-resolved set of dependencies in the dependency module info.
Moreover, looking up a given module dependency by name (via `ASTContext`'s `getModuleDependencies`) would result in iterating over the scanner's module "loaders" and querying each for the module name. The corresponding modules would then check the scanner's cache for a respective discovered module, and if no such module is found the "loader" would search the filesystem.
This meant that in practice, we searched the filesystem on many occasions where we actually had cached the required dependency, as follows:
Suppose we had previously discovered a Clang module "foo" and cached its dependency info.
-> ASTContext.getModuleDependencies("foo")
--> (1) Swift Module "Loader" checks caches for a Swift module "foo" and doesn't find one, so it searches the filesystem for "foo" and fails to find one.
--> (2) Clang Module "Loader" checks caches for a Clang module "foo", finds one and returns it to the client.
This means that we were always searching the filesystem in (1) even if we knew that to be futile.
With this change, queries to `ASTContext`'s `getModuleDependencies` will always check all the caches first, and only delegate to the scanner "loaders" if no cached dependency is found. The loaders are then no longer in the business of checking the cached contents.
To handle cases in the scanner where we must only lookup either a Swift-only module or a Clang-only module, this patch splits 'getModuleDependencies' into an alrady-existing 'getSwiftModuleDependencies' and a newly-added 'getClangModuleDependencies'.
Adopts Clang's 'DependencyScanningWorkerFilesystem' for use by the scanner, with the persistent
scanner instance keeping a 'DependencyScanningFilesystemSharedCache'.
Introduces a concept of a dependency scanning action context hash, which is used to select an instance of a global dependency scanning cache which gets re-used across dependency scanning actions.