Instead of being a part of 'directDependencies' on a module dependency info, make them a separate array of dependency IDs for Swift Source and Textual modules.
This will allow clients to still distinguish direct module dependencies imported from a given module, versus dependencies added because direct/transitive Clang module dependencies have Swift overlays.
This change does *not* remove overlay dependencies from 'directDependencies' yet, just adds them as a separate field on the module details info. A followup change will remove overlay and bridging header dependencies from 'directDependencies' once the clients have had a chance to adopt to this change.
Using a virutal output backend to capture all the outputs from
swift-frontend invocation. This allows redirecting and/or mirroring
compiler outputs to multiple location using different OutputBackend.
As an example usage for the virtual outputs, teach swift compiler to
check its output determinism by running the compiler invocation
twice and compare the hash of all its outputs.
Virtual output will be used to enable caching in the future.
Instead, treat them like any other module that is specific to the scanning context hash of the scan it originates from.
Otherwise we may actually have simultaneous scans happening for the same source module but with different context hashes, and the current scheme leads to collisions.
Using mutual exclusion, ensuring that multiple threads executing dependency scans do not encounter data races on shared mutable state.
There are two layers with shared state where we need to be careful:
- `DependencyScanningTool`, as the main entity that scanning clients interact with. This tool instantiates compiler instances for individual scans, computing a scanning invocation hash. It needs to remember those instances for future use, and when creating instances it needs to reset LLVM argument processor's global state, meaning all uses of argument processing must be in a critical section.
- `SwiftDependencyScanningService`, as the main cache where dependency scanning results are stored. Each individual scan instantiates a `ModuleDependenciesCache`, which uses the scanning service as the underlying storage. The services' storage is segmented to storing dependencies discovered in a scan with a given context hash, which means two different scanning invocations running at the same time will be accessing different locations in its storage, thus not requiring synchronization. But the service still has some shared state that must be protected, such as the collection of discovered source modules, and the map used to query context-hash-specific underlying cache storage.
Do this by computing a transitive closure on the computed dependency graph, relying on the fact that it is a DAG.
The used algorithm is:
```
for each v ∈ V {
T(v) = { v }
}
for v ∈ V in reverse topological order {
for each (v, w) ∈ E {
T(v) = T(v) ∪ T(w)
}
}
```
This changes the scanner's behavior to "resolve" a discovered module's dependencies to a set of Module IDs: module name + module kind (swift textual, swift binary, clang, etc.).
The 'ModuleDependencyInfo' objects that are stored in the dependency scanner's cache now carry a set of kind-qualified ModuleIDs for their dependencies, in addition to unqualified imported module names of their dependencies.
Previously, the scanner's internal state would cache a module dependnecy as having its own set of dependencies which were stored as names of imported modules. This led to a design where any time we needed to process the dependency downstream from its discovery (e.g. cycle detection, graph construction), we had to query the ASTContext to resolve this dependency's imports, which shouldn't be necessary. Now, upon discovery, we "resolve" a discovered dependency by executing a lookup for each of its imported module names (this operation happens regardless of this patch) and store a fully-resolved set of dependencies in the dependency module info.
Moreover, looking up a given module dependency by name (via `ASTContext`'s `getModuleDependencies`) would result in iterating over the scanner's module "loaders" and querying each for the module name. The corresponding modules would then check the scanner's cache for a respective discovered module, and if no such module is found the "loader" would search the filesystem.
This meant that in practice, we searched the filesystem on many occasions where we actually had cached the required dependency, as follows:
Suppose we had previously discovered a Clang module "foo" and cached its dependency info.
-> ASTContext.getModuleDependencies("foo")
--> (1) Swift Module "Loader" checks caches for a Swift module "foo" and doesn't find one, so it searches the filesystem for "foo" and fails to find one.
--> (2) Clang Module "Loader" checks caches for a Clang module "foo", finds one and returns it to the client.
This means that we were always searching the filesystem in (1) even if we knew that to be futile.
With this change, queries to `ASTContext`'s `getModuleDependencies` will always check all the caches first, and only delegate to the scanner "loaders" if no cached dependency is found. The loaders are then no longer in the business of checking the cached contents.
To handle cases in the scanner where we must only lookup either a Swift-only module or a Clang-only module, this patch splits 'getModuleDependencies' into an alrady-existing 'getSwiftModuleDependencies' and a newly-added 'getClangModuleDependencies'.
Introduces a concept of a dependency scanning action context hash, which is used to select an instance of a global dependency scanning cache which gets re-used across dependency scanning actions.
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
Doing so will allow clients to know which Swift-specific PCM arguments are already captured from the scan that first discovered this module.
SwiftDriver, in particular, will be able to use this information to avoid re-scanning a given Clang module if the initial scan was sufficient for all possible sets of PCM arguments on Swift modules that depend on said Clang module.
And only resolve cached dependencies that came from scanning actions with the same target triple.
This change means that the `GlobalModuleDependenciesCache` must be configured with a specific target triple for every scannig action, and it will only resolve previously-found dependencies from previous scannig actions using the exact same triple.
Furthermore, the `GlobalModuleDependenciesCache` separately tracks source-file-based module dependencies as those represent main Swift modules of previous scanning actions, and we must be able to resolve those regardless of the target triple.
Resolves rdar://83105455
These kinds of modules differ from `SwiftTextual` modules in that they do not have an interface and have source-files.
It is cleaner to enforce this distinction with types, instead of checking for interface optionality everywhere.
This change causes the cache to be layered with a local "cache" that wraps the global cache, which will serve as the source of truth. The local cache persists only for the duration of a given scanning action, and has a store of references to dependencies resolved as a part of the current scanning action only, while the global cache is the one that persists across scanning actions (e.g. in `DependencyScanningTool`) and stores actual module dependency info values.
Only the local cache can answer dependency lookup queries, checking current scanning action results first, before falling back to querying the global cache, with queries disambiguated by the current scannning action's search paths, ensuring we never resolve a dependency lookup query with a module info that could not be found in the current action's search paths.
This change is required because search-path disambiguation can lead to false-negatives: for example, the Clang dependency scanner may find modules relative to the compiler's path that are not on the compiler's direct search paths. While such false-negative query responses should be functionally safe, we rely on the current scanning action's results being always-present-in-the-cache for the scanner's functionality. This layering ensures that the cache use-sites remain unchanged and that we get both: preserved global state which can be queried disambiguated with the search path details, and an always-consistent local (current action) cache state.
The dependency scanner's cache persists across different queries and answering a subsequent query's module lookup with a module not in the query's search path is not correct.
For example, suppose we are looking for a Swift module `Foo` with a set of search paths `SP`.
And dependency scanner cache already contains a module `Foo`, for which we found an interface file at location `L`. If `L`∉`SP`, then we cannot re-use the cached entry because we’d be resolving the scanning query to a filesystem location that the current scanning context is not aware of.
Resolves rdar://81175942
Using the serialization format added in https://github.com/apple/swift/pull/37585.
- Add load/save code for the `-scan-dependencies` code-path.
- Add `libSwiftDriver` entry-points to load/store the cache of a given scanner instance.