We're going to play a dirty, dirty trick - but it'll make our users'
lives better in the end so stick with me here.
In order to build up an incremental compilation, we need two sources of
dependency information:
1) "Priors" - Swiftdeps with dependency information from the past
build(s)
2) "Posteriors" - Swiftdeps with dependencies from after we rebuild the
file or module or whatever
With normal swift files built in incremental mode, the priors are given by the
swiftdeps files which are generated parallel to a swift file and usually
placed in the build directory alongside the object files. Because we
have entries in the output file map, we can always know where these
swiftdeps files are. The priors are integrated by the driver and then
the build is scheduled. As the build runs and jobs complete, their
swiftdeps are reloaded and re-integrated. The resulting changes are then
traversed and more jobs are scheduled if necessary. These give us the
posteriors we desire.
A module flips this on its head. The swiftdeps information serialized
in a module functions as the *posterior* since the driver consuming the
module has no way of knowing how to rebuild the module, and because its
dependencies are, for all intents and purposes, fixed in time. The
missing piece of the puzzle is the priors. That is, we need some way of
knowing what the "past" interface of the module looked like so we can
compare it to the "present" interface. Moreover, we need to always know
where to look for these priors.
We solve this problem by serializing a file alongside the build record:
the "external" build record. This is given by a... creative encoding
of multiple source file dependency graphs into a single source file
dependency graph. The rough structure of this is:
SourceFile => interface <BUILD_RECORD>.external
| - Incremental External Dependency => interface <MODULE_1>.swiftmodule
| | - <dependency> ...
| | - <dependency> ...
| | - <dependency> ...
| - Incremental External Dependency => interface <MODULE_2>.swiftmodule
| | - <dependency> ...
| | - <dependency> ...
| - Incremental External Dependency => interface <MODULE_3>.swiftmodule
| - ...
Sorta, `cat`'ing a bunch of source file dependency graphs together but
with incremental external dependency nodes acting as glue.
Now for the trick:
We have to unpack this structure and integrate it to get our priors.
This is easy. The tricky bit comes in integrate itself. Because the
top-level source file node points directly at the external build record,
not the original swift modules that defined these dependency nodes, we
swap the key it wants to use (the external build record) for the
incremental external dependency acting as the "parent" of the dependency
node. We do this by following the arc we carefully laid down in the
structure above.
For rdar://69595010
Goes a long way towards rdar://48955139, rdar://64238133
In order to extract the module dependency graph from the compilation the driver just ran, define a separate semantic type to hold a result code and the graph itself.
The "wave" of a compilation job describes the number of indirections through other compile jobs the driver required to reach the decision to schedule a job. In incremental mode, it should always be the case that it takes no more than two complete waves to arrive at a fixpoint in the build. This is a natural consequence of the structure of the dependencies emitted by the Swift frontend - namely we rely on transitivity in dependency arcs.
A quick proof sketch: Suppose an arbitrary perturbation of the inputs to an incremental compilation session are made. In the first wave, dependency edges from the prior build's state (the "zeroeth wave") are loaded and the files corresponding to invalidated edges are scheduled into the first wave. Supposing the second wave is not the null set - the trivial case - there are additional arcs that were invalidated. Now suppose that there were a third wave. Take an arbitrary arc invalidated by this third wave. It must be the case that the file containing the use is not new - else it would be scheduled. Further it must be the case that its def was not invalidated by the zeroeth or first waves of compilation otherwise we would have scheduled it into the first or second waves. Finally, it must have a use that was discovered in the second wave. But in order for that use to have been included in the second wave, there must have been an invalidated arc created by the first wave. By transitivity of dependency arcs, there must therefore be a dependency arc from a definition invalidated in the first wave to our third wave job, which implies that the file would be scheduled into the second wave!
[Insert contradiction pig image here]
In order to unblock the SwiftWASM project, which relies on an
incremental build of the Swift driver that relies on the merge-modules
job always being run. The situation appears to be something like this:
1) An incremental build is run
2) Temporary swiftmodule outputs are laid down
3) merge-modules is skipped
4) modulewrap is run anyways and reads the empty temp file
We should fix this by skipping modulewrap if we can skip merge-modules.
But for now, be conservative and fall back to the status quo behavior of
always running merge-modules whenever we encounter a modulewrap job.
Plumb the logic necessary to schedule merge-modules incrementally. This means that if any inputs jobs to merge-modules run, merge-modules is run. But, if they are all skipped, merge-modules will be skipped as well.
This requires some light special-casing of the legacy driver's incremental job handling because it assumes in a few places it can always extract a swiftdeps file. This invariant will be further broken when the precompile step for bridging headers is skipped as well.
rdar://65893400
A more durable form of #34218. Keep a side cache of externally-dependent
jobs for now. This ensures our pseudo-Jobs don't get prematurely
deallocated before the tracing machinery has had a chance to report the
structure of the Job graph.
rdar://70053563
An incremental build involving incremental external dependencies behaves as a hybrid between an external dependency and a normal swiftdeps-laden Swift file.
In the simplest case, we will fall back to the behavior of a plain external dependency today. That is, we will check its timestamp, then schedule all jobs that involve these external dependencies if it is out of date.
Where things get interesting is when cross-module incremental builds are enabled. In such a case, we know that a previous compiler has already emitted serialized swiftdeps information inside of a swiftmodule file. Moreover, we know that that swiftmodule file was loaded by the build of the current swift module. Finally, thanks to the previous stack of commits, we now know exactly how to extract this information from the swiftmodule file. To bring this all home, we unpack incremental dependency information from external dependencies, then integrate them into the current dependency graph - as though they were any other swiftdeps file. This neatly extends the single-module incremental logic to the multi-module case.
Treat any incremental external depends like normal external depends. This will eventually become the fallback behavior for cross-module incremental builds.
Annotate the covered switches with `llvm_unreachable` to avoid the MSVC
warning which does not recognise the covered switches. This allows us
to avoid a spew of warnings.
Restructure fine-grained-dependencies to enable unit testing
Get frontend to emit correct swiftdeps file (fine-grained when needed) and only emit dot file for -emit-fine-grained-dependency-sourcefile-dot-files
Use deterministic order for more information outputs.
Set EnableFineGrainedDependencies consistently in frontend.
Tolerate errors that result in null getExtendedNominal()
Fix memory issue by removing node everywhere.
Break up print routine
Be more verbose so it will compile on Linux.
Sort batchable jobs, too.
Restructure fine-grained-dependencies to enable unit testing
Get frontend to emit correct swiftdeps file (fine-grained when needed) and only emit dot file for -emit-fine-grained-dependency-sourcefile-dot-files
Use deterministic order for more information outputs.
Set EnableFineGrainedDependencies consistently in frontend.
Tolerate errors that result in null getExtendedNominal()
Fix memory issue by removing node everywhere.
Break up print routine
Be more verbose so it will compile on Linux.
Sort batchable jobs, too.