Just as with the result cache, instead of a single DenseMap with
type-erased AnyRequest keys, we can use per-request maps for a
nice performance improvement.
This is based on an earlier patch by @hamishknight.
The idea is that instead of caching results in a single DenseMap
that maps AnyRequest to AnyValue, we instead define a separate
DenseMap for each request kind that directly uses the request as
the key, and the request value as the value.
This avoids type erasure and memory allocation overhead arising
from the use of AnyRequest and AnyValue. There are no remaining
usages of AnyValue, and the only usage of AnyRequest is now in
the reference dependency tracking, which can be refactored to use
a similar strategy of storing per-request maps as well.
Other than simplifying some code, the big improvement here is
that we 'freeze' the reference dependencies for a request down
to a simple vector. We only use a DenseSet to store dependencies
of active requests.
This method had a messy contract:
- Setting the diags parameter to nullptr inhibited caching
- The initExpr out parameter could only used if no result
had yet been cached
Let's instead use the request evaluator here.
LLVM, as of 77e0e9e17daf0865620abcd41f692ab0642367c4, now builds with
-Wsuggest-override. Let's clean up the swift sources rather than disable
the warning locally.
In order for private dependencies to be completely correct, it must perform the name lookup unioning step when a cached request is replayed - not just when lookups are first performed. In order to reduce the overhead of this union operation, it is not necessary to walk the entire active request stack, just walk to the nearest cached request in the stack and union into that. When it is popped, its replay step will itself union into the next cached request.
To see why, consider a request graph:
A* -> B -> C*
|
-> D*
where A, C, and D are cached.
If a caller were to force C and D, then force A independenty, today we would *only* replay the names looked up by C and D the first time A was evaluated. That is, subsequent evaluations of A do not replay the correct set of names. If we were to perform the union step during replay as well, requests that force A would also see C and D’s lookups.
Without this, callers that force requests like the DeclChecker have to be wary of the way they force the interface type request so other files see the right name sets.
rdar://64008262
Split off the notion of "recording" dependencies from the notion of
"collecting" dependencies. This corrects an oversight in the previous
design where dependency replay and recording were actually not "free" in
WMO where we actually never track dependencies. This architecture also
lays the groundwork for the removal of the referenced name trackers.
The algorithm builds upon the infrastructure for dependency sources and
sinks laid down during the cut over to request-based dependency tracking
in #30723.
The idea of the naive algorithm is this:
For a chain of requests A -> B* -> C -> D* -> ... -> L where L is a lookup
request and all starred requests are cached, once L writes into the
dependency collector, the active stack is walked and at each cache-point
the results of dependency collection are associated with the request
itself (in this example, B* and D* have all the names L found associated
with them). Subsequent evaluations of these cached requests (B* and D*
et al) will then *replay* the previous lookup results from L into the
active referenced name tracker. One complication is, suppose the
evaluation of a cached request involves multiple downstream name
lookups. More concretely, suppose we have the following request trace:
A* -> B -> L
|
-> C -> L
|
-> D -> L
|
-> ...
Then A* must see the union of the results of each L. If this reminds
anyone of a union-find, that is no accident! A persistent union-find
a la Conchon and Filliatre is probably in order to help bring down peak
heap usage...
Finish off private intransitive dependencies with an implementation of
dependency replay.
For the sake of illustration, imagine a chain of requests
A -> B -> C -> ...
Supposing each request is never cached, then every invocation of the
compiler with the same inputs will always kick off the exact same set of
requests. For the purposes of dependency tracking, that also means every
single lookup request will run without issue, and all dependencies will
be accurately reported. But we live in a world with cached requests.
Suppose request B* is cached. The first time we encounter that request,
its evaluation order looks identical:
A -> B* -> C -> ...
If we are in a mode that compiles single primaries, this is not
a problem because every request graph will look like this.
But if we are in a mode where we are compiling multiple primaries, then
subsequent request graphs will *actually* hit the cache and never
execute request C or any of its dependent computations!
A -> B*
Supposing C was a lookup request, that means the name(s) looked up
downstream of B* will *never* be recorded in the referenced name tracker
which can lead to miscompilation. Note that this is not a problem
inherent to the design of the request evaluator - caches in the compiler
have *always* hidden dependent lookups. In fact, the request evaluator
provides us our first opportunity to resolve this correctness bug!
Add a mode bit to the dependency collector that respects the frontend flag in the previous commit.
Notably, we now write over the dependency files at the end of the compiler pipeline when this flag is on so that dependency from SILGen and IRGen are properly written to disk.
Define a new type DependencyCollector that abstracts over the
incremental dependency gathering logic. This will insulate the
request-based name tracking code from future work on private,
intransitive dependencies.
* Document a number of legacy conditions and edge cases
* Add lexicon definitions for "dependency source", "dependency sink",
"cascading dependency" and "private dependency"
Convert most of the name lookup requests and a few other ancillary typechecking requests into dependency sinks.
Some requests are also combined sinks and sources in order to emulate the current scheme, which performs scope changes based on lookup flags. This is generally undesirable, since it means those requests cannot immediately be generalized to a purely context-based scheme because they depend on some client-provided entropy source. In particular, the few callers that are providing the "known private" name lookup flag need to be converted to perform lookups in the appropriate private context.
Clients that are passing "no known dependency" are currently considered universally incorrect and are outside the scope of the compatibility guarantees. This means that request-based dependency tracking registers strictly more edges than manual dependency tracking. It also means that once we fixup the clients that are passing "known private", we can completely remove these name lookup flags.
Finally, some tests had to change to accomodate the new scheme. Currently, we go out of our way to register a dependency edge for extensions that declare protocol conformances. However, we were also asserting in at least one test that extensions without protocol conformances weren't registering dependency edges. This is blatantly incorrect and has been undone now that the request-based scheme is automatically registering this edge.
Formalize DependencyScope, DependencySource, and the incremental dependency stack.
Also specialize SimpleRequest to formalize dependency sources and dependency sinks. This allows the evaluator's internal entrypoints to specalize away the incremental dependency tracking infrastructure if a request is not actually dependency-relevant.
A request is intended to be a pure function of its inputs. That function could, in theory, fail. In practice, there were basically no requests taking advantage of this ability - the few that were using it to explicitly detect cycles can just return reasonable defaults instead of forwarding the error on up the stack.
This is because cycles are checked by *the Evaluator*, and are unwound by the Evaluator.
Therefore, restore the idea that the evaluate functions are themselves pure, but keep the idea that *evaluation* of those requests may fail. This model enables the best of both worlds: we not only keep the evaluator flexible enough to handle future use cases like cancellation and diagnostic invalidation, but also request-based dependencies using the values computed at the evaluation points. These aforementioned use cases would use the llvm::Expected interface and the regular evaluation-point interface respectively.
This wrapper is used to type erase a reference to
a request on the stack, and can be converted into
an AnyRequest when persistent storage is required.
Switch the evaluator over to using ActiveRequest
for its stack of active requests, meaning it now
only needs to heap allocate requests when they
enter the cache, or if it's building the (disabled
by default) dependency graph.
This improves performance in the common case where the result has
already been cached, because the cycle check constructs an
AnyRequest existential, which is expensive.
By convention, most structs and classes in the Swift compiler include a `dump()` method which prints debugging information. This method is meant to be called only from the debugger, but this means they’re often unused and may be eliminated from optimized binaries. On the other hand, some parts of the compiler call `dump()` methods directly despite them being intended as a pure debugging aid. clang supports attributes which can be used to avoid these problems, but they’re used very inconsistently across the compiler.
This commit adds `SWIFT_DEBUG_DUMP` and `SWIFT_DEBUG_DUMPER(<name>(<params>))` macros to declare `dump()` methods with the appropriate set of attributes and adopts this macro throughout the frontend. It does not pervasively adopt this macro in SILGen, SILOptimizer, or IRGen; these components use `dump()` methods in a different way where they’re frequently called from debugging code. Nor does it adopt it in runtime components like swiftRuntime and swiftReflection, because I’m a bit worried about size.
Despite the large number of files and lines affected, this change is NFC.
Use the Dependencies map to standardize on a single allocation for
any particular Request. We could probably go all the way to unique
ownership here, but I didn't want to rock the boat.
Results in minor speedups compiling the stdlib. (I was experimenting
with using a new, auto-cached Request for a hot lookup path, which may
or may not turn out to be worth it, but even without that we can take
this.)
This patch removes the need for Request objects to provide a default
cycle-breaking value, instead opting to return llvm::Expected so clients
must handle a cycle failure explicitly.
Currently, all clients do the 'default' behavior, but this opens the
possibility for future requests to handle failures explicitly.
The bundling of the form of a request (e.g., the storage that makes up a request)
with the function that evaluates the request value requires us to perform
ad hoc indirection to address the AST —> Sema layering violation. For
example, ClassDecl::getSuperclass() calls through the LazyResolver (when
available) to form the appropriate request. This means that we cannot
use the the request-evaluator’s cache when LazyResolver is null, forcing
all cached state into the AST.
Provide the evaluator with a zone-based registration system, where each
request “zone” (e.g., the type checker’s requests) registers
callbacks to evaluate each kind of request within that zone. The
evaluator indirects through this table of function pointers, allowing
the request classes themselves to be available at a lower level (AST)
than the functions that perform the computation when the value isn’t
in the cache (e.g., Sema).
We are not taking advantage of the indirection yet; that’ll come in a
follow-up commit.
When dumping dependencies, clean up the output in two ways:
* Elide redundant but (non-cyclic) references to the same dependency.
* When dumping a cycle, highlight the path to the cycle so it stands out.
As a debugging aid, introduce a new frontend flag `-debug-cycles` that
will emit a debug dump whenever the request-evaluator encounters a cyclic
dependency, while otherwise allowing compilation to continue.
The type checker has *lots* cycles, and producing diagnostics for them
at this point in the development of the request-evaluator is not
productive because it breaks currently-working code. Disable cycle
diagnostics for now when using the request-evaluator in the type
checker. We'll enable it later as things improve, or as a separate
logging mode in the interim.