Take advantage of the binary swiftdeps serialization utliities built during #32131. Add a new optional information block to swiftdeps files. For now, don't actually serialize swiftdeps information.
Frontends will use this information to determine whether to write incremental dependencies across modules into their swiftdeps files. We will then teach the driver to deserialize the data from this section and integrate it into its incremental decision making.
Serialize derivative function configurations per module.
`@differentiable` and `@derivative` attributes register derivatives for
`AbstractFunctionDecl`s for a particular "derivative function configuration":
parameter indices and dervative generic signature.
To find `@derivative` functions registered in other Swift modules, derivative
function configurations must be serialized per module. When configurations for
a `AbstractFunctionDecl` are requested, all configurations from imported
modules are deserialized. This module serialization technique has precedent: it
is used for protocol conformances (e.g. extension declarations for a nominal
type) and Obj-C members for a class type.
Add `AbstractFunctionDecl::getDerivativeFunctionConfigurations` entry point
for accessing derivative function configurations.
In the differentiation transform: use
`AbstractFunctionDecl::getDerivativeFunctionConfigurations` to implement
`findMinimalDerivativeConfiguration` for canonical derivative function
configuration lookup, replacing `getMinimalASTDifferentiableAttr`.
Resolves TF-1100.
Remove the option to switch off nested types tables. In a world where
re-entrant direct lookup will cause deserialization to fail (or worse),
disabling these tables will only lead to further instability in the
compiler.
As part of this, we have to change the type export rules to
prevent `@convention(c)` function types from being used in
exported interfaces if they aren't serializable. This is a
more conservative version of the original rule I had, which
was to import such function-pointer types as opaque pointers.
That rule would've completely prevented importing function-pointer
types defined in bridging headers and so simply doesn't work,
so we're left trying to catch the unsupportable cases
retroactively. This has the unfortunate consequence that we
can't necessarily serialize the internal state of the compiler,
but that was already true due to normal type uses of aggregate
types from bridging headers; if we can teach the compiler to
reliably serialize such types, we should be able to use the
same mechanisms for function types.
This PR doesn't flip the switch to use Clang function types
by default, so many of the clang-function-type-serialization
FIXMEs are still in place.
Structurally prevent a number of common anti-patterns involving generic
signatures by separating the interface into GenericSignature and the
implementation into GenericSignatureBase. In particular, this allows
the comparison operators to be deleted which forces callers to
canonicalize the signature or ask to compare pointers explicitly.
Now that GenericSignatures store their single unique GenericEnvironment,
we can remove similar logic from deserialization to preserve identity
of GenericEnvironments.
A generic environment is always serialized as a GenericSignature with
a lazily-recreated environment, though sometimes it has to include
extra info specifically for generic environments used by SIL. The code
that was doing this claimed a bit for disambiguating between the two,
shrinking the permitted size of a compiled module from 2^31 bits to
2^30. (The code isn't just needlessly complicated; GenericEnvironments
used to be serialized with more information.)
Rather than have two representations for GenericEnvironmentID, this
commit just drops it altogether in favor of referencing
GenericSignatures directly. This causes a negligible file size
shrinkage for swiftmodules in addition to eliminating the problematic
disambiguation bit.
For now, the Deserialization logic will continue to cache
GenericEnvironments that are used directly by Deserialization, but
really that should probably be done at the AST level. Then we can
simplify further to ModuleFile tracking a plain list of
GenericSignatures.
...by making it a tagged union of either a DeclID or a
LocalDeclContextID. This should lead to smaller module files and be
slightly more efficient to deserialize, and also means that every
AST entity kind is serialized in exactly one way, which allows for
the following commit's refactoring.
The AST block in a compiled module represents an object graph, which
is essentially serialized in four steps:
- An entity (such as a Decl or Type) is encountered, given an ID, and
added to a worklist.
- The next entity is popped from the worklist and its offset in the
output stream is recorded.
- During the course of writing that entity, more entities will be
referenced and added to the worklist.
- Once the entire worklist is drained, the offsets get written to a
table in the Index block.
The implementation of this was duplicated for each kind of entity in
the AST block; this commit factors that out into a reusable helper.
No intended high-level functionality change, but the order in which
Decls and Types get emitted might change a little now that they're not
in the same queue.
It's a pretty obscure feature (and one we wish we didn't need), but
sometimes API is initially exposed through one module in order to
build another one, and we want the canonical presented name to be
something else. Push this concept into Swift's AST properly so that
other parts of the compiler stop having to know that this is a
Clang-specific special case.
No functionality change in this commit; will be used in the next
commit.
Now that we don't store requirements in the GenericParamList, there's
no reason to use trailing records to list out the
GenericTypeParamDecls.
No functionality change.
A module compiled with `-enable-private-imports` allows other modules to
import private declarations if the importing source file uses an
``@_private(from: "SourceFile.swift") import statement.
rdar://29318654
* Introduce stored inlinable function bodies
* Remove serialization changes
* [InterfaceGen] Print inlinable function bodies
* Clean up a little bit and add test
* Undo changes to InlinableText
* Add serialization and deserialization for inlinable body text
* Allow parser to parse accessor bodies in interfaces
* Fix some tests
* Fix remaining tests
* Add tests for usableFromInline decls
* Add comments
* Clean up function body printing throughout
* Add tests for subscripts
* Remove comment about subscript inlinable text
* Address some comments
* Handle lack of @objc on Linux
The string-keyed tables don't actually need Identifier keys on the
serialization side -- no reason to persist these in the ASTContext's
string table either.
They're not actually stored separately in the module file, nor
deserialized separately, but this way we're not sticking a bunch of
strings in the /ASTContext's/ string table, which would persist after
serialization. (The ASTContext's string table is also implemented
using a BumpPtrAllocator-backed StringMap, so the performance
characteristics of this should be about the same.)
This was originally added to support "derived" top-level declarations,
which in practice was just '==' implementations for enums. Now that
those are members, we don't have the notion of derived top-level
declarations anymore, and neither is there anything that would
normally be a cross-reference that we want to force to be serialized
directly.
No functionality change (because this was unused).
Allow substitution maps to be serialized directly (via an ID), writing out
the replacement types and conformances as appropriate. This is a more
efficient form of serialization than the current SubstitutionList approach,
because it maintains uniqueness of substitution maps within a module file,
and is a step toward eliminating SubstitutionList entirely.
Rather than inlining generic signatures in a half dozen places throughout
the serialization format, serialize (uniqued) generic signatures with their
own GenericSignatureID. Update various layouts (generic function types,
SIL function types, generic environments, extension cross-references) to
use GenericSignatureID.
Shaves ~187k off the size of Swift.swiftmodule.
Somewhat oddly, DeclID seems not to like being a key in DenseMap
due to the -1 value from getEmptyKey(), I don't quite get whether
this is intentional but using a raw uint32_t seems to work fine.
Special DeclNames represent names that do not have an identifier in the
surface language. This implies serializing the information about whether
a name is special together with its identifier (if it is not special)
in both the module file and the swift lookup table.
With the introduction of special decl names, `Identifier getName()` on
`ValueDecl` will be removed and pushed down to nominal declarations
whose name is guaranteed not to be special. Prepare for this by calling
to `DeclBaseName getBaseName()` instead where appropriate.
In order to accomplish this, cross-module references to typealiases
are now banned except from within conformances and NameAliasTypes, the
latter of which records the canonical type to determine if the
typealias has changed. For conformances, we don't have a good way to
check if the typealias has changed without trying to map it into
context, but that's all right---the rest of the compiler can already
fall back to the canonical type.
Module files store all of the Objective-C method entrypoints in a
central table indexed by selector, then filter the results based on
the specific class being requested. Rather than storing the class as
a TypeID---which requires a bunch of deserialization---store its
mangled name. This allows us to deserialize less, and causes circular
deserialization in rdar://problem/31615640.
Previously looking up an extension would result in all extensions for
types with the same name (nested or not) being deserialized; this
could even bring in base types that had not been deserialized yet. Add
in a string to distinguish an extension's base type; in the top-level
case this is just a module name, but for nested types it's a full
mangled name.
This is a little heavier than I'd like it to be, since it means we
mangle names and then throw them away, and since it means there's a
whole bunch of extra string data in the module just for uniquely
identifying a declaration. But it's correct, and does less work than
before, and fixes a circularity issue with a nested type A.B.A that
apparently used to work.
https://bugs.swift.org/browse/SR-3915
SubstitutionList is going to be a more compact representation of
a SubstitutionMap, suitable for inline allocation inside another
object.
For now, it's just a typedef for ArrayRef<Substitution>.
There's a class of errors in Serialization called "circularity
issues", where declaration A in file A.swift depends on declaration B
in file B.swift, and B also depends on A. In some cases we can manage
to type-check each of these files individually due to the laziness of
'validateDecl', but then fail to merge the "partial modules" generated
from A.swift and B.swift to form a single swiftmodule for the library
(because deserialization is a little less lazy for some things). A
common case of this is when at least one of the declarations is
nested, in which case a lookup to find that declaration needs to load
all the members of the parent type. This gets even worse when the
nested type is defined in an extension.
This commit sidesteps that issue specifically for nested types by
creating a top-level, per-file table of nested types in the "partial
modules". When a type is in the same module, we can then look it up
/without/ importing all other members of the parent type.
The long-term solution is to allow accessing any members of a type
without having to load them all, something we should support not just
for module-merging while building a single target but when reading
from imported modules as well. This should improve both compile time
and memory usage, though I'm not sure to what extent. (Unfortunately,
too many things still depend on the whole members list being loaded.)
Because this is a new code path, I put in a switch to turn it off:
frontend flag -disable-serialization-nested-type-lookup-table
https://bugs.swift.org/browse/SR-3707 (and possibly others)
Teach the serialization of SIL generic environments, which used to be
a trailing record following the SIL function definition, to use the
same uniqued "generic environment IDs" that are used for the AST
generic environments. Many of them overlap anyway, and SIL functions
tend to have AST generic environments anyway.
This approach guarantees that the AST + SIL deserialization provide
the same uniqueness of generic environments present prior to
serialization.
Serialize generic environments via a generic environment ID with a
separte offset table, so we have identity for the generic environments
and will share generic environments on deserialization.
When a pattern within a type context is serialized, serialize its
interface type (not its contextual type). When deserializing, record
the interface type and keep a side table of the associated
DeclContext, so that we can lazily map to the contextual type on first
access. This is designed to break recursion when we change the way
archetypes and generic environments are serialized.
An environment is always associated with a location with a signature, so
having them separate is pointless duplication. This patch also updates
the serialization to round-trip the signature data.
The witnesses in a NormalProtocolConformance have never been
completely serialized, because their substitutions involved a weird
mix of archetypes that blew up the deserialization code. So, only the
witness declarations themselves got serialized. Many clients (the type
checker, SourceKit, etc.) didn't need the extra information, but some
clients (e.g., the SIL optimizers) would end up recomputing this
information. Ick.
Now, serialize the complete Witness structure along with the AST,
including information about the synthetic environment, complete
substitutions, etc. This should obsolete some redundant code paths in
the SIL optimization infrastructure.
This (de-)serialization code takes a new-ish approach to serializing
the synthetic environment in that it avoids serializing any
archetypes. Rather, it maps everything back to interface types during
serialization, and deserialization forms a new generic environment
(with new archetypes!) on-the-fly, mapping deserialized types back
into that environment (and to those archetypes). This way, we don't
have to maintain identity of archetypes in the deserialization code,
and might get some better re-use of the archetypes.
More of rdar://problem/24079818.