A previous commit introduced the usage of buildContextManglingForSymbol
in buildContextDescriptorMangling, which was not quite correct, since
type nodes needed extra handling, which was done in the other version of
buildContextDescriptorMangling. This patch factors out the handling of
building context descriptors mangling from symbols, and updates both
call sites to use the new function instead.
rdar://165950673
If the MemoryReader can provide a symbol for the context descriptor, use
that instead of building the mangle tree from metadata. This shouldn't
hurt other clients, and would have a few benefits for LLDB:
- Small but numerous memory reads can be expensive for LLDB, building
the mangle tree from a string should be much faster.
- For types with private discriminators, this allows LLDB to provide the
same private discriminator that's in DWARF and in the swiftmodule (as
opposed to using the memory address where the context descriptor lives
at). This means that these private types would have the same mangled
name in every run. LLDB persistently caches field descriptors by name,
but since private discriminators aren't guaranteed to have the same
mangled name over multiple runs, LLDB would often need to invalidate
its entire cache and rebuild it. Using the stable UUID would solve
that issue.
rdar://165950673
A previous change introduced the "resolveRemoteAddress" API in the
MemoryReader interface, along with one use of it. However, the resolved
memory address is created, but left accidentally unused.
rdar://161919584
Take advantage of the ability to iterate over TrailingObjects when computing the size of certain kinds of type metadata. This avoids the occasional issue where a new trailing object is added to a type, the remote metadata reader isn't fully updated for it, and doesn't read enough data. This change fixes an issue with function type metadata where we didn't read the global actor field.
Not all type metadata is amenable to this, as some don't use TrailingObjects for their trailing data (e.g. various nominal types) and extended existentials need to dereference their Shape pointer to determine the number of TrailingObjects, which needs some additional code when done remotely. We are able to automatically calculate the sizes of Existential and Function.
rdar://162855053
Adds a hook so implementations of memory reader can add logic to
resolving remote addresses. This is needed because of an interaction
between LLDB, which tries to read memory from files instead of process
memory whenever possible and the DYLD shared cache.
The shared cache will merge pointers in the GOT sections from multiple
images into one location, and update the relative offsets to point to
the new location.
LLDB, will have initially read the offset pointing to the "old"
location, which will be zeroed out in live memory. This gives LLDB the
opportunity to re-read the relative offset, but from live memory, so it
can return the right pointer in the shared cache.
rdar://160837587
We scan the target's initial allocation pool, and all 16kB heap allocations. We check each pointer-aligned offset within those areas, and try to read it as Swift metadata and get a name from it. If that fails, quietly move on. It's very unlikely for some random memory to look enough like Swift metadata for this to produce a name, so this works very well to print the generic metadata instantiated in the remote process without requiring `SWIFT_DEBUG_ENABLE_METADATA_ALLOCATION_ITERATION`.
rdar://161120936
This patch adds a new SymbolicExtendedExistentialTypeRef kind, and
wires it up in TypeDecoder, TypeRefBuilder, TypeLowering, and
ASTDemangler.
This is tested indirectly via the matching LLDB commit.
Allow subclasses to override the behavior of readRemoteAddress. LLDB in
particular needs this to correctly set the address space of the result
remote address.
rdar://148361743
Add an extra opaque field to AddressSpace, which can be used by clients
of RemoteInspection to distinguish between different address spaces.
LLDB employs an optimization where it reads memory from files instead of
the running process whenever it can to speed up memory reads (these can
be slow when debugging something over a network). To do this, it needs
to keep track whether an address originated from a process or a file. It
currently distinguishes addresses by setting an unused high bit on the
address, but because of pointer authentication this is not a reliable
solution. In order to keep this optimization working, this patch adds an
extra opaque AddressSpace field to RemoteAddress, which LLDB can use on
its own implementation of MemoryReader to distinguish between addresses.
This patch is NFC for the other RemoteInspection clients, as it adds
extra information to RemoteAddress, which is entirely optional and if
unused should not change the behavior of the library.
Although this patch is quite big the changes are largely mechanical,
replacing threading StoredPointer with RemoteAddress.
rdar://148361743
We can end up querying the pointer size and similar things very frequently, which can be slow. These values don't change for any given reader, so call them once and cache the values locally instead.
rdar://150221008
This patch changes RemoteAbsolutePointer to store both the symbol and
the resolved address. This allows us to retire some ugly workarounds
to deal with non-symbolic addresses and it fixes code paths that would
need these workarounds, but haven't implemented them yet (i.e., the
pack shape handling in the symbolicReferenceResolver in MetadatyaReader.
Addresses parts of rdar://146273066.
rdar://153687085
them to hex strings when creating anonymous context descriptors. This
aims to solve a problem when LLDB reads reflection metadata directly
from local binary files instead of downloading them from in-process
memory.
LLDB's MemoryReader implements this to convert the file address into
an in-process address, so mangled names created from instance metadata
can be matched with the field info data read from the local files.
rdar://152743797
Determining a descriptor's size requires reading its contents, but reading its contents from out of process requires knowing its size.
Build up the size incrementally by walking over the TrailingObjects. Take advantage of the fact that each trailing object's presence/count depends only on data that comes before it. This allows us to read prefixes that we gradually expand until we've covered the whole thing.
Add calls to TrailingObjects to allow iterating over the prefix sizes, and modify readContextDescriptor to use them. This replaces the old code which attempted to determine the descriptor size in an ad-hoc fashion that didn't always get it right.
rdar://146006006
Our Bazel builds have become more strict about libc++
dependencies recently, so these are required to pick up
declarations of `malloc` and `uint32_t`, respectively.
When we hit the end of the hasResilientSuperclass chain, compute the bounds using the class descriptor's data instead of using the root class bounds and adjusting them.
rdar://134448718
Fix readMetadataBoundsOfSuperclass to recursively apply adjustForSubclass while going through the superclass chain.
Strip indirect descriptor pointers so we can deal with signed pointers here.
Make CMemoryReader assert (in asserts builds) that the addresses being read don't have any signature bits set, which helps track down places where we need to add stripping.
rdar://134448718
Introduce metadata and runtime support for describing conformances to
"suppressible" protocols such as `Copyable`. The metadata changes occur
in several different places:
* Context descriptors gain a flag bit to indicate when the type itself has
suppressed one or more suppressible protocols (e.g., it is `~Copyable`).
When the bit is set, the context will have a trailing
`SuppressibleProtocolSet`, a 16-bit bitfield that records one bit for
each suppressed protocol. Types with no suppressed conformances will
leave the bit unset (so the metadata is unchanged), and older runtimes
don't look at the bit, so they will ignore the extra data.
* Generic context descriptors gain a flag bit to indicate when the type
has conditional conformances to suppressible protocols. When set,
there will be trailing metadata containing another
`SuppressibleProtocolSet` (a subset of the one in the main context
descriptor) indicating which suppressible protocols have conditional
conformances, followed by the actual lists of generic requirements
for each of the conditional conformances. Again, if there are no
conditional conformances to suppressible protocols, the bit won't be
set. Old runtimes ignore the bit and any trailing metadata.
* Generic requirements get a new "kind", which provides an ignored
protocol set (another `SuppressibleProtocolSet`) stating which
suppressible protocols should *not* be checked for the subject type
of the generic requirement. For example, this encodes a requirement
like `T: ~Copyable`. These generic requirements can occur anywhere
that there is a generic requirement list, e.g., conditional
conformances and extended existentials. Older runtimes handle unknown
generic requirement kinds by stating that the requirement isn't
satisfied.
Extend the runtime to perform checking of the suppressible
conformances on generic arguments as part of checking generic
requirements. This checking follows the defaults of the language, which
is that every generic argument must conform to each of the suppressible
protocols unless there is an explicit generic requirement that states
which suppressible protocols to ignore. Thus, a generic parameter list
`<T, Y where T: ~Escapable>` will check that `T` is `Copyable` but
not that it is `Escapable`, and check that `U` is both `Copyable` and
`Escapable`. To implement this, we collect the ignored protocol sets
from these suppressed requirements while processing the generic
requirements, then check all of the generic arguments against any
conformances not suppressed.
Answering the actual question "does `X` conform to `Copyable`?" (for
any suppressible protocol) looks at the context descriptor metadata to
answer the question, e.g.,
1. If there is no "suppressed protocol set", then the type conforms.
This covers types that haven't suppressed any conformances, including
all types that predate noncopyable generics.
2. If the suppressed protocol set doesn't contain `Copyable`, then the
type conforms.
3. If the type is generic and has a conditional conformance to
`Copyable`, evaluate the generic requirements for that conditional
conformance to answer whether it conforms.
The procedure above handles the bits of a `SuppressibleProtocolSet`
opaquely, with no mapping down to specific protocols. Therefore, the
same implementation will work even with future suppressible protocols,
including back deployment.
The end result of this is that we can dynamically evaluate conditional
conformances to protocols that depend on conformances to suppressible
protocols.
Implements rdar://123466649.
LLVM is presumably moving towards `std::string_view` -
`StringRef::startswith` is deprecated on tip. `SmallString::startswith`
was just renamed there (maybe with some small deprecation inbetween, but
if so, we've missed it).
The `SmallString::startswith` references were moved to
`.str().starts_with()`, rather than adding the `starts_with` on
`stable/20230725` as we only had a few of them. Open to switching that
over if anyone feels strongly though.
We got crash reports from LLDB where protocolList is a nullptr when demangling a
symbolic reference. While we're also investigating the root cause of the issue,
the code in MetadataReader should also not just crash with such input.
rdar://122698966
Implement computeUnalignedFieldStartOffset as an alternative way to
finding out the start of the fields that belong to the instance when
reading the field's start from binary is not possible (for example,
embedded Swift doesn't emit any reflection metadata on the binary).
Not quite NFC because apparently the representation bleeds into what's
accepted in some situations where we're supposed to be warning about
conflicts and then making an arbitrary choice. But what we're doing
is nonsense, so we definitely need to break behavior here.
This is setting up for isolated(any) and isolated(caller). I tried
to keep that out of the patch as much as possible, though.
Create a version of the metadata specialization code which is abstracted so that it can work in different contexts, such as building specialized metadata from dylibs on disk rather than from inside a running process.
The GenericMetadataBuilder class is templatized on a ReaderWriter. The ReaderWriter abstracts out everything that's different between in-process and external construction of this data. Instead of reading and writing pointers directly, the builder calls the ReaderWriter to resolve and write pointers. The ReaderWriter also handles symbol lookups and looking up other Swift types by name.
This is accompanied by a simple implementation of the ReaderWriter which works in-process. The abstracted calls to resolve and write pointers are implemented using standard pointer dereferencing.
A new SWIFT_DEBUG_VALIDATE_EXTERNAL_GENERIC_METADATA_BUILDER environment variable uses the in-process ReaderWriter to validate the builder by running it in parallel with the existing metadata builder code in the runtime. When enabled, the GenericMetadataBuilder is used to build a second copy of metadata built by the runtime, and the two are compared to ensure that they match. When this environment variable is not set, the new builder code is inactive.
The builder is incomplete, and this initial version only works on structs. Any unsupported type produces an error, and skips the validation.
rdar://116592420
Extend function type metadata with an entry for the thrown error type,
so that thrown error types are represented at runtime as well. Note
that this required the introduction of "extended" function type
flags into function type metadata, because we would have used the last
bit. Do so, and define one extended flag bit as representing typed
throws.
Add `swift_getExtendedFunctionTypeMetadata` to the runtime to build
function types that have the extended flags and a thrown error type.
Teach IR generation to call this function to form the metadata, when
appropriate.
Introduce all of the runtime mangling/demangling support needed for
thrown error types.
Using symbolic references instead of a text based mangling avoids the
expensive type descriptor scan when objective c protocols are requested.
rdar://111536582
MetadataReader caches types only with the metadata address as the key.
However a type lookup can be requested skipping artificial subclasses or
not. This makes the cached results incorrect if two requests for the
same type, but skipping subclasses on one and not on the other, are made.
Fix this by adding a second dimension to the cache key.
rdar://101519300