Adds a hook so implementations of memory reader can add logic to
resolving remote addresses. This is needed because of an interaction
between LLDB, which tries to read memory from files instead of process
memory whenever possible and the DYLD shared cache.
The shared cache will merge pointers in the GOT sections from multiple
images into one location, and update the relative offsets to point to
the new location.
LLDB, will have initially read the offset pointing to the "old"
location, which will be zeroed out in live memory. This gives LLDB the
opportunity to re-read the relative offset, but from live memory, so it
can return the right pointer in the shared cache.
(cherry picked from commit 0f4a3ceb67a3669cd5433d5ae13fec6852876173)
rdar://163652093
Allow subclasses to override the behavior of readRemoteAddress. LLDB in
particular needs this to correctly set the address space of the result
remote address.
rdar://148361743
(cherry picked from commit 45d6d45623)
Add an extra opaque field to AddressSpace, which can be used by clients
of RemoteInspection to distinguish between different address spaces.
LLDB employs an optimization where it reads memory from files instead of
the running process whenever it can to speed up memory reads (these can
be slow when debugging something over a network). To do this, it needs
to keep track whether an address originated from a process or a file. It
currently distinguishes addresses by setting an unused high bit on the
address, but because of pointer authentication this is not a reliable
solution. In order to keep this optimization working, this patch adds an
extra opaque AddressSpace field to RemoteAddress, which LLDB can use on
its own implementation of MemoryReader to distinguish between addresses.
This patch is NFC for the other RemoteInspection clients, as it adds
extra information to RemoteAddress, which is entirely optional and if
unused should not change the behavior of the library.
Although this patch is quite big the changes are largely mechanical,
replacing threading StoredPointer with RemoteAddress.
rdar://148361743
(cherry picked from commit 58df5534d2)
(cherry picked from commit 8f3862b5e7)
This patch changes RemoteAbsolutePointer to store both the symbol and
the resolved address. This allows us to retire some ugly workarounds
to deal with non-symbolic addresses and it fixes code paths that would
need these workarounds, but haven't implemented them yet (i.e., the
pack shape handling in the symbolicReferenceResolver in MetadatyaReader.
Addresses parts of rdar://146273066.
rdar://153687085
(cherry picked from commit 9381a54c67)
(cherry picked from commit a6eafcb311)
them to hex strings when creating anonymous context descriptors. This
aims to solve a problem when LLDB reads reflection metadata directly
from local binary files instead of downloading them from in-process
memory.
LLDB's MemoryReader implements this to convert the file address into
an in-process address, so mangled names created from instance metadata
can be matched with the field info data read from the local files.
rdar://152743797
(cherry picked from commit 540df142d7)
Our Bazel builds have become more strict about libc++
dependencies recently, so these are required to pick up
declarations of `malloc` and `uint32_t`, respectively.
Reformatting everything now that we have `llvm` namespaces. I've
separated this from the main commit to help manage merge-conflicts and
for making it a bit easier to read the mega-patch.
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
Until recently, `MemoryReader` had a single function `resovlePointer` which did two things, and has a somewhat vague name. The two things were:
1. Tool-specific mapping between real addresses and tagged addresses (first implemented in `swift-reflection-dump` and then later in lldb)
2. Finding a "symbol" for a given address
Recently, `resolvePointerAsSymbol` was added, which overloaded the term "resolve" and it added another way to deal with symbols for addresses. Symbols themselves were a bit muddled, as `swift-reflection-dump` was dealing with dynamic symbols aka bindings, while lldb was dealing in regular (static) symbols.
This change separates these two parts of functionality, and also divides symbol lookup into two cases. The API surface will now be:
1. `resolvePointer` for mapping/tagging addresses
3. `getSymbol` for looking up a symbol for an address
4. `getDynamicSymbol` for looking up a dyld binding for an address
Note: each of these names could be improved. Some alternative terms: `lookup` instead of `get`, `Binding` or `BindingName` instead of `DynamicSymbol`. Maybe even another term instead of "resolve". Suggestions welcome!
Currently, `swift-reflection-dump` supports `getDynamicSymbol` but not `getSymbol`. For lldb it's the reverse, `getSymbol` is supported but `getDynamicSymbol` needs to be implemented.
For everything but lldb, this change is NFC. For lldb it fixes a bug where `LLDBMemoryReader` returns regular symbols where we should instead be returning dynamic symbols.
Add a few includes of Optional.h and Hashing.h. These files are failing
ot build in the "next" branch due to changes in llvm's own includes, but
it's general goodness to include them on main as well.
Most of the new inspection logic is in Remote Mirror. New code in swift-inspect calls the new Remote Mirror functions and formats the resulting information for display.
Specific Remote Mirror changes:
* Add a call to check if a given metadata is an actor.
* Add calls to get information about actors and tasks.
* Add a `readObj` call to MemoryReader that combines the read and the cast, greatly simplifying code chasing pointers in the remote process.
* Add a generalized facility to the C shims that can allocate a temporary object that remains valid until at least the next call, which is used to return various temporary arrays from the new calls. Remove the existing `lastString` and `lastChunks` member variables in favor of this new facility.
Swift-inspect changes:
* Add a new dump-concurrency command.
* Add a new `ConcurrencyDumper.swift` file with the implementation. The dumper needs to do some additional work with the results from Remote Mirror to build up the task tree and this keeps it all organized.
* Extend `Inspector` to query the target's threads and fetch each thread's current task.
Concurrency runtime changes:
* Add `_swift_concurrency_debug` variables pointing to the various future adapters. Remote Mirror uses these to provide a better view of a tasks's resume pointer.
rdar://85231338
This new function allows subclasses to attempt to resolve a pointer to
a symbol before any memory is read, which can be potentially expensive
(for example, in LLDB).
These calls can fail when passed absurd sizes, which can happen when we try to read data that's corrupt or doesn't contain what we think it should. Fail gracefully instead of crashing.
rdar://78210820
swift::reflection::TypeInfo for (Clang-)imported non-Objective-C types. This is
needed to reflect on the size mixed Swift / Clang types, when no type metadata
is available for the C types.
This is a necessary ingredient for the TypeRef-based Swift context in
LLDB. Because we do not have reflection metadata for pure C types in Swift,
reflection cannot compute TypeInfo for NominalTypeRefs for those types. By
providing this callback, LLDB can supply this information for DWARF, and
reflection can compute TypeInfos for mixed Swift/C types.
This code rearchitects and simplifies the projectEnumValue support by
introducing a new `TypeInfo` subclass for each kind of enum, including trivial,
no-payload, single-payload, and three different classes for multi-payload enums:
* "UnsupportedEnum" that we don't understand. This returns "don't know" answers for all requests in cases where the runtime lacks enough information to accurately handle a particular enum.
* MP Enums that only use a separate tag value. This includes generic enums and other dynamic layouts, as well as enums whose payloads have no spare bits.
* MP Enums that use spare bits, possibly in addition to a separate tag. This logic can only be used, of course, if we can in fact compute a spare bit mask that agrees with the compiler.
The final challenge is to choose one of the above three handlings for every MPE. Currently, we do not have an accurate source of information for the spare bit mask, so we never choose the third option above. We use the second option for dynamic MPE layouts (including generics) and the first for everything else.
TODO: Once we can arrange for the compiler to expose spare bit mask data, we'll be able to use that to drive more MPE cases.
* First part of multi-payload enum support
This handles multi-payload enums with fixed
layouts that don't use spare payload bits.
It includes XI calculations that allow us to
handle single-payload enums where the payload
ultimately includes a multi-payload enum
(For example, on 32-bit platforms, String uses
a multi-payload enum, so this now supports single-payload
enums carrying Strings.)
Teach RemoteMirror how to project enum values
This adds two new functions to the SwiftRemoteMirror
facility that support inspecting enum values.
Currently, these support non-payload enums and
single-payload enums, including nested enums and
payloads with struct, tuple, and reference payloads.
In particular, it handles nested `Optional` types.
TODO: Multi-payload enums use different strategies for
encoding the cases that aren't yet supported by this
code.
Note: This relies on information from dataLayoutQuery
to correctly decode invalid pointer values that are used
to encode enums. Existing clients will need to augment
their DLQ functions before using these new APIs.
Resolves rdar://59961527
```
/// Projects the value of an enum.
///
/// Takes the address and typeref for an enum and determines the
/// index of the currently-selected case within the enum.
///
/// Returns true iff the enum case could be successfully determined.
/// In particular, note that this code may fail for valid in-memory data
/// if the compiler is using a strategy we do not yet understand.
SWIFT_REMOTE_MIRROR_LINKAGE
int swift_reflection_projectEnumValue(SwiftReflectionContextRef ContextRef,
swift_addr_t EnumAddress,
swift_typeref_t EnumTypeRef,
uint64_t *CaseIndex);
/// Finds information about a particular enum case.
///
/// Given an enum typeref and index of a case, returns:
/// * Typeref of the associated payload or zero if there is no payload
/// * Name of the case if known.
///
/// The Name points to a freshly-allocated C string on the heap. You
/// are responsible for freeing the string (via `free()`) when you are finished.
SWIFT_REMOTE_MIRROR_LINKAGE
int swift_reflection_getEnumCaseTypeRef(SwiftReflectionContextRef ContextRef,
swift_typeref_t EnumTypeRef,
unsigned CaseIndex,
char **CaseName,
swift_typeref_t *PayloadTypeRef);
```
Co-authored-by: Mike Ash <mikeash@apple.com>
Pointer data in some remote reflection targets may required relocation, or may not be
fully resolvable, such as when we're dumping info from a single image on disk that
references other dynamic libraries. Add a `RemoteAbsolutePointer` type that can hold a
symbol, offset, or combination of both, and add APIs to `MemoryReader` and `MetadataReader`
for reading pointers that can get unresolved relocation info from an image, or apply
relocations to pointer information. MetadataReader can use the symbol name information to
fill in demanglings of symbolic-reference-bearing mangled names by using the information
from the symbol name to fill in the name even though the context descriptors are not
available.
For now, this is NFC (MemoryReader::resolvePointer just forwards the pointer data), but
lays the groundwork for implementation of relocation in ObjectMemoryReader.
* Change the RemoteMirror API to have extensible data layout callback
* Use DLQ_Get prefix on DataLayoutQueryType enum values
* Simplify MemoryReaderImpl and synthesize minimalDataLayoutQueryFunction