...and explicitly mark symbols we export, either for use by executables or for runtime-stdlib interaction. Until the stdlib supports resilience we have to allow programs to link to these SPI symbols.
* Switch to calling `putchar_unlocked()` instead of `putchar()` for
actual printing. We're already locking stdout with `flockfile()`, so
there's no need for the redundant lock that `putchar()` uses.
* Add an explicit lock to the output stream in `dump()`. This means the
entire dump is printed with the lock held, which will prevent the
output of `dump()` from mixing with prints on other threads.
* Use `_debugPrint_unlocked()` instead of `debugPrint()` in
`_adHocPrint()`. The output stream is already locked while this
function is executing. Rename the function to `_adHocPrint_unlocked()`
to make this explicit.
* Use `targetStream.write()` and `_print_unlocked()` instead of
`print()` in `_dumpObject()`. This removes the redundant locking, and
also eliminates the creation of intermediate strings. Rename the
function to `_dumpObject_unlocked()` to make this explicit.
* Use `targetStream.write()`, `_print_unlocked()`, and
`_debugPrint_unlocked()` in `_dumpSuperclass()`. This removes the
redundant locking, and also eliminates the creation of intermediate
strings. Rename the function to `_dumpSuperclass_unlocked()` to make
this explicit.
* Use `_debugPrint_unlocked()` instead of `debugPrint()` in
`String.init(reflecting:)`. This shouldn't really make much of a
difference but it matches the usage of `_print_unlocked()` in
`String.init(_:)`.
The net result is that all printing is still covered under locks like
before, but stdout is never recursively locked. This should result in
slightly faster printing. In addition, `dump()` is now covered under a
single lock so it can't mix its output with prints from other threads.