Use the generic type lowering algorithm described in
"docs/CallingConvention.rst#physical-lowering" to map from IRGen's explosion
type to the type expected by the ABI.
Change IRGen to use the swift calling convention (swiftcc) for native swift
functions.
Use the 'swiftself' attribute on self parameters and for closures contexts.
Use the 'swifterror' parameter for swift error parameters.
Change functions in the runtime that are called as native swift functions to use
the swift calling convention.
rdar://19978563
This seems to more than fix a performance regression that we
detected on a metadata-allocation microbenchmark.
A few months ago, I improved the metadata cache representation
and changed the metadata allocation scheme to primarily use malloc.
Previously, we'd been using malloc in the concurrent tree data
structure but a per-cache slab allocator for the metadata itself.
At the time, I was concerned about the overhead of per-cache
allocators, since many metadata patterns see only a small number
of instantiations. That's still an important factor, so in the
new scheme we're using a global allocator; but instead of using
malloc for individual allocations, we're using a slab allocator,
which should have better peak, single-thread performance, at the
cost of not easily supporting deallocation. Deallocation is
only used for metadata when there's contention on the cache, and
specifically only when there's contention for the same key, so
leaking a little isn't the worst thing in the world.
The initial slab is a 64K globally-allocated buffer.
Successive slabs are 16K and allocated with malloc.
rdar://28189496
This seems to more than fix a performance regression that we
detected on a metadata-allocation microbenchmark.
A few months ago, I improved the metadata cache representation
and changed the metadata allocation scheme to primarily use malloc.
Previously, we'd been using malloc in the concurrent tree data
structure but a per-cache slab allocator for the metadata itself.
At the time, I was concerned about the overhead of per-cache
allocators, since many metadata patterns see only a small number
of instantiations. That's still an important factor, so in the
new scheme we're using a global allocator; but instead of using
malloc for individual allocations, we're using a slab allocator,
which should have better peak, single-thread performance, at the
cost of not easily supporting deallocation. Deallocation is
only used for metadata when there's contention on the cache, and
specifically only when there's contention for the same key, so
leaking a little isn't the worst thing in the world.
The initial slab is a 64K globally-allocated buffer.
Successive slabs are 16K and allocated with malloc.
rdar://28189496
Fixed for the difference of Cygwin with other Windows variants (MSVC,
Itanium, MinGW).
- The platform name is renamed to "cygwin" from "windows" which is used
for searching the standard libraries.
- The consideration for DLL storage class (DllExport/DllImport) is not
required for Cygwin and MinGW. There is no problem when linking in
these environment.
- Cygwin should use large memory model as default.(This may be changed
if someone ports to 32bit)
- Cygwin and MinGW should use the autolink feature in the sameway of
Linux due to the linker's limit.
For a value of an opaque generic type `<T> x: T`, the language currently defines `type(of: x)` and `T.self` as both producing a type `T.Type`, and the result of substituting an existential type by `T == P` gives `P.Protocol`, so the `type(of:)` operation on `x` can only give the concrete protocol metatype when `x` is an existential in this case. The optimizer understood this rule, but the runtime did not, causing SR-3304.
Swift uses rt_swift_* functions to call the Swift runtime without using dyld's stubs. These functions are renamed to swift_rt_* to reduce namespace pollution.
rdar://28706212
MetadataCache's allocator into it.
The major functional change here is that MetadataCache will now use
the slab allocator for tree nodes, but I also switched the Hashable
conformances cache to use ConcurrentMap directly instead of a
Lazy<ConcurrentMap<>>.
Previously, these were all using MetadataCache. MetadataCache is a
more heavyweight structure which acquires a lock before building the
metadata. This is appropriate if building the metadata is very
expensive or might have semantic side-effects which cannot be rolled
back. It's also useful when there's a risk of re-entrance, since it
can diagnose such things instead of simply dead-locking or infinitely
recursing. However, it's necessary for structural cases like tuple
and function types, and instead we can just use ConcurrentMap, which
does a compare-and-swap to publish the constructed metadata and
potentially destroys it if another thread successfully won the race.
This is an optimization which we could not previously attempt.
As part of this, fix tuple metadata uniquing to consider the label
string correctly. This exposes a bug where the runtime demangling
of tuple metadata nodes doesn't preserve labels; fix this as well.
Previously, these were all using MetadataCache. MetadataCache is a
more heavyweight structure which acquires a lock before building the
metadata. This is appropriate if building the metadata is very
expensive or might have semantic side-effects which cannot be rolled
back. It's also useful when there's a risk of re-entrance, since it
can diagnose such things instead of simply dead-locking or infinitely
recursing. However, it's necessary for structural cases like tuple
and function types, and instead we can just use ConcurrentMap, which
does a compare-and-swap to publish the constructed metadata and
potentially destroys it if another thread successfully won the race.
This is an optimization which we could not previously attempt.
As part of this, fix tuple metadata uniquing to consider the label
string correctly. This exposes a bug where the runtime demangling
of tuple metadata nodes doesn't preserve labels; fix this as well.
When getting a mirror child that is a class existential, there
may be witness tables for the protocol composition to copy. Don't
just take the address of a class instance pointer from the stack -
make a temporary existential-like before calling into the Mirror
constructor.
This now correctly covers reflecting weak optional class types, and weak
optional class existential types, along with fixing a stack buffer
overflow reported by the Address Sanitizer (thanks, ASan!).
Tests were also updated to check for the validity of the child's data.
rdar://problem/27348445
This makes it a bit easier to diagnose unexpected boxing problems in the debugger, by allowing `po [value _swiftTypeName]` to work, instead of forcing users to know how to call `swift_getTypeName` from lldb themselves.
I apologize in advance to @jrose-apple, who is not a fan
of this fix ;-)
In unoptimized builds, the convenience initializers on
DispatchQueue allocate and immediately deallocate an
instance of OS_dispatch_queue prior to calling the
C function that returns the "real" instance.
This is because we don't have a way to write user-defined
factory initializers yet; convenience initializers still
have an 'initializing' entry point that takes an existing
instance, which we have no choice but to throw away.
Unfortunately, when we perform the fake allocation, we
look up class metadata by calling the wrong Swift runtime
function, causing a crash when we send +allocWithZone:.
Fix this so that the metadata is accessed via a lookup
from the Objective-C runtime, instead of making a totally
fake 'foreign metadata' object -- it looks like there was
code for this already, it just wasn't used in all cases.
While getting metadata for a runtime-only class should be
rare, this feels like a real bug fix, to me.
Second, we would ultimately free the fake object by sending
-release, however OS_dispatch_queue has an override of
-dealloc which doesn't like to be called with a completely
uninitialized instance.
Here, I'm going to drop all pretense of sanity. The patch
just changes IRGen to lower the dealloc_partial_ref instruction
as a call to the object_dispose() Objective-C runtime function
when the class in question is a runtime-only class. This
frees the object without running -dealloc, which *happens*
to work for OS_dispatch_queue.
Fixes <rdar://problem/27226313>.