Switch MetadataCache to use a global slab allocator.

This seems to more than fix a performance regression that we
detected on a metadata-allocation microbenchmark.

A few months ago, I improved the metadata cache representation
and changed the metadata allocation scheme to primarily use malloc.
Previously, we'd been using malloc in the concurrent tree data
structure but a per-cache slab allocator for the metadata itself.
At the time, I was concerned about the overhead of per-cache
allocators, since many metadata patterns see only a small number
of instantiations.  That's still an important factor, so in the
new scheme we're using a global allocator; but instead of using
malloc for individual allocations, we're using a slab allocator,
which should have better peak, single-thread performance, at the
cost of not easily supporting deallocation.  Deallocation is
only used for metadata when there's contention on the cache, and
specifically only when there's contention for the same key, so
leaking a little isn't the worst thing in the world.

The initial slab is a 64K globally-allocated buffer.
Successive slabs are 16K and allocated with malloc.

rdar://28189496
This commit is contained in:
John McCall
2017-01-27 17:52:24 -05:00
parent 400f3381cf
commit ccbe5fcf73
7 changed files with 117 additions and 8 deletions

View File

@@ -2809,3 +2809,99 @@ swift::swift_getGenericWitnessTable(GenericWitnessTable *genericTable,
}
uint64_t swift::RelativeDirectPointerNullPtr = 0;
/***************************************************************************/
/*** Allocator implementation **********************************************/
/***************************************************************************/
namespace {
struct PoolRange {
static constexpr uintptr_t PageSize = 16 * 1024;
static constexpr uintptr_t MaxPoolAllocationSize = PageSize / 2;
/// The start of the allocation.
char *Begin;
/// The number of bytes remaining.
size_t Remaining;
};
}
// A statically-allocated pool. It's zero-initialized, so this
// doesn't cost us anything in binary size.
LLVM_ALIGNAS(alignof(void*)) static char InitialAllocationPool[64*1024];
static std::atomic<PoolRange>
AllocationPool{PoolRange{InitialAllocationPool,
sizeof(InitialAllocationPool)}};
void *MetadataAllocator::Allocate(size_t size, size_t alignment) {
assert(alignment <= alignof(void*));
assert(size % alignof(void*) == 0);
// If the size is larger than the maximum, just use malloc.
if (size > PoolRange::MaxPoolAllocationSize)
return malloc(size);
// Allocate out of the pool.
PoolRange curState = AllocationPool.load(std::memory_order_relaxed);
while (true) {
char *allocation;
PoolRange newState;
bool allocatedNewPage;
// Try to allocate out of the current page.
if (size <= curState.Remaining) {
allocatedNewPage = false;
allocation = curState.Begin;
newState = PoolRange{curState.Begin + size, curState.Remaining - size};
} else {
allocatedNewPage = true;
allocation = new char[PoolRange::PageSize];
newState = PoolRange{allocation + size, PoolRange::PageSize - size};
__asan_poison_memory_region(allocation, PoolRange::PageSize);
}
// Swap in the new state.
if (std::atomic_compare_exchange_weak_explicit(&AllocationPool,
&curState, newState,
std::memory_order_relaxed,
std::memory_order_relaxed)) {
// If that succeeded, we've successfully allocated.
__msan_allocated_memory(allocation, size);
__asan_poison_memory_region(allocation, size);
return allocation;
}
// If it failed, go back to a neutral state and try again.
if (allocatedNewPage) {
delete[] allocation;
}
}
}
void MetadataAllocator::Deallocate(const void *allocation, size_t size) {
__asan_poison_memory_region(allocation, size);
if (size > PoolRange::MaxPoolAllocationSize) {
free(const_cast<void*>(allocation));
return;
}
// Check whether the allocation pool is still in the state it was in
// immediately after the given allocation.
PoolRange curState = AllocationPool.load(std::memory_order_relaxed);
if (reinterpret_cast<const char*>(allocation) + size != curState.Begin) {
return;
}
// Try to swap back to the pre-allocation state. If this fails,
// don't bother trying again; we'll just leak the allocation.
PoolRange newState = { reinterpret_cast<char*>(const_cast<void*>(allocation)),
curState.Remaining + size };
(void)
std::atomic_compare_exchange_strong_explicit(&AllocationPool,
&curState, newState,
std::memory_order_relaxed,
std::memory_order_relaxed);
}