Previously it was part of swiftBasic.
The demangler library does not depend on llvm (except some header-only utilities like StringRef). Putting it into its own library makes sure that no llvm stuff will be linked into clients which use the demangler library.
This change also contains other refactoring, like moving demangler code into different files. This makes it easier to remove the old demangler from the runtime library when we switch to the new symbol mangling.
Also in this commit: remove some unused API functions from the demangler Context.
fixes rdar://problem/30503344
The difference is that TransformArrayRef stores its function as an std::function
instead of using a template parameter. This is useful in situations where one
wants to define such a type in a header on forward declared pointers. If one had
to define the function to be used as a template parameter, one would have to
define the function or provide a forward declared version
Changes:
* Terminate all namespaces with the correct closing comment.
* Make sure argument names in comments match the corresponding parameter name.
* Remove redundant get() calls on smart pointers.
* Prefer using "override" or "final" instead of "virtual". Remove "virtual" where appropriate.
These are more instances of the problem with ArrayRef capturing a
reference to a temporary. The problem is exposed when compiling with a
recent version of clang. rdar://problem/28700044
C++ atomic's fetch_sub returns the previous value, where we want to
check the new value. This was causing massive memory leaks in SourceKit.
For ThreadSafeRefCountedBase, just switch to the one in LLVM that's
already correct. We should move the VPTR one to LLVM as well and then
we can get rid of this header.
rdar://problem/27358273
`unittests/Basic/BlotMapVectorTest.cpp` references `llvm::outs()`, which is defined in `llvm/Support/raw_ostream.h`. I believe this currently works without the import because it is transitively included via some CMake incantations and conditionals in the googletest headers invovling `GTEST_NO_LLVM_RAW_OSTREAM`. I'm not sure.
In any case, including the header is more explicit. The file uses `llvm::outs()`, so it should include the header that defines it.
We do this by doing a traversal of our sorted lists in a similar manner as one
would when one is merging two such sets, i.e. one has two iterators and always
advances the iterator that has a value that is less than the other. If we ever
hit a situation where the two iterators equal, we must have a non-empty
intersection.
A unittest that exercises very basic functionality is provided as well.
This is an immutable data structure with the following properties:
1. All of the sets are sorted and can be iterated over.
2. It takes in a bump ptr allocator and uses that allocator for all
allocations.
3. All concatenation operations involve only one bump ptr allocation.
4. Since we are only storing pointers, the data structure does not need any
destructors to be invoked to be cleaned up. The bumpptrallocator memory just
needs to be freed.
I am going to use this to improve the compile time performance of ARC.
The big differences here are that:
1. We no longer use the 4096 trick.
2. Now we store all indices inline so no mallocing is required and the
value is trivially copyable. We allow for much larger indices to be
stored inline which makes having an unrepresentable index a much smaller
issue. For instance on a 32 bit platform, in NewProjection, we are able
to represent an index of up to (1 << 26) - 1, which should be more than
enough to handle any interesting case.
3. We can now have up to 7 ptr cases and many more index cases (with each extra
bit needed to represent the index cases lowering the representable range of
indices).
The whole data structure is much simpler and easier to understand as a
bonus. A high level description of the ADT is as follows:
1. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are not all
set to 1 represent an enum with a pointer case. This means that one can have
at most ((1 << num_tagged_bits(T*)) - 2) enum cases associated with
pointers.
2. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are all set
is either an invalid PointerIntEnum or an index.
3. A PointerIntEnum with all bits set is an invalid PointerIntEnum.
4. A PointerIntEnum for which bits [0, (num_tagged_bits(T*)-1)] are all set
but for which the upper bits are not all set is an index enum. The case bits
for the index PointerIntEnum are stored in bits [num_tagged_bits(T*),
num_tagged_bits(T*) + num_index_case_bits]. Then the actual index is stored
in the remaining top bits. For the case in which this is used in swift
currently, we use 3 index bits meaning that on a 32 bit system we have 26
bits for representing indices meaning we can represent indices up to
67_108_862. Any index larger than that will result in an invalid
PointerIntEnum. On 64 bit we have many more bits than that.
By using this representation, we can make PointerIntEnum a true value type
that is trivially constructable and destructable without needing to malloc
memory.
In order for all of this to work, the user of this needs to construct an
enum with the appropriate case structure that allows the data structure to
determine what cases are pointer and which are indices. For instance the one
used by Projection in swift is:
enum class NewProjectionKind : unsigned {
// PointerProjectionKinds
Upcast = 0,
RefCast = 1,
BitwiseCast = 2,
FirstPointerKind = Upcast,
LastPointerKind = BitwiseCast,
// This needs to be set to ((1 << num_tagged_bits(T*)) - 1). It
// represents the first NonPointerKind.
FirstIndexKind = 7,
// Index Projection Kinds
Struct = PointerIntEnumIndexKindValue<0, EnumTy>::value,
Tuple = PointerIntEnumIndexKindValue<1, EnumTy>::value,
Index = PointerIntEnumIndexKindValue<2, EnumTy>::value,
Class = PointerIntEnumIndexKindValue<3, EnumTy>::value,
Enum = PointerIntEnumIndexKindValue<4, EnumTy>::value,
LastIndexKind = Enum,
};
This commit adds a number of compression routines:
1. A dictionary based compression.
2. Huffman based compression.
3. A compression algorithm for swift names that's based on the other two.
This commit also adds two large autogenerated files: CBCTables.h and HuffTables.h.
These files contain the autogenerated string tables and auto-generated code for
fast compression/decompression. The internal tree data structures are lowered
into code that does the variable length encoding/decoding and searching of
fragments in the codebook. The files were generated by processing the symbols
from several large swift applications (stdlib, unittests, simd, ui app, etc).
The list of the programs is listed as part of the output of the tool in the
header file.
I decided to commit the auto-generated files for two reasons. First, we have a
cyclic dependency problem where we need to analyze the output of the compiler
(swift files) in order to generate the tables. And second, these tables will
become a part of the Swift ABI and should remain constant.
It should be possible to split the code that generates the Trie-based data
structure and auto-generate it as part of the Swift build process.
PointerIntEnum is a more powerful PointerIntPair data structure. It uses
an enum with special cases to understand characteristics of the data and
then uses this information and the some tricks to be able to
represent:
1. Up to tagged bit number of pointer cases. The cases are stored inline.
2. Inline indices up to 4096.
3. Out of line indices > 4096.
It takes advantage of the trick that we use in the runtime already to
distinguish pointers from indices: namely that the zero page on modern
OSes do not allocate the zero page.
I made unittests for all of the operations so it is pretty well tested
out.
I am going to use this in a subsequent commit to compress projection in
the common case (the inline case) down to 1/3 of its size. The reason
why the inline case is common is that in most cases where projection is
used it will be targeting relative offsets in an array which are not
likely to be greater than a page. The mallocing of memory just enables
us to degrade gracefully.