* IRGen: EmptyBoxType's representation cannot be nil because of a conflict with extra inhabitant assumption in indirect enums
We map nil to the .None case of Optional. Instead use a singleton object.
SR-5148
rdar://32618580
- Separate out a uniquable KeyPathPattern that describes the context-free shape of the key path, with generic parameters and (eventually) subscript index slots factored out.
- Add component kinds for gettable and settable properties.
* IRGen: Change c-o-w existential implementation functions
* initialzeBufferWith(Copy|Take)OfBuffer value witness implementation for cow existentials
Implement and use initialzeBufferWith(Copy|Take)OfBuffer value witnesses for
copy-on-write existentials.
Before we used a free standing function but the overhead of doing so was
noticable (~20-30%) on micro benchmarks.
* IRGen: Use common getCopyOutOfLineBoxPointerFunction
* Add a runtime function to conditionally make a box unique
* Fix compilation of HeapObject.cpp on i386
* Fix IRGen test case
* Fix test case for i386
...and IRGen it into a call to __tsan_write1 in compiler-rt. This is
preparatory work for a later patch that will add an experimental
option to treat Swift inout accesses as TSan writes.
Use the generic type lowering algorithm described in
"docs/CallingConvention.rst#physical-lowering" to map from IRGen's explosion
type to the type expected by the ABI.
Change IRGen to use the swift calling convention (swiftcc) for native swift
functions.
Use the 'swiftself' attribute on self parameters and for closures contexts.
Use the 'swifterror' parameter for swift error parameters.
Change functions in the runtime that are called as native swift functions to use
the swift calling convention.
rdar://19978563
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
Now we can discern the types of values in heap boxes at runtime!
Closure reference captures are a common way of creating reference
cycles, so this provides some basic infrastructure for detecting those
someday.
A closure capture descriptor has the following:
- The number of captures.
- The number of sources of metadata reachable from the closure.
This is important for substituting generics at runtime since we
can't know precisely what will get captured until we observe a
closure.
- The number of types in the NecessaryBindings structure.
This is a holding tank in a closure for sources of metadata that
can't be gotten from the captured values themselves.
- The metadata source map, a list of pairs, for each
source of metadata for every generic argument needed to perform
substitution at runtime.
Key: The typeref for the generic parameter visible from the closure
in the Swift source.
Value: The metadata source, which describes how to crawl the heap from
the closure to get to the metadata for that generic argument.
- A list of typerefs for the captured values themselves.
Follow-up: IRGen tests for various capture scenarios, which will include
MetadataSource encoding tests.
rdar://problem/24989531
Properly lower reference counting SIL instructions with nonatomic attribute as invocations of corresponding non-atomic reference counting runtime functions.
initialization in-place on demand. Initialize parent metadata
references correctly on struct and enum metadata.
Also includes several minor improvements related to relative
pointers that I was using before deciding to simply switch the
parent reference to an absolute reference to get better access
patterns.
Includes a fix since the earlier commit to make enum metadata
writable if they have an unfilled payload size. This didn't show
up on Darwin because "constant" is currently unenforced there in
global data containing relocations.
This patch requires an associated LLDB change which is being
submitted in parallel.
initialization in-place on demand. Initialize parent metadata
references correctly on struct and enum metadata.
Also includes several minor improvements related to relative
pointers that I was using before deciding to simply switch the
parent reference to an absolute reference to get better access
patterns.
from the witness tables for their associations rather than passing
them separately.
This drastically reduces the number of physical arguments required
to invoke a generic function with a complex protocol hierarchy. It's
also an important step towards allowing recursive protocol
constraints. However, it may cause some performance problems in
generic code that we'll have to figure out ways to remediate.
There are still a few places in IRGen that rely on recursive eager
expansion of associated types and protocol witnesses. For example,
passing generic arguments requires us to map from a dependent type
back to an index into the all-dependent-types list in order to
find the right Substitution; that's something we'll need to fix
more generally. Specific to IRGen, there are still a few abstractions
like NecessaryBindings that use recursive expansion and are therefore
probably extremely expensive under this patch; I intend to fix those
up in follow-ups to the greatest extent possible.
There are also still a few things that could be made lazier about
type fulfillment; for example, we eagerly project the dynamic type
metadata of class parameters rather than waiting for the first place
we actually need to do so. We should be able to be lazier about
that, at least when the parameter is @guaranteed.
Technical notes follow. Most of the basic infrastructure I set up
for this over the last few months stood up, although there were
some unanticipated complexities:
The first is that the all-dependent-types list still does not
reliably contain all the dependent types in the minimized signature,
even with my last patch, because the primary type parameters aren't
necessarily representatives. It is, unfortunately, important to
give the witness marker to the primary type parameter because
otherwise substitution won't be able to replace that parameter at all.
There are better representations for all of that, but it's not
something I wanted to condition this patch on; therefore, we have to
do a significantly more expensive check in order to figure out a
dependent type's index in the all-dependent-types list.
The second is that the ability to add requirements to associated
types in protocol refinements means that we have to find the *right*
associatedtype declaration in order to find the associated witness
table. There seems to be relatively poor AST support for this
operation; maybe I just missed it.
The third complexity (so far) is that the association between an
archetype and its parent isn't particularly more important than
any other association it has. We need to be able to recover
witness tables linked with *all* of the associations that lead
to an archetype. This is, again, not particularly well-supported
by the AST, and we may run into problems here when we eliminate
recursive associated type expansion in signatures.
Finally, it's a known fault that this potentially leaves debug
info in a bit of a mess, since we won't have any informaton for
a type parameter unless we actually needed it somewhere.
Most notably, the source caches did not respect dominance. The
simplest solution was just to drop them in favor of the ordinary
caching system; this is unfortunate because it requires walking
over the path twice instead of exploiting the trie, but it's much
easier to make this work, especially in combination with the other
caching mechanisms at play.
This will be tested by later commits that enable lazy-loading of
local type data in various contexts.
Instead of categorically forbidding caching within a conditional
scope, permit it but remember how to remove the cache entries.
This means that ad-hoc emission code can now get exactly the
right caching behavior if they use this properly. In keeping
with that, adjust a bunch of code to properly nest scopes
according to the conditional paths they enter.
There are several interesting new features here.
The first is that, when emitting a SILFunction, we're now able to
cache type data according to the full dominance structure of the
original function. For example, if we ask for type metadata, and
we've already computed it in a dominating position, we're now able
to re-use that value; previously, we were limited to only doing this
if the value was from the entry block or the LLVM basic block
matched exactly. Since this tracks the SIL dominance relationship,
things in IRGen which add their own control flow must be careful
to suppress caching within blocks that may not dominate the
fallthrough; this mechanism is currently very crude, but could be
made to allow a limited amount of caching within the
conditionally-executed blocks.
This query is done using a proper dominator tree analysis, even at -O0.
I do not expect that we will frequently need to actually build the
tree, and I expect that the code-size benefits of doing a real
analysis will be significant, especially as we move towards making
more metadata lazily computed.
The second feature is that this adds support for "abstract"
cache entries, which indicate that we know how to derive the metadata
but haven't actually done so. This code isn't yet tested, but
it's going to be the basis of making a lot of things much lazier.
A single extra inhabitant is good enough for the most important case,
that being a single level of optionality. Otherwise, we want to
reserve maximal flexibility for the implementation.
This commit also fixes a bug where I was not correctly defining
the extra-inhabitant rules for all of the existential cases.
This is a bit of a hodge-podge of related changes that I decided
weren't quite worth teasing apart:
First, rename the weak{Retain,Release} entrypoints to
unowned{Retain,Release} to better reflect their actual use
from generated code.
Second, standardize the names of the rest of the entrypoints around
unowned{operation}.
Third, standardize IRGen's internal naming scheme and API for
reference-counting so that (1) there are generic functions for
emitting operations using a given reference-counting style and
(2) all operations explicitly call out the kind and style of
reference counting.
Finally, implement a number of new entrypoints for unknown unowned
reference-counting. These entrypoints use a completely different
and incompatible scheme for working with ObjC references. The
primary difference is that the new scheme abandons the flawed idea
(which I take responsibility for) that we can simulate an unowned
reference count for ObjC references, and instead moves towards an
address-only scheme when the reference might store an ObjC reference.
(The current implementation is still trivially takable, but that is
not something we should be relying on.) These will be tested in a
follow-up commit. For now, we still rely on the bad assumption of
reference-countability.
The drivers for this change are providing a simpler API to SIL pass
authors, having a more efficient of the in-memory representation,
and ruling out an entire class of common bugs that usually result
in hard-to-debug backend crashes.
Summary
-------
SILInstruction
Old New
+---------------+ +------------------+ +-----------------+
|SILInstruction | |SILInstruction | |SILDebugLocation |
+---------------+ +------------------+ +-----------------+
| ... | | ... | | ... |
|SILLocation | |SILDebugLocation *| -> |SILLocation |
|SILDebugScope *| +------------------+ |SILDebugScope * |
+---------------+ +-----------------+
We’re introducing a new class SILDebugLocation which represents the
combination of a SILLocation and a SILDebugScope.
Instead of storing an inline SILLocation and a SILDebugScope pointer,
SILInstruction now only has one SILDebugLocation pointer. The APIs of
SILBuilder and SILDebugLocation guarantees that every SILInstruction
has a nonempty SILDebugScope.
Developer-visible changes include:
SILBuilder
----------
In the old design SILBuilder populated the InsertedInstrs list to
allow setting the debug scopes of all built instructions in bulk
at the very end (as the responsibility of the user). In the new design,
SILBuilder now carries a "current debug scope" state and immediately
sets the debug scope when an instruction is inserted.
This fixes a use-after-free issue with with SIL passes that delete
instructions before destroying the SILBuilder that created them.
Because of this, SILBuilderWithScopes no longer needs to be a template,
which simplifies its call sites.
SILInstruction
--------------
It is neither possible or necessary to manually call setDebugScope()
on a SILInstruction any more. The function still exists as a private
method, but is only used when splicing instructions from one function
to another.
Efficiency
----------
In addition to dropping 20 bytes from each SILInstruction,
SILDebugLocations are now allocated in the SILModule's bump pointer
allocator and are uniqued by SILBuilder. Unfortunately repeat compiles
of the standard library already vary by about 5% so I couldn’t yet
produce reliable numbers for how much this saves overall.
rdar://problem/22017421
The swift_unknown* entry points are not available on the Linux port.
Previously we would still attempt to use them in a couple of cases:
1) Foreign classes
2) Existentials and archetypes
3) Optionals of boxed existentials
Note that this patch changes IRGen to never emit the
swift_errorRelease/Retain entry points on Linux. We would like to
use them in the future if we ever adopt a tagged-pointer representation
for small errors. In this case, they can be brought back, and the
TypeInfo for optionals will need to be generalized to propagate the
reference counting of the payload type, instead of defaulting to
unknown if the payload type is not natively reference counted.
A similar change will need to be made to support blocks, if we ever
want to use the blocks runtime on Linux.
Fixes <rdar://problem/23335318>, <rdar://problem/23335537>,
<rdar://problem/23335453>.
This means: handling of alloc_ref [stack].
It can be configured with two new options. See Option/FrontendOptions.td.
As the [stack] attribute is not generated yet, there should be NFC.
Swift SVN r32929
Full type metadata isn't necessary to calculate the runtime layout of a dependent struct or enum; we only need the non-function data from the value witness table (size, alignment, extra inhabitant count, and POD/BT/etc. flags). This can be generated more efficiently than the type metadata for many types--if we know a specific instantiation is fixed-layout, we can regenerate the layout information, or if we know the type has the same layout as another well-known type, we can get the layout from a common value witness table. This breaks a deadlock in most (but not all) cases where a value type is recursive using classes or fixed-layout indirected structs like UnsafePointer. rdar://problem/19898165
This time, factor out the ObjC-dependent parts of the tests so they only run with ObjC interop.
Swift SVN r30266
Full type metadata isn't necessary to calculate the runtime layout of a dependent struct or enum; we only need the non-function data from the value witness table (size, alignment, extra inhabitant count, and POD/BT/etc. flags). This can be generated more efficiently than the type metadata for many types--if we know a specific instantiation is fixed-layout, we can regenerate the layout information, or if we know the type has the same layout as another well-known type, we can get the layout from a common value witness table. This breaks a deadlock in most (but not all) cases where a value type is recursive using classes or fixed-layout indirected structs like UnsafePointer. rdar://problem/19898165
Swift SVN r30243
When producing TypeInfo for a box, try to reuse instantiations for common type structures:
- any POD type with the same stride and alignment can share a fixed HeapLayout;
- any single-refcounted-pointer type can share a fixed HeapLayout with types that have the same ReferenceCounting;
- dynamically-sized types can share a runtime-based box implementation.
For the runtime implementation, use new to-be-implemented variants of allocBox/deallocBox that will produce polymorphically-projectable boxes using instantiated metadata.
Swift SVN r29612
Don't project every value witness from the metadata every time we need one; this wastes code size in a way LLVM can't really optimize since it doesn't know the metadata is immutable. The code size wins on the standard library are disappointingly small (stdlib only shrinks by 4KB), but this makes generic IR a lot more compact and easier to read.
Swift SVN r28095
Preparation to fix <rdar://problem/18151694> Add Builtin.checkUnique
to avoid lost Array copies.
This adds the following new builtins:
isUnique : <T> (inout T[?]) -> Int1
isUniqueOrPinned : <T> (inout T[?]) -> Int1
These builtins take an inout object reference and return a
boolean. Passing the reference inout forces the optimizer to preserve
a retain distinct from what’s required to maintain lifetime for any of
the reference's source-level copies, because the called function is
allowed to replace the reference, thereby releasing the referent.
Before this change, the API entry points for uniqueness checking
already took an inout reference. However, after full inlining, it was
possible for two source-level variables that reference the same object
to appear to be the same variable from the optimizer's perspective
because an address to the variable was longer taken at the point of
checking uniqueness. Consequently the optimizer could remove
"redundant" copies which were actually needed to implement
copy-on-write semantics. With a builtin, the variable whose reference
is being checked for uniqueness appears mutable at the level of an
individual SIL instruction.
The kind of reference count checking that Builtin.isUnique performs
depends on the argument type:
- Native object types are directly checked by reading the
strong reference count:
(Builtin.NativeObject, known native class reference)
- Objective-C object types require an additional check that the
dynamic object type uses native swift reference counting:
(Builtin.UnknownObject, unknown class reference, class existential)
- Bridged object types allow the dymanic object type check to be
bypassed based on the pointer encoding:
(Builtin.BridgeObject)
Any of the above types may also be wrapped in an optional. If the
static argument type is optional, then a null check is also performed.
Thus, isUnique only returns true for non-null, native swift object
references with a strong reference count of one.
isUniqueOrPinned has the same semantics as isUnique except that it
also returns true if the object is marked pinned regardless of the
reference count. This allows for simultaneous non-structural
modification of multiple subobjects.
In some cases, the standard library can dynamically determine that it
has a native reference even though the static type is a bridge or
unknown object. Unsafe variants of the builtin are available to allow
the additional pointer bit mask and dynamic class lookup to be
bypassed in these cases:
isUnique_native : <T> (inout T[?]) -> Int1
isUniqueOrPinned_native : <T> (inout T[?]) -> Int1
These builtins perform an implicit cast to NativeObject before
checking uniqueness. There’s no way at SIL level to cast the address
of a reference, so we need to encapsulate this operation as part of
the builtin.
Swift SVN r27887
Some future-proofing to let us change ErrorType's reference counting in the future, or to use various tagged pointer optimizations in its representation.
Swift SVN r27213
As part of this, re-arrange the argument order so that
generic arguments come before the context, which comes
before the error result. Be more consistent about always
adding a context parameter on thick functions, even
when it's unused. Pull out the witness-method Self
argument so that it appears last after the error
argument.
Swift SVN r26667
This avoids redundant calls of swift_get*Metadata() if the metadata is reused in the same basic block
or if it is already availabe in the function entry block.
There is still room for improvement, e.g. by not limiting the reuse to a single block.
This change reduces the dylib codesize by about 11%, mostly in protocoll witnesses (-34%).
It also results in some improvements for the -Onone performance:
RangeAssignment: +31%
StdlibSort: +20%
Swift SVN r25123
OpaqueStorageTypeInfo uses iNNN types that don't always have the correct alloc size for an expected
size at the LLVM level. This needs to be fixed before my partial apply closure fixes can hold.
Swift SVN r24551