This commit is mostly refactoring.
*) Introduce a new OptimizationMode enum and use that in SILOptions and IRGenOptions
*) Allow the optimization mode also be specified for specific SILFunctions. This is not used in this commit yet and thus still a NFC.
Also, fixes a minor bug: we didn’t run mandatory IRGen passes for functions with @_semantics("optimize.sil.never")
This commit contains:
-) adding the new instructions + infrastructure, like parsing, printing, etc.
-) support in IRGen to generate global object-variables (i.e. "heap" objects) which are statically initialized in the data section.
-) IRGen for global_value which lazily initializes the object header and returns a reference to the object.
For details see the documentation of the new instructions in SIL.rst.
To make this stick, I've disallowed direct use of that overload of
CreateCall. I've left the Constant overloads available, but eventually
we might want to consider fixing those, too, just to get all of this
code out of the business of manually remembering to pass around
attributes and calling conventions.
The test changes reflect the fact that we weren't really setting
attributes consistently at all, in this case on value witnesses.
* IRGen: EmptyBoxType's representation cannot be nil because of a conflict with extra inhabitant assumption in indirect enums
We map nil to the .None case of Optional. Instead use a singleton object.
SR-5148
rdar://32618580
- Separate out a uniquable KeyPathPattern that describes the context-free shape of the key path, with generic parameters and (eventually) subscript index slots factored out.
- Add component kinds for gettable and settable properties.
* IRGen: Change c-o-w existential implementation functions
* initialzeBufferWith(Copy|Take)OfBuffer value witness implementation for cow existentials
Implement and use initialzeBufferWith(Copy|Take)OfBuffer value witnesses for
copy-on-write existentials.
Before we used a free standing function but the overhead of doing so was
noticable (~20-30%) on micro benchmarks.
* IRGen: Use common getCopyOutOfLineBoxPointerFunction
* Add a runtime function to conditionally make a box unique
* Fix compilation of HeapObject.cpp on i386
* Fix IRGen test case
* Fix test case for i386
...and IRGen it into a call to __tsan_write1 in compiler-rt. This is
preparatory work for a later patch that will add an experimental
option to treat Swift inout accesses as TSan writes.
Use the generic type lowering algorithm described in
"docs/CallingConvention.rst#physical-lowering" to map from IRGen's explosion
type to the type expected by the ABI.
Change IRGen to use the swift calling convention (swiftcc) for native swift
functions.
Use the 'swiftself' attribute on self parameters and for closures contexts.
Use the 'swifterror' parameter for swift error parameters.
Change functions in the runtime that are called as native swift functions to use
the swift calling convention.
rdar://19978563
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
The new instructions are: ref_tail_addr, tail_addr and a new attribute [ tail_elems ] for alloc_ref.
For details see docs/SIL.rst
As these new instructions are not generated so far, this is a NFC.
Now we can discern the types of values in heap boxes at runtime!
Closure reference captures are a common way of creating reference
cycles, so this provides some basic infrastructure for detecting those
someday.
A closure capture descriptor has the following:
- The number of captures.
- The number of sources of metadata reachable from the closure.
This is important for substituting generics at runtime since we
can't know precisely what will get captured until we observe a
closure.
- The number of types in the NecessaryBindings structure.
This is a holding tank in a closure for sources of metadata that
can't be gotten from the captured values themselves.
- The metadata source map, a list of pairs, for each
source of metadata for every generic argument needed to perform
substitution at runtime.
Key: The typeref for the generic parameter visible from the closure
in the Swift source.
Value: The metadata source, which describes how to crawl the heap from
the closure to get to the metadata for that generic argument.
- A list of typerefs for the captured values themselves.
Follow-up: IRGen tests for various capture scenarios, which will include
MetadataSource encoding tests.
rdar://problem/24989531
Properly lower reference counting SIL instructions with nonatomic attribute as invocations of corresponding non-atomic reference counting runtime functions.
initialization in-place on demand. Initialize parent metadata
references correctly on struct and enum metadata.
Also includes several minor improvements related to relative
pointers that I was using before deciding to simply switch the
parent reference to an absolute reference to get better access
patterns.
Includes a fix since the earlier commit to make enum metadata
writable if they have an unfilled payload size. This didn't show
up on Darwin because "constant" is currently unenforced there in
global data containing relocations.
This patch requires an associated LLDB change which is being
submitted in parallel.
initialization in-place on demand. Initialize parent metadata
references correctly on struct and enum metadata.
Also includes several minor improvements related to relative
pointers that I was using before deciding to simply switch the
parent reference to an absolute reference to get better access
patterns.
from the witness tables for their associations rather than passing
them separately.
This drastically reduces the number of physical arguments required
to invoke a generic function with a complex protocol hierarchy. It's
also an important step towards allowing recursive protocol
constraints. However, it may cause some performance problems in
generic code that we'll have to figure out ways to remediate.
There are still a few places in IRGen that rely on recursive eager
expansion of associated types and protocol witnesses. For example,
passing generic arguments requires us to map from a dependent type
back to an index into the all-dependent-types list in order to
find the right Substitution; that's something we'll need to fix
more generally. Specific to IRGen, there are still a few abstractions
like NecessaryBindings that use recursive expansion and are therefore
probably extremely expensive under this patch; I intend to fix those
up in follow-ups to the greatest extent possible.
There are also still a few things that could be made lazier about
type fulfillment; for example, we eagerly project the dynamic type
metadata of class parameters rather than waiting for the first place
we actually need to do so. We should be able to be lazier about
that, at least when the parameter is @guaranteed.
Technical notes follow. Most of the basic infrastructure I set up
for this over the last few months stood up, although there were
some unanticipated complexities:
The first is that the all-dependent-types list still does not
reliably contain all the dependent types in the minimized signature,
even with my last patch, because the primary type parameters aren't
necessarily representatives. It is, unfortunately, important to
give the witness marker to the primary type parameter because
otherwise substitution won't be able to replace that parameter at all.
There are better representations for all of that, but it's not
something I wanted to condition this patch on; therefore, we have to
do a significantly more expensive check in order to figure out a
dependent type's index in the all-dependent-types list.
The second is that the ability to add requirements to associated
types in protocol refinements means that we have to find the *right*
associatedtype declaration in order to find the associated witness
table. There seems to be relatively poor AST support for this
operation; maybe I just missed it.
The third complexity (so far) is that the association between an
archetype and its parent isn't particularly more important than
any other association it has. We need to be able to recover
witness tables linked with *all* of the associations that lead
to an archetype. This is, again, not particularly well-supported
by the AST, and we may run into problems here when we eliminate
recursive associated type expansion in signatures.
Finally, it's a known fault that this potentially leaves debug
info in a bit of a mess, since we won't have any informaton for
a type parameter unless we actually needed it somewhere.
Most notably, the source caches did not respect dominance. The
simplest solution was just to drop them in favor of the ordinary
caching system; this is unfortunate because it requires walking
over the path twice instead of exploiting the trie, but it's much
easier to make this work, especially in combination with the other
caching mechanisms at play.
This will be tested by later commits that enable lazy-loading of
local type data in various contexts.
Instead of categorically forbidding caching within a conditional
scope, permit it but remember how to remove the cache entries.
This means that ad-hoc emission code can now get exactly the
right caching behavior if they use this properly. In keeping
with that, adjust a bunch of code to properly nest scopes
according to the conditional paths they enter.
There are several interesting new features here.
The first is that, when emitting a SILFunction, we're now able to
cache type data according to the full dominance structure of the
original function. For example, if we ask for type metadata, and
we've already computed it in a dominating position, we're now able
to re-use that value; previously, we were limited to only doing this
if the value was from the entry block or the LLVM basic block
matched exactly. Since this tracks the SIL dominance relationship,
things in IRGen which add their own control flow must be careful
to suppress caching within blocks that may not dominate the
fallthrough; this mechanism is currently very crude, but could be
made to allow a limited amount of caching within the
conditionally-executed blocks.
This query is done using a proper dominator tree analysis, even at -O0.
I do not expect that we will frequently need to actually build the
tree, and I expect that the code-size benefits of doing a real
analysis will be significant, especially as we move towards making
more metadata lazily computed.
The second feature is that this adds support for "abstract"
cache entries, which indicate that we know how to derive the metadata
but haven't actually done so. This code isn't yet tested, but
it's going to be the basis of making a lot of things much lazier.
A single extra inhabitant is good enough for the most important case,
that being a single level of optionality. Otherwise, we want to
reserve maximal flexibility for the implementation.
This commit also fixes a bug where I was not correctly defining
the extra-inhabitant rules for all of the existential cases.
This is a bit of a hodge-podge of related changes that I decided
weren't quite worth teasing apart:
First, rename the weak{Retain,Release} entrypoints to
unowned{Retain,Release} to better reflect their actual use
from generated code.
Second, standardize the names of the rest of the entrypoints around
unowned{operation}.
Third, standardize IRGen's internal naming scheme and API for
reference-counting so that (1) there are generic functions for
emitting operations using a given reference-counting style and
(2) all operations explicitly call out the kind and style of
reference counting.
Finally, implement a number of new entrypoints for unknown unowned
reference-counting. These entrypoints use a completely different
and incompatible scheme for working with ObjC references. The
primary difference is that the new scheme abandons the flawed idea
(which I take responsibility for) that we can simulate an unowned
reference count for ObjC references, and instead moves towards an
address-only scheme when the reference might store an ObjC reference.
(The current implementation is still trivially takable, but that is
not something we should be relying on.) These will be tested in a
follow-up commit. For now, we still rely on the bad assumption of
reference-countability.
The drivers for this change are providing a simpler API to SIL pass
authors, having a more efficient of the in-memory representation,
and ruling out an entire class of common bugs that usually result
in hard-to-debug backend crashes.
Summary
-------
SILInstruction
Old New
+---------------+ +------------------+ +-----------------+
|SILInstruction | |SILInstruction | |SILDebugLocation |
+---------------+ +------------------+ +-----------------+
| ... | | ... | | ... |
|SILLocation | |SILDebugLocation *| -> |SILLocation |
|SILDebugScope *| +------------------+ |SILDebugScope * |
+---------------+ +-----------------+
We’re introducing a new class SILDebugLocation which represents the
combination of a SILLocation and a SILDebugScope.
Instead of storing an inline SILLocation and a SILDebugScope pointer,
SILInstruction now only has one SILDebugLocation pointer. The APIs of
SILBuilder and SILDebugLocation guarantees that every SILInstruction
has a nonempty SILDebugScope.
Developer-visible changes include:
SILBuilder
----------
In the old design SILBuilder populated the InsertedInstrs list to
allow setting the debug scopes of all built instructions in bulk
at the very end (as the responsibility of the user). In the new design,
SILBuilder now carries a "current debug scope" state and immediately
sets the debug scope when an instruction is inserted.
This fixes a use-after-free issue with with SIL passes that delete
instructions before destroying the SILBuilder that created them.
Because of this, SILBuilderWithScopes no longer needs to be a template,
which simplifies its call sites.
SILInstruction
--------------
It is neither possible or necessary to manually call setDebugScope()
on a SILInstruction any more. The function still exists as a private
method, but is only used when splicing instructions from one function
to another.
Efficiency
----------
In addition to dropping 20 bytes from each SILInstruction,
SILDebugLocations are now allocated in the SILModule's bump pointer
allocator and are uniqued by SILBuilder. Unfortunately repeat compiles
of the standard library already vary by about 5% so I couldn’t yet
produce reliable numbers for how much this saves overall.
rdar://problem/22017421
The swift_unknown* entry points are not available on the Linux port.
Previously we would still attempt to use them in a couple of cases:
1) Foreign classes
2) Existentials and archetypes
3) Optionals of boxed existentials
Note that this patch changes IRGen to never emit the
swift_errorRelease/Retain entry points on Linux. We would like to
use them in the future if we ever adopt a tagged-pointer representation
for small errors. In this case, they can be brought back, and the
TypeInfo for optionals will need to be generalized to propagate the
reference counting of the payload type, instead of defaulting to
unknown if the payload type is not natively reference counted.
A similar change will need to be made to support blocks, if we ever
want to use the blocks runtime on Linux.
Fixes <rdar://problem/23335318>, <rdar://problem/23335537>,
<rdar://problem/23335453>.
This means: handling of alloc_ref [stack].
It can be configured with two new options. See Option/FrontendOptions.td.
As the [stack] attribute is not generated yet, there should be NFC.
Swift SVN r32929
Full type metadata isn't necessary to calculate the runtime layout of a dependent struct or enum; we only need the non-function data from the value witness table (size, alignment, extra inhabitant count, and POD/BT/etc. flags). This can be generated more efficiently than the type metadata for many types--if we know a specific instantiation is fixed-layout, we can regenerate the layout information, or if we know the type has the same layout as another well-known type, we can get the layout from a common value witness table. This breaks a deadlock in most (but not all) cases where a value type is recursive using classes or fixed-layout indirected structs like UnsafePointer. rdar://problem/19898165
This time, factor out the ObjC-dependent parts of the tests so they only run with ObjC interop.
Swift SVN r30266
Full type metadata isn't necessary to calculate the runtime layout of a dependent struct or enum; we only need the non-function data from the value witness table (size, alignment, extra inhabitant count, and POD/BT/etc. flags). This can be generated more efficiently than the type metadata for many types--if we know a specific instantiation is fixed-layout, we can regenerate the layout information, or if we know the type has the same layout as another well-known type, we can get the layout from a common value witness table. This breaks a deadlock in most (but not all) cases where a value type is recursive using classes or fixed-layout indirected structs like UnsafePointer. rdar://problem/19898165
Swift SVN r30243
When producing TypeInfo for a box, try to reuse instantiations for common type structures:
- any POD type with the same stride and alignment can share a fixed HeapLayout;
- any single-refcounted-pointer type can share a fixed HeapLayout with types that have the same ReferenceCounting;
- dynamically-sized types can share a runtime-based box implementation.
For the runtime implementation, use new to-be-implemented variants of allocBox/deallocBox that will produce polymorphically-projectable boxes using instantiated metadata.
Swift SVN r29612