While tightening the requirements of the debug info generator in
IRGenSIL I noticed that SILCloner didn't correctly transfer variable
debug info on alloc_box and alloc_stack instructions. In order to make
these mistakes easier to find I added an assertion to SILBuilder and
fixed all issues uncovered by that assertion, too.
The result is a moderate increase in debug info coverage in optimized code.
On stdlib/public/core/OSX/x86_64/Swift.o "variables with location"
increases from 60134 to 60299.
Specifically, this transforms:
builtin "generic_add"<Builtin.Vec4xInt32>(
->
builtin "add_Vec4xInt32"(
If we do not have a static overload for the type, we just leave the generic call
alone. If the generic builtin takes addresses as its arguments (i.e. 2x
in_guaranteed + 1x out), we load the arguments, evaluate the static overloaded
builtin and then store the result into the out parameter.
TLDR: This patch introduces a new kind of builtin, "a polymorphic builtin". One
calls it like any other builtin, e.x.:
```
Builtin.generic_add(x, y)
```
but it has a contract: it must be specialized to a concrete builtin by the time
we hit Lowered SIL. In this commit, I add support for the following generic
operations:
Type | Op
------------------------
FloatOrVector |FAdd
FloatOrVector |FDiv
FloatOrVector |FMul
FloatOrVector |FRem
FloatOrVector |FSub
IntegerOrVector|AShr
IntegerOrVector|Add
IntegerOrVector|And
IntegerOrVector|ExactSDiv
IntegerOrVector|ExactUDiv
IntegerOrVector|LShr
IntegerOrVector|Mul
IntegerOrVector|Or
IntegerOrVector|SDiv
IntegerOrVector|SRem
IntegerOrVector|Shl
IntegerOrVector|Sub
IntegerOrVector|UDiv
IntegerOrVector|Xor
Integer |URem
NOTE: I only implemented support for the builtins in SIL and in SILGen. I am
going to implement the optimizer parts of this in a separate series of commits.
DISCUSSION
----------
Today there are polymorphic like instructions in LLVM-IR. Yet, at the
swift and SIL level we represent these operations instead as Builtins whose
names are resolved by splatting the builtin into the name. For example, adding
two things in LLVM:
```
%2 = add i64 %0, %1
%2 = add <2 x i64> %0, %1
%2 = add <4 x i64> %0, %1
%2 = add <8 x i64> %0, %1
```
Each of the add operations are done by the same polymorphic instruction. In
constrast, we splat out these Builtins in swift today, i.e.:
```
let x, y: Builtin.Int32
Builtin.add_Int32(x, y)
let x, y: Builtin.Vec4xInt32
Builtin.add_Vec4xInt32(x, y)
...
```
In SIL, we translate these verbatim and then IRGen just lowers them to the
appropriate polymorphic instruction. Beyond being verbose, these prevent these
Builtins (which need static types) from being used in polymorphic contexts where
we can guarantee that eventually a static type will be provided.
In contrast, the polymorphic builtins introduced in this commit can be passed
any type, with the proviso that the expert user using this feature can guarantee
that before we reach Lowered SIL, the generic_add has been eliminated. This is
enforced by IRGen asserting if passed such a builtin and by the SILVerifier
checking that the underlying builtin is never called once the module is in
Lowered SIL.
In forthcoming commits, I am going to add two optimizations that give the stdlib
tool writer the tools needed to use this builtin:
1. I am going to add an optimization to constant propagation that changes a
"generic_*" op to the type of its argument if the argument is a type that is
valid for the builtin (i.e. integer or vector).
2. I am going to teach the SILCloner how to specialize these as it inlines. This
ensures that when we transparent inline, we specialize the builtin automatically
and can then form SSA at -Onone using predictable memory access operations.
The main implication around these polymorphic builtins are that if an author is
not able to specialize the builtin, they need to ensure that after constant
propagation, the generic builtin has been DCEed. The general rules are that the
-Onone optimizer will constant fold branches with constant integer operands. So
if one can use a bool of some sort to trigger the operation, one can be
guaranteed that the code will not codegen. I am considering putting in some sort
of diagnostic to ensure that the stdlib writer has a good experience (e.x. get
an error instead of crashing the compiler).
evaluator to precisely evaluate Builtin.assert_configuration.
Unify UnknownReason::Trap and UnknownReason::AssertionFailure error
values in the constant evaluator, now that we have 'condfail_message'
SIL instruction, which provides an error message for the traps.
@_semantics("constant_evaluable") annotation to denote constant
evaluable functions.
Add a test suite that uses the sil-opt pass ConstantEvaluableSubsetChecker.cpp
to check the constant evaluability of function in the OSLog
overlay.
of Swift code snippets. Add a new test: constant_evaluable_subset_test.swift
that tests Swift code snippets that are expected to be constant evaluable and
those that are not.
1. builtin "int_expect", which makes the evaluator work on more
integer operations such as left/right shift (with traps) and
integer conversions.
2. builtin "_assertConfiguration", which enables the evaluator
to work with debug stdlib.
3. builtin "ptrtoint", which enables the evaluator to track
StaticString
4. _assertionFailure API, which enables the evaluator to report
stdlib assertion failures encountered during constant evaluation.
Also, enable attaching auxiliary data with the enum "UnknownReason"
and use it to improve diagnostics for UnknownSymbolicValues,
which represent failures during constant evaluation.
This makes for a cleaner and less implicit-context-heavy API, and makes it easier for symbolic
reference resolvers to do context-dependent things (like map the in-memory base address back to a
remote address in MetadataReader).
Co-routines are so expensive (e.g. Array.subscript.read) that it makes sense to enable generic inlining of co-routines.
This will speed up array iteration (e.g. for elem in array { }) in a generic context significantly.
Another example is ManagedBuffer.header.read, which gets much faster.
In both cases, the speedup is mainly because there is no malloc happening anymore.
https://bugs.swift.org/browse/SR-11231
rdar://problem/53777612
This mostly requires changing various entry points to pass around a
TypeConverter instead of a SILModule. I've left behind entry points
that take a SILModule for a few methods like SILType::subst() to
avoid creating even more churn.
This memoizes the result, which is fine for all callers; the only
exception is open existential types where each new open existential
now explicitly gets a unique generic environment, allocated by
calling GenericEnvironment::getIncomplete().
Returns `true` if `T.Type` is known to refer to a concrete type. The
implementation allows for the optimizer to specialize this at -O and
eliminate conditional code.
Includes `Swift._isConcrete<T>(T.Type) -> Bool` wrapper function.
Rather than storing the set of input requirements in a
(SIL)SpecializeAttr, store the specialized generic signature. This
prevents clients from having to rebuild the same specialized generic
signature on every use.
This provides a singular instruction for convert an unmanaged value to a ref,
then strong_retain it. I expanded the definition of UNCHECKED_REF_STORAGE to
include these copy like instructions. This instruction is valid in all SIL.
The reason why I am adding this instruction is that currently when we emit an
access to an unowned (unsafe) ivar, we use an unmanaged_to_ref and a strong
retain. This can look to the optimizer like a strong retain that can potentially
be optimized. By combining the two together into a new instruction, we can avoid
this potential problem since the pattern matching will break.
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances in the swift repo.
This was not a real leak, because it only happened in an error condition where the block ends in an trap instructon anyway.
But it must be fixed to make the MemoryLifetimeVerifier happy.
This simplifies the IR and eliminates in a certain sense "constant information"
from the IR by allowing us to remove extract, nominal literal round trips.
NOTE: I had to modify the pound_assert test slightly on the location of a note
that it emits. The reason that I did this is that the test output is technically
correct. The instruction we are interpreting when we error is (after this
commit), a debug_value in the prelude of the function (and thus has the new
location). I am going to talk with Ravi and others on what to do with this.
When this was originally implemented, this transform was put into
SILCombine. This patch also puts it into constant folding so we can also perform
the transform at -Onone.
My hunch is that the reason why this isn't happening with the -Onone
serialization change is that we were relying only this code being specialized
before serialization in the stdlib. But transparent functions are now not
optimized until after serialization, so it isn't done. Rather than mess with
that, I just added the support here.
I put in a little bit of infrastructure that should provide the appropriate
places for adding information about other cases where we can run into this with
other casts.
The reason I added this is that the codegen around condfail_message has changed
slightly and thus has this round trip in it, messing with various pattern
matching routines.
Example:
```
%x = string_literal
%x1 = ptr_to_int %x // <=== This messes with pattern matching from
%x2 = int_to_ptr %x1 // <=== the cond_fail to %x.
cond_fail(%x2)
```
=>
```
%x = string_literal // <=== Happiness = ).
cond_fail(%x)
```
This originally used a SmallSetVector before a refactoring that I did. I changed
it to a vector + sortUnique since AFAIKT it only needed uniqueness, not
insertion order... turns out I was wrong. I added a comment here explaining that
this has to be a SetVector to preserve both uniqueness and insertion order.
rdar://53401018
This improves on the previous situation:
- The request ensures that the backing storage for lazy properties
and property wrappers gets synthesized first; previously it was
only somewhat guaranteed by callers.
- Instead of returning a range this just returns an ArrayRef,
which simplifies clients.
- Indexing into the ArrayRef is O(1), which addresses some FIXMEs
in the SIL optimizer.
"globalStringTablePointer": String -> Builtin.RawPointer` to a
string_literal instruction if the string that is passed is constructed
from a literal. Otherwise, emit diagnostics.