Things that have come up recently but are somewhat blocked on this:
- Moving AccessMarkerElimination down in the pipeline
- SemanticARCOpts correctness and improvements
- AliasAnalysis improvements
- LICM performance regressions
- RLE/DSE improvements
Begin to formalize the model for valid memory access in SIL. Ignoring
ownership, every access is a def-use chain in three parts:
object root -> formal access base -> memory operation address
AccessPath abstracts over this path and standardizes the identity of a
memory access throughout the optimizer. This abstraction is the basis
for a new AccessPathVerification.
With that verification, we now have all the properties we need for the
type of analysis requires for exclusivity enforcement, but now
generalized for any memory analysis. This is suitable for an extremely
lightweight analysis with no side data structures. We currently have a
massive amount of ad-hoc memory analysis throughout SIL, which is
incredibly unmaintainable, bug-prone, and not performance-robust. We
can begin taking advantage of this verifably complete model to solve
that problem.
The properties this gives us are:
Access analysis must be complete over memory operations: every memory
operation needs a recognizable valid access. An access can be
unidentified only to the extent that it is rooted in some non-address
type and we can prove that it is at least *not* part of an access to a
nominal class or global property. Pointer provenance is also required
for future IRGen-level bitfield optimizations.
Access analysis must be complete over address users: for an identified
object root all memory accesses including subobjects must be
discoverable.
Access analysis must be symmetric: use-def and def-use analysis must
be consistent.
AccessPath is merely a wrapper around the existing accessed-storage
utilities and IndexTrieNode. Existing passes already very succesfully
use this approach, but in an ad-hoc way. With a general utility we
can:
- update passes to use this approach to identify memory access,
reducing the space and time complexity of those algorithms.
- implement an inexpensive on-the-fly, debug mode address lifetime analysis
- implement a lightweight debug mode alias analysis
- ultimately improve the power, efficiency, and maintainability of
full alias analysis
- make our type-based alias analysis sensistive to the access path
The specific problem here is this:
```
sil @builtin_array_opt_index_raw_pointer_to_index_addr_no_crash : $@convention(thin) (Builtin.RawPointer, Builtin.
Word) -> Builtin.Word {
bb0(%0 : $Builtin.RawPointer, %1 : $Builtin.Word):
%2 = index_raw_pointer %0 : $Builtin.RawPointer, %1 : $Builtin.Word
%3 = pointer_to_address %2 : $Builtin.RawPointer to [strict] $*Builtin.Word
%4 = load %3 : $*Builtin.Word
return %4 : $Builtin.Word
}
```
before this commit, we unconditionally called getDefiningInstruction when
looking at %1. This of course returned a null value causing the compiler to
crash in a dyn_cast<TupleExtractInst>().
rdar://59138369
TLDR: This patch introduces a new kind of builtin, "a polymorphic builtin". One
calls it like any other builtin, e.x.:
```
Builtin.generic_add(x, y)
```
but it has a contract: it must be specialized to a concrete builtin by the time
we hit Lowered SIL. In this commit, I add support for the following generic
operations:
Type | Op
------------------------
FloatOrVector |FAdd
FloatOrVector |FDiv
FloatOrVector |FMul
FloatOrVector |FRem
FloatOrVector |FSub
IntegerOrVector|AShr
IntegerOrVector|Add
IntegerOrVector|And
IntegerOrVector|ExactSDiv
IntegerOrVector|ExactUDiv
IntegerOrVector|LShr
IntegerOrVector|Mul
IntegerOrVector|Or
IntegerOrVector|SDiv
IntegerOrVector|SRem
IntegerOrVector|Shl
IntegerOrVector|Sub
IntegerOrVector|UDiv
IntegerOrVector|Xor
Integer |URem
NOTE: I only implemented support for the builtins in SIL and in SILGen. I am
going to implement the optimizer parts of this in a separate series of commits.
DISCUSSION
----------
Today there are polymorphic like instructions in LLVM-IR. Yet, at the
swift and SIL level we represent these operations instead as Builtins whose
names are resolved by splatting the builtin into the name. For example, adding
two things in LLVM:
```
%2 = add i64 %0, %1
%2 = add <2 x i64> %0, %1
%2 = add <4 x i64> %0, %1
%2 = add <8 x i64> %0, %1
```
Each of the add operations are done by the same polymorphic instruction. In
constrast, we splat out these Builtins in swift today, i.e.:
```
let x, y: Builtin.Int32
Builtin.add_Int32(x, y)
let x, y: Builtin.Vec4xInt32
Builtin.add_Vec4xInt32(x, y)
...
```
In SIL, we translate these verbatim and then IRGen just lowers them to the
appropriate polymorphic instruction. Beyond being verbose, these prevent these
Builtins (which need static types) from being used in polymorphic contexts where
we can guarantee that eventually a static type will be provided.
In contrast, the polymorphic builtins introduced in this commit can be passed
any type, with the proviso that the expert user using this feature can guarantee
that before we reach Lowered SIL, the generic_add has been eliminated. This is
enforced by IRGen asserting if passed such a builtin and by the SILVerifier
checking that the underlying builtin is never called once the module is in
Lowered SIL.
In forthcoming commits, I am going to add two optimizations that give the stdlib
tool writer the tools needed to use this builtin:
1. I am going to add an optimization to constant propagation that changes a
"generic_*" op to the type of its argument if the argument is a type that is
valid for the builtin (i.e. integer or vector).
2. I am going to teach the SILCloner how to specialize these as it inlines. This
ensures that when we transparent inline, we specialize the builtin automatically
and can then form SSA at -Onone using predictable memory access operations.
The main implication around these polymorphic builtins are that if an author is
not able to specialize the builtin, they need to ensure that after constant
propagation, the generic builtin has been DCEed. The general rules are that the
-Onone optimizer will constant fold branches with constant integer operands. So
if one can use a bool of some sort to trigger the operation, one can be
guaranteed that the code will not codegen. I am considering putting in some sort
of diagnostic to ensure that the stdlib writer has a good experience (e.x. get
an error instead of crashing the compiler).
NOTE:
1. To test this I changed UnaryOp_match to use this under the hood.
2. These types of m_##ID##Inst matchers now will only accept compound types and
I added a static assert to verify that this mistake doesn't happen. We
previously had matchers that would take an int or the like to match tuple
extract patterns. I converted those to use TupleExtractOperation that also
properly handles destructures.
With the advent of dynamic_function_ref the actual callee of such a ref
my vary. Optimizations should not assume to know the content of a
function referenced by dynamic_function_ref. Introduce
getReferencedFunctionOrNull which will return null for such function
refs. And getInitialReferencedFunction to return the referenced
function.
Use as appropriate.
rdar://50959798
I also ported the constant_propagation.sil tests over for ownership and updated
a few parts of the cast optimizer so that those tests pass with and without
ownership. I purposely only updated the parts of the cast optimizer that crashed
with ownership in the relevant test so that I can add new sil code coverage for
those uncovered code paths.
* Remove apparently obsolete builtin functions.
- Remove s_to_u_checked_conversion and u_to_s_checked_conversion functions from builtin AST parsing, SIL/IR generation and from SIL optimisations.
* Remove apparently obsolete builtin functions - unit tests.
- Remove unit tests for SIL transformations relating to s_to_u_checked_conversion and u_to_s_checked_conversion builtin functions.
* Remove apparently obsolete builtin functions.
- Remove s_to_u_checked_conversion and u_to_s_checked_conversion functions from builtin AST parsing, SIL/IR generation and from SIL optimisations.
* Remove apparently obsolete builtin functions - unit tests.
- Remove unit tests for SIL transformations relating to s_to_u_checked_conversion and u_to_s_checked_conversion builtin functions.
* Reduce array abstraction on apple platforms dealing with literals
Part of the ongoing quest to reduce swift array literal abstraction
penalties: make the SIL optimizer able to eliminate bridging overhead
when dealing with array literals.
Introduce a new classify_bridge_object SIL instruction to handle the
logic of extracting platform specific bits from a Builtin.BridgeObject
value that indicate whether it contains a ObjC tagged pointer object,
or a normal ObjC object. This allows the SIL optimizer to eliminate
these, which allows constant folding a ton of code. On the example
added to test/SILOptimizer/static_arrays.swift, this results in 4x
less SIL code, and also leads to a lot more commonality between linux
and apple platform codegen when passing an array literal.
This also introduces a couple of SIL combines for patterns that occur
in the array literal passing case.
This replaces the '[volatile]' flag. Now, class_method and
super_method are only used for vtable dispatch.
The witness_method instruction is still overloaded for use
with both ObjC protocol requirements and Swift protocol
requirements; the next step is to make it only mean the
latter, also using objc_method for ObjC protocol calls.
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
These instructions have the same semantics as the *ExistentialAddr instructions
but operate directly on the existential value, not its address.
This is in preparation for adding ExistentialBoxValue instructions.
The previous name would cause impossible confusion with "opaque existentials"
and "opaque existential boxes".
Replace `NameOfType foo = dyn_cast<NameOfType>(bar)` with DRY version `auto foo = dyn_cast<NameOfType>(bar)`.
The DRY auto version is by far the dominant form already used in the repo, so this PR merely brings the exceptional cases (redundant repetition form) in line with the dominant form (auto form).
See the [C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es11-use-auto-to-avoid-redundant-repetition-of-type-names) for a general discussion on why to use `auto` to avoid redundant repetition of type names.
We were giving special handling to ApplyInst when we were attempting to use
getMemoryBehavior(). This commit changes the special handling to work on all
full apply sites instead of just AI. Additionally, we look through partial
applies and thin to thick functions.
I also added a dumper called BasicInstructionPropertyDumper that just dumps the
results of SILInstruction::get{Memory,Releasing}Behavior() for all instructions
in order to verify this behavior.
There's a buggy SIL verifier check that was previously tautological,
and it turns out that it's violated, apparently harmlessly. Since it
was already doing nothing, I've commented it out temporarily while
I figure out the right way to fix SILGen to get the invariant right.
Remove the include of SILInstruction.h in a couple places since it is
not necessary and forces more recompilation than necessary when edits
are made to SILInstruction.h.
Swift SVN r32091
We need a SIL level unsafe cast that supports arbitrary usage of
UnsafePointer, generalizes Builtin.reinterpretCast, and has the same
semantics on generic vs. nongeneric code. In other words, we need to
be able to promote the cast of an address type to the cast of an
object type without changing semantics, and that cast needs to support
types that are not layout identical.
This patch introduces an unchecked_bitwise_cast instruction for that
purpose. It is different from unsafe_addr_cast, which has been our
fall-back "unknown" cast in the past. With unchecked_bitwise_cast we
cannot assume layout or RC identity. The cast implies a store and
reload of the value to obtain the low order bytes. I know that
bit_cast is just an abbreviation for bitwise_cast, but we use
"bitcast" throught to imply copying a same sized value. No one could
come up with a better name for copying an objects low bytes via:
@addr = alloca $wideTy
store @addr, $wideTy
load @addr, $narrowTy
Followup patches will optimize unchecked_bitwise_cast into more
semantically useful unchecked casts when enough type information is
present. This way, the optimizer will rarely need to be taught about
the bitwise case.
Swift SVN r29510
For better consistency with other address-only instruction variants, and to open the door to new exciting existential representations (such as a refcounted boxed representation for ErrorType).
Swift SVN r25902