Specifically:
1. I made methods, variables camelCase.
2. I expanded out variable names (e.x.: bb -> block, predBB -> predBlocks, U -> wrappedUse).
3. I changed typedef -> using.
4. I changed a few c style for loops into for each loops using llvm::enumerate.
NOTE: I left the parts needed for syncing to LLVM in the old style since LLVM
needs these to exist for CRTP to work correctly for the SILSSAUpdater.
For use outside access enforcement passes.
Add isUniquelyIdentifiedAfterEnforcement.
Rename functions for clarity and generality.
Rename isUniquelyIdentifiedOrClass to isFormalAccessBase.
Rename findAccessedStorage to identifyFormalAccess.
Rename findAccessedStorageNonNested to findAccessedStorage.
Part of generalizing the utility for use outside the access
enforcement passes.
Use the new builtins for COW representation in Array, ContiguousArray and ArraySlice.
The basic idea is to strictly separate code which mutates an array buffer from code which reads from an array.
The concept is explained in more detail in docs/SIL.rst, section "Copy-on-Write Representation".
The main change is to use beginCOWMutation() instead of isUniquelyReferenced() and insert endCOWMutation() at the end of all mutating functions. Also, reading from the array buffer must be done differently, depending on if the buffer is in a mutable or immutable state.
All the required invariants are enforced by runtime checks - but only in an assert-build of the library: a bit in the buffer object side-table indicates if the buffer is mutable or not.
Along with the library changes, also two optimizations needed to be updated: COWArrayOpt and ObjectOutliner.
For COW support in SIL it's required to "finalize" array literals.
_finalizeUninitializedArray is a compiler known stdlib function which is called after all elements of an array literal are stored.
This runtime function marks the array literal as finished.
%uninitialized_result_tuple = apply %_allocateUninitializedArray(%count)
%mutable_array = tuple_extract %uninitialized_result_tuple, 0
%elem_base_address = tuple_extract %uninitialized_result_tuple, 1
...
store %elem_0 to %elem_addr_0
store %elem_1 to %elem_addr_1
...
%final_array = apply %_finalizeUninitializedArray(%mutable_array)
In this commit _finalizeUninitializedArray is still a no-op because the COW support is not used in the Array implementation yet.
Used to "finalize" an array literal. It's not used, yet. So this is NFC.
Also handle the "array.finalize_intrinsic" function in various array specific optimizations.
This simplifies the handling of the subdirectories in the SIL and
SILOptimizer paths. Create individual libraries as object libraries
which allows the analysis of the source changes to be limited in scope.
Because these are object libraries, this has 0 overhead compared to the
previous implementation. However, string operations over the filenames
are avoided. The cost for this is that any new sub-library needs to be
added into the list rather than added with the special local function.
Even if a store is not dominating the loop exits, it makes sense to move it out of the loop if the pre-header also as a store to the same memory location.
When this is done, dead-store-elimination can then most likely remove the store in the pre-header.
This loop optimization hoists and sinks a group of loads and stores to
the same address.
Consider this SIL...
PRELOOP:
%stackAddr = alloc_stack $Index
%outerAddr1 = struct_element_addr %stackAddr : $*Index, #Index.value
%innerAddr1 = struct_element_addr %outerAddr1 : $*Int, #Int._value
%outerAddr2 = struct_element_addr %stackAddr : $*Index, #Index.value
%innerAddr2 = struct_element_addr %outerAddr2 : $*Int, #Int._value
LOOP:
%_ = load %innerAddr2 : $*Builtin.Int64
store %_ to %outerAddr2 : $*Int
%_ = load %innerAddr1 : $*Builtin.Int64
There are two bugs:
1) LICM miscompiles code during combined load/store hoisting and sinking.
When the loop contains an aliasing load from a difference projection
value, the optimization sinks the store but never replaces the
load. At runtime, the load reads a stale value.
FIX: isOnlyLoadedAndStored needs to check for other load instructions
before hoisting/sinking a seemingly unrelated set of
loads/stores. Checking side effect instructions is insufficient. The
same bug could happen with stores, which also do not produce side
effects.
Fixes <rdar://61246061> LICM miscompile:
Combined load/store hoisting/sinking with aliases
2) The LICM algorithm is not robust with respect to address projection
because it identifies a projected address by its SILValue. This
should never be done! It is trivial to represent a project path
using an IndexTrieNode (there is also an abstraction called
"ProjectionPath", but it should _never_ actually be stored by an
analysis because of the time and space complexity of doing so).
The second bug is not necessary to fix for correctness, so will be
fixed in a follow-up commit.
This avoids significant code size regressions if the loop block to be duplicated is very large.
Also remove the ShouldRotate option. It's not needed anymore as disabling the pass can be done by -sil-disable-pass.
rdar://problem/33970453
Global initializers are executed only once.
Therefore it's possible to hoist such an initializer call to a loop pre-header - in case there are no conflicting side-effects in the loop before the call.
Also, the call must post-dominate the loop pre-header. Otherwise it would be executed speculatively.
calls over arrays created from array literals. This enables optimizing
further the output of the OSLogOptimization pass, and results in
highly-compact and optimized IR for calls to the new os log API.
<rdar://58928427>
semantics attribute that is used by the top-level array initializer (in ArrayShared.swift),
which is the entry point used by the compiler to initialize array from array literals.
This initializer is early-inlined so that other optimizations can work on its body.
Fix DeadObjectElimination and ArrayCOWOpts optimization passes to work with this
semantics attribute in addition to "array.uninitialized", which they already use.
Refactor mapInitializationStores function from ArrayElementValuePropagation.cpp to
ArraySemantic.cpp so that the array-initialization pattern matching functionality
implemented by the function can be reused by other optimizations.
* Add profitability check to array-property-opt
This change adds additional heuristics to array-property-opt to avoid
code size increase in cases where the optimization may not be
beneficial. With this change, we do not specialize a loop if it has any
opaque function calls or the instruction count exceeds a predefined threshold.
and eliminate dead code. This is meant to be a replacement for the utility:
recursivelyDeleteTriviallyDeadInstructions. The new utility performs more aggresive
dead-code elimination for ownership SIL.
This patch also migrates most non-force-delete uses of
recursivelyDeleteTriviallyDeadInstructions to the new utility.
and migrates one force-delete use of recursivelyDeleteTriviallyDeadInstructions
(in IRGenPrepare) to use the new utility.
These are separate, mostly unrelated passes. Putting them in their own
files makes it easier to read the code, understand how to control the
passes, and makes it possible to independently trace, and debug them.
In SILInstruction::isTriviallyDuplicatable():
- Make deallocating instructions trivially duplicatable. They are by
any useful definition--duplicating an instruction does not imply
reordering it. Tail duplication was already treating deallocations
as duplicatable, but doing it inconsistently. Sometimes it checks
isTriviallyDuplicatable, and sometimes it doesn't, which appears to
have been an accident. Disallowing duplication of deallocations will
cause severe performance regressions. Instead, consistently allow
them to be duplicated, making tail duplication more powerful, which
could expose other bugs.
- Do not duplicate on-stack AllocRefInst (without special
consideration). This is a correctness fix that apparently was never
exposed.
Fix SILLoop::canDuplicate():
- Handle isDeallocatingStack. It's not clear how we were avoiding an
assertion before when a stack allocatable reference was confined to
a loop--probably just by luck.
- Handle begin/end_access inside a loop. This is extremely important
and probably prevented many loop optimizations from working with
exclusivity.
Update LoopRotate canDuplicateOrMoveToPreheader(). This is NFC.
This is a combination of load hoisting and store sinking, e.g.
preheader:
br header_block
header_block:
%x = load %not_aliased_addr
// use %x and define %y
store %y to %not_aliased_addr
...
exit_block:
is transformed to:
preheader:
%x = load %not_aliased_addr
br header_block
header_block:
// use %x and define %y
...
exit_block:
store %y to %not_aliased_addr
This is a combination of load hoisting and store sinking, e.g.
preheader:
br header_block
header_block:
%x = load %not_aliased_addr
// use %x and define %y
store %y to %not_aliased_addr
...
exit_block:
is transformed to:
preheader:
%x = load %not_aliased_addr
br header_block
header_block:
// use %x and define %y
...
exit_block:
store %y to %not_aliased_addr
Requested by gottesmm during review.
Update the variable naming conventions to lower-camel.
Run clang-format.
I'm sure I missed some local variables somewhere--this was a best
effort.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances in the swift repo.
NOTE:
1. To test this I changed UnaryOp_match to use this under the hood.
2. These types of m_##ID##Inst matchers now will only accept compound types and
I added a static assert to verify that this mistake doesn't happen. We
previously had matchers that would take an int or the like to match tuple
extract patterns. I converted those to use TupleExtractOperation that also
properly handles destructures.
With the advent of dynamic_function_ref the actual callee of such a ref
my vary. Optimizations should not assume to know the content of a
function referenced by dynamic_function_ref. Introduce
getReferencedFunctionOrNull which will return null for such function
refs. And getInitialReferencedFunction to return the referenced
function.
Use as appropriate.
rdar://50959798