The old analysis pass doesn't take into account profile data, nor does
it consider post-dominance. It primarily dealt with _fastPath/_slowPath.
A block that is dominated by a cold block is itself cold. That's true
whether it's forwards or backwards dominance.
We can also consider a call to any `Never` returning function as a
cold-exit, though the block(s) leading up to that call may be executed
frequently because of concurrency. For now, I'm ignoring the concurrency
case and assuming it's cold. To make use of this "no return" prediction,
use the `-enable-noreturn-prediction` flag, which is currently off by
default.
Although I don't plan to bring over new assertions wholesale
into the current qualification branch, it's entirely possible
that various minor changes in main will use the new assertions;
having this basic support in the release branch will simplify that.
(This is why I'm adding the includes as a separate pass from
rewriting the individual assertions)
This is phase-1 of switching from llvm::Optional to std::optional in the
next rebranch. llvm::Optional was removed from upstream LLVM, so we need
to migrate off rather soon. On Darwin, std::optional, and llvm::Optional
have the same layout, so we don't need to be as concerned about ABI
beyond the name mangling. `llvm::Optional` is only returned from one
function in
```
getStandardTypeSubst(StringRef TypeName,
bool allowConcurrencyManglings);
```
It's the return value, so it should not impact the mangling of the
function, and the layout is the same as `std::optional`, so it should be
mostly okay. This function doesn't appear to have users, and the ABI was
already broken 2 years ago for concurrency and no one seemed to notice
so this should be "okay".
I'm doing the migration incrementally so that folks working on main can
cherry-pick back to the release/5.9 branch. Once 5.9 is done and locked
away, then we can go through and finish the replacement. Since `None`
and `Optional` show up in contexts where they are not `llvm::None` and
`llvm::Optional`, I'm preparing the work now by going through and
removing the namespace unwrapping and making the `llvm` namespace
explicit. This should make it fairly mechanical to go through and
replace llvm::Optional with std::optional, and llvm::None with
std::nullopt. It's also a change that can be brought onto the
release/5.9 with minimal impact. This should be an NFC change.
`getValue` -> `value`
`getValueOr` -> `value_or`
`hasValue` -> `has_value`
`map` -> `transform`
The old API will be deprecated in the rebranch.
To avoid merge conflicts, use the new API already in the main branch.
rdar://102362022
Saves a bunch of compile time when compiling the stdlib. Specifically, when
compiling the iOS 32 bit stdlib + sil-verify-all, this shaves off ~20% of the
compile time.
Additional handling of copy_value/destroy_value/load[copy]/
begin_borrow/end_borrow is needed to support OSSA.
TODO: Support handling of 2d array in the pass for OSSA.
Currently hoisting of loads and borrows are not supported in OSSA
to check for improperly nested '@_semantic' functions.
Add a missing @_semantics("array.init") in ArraySlice found by the
diagnostic.
Distinguish between array.init and array.init.empty.
Categorize the types of semantic functions by how they affect the
inliner and pass pipeline, and centralize this logic in
PerformanceInlinerUtils. The ultimate goal is to prevent inlining of
"Fundamental" @_semantics calls and @_effects calls until the late
pipeline where we can safely discard semantics. However, that requires
significant pipeline changes.
In the meantime, this change prevents the situation from getting worse
and makes the intention clear. However, it has no significant effect
on the pass pipeline and inliner.
Change ProjectionIndex for ref_tail_addr to std::numeric_limits<int>::max();
This is necessary to disambiguate the tail elements from
ref_element_addr field zero.
Use the new builtins for COW representation in Array, ContiguousArray and ArraySlice.
The basic idea is to strictly separate code which mutates an array buffer from code which reads from an array.
The concept is explained in more detail in docs/SIL.rst, section "Copy-on-Write Representation".
The main change is to use beginCOWMutation() instead of isUniquelyReferenced() and insert endCOWMutation() at the end of all mutating functions. Also, reading from the array buffer must be done differently, depending on if the buffer is in a mutable or immutable state.
All the required invariants are enforced by runtime checks - but only in an assert-build of the library: a bit in the buffer object side-table indicates if the buffer is mutable or not.
Along with the library changes, also two optimizations needed to be updated: COWArrayOpt and ObjectOutliner.
For COW support in SIL it's required to "finalize" array literals.
_finalizeUninitializedArray is a compiler known stdlib function which is called after all elements of an array literal are stored.
This runtime function marks the array literal as finished.
%uninitialized_result_tuple = apply %_allocateUninitializedArray(%count)
%mutable_array = tuple_extract %uninitialized_result_tuple, 0
%elem_base_address = tuple_extract %uninitialized_result_tuple, 1
...
store %elem_0 to %elem_addr_0
store %elem_1 to %elem_addr_1
...
%final_array = apply %_finalizeUninitializedArray(%mutable_array)
In this commit _finalizeUninitializedArray is still a no-op because the COW support is not used in the Array implementation yet.
Used to "finalize" an array literal. It's not used, yet. So this is NFC.
Also handle the "array.finalize_intrinsic" function in various array specific optimizations.
semantics attribute that is used by the top-level array initializer (in ArrayShared.swift),
which is the entry point used by the compiler to initialize array from array literals.
This initializer is early-inlined so that other optimizations can work on its body.
Fix DeadObjectElimination and ArrayCOWOpts optimization passes to work with this
semantics attribute in addition to "array.uninitialized", which they already use.
Refactor mapInitializationStores function from ArrayElementValuePropagation.cpp to
ArraySemantic.cpp so that the array-initialization pattern matching functionality
implemented by the function can be reused by other optimizations.
These are separate, mostly unrelated passes. Putting them in their own
files makes it easier to read the code, understand how to control the
passes, and makes it possible to independently trace, and debug them.
The XXOptUtils.h convention is already established and parallels
the SIL/XXUtils convention.
New:
- InstOptUtils.h
- CFGOptUtils.h
- BasicBlockOptUtils.h
- ValueLifetime.h
Removed:
- Local.h
- Two conflicting CFG.h files
This reorganization is helpful before I introduce more
utilities for block cloning similar to SinkAddressProjections.
Move the control flow utilies out of Local.h, which was an
unreadable, unprincipled mess. Rename it to InstOptUtils.h, and
confine it to small APIs for working with individual instructions.
These are the optimizer's additions to /SIL/InstUtils.h.
Rename CFG.h to CFGOptUtils.h and remove the one in /Analysis. Now
there is only SIL/CFG.h, resolving the naming conflict within the
swift project (this has always been a problem for source tools). Limit
this header to low-level APIs for working with branches and CFG edges.
Add BasicBlockOptUtils.h for block level transforms (it makes me sad
that I can't use BBOptUtils.h, but SIL already has
BasicBlockUtils.h). These are larger APIs for cloning or removing
whole blocks.
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances in the swift repo.
With the advent of dynamic_function_ref the actual callee of such a ref
my vary. Optimizations should not assume to know the content of a
function referenced by dynamic_function_ref. Introduce
getReferencedFunctionOrNull which will return null for such function
refs. And getInitialReferencedFunction to return the referenced
function.
Use as appropriate.
rdar://50959798
If a make_mutable operation is done conditionally in a loop, the hoisting of this operation can cause an over-release of the array buffer in some cases.
rdar://problem/48906146
The ownership kind is Any for trivial types, or Owned otherwise, but
whether a type is trivial or not will soon depend on the resilience
expansion.
This means that a SILModule now uniques two SILUndefs per type instead
of one, and serialization uses two distinct sentinel IDs for this
purpose as well.
For now, the resilience expansion is not actually used here, so this
change is NFC, other than changing the module format.
This is a bug fix that just bails out of FSO, which is exactly what we
should be doing in this case anyway.
This issue will be exposed in stdlib builds once I fix another bug in
the passmanager. Once the pass pipeline restart works as intended, we
will perform FSO on `F`, then devirtualization will discover a new
reference to `F`, causing it to be pushed back on the function pass
pipeline.
This disables a bunch of passes when ownership is enabled. This will allow me to
keep transparent functions in ossa and skip most of the performance pipeline without
being touched by passes that have not been updated for ownership.
This is important so that we can in -Onone code import transparent functions and
inline them into other ossa functions (you can't inline from ossa => non-ossa).
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
This is in preparation for verifying that when ownership verification is enabled
that only enums and trivial values can have any ownership. I am doing this in
preparation for eliminating ValueOwnershipKind::Trivial.
rdar://46294760
Mostly functionally neutral:
- may fix latent bugs.
- may reduce useless basic blocks after inlining.
This rewrite encapsulates the cloner's internal state, providing a
clean API for the CRTP subclasses. The subclasses are rewritten to use
the exposed API and extension points. This makes it much easier to
understand, work with, and extend SIL cloners, which are central to
many optimization passes. Basic SIL invariants are now clearly
expressed and enforced. There is no longer a intricate dance between
multiple levels of subclasses operating on underlying low-level data
structures. All of the logic needed to keep the original SIL in a
consistent state is contained within the SILCloner itself. Subclasses
only need to be responsible for their own modifications.
The immediate motiviation is to make CFG updates self-contained so
that SIL remains in a valid state. This will allow the removal of
critical edge splitting hacks and will allow general SIL utilities to
take advantage of the fact that we don't allow critical edges.
This rewrite establishes a simple principal that should be followed
everywhere: aside from the primitive mutation APIs on SIL data types,
each SIL utility is responsibile for leaving SIL in a valid state and
the logic for doing so should exist in one central location.
This includes, for example:
- Generating a valid CFG, splitting edges if needed.
- Returning a valid instruction iterator if any instructions are removed.
- Updating dominance.
- Updating SSA (block arguments).
(Dominance info and SSA properties are fundamental to SIL verification).
LoopInfo is also somewhat fundamental to SIL, and should generally be
updated, but it isn't required.
This also fixes some latent bugs related to iterator invalidation in
recursivelyDeleteTriviallyDeadInstructions and SILInliner. Note that
the SILModule deletion callback should be avoided. It can be useful as
a simple cache invalidation mechanism, but it is otherwise bug prone,
too limited to be very useful, and basically bad design. Utilities
that mutate should return a valid instruction iterator and provide
their own deletion callbacks.
With removing of pinning and with addressors, the pattern matching did not work anymore.
The good thing is that the SIL is now much simpler and we can handle the 2D case without pattern matching at all.
This removes a lot of code from COWArrayOpts.
rdar://problem/43863081
The optimizations now handle the ref_tail_addr instructions for detecting element addresses
(in addition to the array semantics function _getElementAddress).
After _modify for Array subscript lands, we can get rid of _getElementAddress at all.
SIL passes were violating the existing invariant on non-cond-br
critical edges in several places. I fixed the places that I could
find. Wherever there was a post-pass to "clean up" critical edges, I
replaced it with a a call to verification that the critical edges
aren't broken in the first place.
We still need to eliminate critical edges entirely before enabling
ownership SIL.
This is a property of an instruction and should be a member
function of `SILInstruction` and not a free function in
`DebugUtils`. Discussed with Adrian.