When ownership is disabled this is NFC.
NOTE: In terms of testing, I am trying to get everything into place in
preparation for landing a complete ownership translation of the pmo tests. I
need to get enough in place for the full test translation to work. In the mean
time, I am going to start adding memaccess specific stuff in its own ownership
file.
The reason this is true is that an assign is an instruction in PMO parlance that
must destroy the value stored into memory at that location previously. So PMO
would need to be taught to ensure that said destroy is promoted. Consider the
following example:
%0 = alloc_stack $Foo
store %1 to [init] %0 : $*Foo
store %2 to [assign] %0 : $*Foo
destroy_addr %0 : $*Foo
dealloc_stack %0 : $*Foo
If PMO were to try to eliminate the alloc_stack as PMO is written today, one
would have:
destroy_value %2 : $Foo
That is clearly wrong.
PMO uses InitOrAssign for trivially typed things and Init/Assign for non-trivial
things, so I think this was an oversight from a long time ago. There is actually
no /real/ effect on the code today since after exploding the copy_addr, the
store will still be used to produce the right available value and since for
stores, init/assign/initorassign all result in allocations being removed. Once
though I change assign to not allow for allocation removal (the proper way to
model this), without this change, certain trivial allocations will no longer be
removed, harming perf as seen via the benchmarking run on the bots in #21918.
This models the semantics we are trying to have here more closely without
changing current output. Specifically, an assign in this case is supposed to
mean that the underlying value is overwritten and destroyed (via a ref count
decrement). This is obviously true with the non-init copy_addr form, i.e.:
%0 = alloc_stack $Builtin.NativeObject
%1 = alloc_stack $Builtin.NativeObject
...
copy_addr [take] %0 to %1 : $*Builtin.NativeObject
...
Notice how the instruction is actively going to destroy whatever is in %1. Lets
consider the same SIL after exploding the copy_addr.
%0 = alloc_stack $Builtin.NativeObject
%1 = alloc_stack $Builtin.NativeObject
...
%2 = load %0 : $*Builtin.NativeObject
%3 = load %1 : $*Builtin.NativeObject (1)
store %2 to %1 : $*Builtin.NativeObject
destroy_value %3 : $Builtin.NativeObject (2)
...
In this case, the store is actually acting like an initialization since the
destructive part of the assign is performed by (1), (2). So considering the
store to be an assign is incorrect since a store that is an assign /should/
destroy the underlying value itself.
In terms of the actual effect on the result of the pass today, Initialization
and Assign stores are treated the same when it comes to getting available values
from stores and (for stores) destroying allocations. So this should be an NFC
change, but gets us closer to the model where assigns are only used for things
that store into memory and destroy the underlying model directly.
I am removing these for the following reasons:
* PMO does not have any tests for these code paths. (1).
* PMO does not try to promote these loads (it explicitly pattern matches load,
copy_addr) or get available values from these (it explicitly pattern matches
store or explodes a copy_addr to get the copy_addr's stores). This means that
removing this code will not effect our constant propagation diagnostics. So,
removing this untested code path at worst could cause us to no longer
eliminate some dead objects that we otherwise would be able to eliminate at
-Onone (low-priority). (2).
----
(1). I believe that the lack of PMO tests is due to this being a vestigal
remnant of DI code in PMO. My suspicion arises since:
* The code was added when the two passes were both sharing the same use
collector and auxillary data structures. Since then I have changed DI/PMO
to each have their own copies.
* DI has a bunch of tests that verify behavior around these instructions.
(2). I expect the number of actually removed allocations that are no longer
removed should be small since we do not promote loads from such allocations
and PMO will not eliminate an allocation that has any loads.
PartialStore is a PMOUseKind that is a vestigal remnant of Definite Init in the
PMO source. This can be seen by noting that in Definite Init, PartialStore is
how Definite Init diagnoses partially initialized values and errors. In contrast
in PMO the semantics of PartialStore are:
1. It can only be produced if we have a raw store use or a copy_addr.
2. We allow for the use to provide an available value just like if it was an
assign or an init.
3. We ignore it for the purposes of removing store only allocations since by
itself without ownership, stores (and stores from exploded copy_addr) do not
effect ownership in any way.
Rather than keeping this around, in this commit I review it since it doesn't
provide any additional value or [init] or [assign]. Functionally there should be
no change.
I discovered while updating PMO for ownership that for ~5 years there has been a
bug where we were treating copy_addr of trivial values like an "Assign" (in PMO
terminology) of a non-trivial value and thus stopping allocation
elimination. When I fixed this I discovered that this caused us to no longer
emit diagnostics in a predictable way. Specifically, consider the following
swift snippet:
var _: UInt = (-1) >> 0
Today, we emit a diagnostic that -1 can not be put into a UInt. This occurs
since even though the underlying allocation is only stored into, the copy_addr
assign keeps it alive, causing the diagnostics pass to see the conversion. With
my fix though, we see that we are only storing into the allocation, causing the
allocation to be eliminated before the constant propagation diagnostic pass
runs, causing the diagnostic to no longer be emitted.
We should truly not be performing this type of DCE before we emit such
diagnostics. So in this commit, I split the pass into two parts:
1. A load promotion pass that performs the SSA formation needed for SSA based
diagnostics to actually work.
2. An allocation elimination passes that run /after/ SSA based diagnostics.
This should be NFC since the constant propagation SSA based diagnostics do not
create memory operations so the output should be the same.
The main contribution of this commit is that I refactored out a helper class for
managing the data used when promoting destroy_addr. This enabled me to make a
getData method on the helper class to simplify getting iterating over a
destroy_addr and its available values.
TLDR: This does not eliminate the struct/tuple flat namespace from Predictable
Mem Opts. Just the tuple specific flat namespace code from PMOMemoryUseCollector
that we were computing and then throwing away. I explain below in more detail.
First note that this is cruft from when def-init and pmo were one pass. What we
were doing here was maintaing a flattened tuple namespace while we were
collecting uses in PMOMemoryUseCollector. We never actually used them for
anything since we recomputed this information including information about
structs in PMO itself! So this information was truly completely dead.
This commit removes that and related logic and from a maintenance standpoint
makes PMOMemoryUseCollector a simple visitor that doesn't have any real special
logic in it beyond the tuple scalarization.
Specifically, we are putting dealloc_stack, destroy_box into the Releases array
in PMOMemoryUseCollector only to ignore them in the only place that we use the
Releases array in PredictableMemOpts.
These are vestigal remnants of the code when it needed to support DI and
PredMemOpts. Since both of the passes have been split, these are now dead in
PMO. So eliminate them.
This is in preparation for verifying that when ownership verification is enabled
that only enums and trivial values can have any ownership. I am doing this in
preparation for eliminating ValueOwnershipKind::Trivial.
rdar://46294760
This makes the use of the used attribute more portable.
`__attribute__((__used__))` is not accepted by Visual Studio, but LLVM
conveniently provides the helpful macro `LLVM_ATTRIBUTE_USED` to enable this.
I am going to DI and predictable mem opts have been split for a long time and
their subroutines aren't going to be joined in the future... so replace the DI
prefixes in pred-mem-opts impl with PMO and rename DIMemoryUseCollector =>
PMOUseCollector.
Been sitting on this for a long time... just happy to get it in.
Until the beginning of the ownership transition, DI and predictable mem opts
used the same memory use collector. I split them partially since I need to turn
on ownership for predictable mem opts at one time, but also b/c there was a huge
amount of special code that would only trigger if it was used by DI or used by
predictable mem opts. After I did the copy some of the asserts that were needed
for DI remained in the predictable mem opts code. When pred-memopts was only run
in the mandatory pipeline keeping these assertions were ok, but pred-memopts was
recently added to the perf pipeline meaning that it may see code that breaks
these DI invariants (and thus hit this assertion).
We should remove this limitation on predictable-memopts but that would require
some scheduled time to read the code (more than I have to fix this bug = p). So
instead I changed the code to just bail in these cases.
rdar://40032102
This mandatory pass inserts struct_extract operations before earlier
stores to the aggregate but didn't set the debug location of those new
instructions to the location of the store, which caused unexpected
stepping behavior in the debugger.
Fixes a regression from e74367f2b3.
<rdar://problem/35459092>
This simplifies the code and enables me to extract out the code for deleting
dead allocations into a separate struct from AllocOptimize. Otherwise, I would
have to pass in a reference to AllocOptimize creating back references, a
complexity that isn't necessary and will force me to break encapsulation
properties.
Just chopping this off of a larger commit.
rdar://31521023
Previously, we just always promoted destroy_addr. With ownership, this does not
work as well since we need to be able to reason about the take operation that we
are performing. In the general case, this would require adding code to
pred-memopts for tracking reads and for compensation store code since we would
need to eliminate the store of the taken value to prevent a double use.
I am going to loop back around later in the year and add back this code once the
time is available. I filed SR-6341. In case any third party contributor is
interested in looking at re-enabling this optimization before I get back to it.
rdar://31521023
Previously in PredMemOpts, we would insert any extracts at the load site, i.e.:
store %foo to %0
...
%1 = struct_element_addr %0, $Type, $Type.field
%2 = load %1
...
apply %use(%2)
would transform to:
store %foo to %0
...
%2 = struct_extract %foo
apply %use(%2)
This framework will not work with Ownership enabled since the value stored is
considered consumed by the store. This change fixes the issue by moving such
non-destructive extracts to occur while %foo is considered live, i.e. before the
store:
%2 = struct_extract %foo
store %foo to %0
...
apply %use(%2)
This means that we have to store insertion points for each store that provides
us with available values and insert the extracts at those points. This creates
some complications in the case where we have multiple stores since we need to
deal with phi nodes. Rather than dealing with it by hand, we just insert the
extracts at each point and then use the SSA updater to insert the relevant phi
nodes.
rdar://31521023
For pred-memopt to work with ownership, we need to insert instructions in two
different places:
1. When loading not-available values, we must insert the load at the site of the
load we are trying to promote.
2. When propagating an available value, we must destructure and or copy before
each one of the stores that provide our value.
By changing AvailableValue to be a struct, I am providing a cleaner API, but
more importantly the ability to start tracking that information.
rdar://31521023
In non-ownership SIL, PMOpt did not need to distinguish in between destructive
and non-destructive sub element extraction. This allowed PMOpt to just use the
"non-destructive" sub element (i.e. extractSubElement) everywhere. In
preparation for introducing this distinction (and a destructive sub element
extractor), rename extractSubElement to be nonDestructivelyExtractSubElement for
clarity.
NFC.
rdar://31521023
AssignInst is eliminated by DI so we should /never/ see any AssignInst once DI
runs. So this code is dead.
Just shaved off of a larger commit to make it easier to review the larger
commit.
rdar://31521023
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
At some point, pass definitions were heavily macro-ized. Pass
descriptive names were added in two places. This is not only redundant
but a source of confusion. You could waste a lot of time grepping for
the wrong string. I removed all the getName() overrides which, at
around 90 passes, was a fairly significant amount of code bloat.
Any pass that we want to be able to invoke by name from a tool
(sil-opt) or pipeline plan *should* have unique type name, enum value,
commend-line string, and name string. I removed a comment about the
various inliner passes that contradicted that.
Side note: We should be consistent with the policy that a pass is
identified by its type. We have a couple passes, LICM and CSE, which
currently violate that convention.
Handling address_to_pointer as a plain inout missed some mutations and lead to miscompiles.
We now treat address_to_pointer as escaping address.
Fixes SR-3554
Applying nontrivial generic arguments to a nontrivial SIL layout requires lowered SILType substitution, which requires a SILModule. NFC yet, just an API change.