Currently PMO tries to scalarize empty tuple, and ends up deleting
store of an empty tuple. This causes a cascade of problems.
Mem2Reg can end up seeing loads without stores of empty tuple type,
and creates undef while optimizing the alloca away.
This is bad, we don't want to create undef's unnecessarily in SIL.
Some optimization can queiry the function or the block on the undef leading to nullptr
and a compiler crash.
Fixes rdar://94829482
This commit is fixing two things:
1. In certain cases, we are seeing cases where either SILGen or the optimizer
are eliminating destroy_addr along paths where we know that an enum is
dynamically trivial. This can not be expressed in OSSA, so I added code to
pred-deadalloc-elim so that I check if any of our available values after we
finish promoting away an allocation now need to have their consuming use set
completed.
2. That led me to discover that in certain cases load [take] that we were
promoting were available values of other load [take]. This means that we have a
memory safety issue if we promote one load before the other. Consider the
following SIL:
```
%mem = alloc_stack
store %arg to [init] %mem
%0 = load [take] %mem
store %0 to [init] %mem
%1 = load [take] %mem
destroy_value %1
dealloc_stack %mem
```
In this case, if we eliminate %0 before we eliminate %1, we will have a stale
pointer to %0.
I also took this as an opportunity to turn off predictable mem access opt on SIL
that was deserialized canonicalized and non-OSSA SIL. We evidently need to still
do this for pred mem opts for perf reasons (not sure why). But I am pretty sure
this isn't needed and allows me to avoid some nasty code.
I discovered while updating PMO for ownership that for ~5 years there has been a
bug where we were treating copy_addr of trivial values like an "Assign" (in PMO
terminology) of a non-trivial value and thus stopping allocation
elimination. When I fixed this I discovered that this caused us to no longer
emit diagnostics in a predictable way. Specifically, consider the following
swift snippet:
var _: UInt = (-1) >> 0
Today, we emit a diagnostic that -1 can not be put into a UInt. This occurs
since even though the underlying allocation is only stored into, the copy_addr
assign keeps it alive, causing the diagnostics pass to see the conversion. With
my fix though, we see that we are only storing into the allocation, causing the
allocation to be eliminated before the constant propagation diagnostic pass
runs, causing the diagnostic to no longer be emitted.
We should truly not be performing this type of DCE before we emit such
diagnostics. So in this commit, I split the pass into two parts:
1. A load promotion pass that performs the SSA formation needed for SSA based
diagnostics to actually work.
2. An allocation elimination passes that run /after/ SSA based diagnostics.
This should be NFC since the constant propagation SSA based diagnostics do not
create memory operations so the output should be the same.
Since:
1. We only handle alloc_stack, alloc_box in predictable memopts.
2. alloc_stack can not be released.
We know that the release collecting in collectFrom can just be done in
collectContainerUses() [which only processes alloc_box].
This also let me simplify some code as well and add a defensive check in case
for some reason we are passed a release_value on the box. NOTE: I verified that
previously this did not result in a bug since we would consider the
release_value to be an escape of the underlying value even though we didn't
handle it in collectFrom. But the proper way to handle release_value is like
strong_release, so I added code to do that as well.
This was never implemented correctly way back in 2013-2014. It was originally
added I believe so we could DI checks, but the promotion part was never added.
Given that DI is now completely split from PMO, we can just turn this off and if
necessary add it back on master "properly".
rdar://41161408
Until the beginning of the ownership transition, DI and predictable mem opts
used the same memory use collector. I split them partially since I need to turn
on ownership for predictable mem opts at one time, but also b/c there was a huge
amount of special code that would only trigger if it was used by DI or used by
predictable mem opts. After I did the copy some of the asserts that were needed
for DI remained in the predictable mem opts code. When pred-memopts was only run
in the mandatory pipeline keeping these assertions were ok, but pred-memopts was
recently added to the perf pipeline meaning that it may see code that breaks
these DI invariants (and thus hit this assertion).
We should remove this limitation on predictable-memopts but that would require
some scheduled time to read the code (more than I have to fix this bug = p). So
instead I changed the code to just bail in these cases.
rdar://40032102
i.e for:
enum Indirect<T> {
indirect cast payload(first: T, second :T)
}
let _ = Indirect<X>
The payload's SIL box type will be:
$<t_0_0> { var (first: t_0_0, second: t_0_0) } <X>
rdar: //36799330
Previously, we just always promoted destroy_addr. With ownership, this does not
work as well since we need to be able to reason about the take operation that we
are performing. In the general case, this would require adding code to
pred-memopts for tracking reads and for compensation store code since we would
need to eliminate the store of the taken value to prevent a double use.
I am going to loop back around later in the year and add back this code once the
time is available. I filed SR-6341. In case any third party contributor is
interested in looking at re-enabling this optimization before I get back to it.
rdar://31521023
Previously in PredMemOpts, we would insert any extracts at the load site, i.e.:
store %foo to %0
...
%1 = struct_element_addr %0, $Type, $Type.field
%2 = load %1
...
apply %use(%2)
would transform to:
store %foo to %0
...
%2 = struct_extract %foo
apply %use(%2)
This framework will not work with Ownership enabled since the value stored is
considered consumed by the store. This change fixes the issue by moving such
non-destructive extracts to occur while %foo is considered live, i.e. before the
store:
%2 = struct_extract %foo
store %foo to %0
...
apply %use(%2)
This means that we have to store insertion points for each store that provides
us with available values and insert the extracts at those points. This creates
some complications in the case where we have multiple stores since we need to
deal with phi nodes. Rather than dealing with it by hand, we just insert the
extracts at each point and then use the SSA updater to insert the relevant phi
nodes.
rdar://31521023
This is to ensure that when I add the destructive code when ownership is
enabled, the non-ownership variant doesn't break. I have to move the rest of the
pipeline to -Onone at the same time. Once that is done, I will remove more code.
NFC.
rdar://31521023
AssignInst is eliminated by DI so we should /never/ see any AssignInst once DI
runs. So this code is dead.
Just shaved off of a larger commit to make it easier to review the larger
commit.
rdar://31521023
introduce a common superclass, SILNode.
This is in preparation for allowing instructions to have multiple
results. It is also a somewhat more elegant representation for
instructions that have zero results. Instructions that are known
to have exactly one result inherit from a class, SingleValueInstruction,
that subclasses both ValueBase and SILInstruction. Some care must be
taken when working with SILNode pointers and testing for equality;
please see the comment on SILNode for more information.
A number of SIL passes needed to be updated in order to handle this
new distinction between SIL values and SIL instructions.
Note that the SIL parser is now stricter about not trying to assign
a result value from an instruction (like 'return' or 'strong_retain')
that does not produce any.
Handling address_to_pointer as a plain inout missed some mutations and lead to miscompiles.
We now treat address_to_pointer as escaping address.
Fixes SR-3554
Use a syntax that declares the layout's generic parameters and fields,
followed by the generic arguments to apply to the layout:
{ var Int, let String } // A concrete box layout with a mutable Int
// and immutable String field
<T, U> { var T, let U } <Int, String> // A generic box layout,
// applied to Int and String
// arguments
- All parts of the compiler now use ‘P1 & P2’ syntax
- The demangler and AST printer wrap the composition in parens if it is
in a metatype lookup
- IRGen mangles compositions differently
- “protocol<>” is now “swift.Any”
- “protocol<_TP1P,_TP1Q>” is now “_TP1P&_TP1Q”
- Tests cases are updated and added to test the new syntax and mangling
Similarly to how we've always handled parameter types, we
now recursively expand tuples in result types and separately
determine a result convention for each result.
The most important code-generation change here is that
indirect results are now returned separately from each
other and from any direct results. It is generally far
better, when receiving an indirect result, to receive it
as an independent result; the caller is much more likely
to be able to directly receive the result in the address
they want to initialize, rather than having to receive it
in temporary memory and then copy parts of it into the
target.
The most important conceptual change here that clients and
producers of SIL must be aware of is the new distinction
between a SILFunctionType's *parameters* and its *argument
list*. The former is just the formal parameters, derived
purely from the parameter types of the original function;
indirect results are no longer in this list. The latter
includes the indirect result arguments; as always, all
the indirect results strictly precede the parameters.
Apply instructions and entry block arguments follow the
argument list, not the parameter list.
A relatively minor change is that there can now be multiple
direct results, each with its own result convention.
This is a minor change because I've chosen to leave
return instructions as taking a single operand and
apply instructions as producing a single result; when
the type describes multiple results, they are implicitly
bound up in a tuple. It might make sense to split these
up and allow e.g. return instructions to take a list
of operands; however, it's not clear what to do on the
caller side, and this would be a major change that can
be separated out from this already over-large patch.
Unsurprisingly, the most invasive changes here are in
SILGen; this requires substantial reworking of both call
emission and reabstraction. It also proved important
to switch several SILGen operations over to work with
RValue instead of ManagedValue, since otherwise they
would be forced to spuriously "implode" buffers.
And use project_box to get to the address value.
SILGen now generates a project_box for each alloc_box.
And IRGen re-uses the address value from the alloc_box if the operand of project_box is an alloc_box.
This lets the generated code be the same as before.
Other than that most changes of this (quite large) commit are straightforward.
Having a separate address and container value returned from alloc_stack is not really needed in SIL.
Even if they differ we have both addresses available during IRGen, because a dealloc_stack is always dominated by the corresponding alloc_stack in the same function.
Although this commit quite large, most changes are trivial. The largest non-trivial change is in IRGenSIL.
This commit is a NFC regarding the generated code. Even the generated SIL is the same (except removed #0, #1 and @local_storage).