Specifically before this PR, if a caller did not customize a specific callback
of InstModCallbacks, we would store a static default std::function into
InstModCallbacks. This means that we always would have an indirect jump. That is
unfortunate since this code is often called in loops.
In this PR, I eliminate this problem by:
1. I made all of the actual callback std::function in InstModCallback private
and gave them a "Func" postfix (e.x.: deleteInst -> deleteInstFunc).
2. I created public methods with the old callback names to actually call the
callbacks. This ensured that as long as we are not escaping callbacks from
InstModCallback, this PR would not result in the need for any source changes
since we are changing a call of a std::function field to a call to a method.
3. I changed all of the places that were escaping inst mod's callbacks to take
an InstModCallback. We shouldn't be doing that anyway.
4. I changed the default value of each callback in InstModCallbacks to be a
nullptr and changed the public helper methods to check if a callback is
null. If the callback is not null, it is called, otherwise the getter falls
back to an inline default implementation of the operation.
All together this means that the cost of a plain InstModCallback is reduced and
one pays an indirect function cost price as one customizes it further which is
better scalability.
P.S. as a little extra thing, I added a madeChange field onto the
InstModCallback. Now that we have the helpers calling the callbacks, I can
easily insert instrumentation like this, allowing for users to pass in
InstModCallback and see if anything was RAUWed without needing to specify a
callback.
In non-OSSA we cannot reliably track the lifetime of non-trivial stored properties, in case the deallocation is inlined.
Removing such a dead alloc_ref might leak a property value (or crash the compiler).
rdar://70689545
It would be more abstractly correct if this got DI support so
that we destroy the member if the constructor terminates
abnormally, but we can get to that later.
When instructions are changed within a pass in a way that affects subsequent alias queries in the same pass run,
their alias analysis information must be invalidated.
Otherwise it can result in miscompiles and/or invalid SIL.
rdar://71924430
In derivatives of loops, no longer allocate boxes for indirect case payloads. Instead, use a custom pullback context in the runtime which contains a bump-pointer allocator.
When a function contains a differentiated loop, the closure context is a `Builtin.NativeObject`, which contains a `swift::AutoDiffLinearMapContext` and a tail-allocated top-level linear map struct (which represents the linear map struct that was previously directly partial-applied into the pullback). In branching trace enums, the payloads of previously indirect cases will be allocated by `swift::AutoDiffLinearMapContext::allocate` and stored as a `Builtin.RawPointer`.
For those who are unfamiliar, alloc-box-to-stack while generally not
interprocedural, will look one level into the callgraph to see if a
partial_apply that captures a box really needs to capture the box due to an
escape. If not, allocbox-to-stack clones the closure with the address inside the
box being passed instead of the box itself. This can then allow us to promote
the box from the heap to the stack.
What went wrong here is that in OSSA, this promoted param cloner drops
copy_value, destroy_value, and project_box on the given box. Both the copy_value
and destroy_value cases correctly looked through copy_values, but when porting,
the author forgot to handle project_box as well. This then caused the cloner to
assert since:
1. The project_box in the original function had a copy_value operand.
2. When we visited that copy_value, we saw it was for the box, so we dropped the
copy_value and did not add it to the cloner's Value -> op(Value) map.
3. Then when the cloner tried to create op(project_box), it tries to lookup the
value associated with the copy_value that is the project_box's operand... but we
don't have any such value due to (2). =><=.
The test change exercises this code path by adding a (project_box (copy_value))
to one of the allocbox to stack tests.
The premise of CopyForwarding was that memory locations have their own
ownership lifetime. We knew this was unmaintainable at the time, and
that was the original incentive for SIL opaque values, aided by
OSSA. In the meantime, we've been relying on SILGen producing
reasonable SIL patterns. Unfortunately, the CopyForwarding pass is
still with us while we make major changes to SIL ownership patterns
and agressively optimize ownership. That introduces risk.
Ultimately, the entire CopyForwarding pass should be redesigned for
OSSA-only and destroy hoisting should be a simple OSSA utility where
most of the work is done by AccessPath::collectUses.
But in the meantime, we should remove the source of risk by limiting
the CopyForwarding pass to OSSA. Any performance regressions will be
recovered as OSSA moves later in the pipeline. After that, opaque
values will improve even more over the current state by handling
generic SIL using the more powerful CopyPropagation pass.
Fixes rdar://71584600 (miscompile in CopyForwarding's release hoisting)
Here's an example of the kind of SIL the CopyForwarding does not
anticipate (although it only actually miscompiles in much more obscure
scenarios, which is why it's so dangerous):
bb0(%0 : $AnyObject):
%alloc1 = alloc_stack $AnyObject
store %0 to %objaddr : $*AnyObject
%ref = load %objaddr : $*AnyObject
%alloc2 = alloc_stack $ObjWrapper
# The in-memory reference is destroyed before retaining the loaded ref.
copy_addr [take] %alloc1 to [initialization] %alloc2 : $*ObjWrapper
retain_value %ref : $AnyObject
Interestingly this problem can only occur if one invokes
MarkUninitializedInst::getKind() directly. Once our instruction is just a
SILInstruction, we call the appropriate method so we didn't notice it.
I used Xcode's refactoring functionality to find all of the invocation
locations.
NOTE: I also added a partial_apply [guaranteed] test.
Whats interesting about these is that we only ever perform allocbox_to_stack if
we know that we are going to eliminate the allocbox completely. So if we break
dominance among some uses of the alloc box or insert destroy_value when we are
in non-ossa... it doesn't matter since we will eliminate the box and these uses
before the pass is done running.
This will harmless on the surface is an instance of the compiler being in a
"fixed point of correctness". This occurance is when the compiler implementation
is incorrect but the incorrectness is being hidden in the final output. If the
output of the compiler changes or the code in question is changed, new bugs can
be introduced due to the lack of preserving of standard invariants like
dominance.
I also added an additional helper: SILBuilder::insertAfter(SILValue). This
builds on Erik's commit that gave us insert(SILInstruction *). I wanted this
functionality, but additionally I wanted to make it so that if I had an
argument, I got back the first instruction in the block. So it was natural to
extend this to values.
This makes it easier to understand conceptually why a ValueOwnershipKind with
Any ownership is invalid and also allowed me to explicitly document the lattice
that relates ownership constraints/value ownership kinds.
I have a need to have SwitchEnum{,Addr}Inst have different base classes
(TermInst, OwnershipForwardingTermInst). To do this I need to add a template to
SwitchEnumInstBase so I can switch that BaseTy. Sadly since we are using
SwitchEnumInstBase as an ADT type as well as an actual base type for
Instructions, this is impossible to do without introducing a template in a ton
of places.
Rather than doing that, I changed the code that was using SwitchEnumInstBase as
an ADT to instead use a proper ADT SwitchEnumBranch. I am happy to change the
name as possible see fit (maybe SwitchEnumTerm?).
This makes it clearer that isConsumingUse() is not an owned oriented API and
returns also for instructions that end the lifetime of guaranteed values like
end_borrow.
`Builtin.createAsyncTask` takes flags, an optional parent task, and an
async/throwing function to execute, and passes it along to the
`swift_task_create_f` entry point to create a new (potentially child)
task, returning the new task and its initial context.