This is an obvious drive-by fix. It will crash when building
Foundation after I commit changes to the pipeline. My attempts at
creating a unit test were unsuccessful because it depends on some
interaction between inlining and specialization heuristics.
Reuse AccessStorageAnalysis to summarize accesses.
Don't ignore call sites in loops.
Don't consider a read access in a loop to conflict with a read
outside.
Use the unidentified access flag from the analysis.
Remove extraneous code, some of which was unreachable.
General cleanup.
Now we handle this case:
%stack = alloc_stack $Protocol
copy_addr %var to [initialization] %stack
open_existential_addr immutable_access %stack
...
destroy_addr %stack
dealloc_stack %stack
We only do this if we have immutable_access. To be conservative I still only let
the normal whitelist of other instructions through.
I am adding this optimization since I am going to be eliminating a SILGen
peephole so I can enable SIL ownership verification everywhere.
Directly implement the data flow. Eliminate the extraneous work.
Remove cubic behavior. Do a single iteration of the data flow state at
each program point only performing the necessary set operations. At
unidentified access, clear the sets for simplicity and efficiency.
This cleanup results in significant functional changes:
- Allowing scopes to merge even if they are enclosed.
- Handling unidentified access conservatively.
Note that most of the added lines of code are comments.
Somehow this cleanup incidentally fixes:
<rdar://problem/48514339> swift compiler hangs building project
(I expected the subsequent loop summary cleanup to fix that problem.)
The optimization already proves that there are no potential conflicts
between the two merged scopes, so merging them can't introduce a new
nested conflict.
AccessSummary was storing unnecessary state in every per-block entry in the
global map. It was also making most of the code in this pass very hard to read.
Rewriting mergePredAccesses allows AccessSummary to be removed. The
new implementation also avoids unnecessary DenseMap lookups.
There is also a functional change. mergePredAccesses was clearing the
state of predecessor blocks. This isn't logical and had no explanation.
Most of the API's operate on the data flow state (RegionState) with
respect to the information for a given access (AccessInfo). Calling
them both "info" made the code completely indecipherable.
The ownership kind is Any for trivial types, or Owned otherwise, but
whether a type is trivial or not will soon depend on the resilience
expansion.
This means that a SILModule now uniques two SILUndefs per type instead
of one, and serialization uses two distinct sentinel IDs for this
purpose as well.
For now, the resilience expansion is not actually used here, so this
change is NFC, other than changing the module format.
This adds a mostly flow-insensitive analysis that runs before the
dominator-based transformations. The analysis is simple and efficient
because it only needs to track data flow of currently in-scope
accesses. The original dominator tree walk remains simple, but it now
checks the flow insensitive analysis information to determine general
correctness. This is now correct in the presence of all kinds of nested
static and dynamic nested accesses, call sites, coroutines, etc.
This is a better compromise than:
(a) disabling the pass and taking a major performance loss.
(b) converting the pass itself to full-fledged data flow driven
optimization, which would be more optimal because it could remove
accesses when nesting is involved, but would be much more expensive
and complicated, and there's no indication that it's useful.
The new approach is also simpler than adding more complexity to
independently handle to each of many issues:
- Nested reads followed by a modify without a false conflict.
- Reads nested within a function call without a false conflict.
- Conflicts nested within a function call without dropping enforcement.
- Accesses within a generalized accessor.
- Conservative treatment of invalid storage locations.
- Conservative treatment of unknown apply callee.
- General analysis invalidation.
Some of these issues also needed to be considered in the
LoopDominatingAccess sub-pass. Rather than fix that sub-pass, I just
integrated it into the main pass. This is a simplification, is more
efficient, and also handles nested loops without creating more
redundant accesses. It is also generalized to:
- hoist non-uniquely identified accesses.
- Avoid unnecessarily promoting accesses inside the loop.
With this approach we can remove the scary warnings and caveats in the
comments.
While doing this I also took the opportunity to eliminate quadratic
behavior, make the domtree walk non-recursive, and eliminate cutoff
thresholds.
Note that simple nested dynamic reads to identical storage could very
easily be removed via separate logic, but it does not fit with the
dominator-based algorithm. For example, during the analysis phase, we
could simply mark the "fully nested" read scopes, then convert them to
[static] right after the analysis, removing them from the result
map. I didn't do this because I don't know if it happens in practice.
Where possible, pass around a ClassDecl or a CanType instead of a
SILType that might wrap a metatype; the unwrapping logic was
repeated in several places.
Also add a FIXME for a bug I found by inspection.
This optimization is doing 2 things: replacing getElement calls and replacing append(contentOf:) calls.
Now if the argument to a append(contentOf:) is a previously replaced getElement call, we ended up in a use-after-free crash.
The main change here is to do the transformations immediately after gathering the data (and that means: separately) and not collecting all the data and do both transformation afterwards
The pass does not use any Analysis (where invalidation could be a problem). Also iterator invalidation is not a problem here.
SR-10003
rdar://problem/48445856
In the statement
optional1?.objc_setter = optional2?.objc_getter?.objc_getter
we can eliminate all optional switches expect for the first switch on
optional1. We must only execute the setter if optional1 has some value.
We can simplify the following switch_enum with a branch as long all
sideffecting instructions in someBB are objc_method calls on the
optional payload or on another objc_method call that transitively uses
the payload.
switch_enum %optionalValue, case #Optional.some!enumelt.1: someBB,
case #Optional.none: noneBB
someBB(%optionalPayload):
%1 = objc_method %optionalPayload
%2 = apply %1(..., %optionalPayload) // self position
br mergeBB(%2)
noneBB:
%4 = enum #Optional.none
br mergeBB(%4)
rdar://48007302
When two access scopes have different access types, it is not safe to
merge them into one access. Doing so will introduce false conflicts
which manifest as crashes in user code.
To do this optimization, we would need additional data flow information
about *potential* read conflicts before converting a read to a modify
access.
Fixes rdar://48239213: Fatal access conflict detected.
NOTE: I changed all places that the CastOptimizer is created to just pass in
nullptr for now so this is NFC.
----
Right now the interface of the CastOptimizer is muddled and confused. Sometimes
it is returning a value that should be used by the caller, other times it is
returning an instruction that is meant to be reprocessed by the caller.
This series of patches is attempting to clean this up by switching to the
following model:
1. If we are optimizing a cast of a value, we return a SILValue. If the cast
fails, we return an empty SILValue().
2. If we are optimizing a cast of an address, we return a boolean value to show
success/failure and require the user to use the SILBuilderContext to get the
cast if they need to.
A recent fix added some code that is inconsistent with the design of
the pass, creates an implicit coupling between separate parts of the
optimization, and relies on incredibly subtle reasoning to ensure
correctness.
This sort of thing needs a big fat comment.
Beside fixing the compiler crash, this change also improves the stack-nesting correction mechanisms in the inliners:
* Instead of trying to correct the nesting after each inlining of a callee, correct the nesting once when inlining is finished for a caller function.
This fixes a potential compile time problem, because StackNesting iterates over the whole function.
In worst case this can lead to quadratic behavior in case many begin_apply instructions with overlapping stack locations are inlined.
* Because we are doing it only once for a caller, we can remove the complex logic for checking if it is necessary.
We can just do it unconditionally in case any coroutine gets inlined.
The inliners iterate over all instruction of a function anyway, so this does not increase the computational complexity (StackNesting is roughly linear with the number of instructions).
rdar://problem/47615442