This is a large patch; I couldn't split it up further while still
keeping things working. There are four things being changed at
once here:
- Places that call SILType::isAddressOnly()/isLoadable() now call
the SILFunction overload and not the SILModule one.
- SILFunction's overloads of getTypeLowering() and getLoweredType()
now pass the function's resilience expansion down, instead of
hardcoding ResilienceExpansion::Minimal.
- Various other places with '// FIXME: Expansion' now use a better
resilience expansion.
- A few tests were updated to reflect SILGen's improved code
generation, and some new tests are added to cover more code paths
that previously were uncovered and only manifested themselves as
standard library build failures while I was working on this change.
Actually: generate the array of (key, value) tuples in the data section, which is then passed to Dictionary.init(dictionaryLiteral:)
We already do this for simple arrays, e.g. arrays with trivial element types.
The only change needed for dictionary literals is to support tuple types in the ObjectOutliner.
The effect of this optimization is a significant reduction in code size for dictionary literals - and an increase in data size.
But in most cases there is a considerable net win for code+data size in total.
Disable constant folding and jump threading for functions with > 10000 blocks.
Those optimizations are not strictly linear with the number of blocks and cause compile time issues if the function is really huge.
SR-10209
rdar://problem/49522869
This is an obvious drive-by fix. It will crash when building
Foundation after I commit changes to the pipeline. My attempts at
creating a unit test were unsuccessful because it depends on some
interaction between inlining and specialization heuristics.
Reuse AccessStorageAnalysis to summarize accesses.
Don't ignore call sites in loops.
Don't consider a read access in a loop to conflict with a read
outside.
Use the unidentified access flag from the analysis.
Remove extraneous code, some of which was unreachable.
General cleanup.
Now we handle this case:
%stack = alloc_stack $Protocol
copy_addr %var to [initialization] %stack
open_existential_addr immutable_access %stack
...
destroy_addr %stack
dealloc_stack %stack
We only do this if we have immutable_access. To be conservative I still only let
the normal whitelist of other instructions through.
I am adding this optimization since I am going to be eliminating a SILGen
peephole so I can enable SIL ownership verification everywhere.
Directly implement the data flow. Eliminate the extraneous work.
Remove cubic behavior. Do a single iteration of the data flow state at
each program point only performing the necessary set operations. At
unidentified access, clear the sets for simplicity and efficiency.
This cleanup results in significant functional changes:
- Allowing scopes to merge even if they are enclosed.
- Handling unidentified access conservatively.
Note that most of the added lines of code are comments.
Somehow this cleanup incidentally fixes:
<rdar://problem/48514339> swift compiler hangs building project
(I expected the subsequent loop summary cleanup to fix that problem.)
The optimization already proves that there are no potential conflicts
between the two merged scopes, so merging them can't introduce a new
nested conflict.
AccessSummary was storing unnecessary state in every per-block entry in the
global map. It was also making most of the code in this pass very hard to read.
Rewriting mergePredAccesses allows AccessSummary to be removed. The
new implementation also avoids unnecessary DenseMap lookups.
There is also a functional change. mergePredAccesses was clearing the
state of predecessor blocks. This isn't logical and had no explanation.
Most of the API's operate on the data flow state (RegionState) with
respect to the information for a given access (AccessInfo). Calling
them both "info" made the code completely indecipherable.
The ownership kind is Any for trivial types, or Owned otherwise, but
whether a type is trivial or not will soon depend on the resilience
expansion.
This means that a SILModule now uniques two SILUndefs per type instead
of one, and serialization uses two distinct sentinel IDs for this
purpose as well.
For now, the resilience expansion is not actually used here, so this
change is NFC, other than changing the module format.
This adds a mostly flow-insensitive analysis that runs before the
dominator-based transformations. The analysis is simple and efficient
because it only needs to track data flow of currently in-scope
accesses. The original dominator tree walk remains simple, but it now
checks the flow insensitive analysis information to determine general
correctness. This is now correct in the presence of all kinds of nested
static and dynamic nested accesses, call sites, coroutines, etc.
This is a better compromise than:
(a) disabling the pass and taking a major performance loss.
(b) converting the pass itself to full-fledged data flow driven
optimization, which would be more optimal because it could remove
accesses when nesting is involved, but would be much more expensive
and complicated, and there's no indication that it's useful.
The new approach is also simpler than adding more complexity to
independently handle to each of many issues:
- Nested reads followed by a modify without a false conflict.
- Reads nested within a function call without a false conflict.
- Conflicts nested within a function call without dropping enforcement.
- Accesses within a generalized accessor.
- Conservative treatment of invalid storage locations.
- Conservative treatment of unknown apply callee.
- General analysis invalidation.
Some of these issues also needed to be considered in the
LoopDominatingAccess sub-pass. Rather than fix that sub-pass, I just
integrated it into the main pass. This is a simplification, is more
efficient, and also handles nested loops without creating more
redundant accesses. It is also generalized to:
- hoist non-uniquely identified accesses.
- Avoid unnecessarily promoting accesses inside the loop.
With this approach we can remove the scary warnings and caveats in the
comments.
While doing this I also took the opportunity to eliminate quadratic
behavior, make the domtree walk non-recursive, and eliminate cutoff
thresholds.
Note that simple nested dynamic reads to identical storage could very
easily be removed via separate logic, but it does not fit with the
dominator-based algorithm. For example, during the analysis phase, we
could simply mark the "fully nested" read scopes, then convert them to
[static] right after the analysis, removing them from the result
map. I didn't do this because I don't know if it happens in practice.
Where possible, pass around a ClassDecl or a CanType instead of a
SILType that might wrap a metatype; the unwrapping logic was
repeated in several places.
Also add a FIXME for a bug I found by inspection.
This optimization is doing 2 things: replacing getElement calls and replacing append(contentOf:) calls.
Now if the argument to a append(contentOf:) is a previously replaced getElement call, we ended up in a use-after-free crash.
The main change here is to do the transformations immediately after gathering the data (and that means: separately) and not collecting all the data and do both transformation afterwards
The pass does not use any Analysis (where invalidation could be a problem). Also iterator invalidation is not a problem here.
SR-10003
rdar://problem/48445856
In the statement
optional1?.objc_setter = optional2?.objc_getter?.objc_getter
we can eliminate all optional switches expect for the first switch on
optional1. We must only execute the setter if optional1 has some value.
We can simplify the following switch_enum with a branch as long all
sideffecting instructions in someBB are objc_method calls on the
optional payload or on another objc_method call that transitively uses
the payload.
switch_enum %optionalValue, case #Optional.some!enumelt.1: someBB,
case #Optional.none: noneBB
someBB(%optionalPayload):
%1 = objc_method %optionalPayload
%2 = apply %1(..., %optionalPayload) // self position
br mergeBB(%2)
noneBB:
%4 = enum #Optional.none
br mergeBB(%4)
rdar://48007302
When two access scopes have different access types, it is not safe to
merge them into one access. Doing so will introduce false conflicts
which manifest as crashes in user code.
To do this optimization, we would need additional data flow information
about *potential* read conflicts before converting a read to a modify
access.
Fixes rdar://48239213: Fatal access conflict detected.