executing unknown code
This means we have to claw back some performance by recognizing harmless
releases.
Such as releases on types we known don't call a deinit with unknown
side-effects.
rdar://143497196
rdar://143141695
It was used in the old redundant-load- and redundant-store-elimination passes which were replaced by new implementations.
TypeExpansionAnalysis is not used anymore.
As the utility runs, new gens may become local: as access scopes are
determined to contain deinit barriers, their `end_access` instructions
become kills; if such an `end_access` occurs in the same block above an
initially-non-local gen, that gen is now local.
Previously, it was asserted that initially-non-local gens would not
encounter when visiting the block backwards from that gen. Iteration
would also _stop_ at the discovered kill, if any. As described above,
the assertion was incorrect.
Stopping at the discovered kill was also incorrect. It's necessary to
continue walking the block after finding such a new kill because the
book-keeping the utility does for which access scopes contain barriers.
Concretely, there are two cases:
(1) It may contain another `end_access` and above it a deinit barrier
which must result in that second scope becoming a deinit barrier.
(2) Some of its predecessors may be in the region, all the access scopes
which are open at the begin of this block must be unioned into the set
of scopes open at each predecessors' end, and more such access scopes
may be discovered above the just-visited `end_access`.
Here, both the assertion failure and the early bailout are fixed by
walking from the indicated initially-non-local gen backwards over the
entire block, regardless of whether a kill was encountered. If a kill
is encountered, it is asserted that the kill is an `end_access` to
account for the case described above.
rdar://139840307
TLDR: Was looking at some performance traces and saw that we need to cache the
result of this value.
----
Specifically, I noticed that we were spending a lot of time computing this
operation. When I looked at the code I saw that we already had a cache along the
relevant code paths... but the cache was from equivalence class representative
-> state. Before we hit that cache, we were performing the work to map the value
to the equivalence class representative... so the work to perform the relevant
lookup from value -> state (which goes through the equivalence class
representative) was not just a hash table lookup. This operation makes it
cheaper by making it two cache lookups.
It may be possible to make this cheaper by redoing the actual mapping of
information so that we can go straight from value to state. I think it would be
slightly different since we would probably need to represent the state in a
separate array and map with indices... which is really just a more efficient
hash table. We could also use malloc/etc but lets not even talk about that.
rdar://139520959
The old analysis pass doesn't take into account profile data, nor does
it consider post-dominance. It primarily dealt with _fastPath/_slowPath.
A block that is dominated by a cold block is itself cold. That's true
whether it's forwards or backwards dominance.
We can also consider a call to any `Never` returning function as a
cold-exit, though the block(s) leading up to that call may be executed
frequently because of concurrency. For now, I'm ignoring the concurrency
case and assuming it's cold. To make use of this "no return" prediction,
use the `-enable-noreturn-prediction` flag, which is currently off by
default.
The main changes are:
*) Rewrite everything in swift. So far, parts of memory-behavior analysis were already implemented in swift. Now everything is done in swift and lives in `AliasAnalysis.swift`. This is a big code simplification.
*) Support many more instructions in the memory-behavior analysis - especially OSSA instructions, like `begin_borrow`, `end_borrow`, `store_borrow`, `load_borrow`. The computation of end_borrow effects is now much more precise. Also, partial_apply is now handled more precisely.
*) Simplify and reduce type-based alias analysis (TBAA). The complexity of the old TBAA comes from old days where the language and SIL didn't have strict aliasing and exclusivity rules (e.g. for inout arguments). Now TBAA is only needed for code using unsafe pointers. The new TBAA handles this - and not more. Note that TBAA for classes is already done in `AccessBase.isDistinct`.
*) Handle aliasing in `begin_access [modify]` scopes. We already supported truly immutable scopes like `begin_access [read]` or `ref_element_addr [immutable]`. For `begin_access [modify]` we know that there are no other reads or writes to the access-address within the scope.
*) Don't cache memory-behavior results. It turned out that the hit-miss rate was pretty bad (~ 1:7). The overhead of the cache lookup took as long as recomputing the memory behavior.
This is just moving up the declaration in the chain of dependencies so that I
can write logic in PartitionUtils.h using it. I also added entrypoints to lookup
the ReprensetativeValue for our various emitters.
This asserts only option is an option to make it quicker/easier to triage
unknown pattern match errors by aborting when we emit it (allowing one to
immediately drop into the debugger at that point).
Previously, it only happened for errors in RegionAnalysis not in
TransferNonSendable itself.
When merging SILIsolationInfo for regions, we want to drop
nonisolated(unsafe). This is important since nonisolated(unsafe) should only
apply to the specific "value" that it belongs to, not the entire region.
This creates a problem since in a few places in the code base we initialize a
value (producing a disconnected value) and then initialize it by merging in an
actor isolation. This no longer work since we will then always have
nonisolated(unsafe) stripped, so no values would ever be considered to be
nonisolated(unsafe). After analyzing the use case, I realized that these were
just initialization patterns and in this commit, I added a specific
initialization operation called SILIsolationInfo::initializeTrackableValue and
eliminated those calls to SILIsolationInfo::mergeIsolationRegionInfo.
Since SILIsolationInfo no longer has any merge operation on it, I then
eliminated that code in this commit. This completes the behavior split that I
put into the type system in the last commit. Specifically, I defined a
composition type called SILDynamicMergedIsolationInfo. It represents a
SILIsolationInfo that has been merged... that is why I called it the
DynamicMergedIsolationInfo. It could probably use a better name = (.
This fixes one of the last weird test case that I wrote where we were not letting through valid
nonisolated(unsafe) code.
At the same time, I discovered an additional issue (which can be seen in the
TODOs in this commit), where we are being too conservative around a non-Sendable
class var field. I am going to fix that in the next commit.
rdar://128299305
Specifically, I introduced a new composition type called
SILDynamicMergedIsolationInfo that just contains a
SILIsolationInfo. Importantly, whenever one merges a SILIsolationInfo with
another SILIsolationInfo, one gets back a SILDynamicMergedIsolationInfo.
The reason why I am doing this is that we drop nonisolated(unsafe) when merging
so I want to ensure that parts of the code that use merging (where the dropping
occurs) and normal SILIsolationInfo where we do not want to merge is
distinguished.
I made sure we match what we get without region isolation by turning off region
isolation in one of the test runs on the test for this.
There is one problem where for non-final classes with nonisolated(unsafe) var
fields, we currently do not properly squelch since I need to do more
infrastructure work. I am going to do that in the next commit.
rdar://128299305
The design change here is that instead of just initializing the regionInfo with
disconnected, we set it as .none and if we see .none, just return a newly
construct disconnected isolation region info when getIsolationRegionInfo() is
called.
This enables us to provide a setIsolationRegionInfo() helper for
RegionAnalysisValueMap::getTrackableValue that does not perform a merge. This is
important since for nonisolated(unsafe), we want to not have nonisolated(unsafe)
propagate through merging. So if we use merging to initialize the internal
regionInfo state of a SILIsolationInfo, we will never have a SILIsolationInfo
with that bit set since it will be lost in the merge. So we need some sort of
other assignment operator. Noting that we should only compute a value's
SILIsolationInfo once in RegionAnalysisValueMap before we cache it in the map,
it made sense to just represent it as an optional that way we can guarantee that
the regionInfo is only ever set exactly once by that routine.
We still only parse transferring... but this sets us up for adding the new
'sending' syntax by first validating that this internal change does not mess up
the current transferring impl since we want both to keep working for now.
rdar://128216574
As part of this I went through how we handled inference and rather than using a
grab-bag getActorIsolation that was confusing to use, I created split APIs for
specific use cases (actor instance, global actor, just an apply expr crossing)
that makes it clearer inside the SILIsolationInfo::get* APIs what we are
actually trying to model. I found a few issues as a result and fixed most of
them if they were small. I also fixed one bigger one around computed property
initializers in the next commit. There is a larger change I didn't fix around allowing function
ref/partial_apply with isolated self parameters have a delayed flow sensitive
actor isolation... this will be fixed in a subsequent commit.
This also fixes a bunch of cases where we were printing actor-isolated instead
of 'self' isolated.
rdar://127295657
This is backing out an approach that I thought would be superior, but ended up
causing problems.
Originally, we mapped a region number to an immutable pointer set containing
Operand * where the region was tranferred. This worked great for a time... until
I began to need to propagate other information from the transferring code in the
analysis to the actual diagnostic emitter.
To be able to do that, my thought was to make a wrapper type around Operand
called TransferringOperand that contained the operand and the other information
I needed. This seemed to provide me what I wanted but I later found that since
the immutable pointer set was tracking TransferringOperands which were always
newly wrapped with an Operand *, we actually always created new pointer
sets. This is of course wasteful from a memory perspective, but also prevents me
from tracking transferring operand sets during the dataflow since we would never
converge.
In this commit, I fix that issue by again tracking just an Operand * in the
TransferringOperandSet and instead map each operand to a state structure which
we merge dataflow state into whenever we visit it. This provides us with
everything we need to in the next commit to including a region -> transferring
operand set equality check in our dataflow equations and always converge.
This means that I added an IsolationHistory field to Partition. Just upstreaming
the beginning part of this work. I added some unittests to exercise the code as
well. NOTE: This means that I did need to begin tracking an
IsolationHistoryFactory and propagating IsolationHistory in the pass
itself... but we do not use it for anything.
A quick overview of the design.
IsolationHistory is the head of an immutable directed acyclic graph. It is
actually represented as an immutable linked list with a special node that ties
in extra children nodes. The users of the information are expected to get a
SmallVectorImpl and process those sibling nodes afterwards. The reason why we
use an immutable approach is that it fits well with the problem and saves space
since different partitions could be pointing at the same linked list
node. Operations occur on an isolation history by pushing/popping nodes. It is
assumed that the user will push nodes in batches with a sequence boundary at the
bottom of the addition which signals to stop processing nodes.
Tieing this together, each Partition within it contains an IsolationHistory. As
the PartitionOpEvaluator applies PartitionOps to Partition in
PartitionOpEvaluator::apply, the evaluator also updates the isolation history in
the partition by first pushing a SequenceBoundary node and then pushing nodes
that will undo the operation that it is performing. This information is used by
the method Partition::popHistory. This pops linked list nodes from its history,
performing the operation in reverse until it hits a SequenceBoundary node.
This allows for one to rewind Partition history. And if one stashes an isolation
history as a target, one can even unwind a partition to what its state was at a
specific transfer point or earlier. Once we are at that point, we can begin
going one node back at a time and see when two values that we are searching for
no longer are apart of the same region. That is a place where we want to emit a
diagnostic. We then process until we find for both of our values history points
where they were the immediate reason why the two regions merge.
rdar://123479934
This should be NFC since the only case where I used this was with self... and I
found another way of doing that using the API I added in the previous commit.
I am making this specific API since I am going to make it so that
SILIsolationInfo::get(SILInstruction *) can infer isolation info from self even
from functions that are not apply isolation crossing points. For example, in the
following, we need to understand that test is main actor isolated and we
shouldn't emit an error.
```swift
@MainActor func test(_ x: NonSendable) {}
@OtherActor func doSomething() {
let x = NonSendable()
Task.init { @MainActor in print(x) }
test(x)
}
```
I am going to need this so that I can use it when evaluating partition
ops. Specifically, so I can when I find two values that do not merge correctly,
emit an error.
Now that we actually know the region that non transferrable things belong to, we
can use this information to give a better diagnostic here.
A really nice effect of this is that we now emit that actor isolated parameters
are actually actor isolated instead of task isolated.
In a subsequent commit, this is going to let me begin handling parameters with
actor regions in a nice way (and standardize all of the errors).
This is meant to be a refactoring commit that uses the current tests in tree to
make sure I did it correctly, so no tests need to be updated.
To keep this as an NFC commit, I only modeled initially actor isolated using
this. I am going to make it so that we properly treat global actor isolated
values as actor isolated/etc in a subsequent commit.
When we run RegionAnalysis, since it uses RPO order, we do not visit dead
blocks. This can create a problem when we emit diagnostics since we may merge in
a value into the region that was never actually defined. In this patch, if we
actually visit the block while performing dataflow, I mark a bit in its state
saying that it was live. Then when we emit diagnostics, I do not visit blocks
that were not marked live.
rdar://124042351