This is code that was written at the very beginning of Swift and has some early
style remaining interspirsed with newer swift style code. Since I am going to be
touching this and I don't expect other people to touch it for a while, I am
fixing up the formatting/reorganizing rather than leaving the style inconsistent.
Problem: We continue to uncover code that assumes either precise local
variable lifetimes (to the end of the lexical scope) or extended
temporary lifetimes (to the end of the statement). These bugs require
heroic debugging to find the root cause. Because they only show up in
Release builds, they often manifest just before the affected project
“ships” under an impending deadline.
We now have enough information from projects that have been tested
with copy propagation that we can both understand common patterns and
identify some specific APIs that may cause trouble. We know what API
annotations the compiler will need for helpful warnings and can begin
adding those annotations.
Disabling copy propagation now is only a temporary deferral, we will
still need to bring it back by default. However, by then we should
have:
- LLDB and runtime support for debugging deinitialized objects
- A variant of lifetime sortening that can run in Debug builds to
catch problems before code ships
- Static compiler warnings for likely invalid lifetime assumptions
- Source annotations that allow those warnings to protect programmers
against existing dangerous APIs
In the meantime...
Projects can experiment with the behavior and gradually migrate.
Copy propagation will automatically be enabled in -enable-ossa-modules
mode. It is important to work toward a single performance
target. Supporting full OSSA and improving ARC performance without
copy propagation would be prohibitively complicated.
rdar://76438920 (Temporarily disable -O copy propagation by default)
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
While the comment is correct to state that this won't enable any
new optimizations with -Onone, it does enable IRGen's lazy
function emission, which is important for 'reasync' functions,
which we don't want to emit at all even at -Onone.
This fixes debug stdlib builds with the new reasync versions
of the &&, || and ?? operators.
This
1. fixes a bug, where we didn't consider that an object can escape via a function call. The pass issued a false warning in this case.
rdar://76115467
2. improves the accuracy by doing a simple form of interprocedural analysis. E.g. it now can see if a weak store is done in a called function.
rdar://76297286
Previously, because partial apply forwarders for async functions were
not themselves fully-fledged async functions, they were not able to
handle dynamic functions. Specifically, the reason was that it was not
possible to produce an async function pointer for the partial apply
forwarder because the size to be used was not knowable.
Thanks to https://github.com/apple/swift/pull/36700, that cause has been
eliminated. With it, partial apply forwarders are fully-fledged async
functions and in particular have their own async function pointers.
Consequently, it is again possible for these partial apply forwarders to
handle non-constant function pointers.
Here, that behavior is restored, by way of reverting part of
ee63777332 while preserving the ABI it
introduced.
rdar://76122027
This isn't _terribly_ useful as-is, because the only constant mask you can get at from Swift at present is the zeroinitializer, but even that is quite useful for optimizing the repeating: intializer on SIMD. At some future point we should wire up generating constant masks for the .even, .odd, .high and .low properties (and also eventually make shufflevector take non-constant masks in LLVM). But this is enough to be useful, so let's get it in.
This feature degrades the debugging experience and causes a large
number of unit test failures.
These were both known issues, but our planned debugger improvements
won't be ready for a while. Until then, we'll leave the feature under
a compiler option, and developers can adopt it at there own speed for
now when they are ready to fix lifetime issues in their code.
rdar://76177280 (Disable mandatory-copy-propagation (-Onone only))
Now we get opt-remarks like the following:
```
public func condCast5<NS, T>(_ ns: NS) -> T? {
// Make sure the colon info is right so that the arrow is under the a.
//
// Today, we lose that x was assigned.
if let x = ns as? T { // expected-remark @:17 {{conditional runtime cast of value with type 'NS' to 'T'}}
return x // expected-note @-5:32 {{of 'ns'}}
}
return nil
}
```
The comment in LowerHopToActor explains the design here.
We want SILGen to emit hops to actors, ignoring executors,
because it's easier to fully optimize in a world where deriving
an executor is a non-trivial operation. But we also want something
prior to IRGen to lower the executor derivation because there are
useful static optimizations we can do, such as doing the derivation
exactly once on a dominance path and strength-reducing the derivation
(e.g. exploiting static knowledge that an actor is a default actor).
There are probably phase-ordering problems with doing this so late,
but hopefully they're restricted to situations like actors that
share an executor. We'll want to optimize that eventually, but
in the meantime, this unblocks the executor work.
* add is_escaping_closure as allowed instruction
* don't restrict the uses of copy_value to be a store
Fixes a verification crash (only in assert builds).
https://bugs.swift.org/browse/SR-14394
rdar://75740622
For example, now we get the following diagnostic on globals:
public func getGlobal() -> Klass {
return global // expected-remark @:5 {{retain of type 'Klass'}}
// expected-note @-5:12 {{of 'global'}}
+ // expected-remark @-2:12 {{begin exclusive access to value of type 'Klass'}}
+ // expected-note @-7:12 {{of 'global'}}
+ // expected-remark @-4 {{end exclusive access to value of type 'Klass'}}
+ // expected-note @-9:12 {{of 'global'}}
+
}
and for classes when we can't eliminate the access:
+func simpleInOut() -> Klass {
+ let x = Klass() // expected-remark @:13 {{heap allocated ref of type 'Klass'}}
+ // expected-note @-1:9 {{of 'x'}}
+ simpleInOutUser(&x.next) // expected-remark @:5 {{begin exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-3:9 {{of 'x.next'}}
+ // expected-remark @-2:28 {{end exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-5:9 {{of 'x.next'}}
+ return x
+}
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
While cloning arguments for existential specializer, do not create copy
for object types. Currently, for guaranteed parameter types it
unnecessarily creates a copy and cleans it up with a destroy. The
discrepency is seen when cloning an instruction like open_existential_ref
which was previously using guaranteed operand and had guaranteed
forwarding ownership, is now replaced to have the owned copy as its
operand during cloning.
Don't move an end_access over a (non-aliasing) end_access. This would destroy the proper nesting of accesses.
Also, add some comments, asserts and tests.
* ``begin_access [modify]`` returned MayWrite, but "modify" means, it can be a read as well.
Instead, return MayReadWrite. Only for ``begin_access [init]`` return MayWrite.
This is more or less cosmetic - probably this bug had no real impact on any optimization.
* ``begin_access [deinit]`` needs to return MayReadWrite for the same reason ``destroy_addr`` returns MayReadWrite (see SILInstruction::MemoryBehavior).
Add more relevant instructions to dump in MemBehaviorDumper.
Therefore the -enable-mem-behavior-dump-all option can be removed.
Also mem-behavior-all.sil into mem-behavior.sil, because the sil-opt command lines don't differ anymore.
Try to move an end_access down to extend the access scope over all uses of the temporary.
For example:
%a = begin_access %src
copy_addr %a to [initialization] %temp : $*T
end_access %a
use %temp
We must not replace %temp with %a after the end_access. Instead we try to move the end_access after "use %temp".
This fixes generation of invalid SIL and/or the invalid removal of access checks.
Tasks shouldn't normally hog the actor context indefinitely after making a call that's bound to
that actor, since that prevents the actor from potentially taking on other jobs it needs to
be able to address. Set up SILGen so that it saves the current executor (using a new runtime
entry point) and hops back to it after every actor call, not only ones where the caller context
is also actor-bound.
The added executor hopping here also exposed a bug in the runtime implementation while processing
DefaultActor jobs, where if an actor job returned to the processing loop having already yielded
the thread back to a generic executor, we would still attempt to make the actor give up the thread
again, corrupting its state.
rdar://71905765
This directly adds support to BasicBlockCloner for updating OSSA.
It also adds a much more general-purpose GuaranteedPhiBorrowFixup
utility which can be used for more complicated SSA updates, in which
multiple phis need to be created. More generally, it handles adding
nested borrow scopes for guaranteed phis even when that phi is used by
other guaranteed phis.
This is a basic SSA-based liveness algorithm. It should just do what
it says. This change rescues the core logic from the nonsense that's
been hacked in. There are now two APIs:
(1) computeLifetimeBoundary (new) only does lifetime analysis. It
provides a new API that always succeeds and provides the last use
points so clients can do the right thing based on the user. I need
this for OSSA utilities.
(2) computeFrontier (old) emulates the original API with a simplified
version of the original logic.
Next steps:
- replace the old API with a separate utility that computes destroy
insertion points. Completely remove DeadEndBlocks from lifetime
analysis, because it simply has nothing to do with liveness.
- Gradually migrate clients to either the new lifetime API provided
here or the new destroy insertion API to be provided later.
Previously, thick async functions were represented sometimes as a pair
of (AsyncFunctionPointer, nullptr)--when the thick function was produced
via a thin_to_thick_function, e.g.--and sometimes as a pair of
(FunctionPointer, ThickContext)--when the thick function was produced by
a partial_apply--with the size stored in the slot of the ThickContext.
That optimized for the wrong case: partial applies of dynamic async
functions; in that case, there is no appropriate AsyncFunctionPointer to
form when lowering the partial_apply instruction. The far more common
case is to know exactly which function is being partially applied. In
that case, we can form the appropriate AsyncFunctionPointer.
Furthermore, the previous representation made calling a thick function
more complex: it was always necessary to check whether the context was
in fact null and then proceed along two different paths depending.
Here, that behavior is corrected by creating a thunk in a mandatory
IRGen SIL pass in the case that the function that is being partially
applied is dynamic. That new thunk is then partially applied in place
of the original partial_apply of the dynamic function.