We previously were computing this eagerly meaning that if we did not actually
need the DomTreeLevel map, we would calculate it and additionally incur a
relatively large malloc for the DenseMap (which stores by default IIRC 64 key
value pairs).
Now, we do it lazily ensuring that we only compute this if we actually need it
(when promoting non-single block stack allocations).
We have for a long time talked about creating a scope like data structure for
use in the SILOptimizer. The discussion was whether or not to reuse the
infrastructure in SILGen that does this already. There were concerns about doing
so since the code in the SILOptimizer and SILGen can work differently.
With that in mind, I added a small AssertingScope class and built on top of that
a composition SIL level class called SILOptScope that one can use to add various
cleanups. One is able to both destructively pop at end of scope and pop along
early exits.
At an implementation level, I kept it simple and:
1. Represented a scope as a stack of Optional<Cleanup> which are just a wrapper
around a std::function. The Optional is so that we can invalidate a cleanup.
2. Based all of these scopes around the idea that the user of the scope must
invalidate the scope by hand. If not, the scope object will assert at the end
of its RAII scope.
3. Rather than creating a whole class hierarchy, I just used std::function
closures to keep things simple.
Instead, put the archetype->instrution map into SIlModule.
SILOpenedArchetypesTracker tried to maintain and reconstruct the mapping locally, e.g. during a use of SILBuilder.
Having a "global" map in SILModule makes the whole logic _much_ simpler.
I'm wondering why we didn't do this in the first place.
This requires that opened archetypes must be unique in a module - which makes sense. This was the case anyway, except for keypath accessors (which I fixed in the previous commit) and in some sil test files.
... with a fix for a non-assert build crash: I used the wrong ilist type for SlabList. This does not explain the crash, though. What I think happened here is that llvm miscompiled and put the llvm_unreachable from the Slab's deleteNode function unconditionally into the SILModule destructor.
Now by using simple_ilist, there is no need for a deleteNode at all.
This is a pretty old style of code that we are trying to move away from.
Specifically, we use to always store a global SILBuilder and manually manipulate
its insertion point/other state when we moved places. This was very bugprone so
we instead now create new SILBuilders when we want to create instructions at a
new insertion point and pass down a SILBuilderContext for global state (like the
list of inserted instructions/etc) to each SILBuilder.
I went through and updated all of the code to now follow this pattern.
This code organization already existed in the file in content, but all of the
routines were mixed together. This splits the code by topic providing to the eye
easy code locality when reading.
This is code that was written at the very beginning of Swift and has some early
style remaining interspirsed with newer swift style code. Since I am going to be
touching this and I don't expect other people to touch it for a while, I am
fixing up the formatting/reorganizing rather than leaving the style inconsistent.
- stop storing the parent task in the TaskGroup at the .swift level
- make sure that swift_taskGroup_isCancelled is implied by the parent
task being cancelled
- make the TaskGroup structs frozen
- make the withTaskGroup functions inlinable
- remove swift_taskGroup_create
- teach IRGen to allocate memory for the task group
- don't deallocate the task group in swift_taskGroup_destroy
To achieve the allocation change, introduce paired create/destroy builtins.
Furthermore, remove the _swiftRetain and _swiftRelease functions and
several calls to them. Replace them with uses of the appropriate builtins.
I should probably change the builtins to return retained, since they're
working with a managed type, but I'll do that in a separate commit.
This isn't _terribly_ useful as-is, because the only constant mask you can get at from Swift at present is the zeroinitializer, but even that is quite useful for optimizing the repeating: intializer on SIMD. At some future point we should wire up generating constant masks for the .even, .odd, .high and .low properties (and also eventually make shufflevector take non-constant masks in LLVM). But this is enough to be useful, so let's get it in.
Now we get opt-remarks like the following:
```
public func condCast5<NS, T>(_ ns: NS) -> T? {
// Make sure the colon info is right so that the arrow is under the a.
//
// Today, we lose that x was assigned.
if let x = ns as? T { // expected-remark @:17 {{conditional runtime cast of value with type 'NS' to 'T'}}
return x // expected-note @-5:32 {{of 'ns'}}
}
return nil
}
```
The comment in LowerHopToActor explains the design here.
We want SILGen to emit hops to actors, ignoring executors,
because it's easier to fully optimize in a world where deriving
an executor is a non-trivial operation. But we also want something
prior to IRGen to lower the executor derivation because there are
useful static optimizations we can do, such as doing the derivation
exactly once on a dominance path and strength-reducing the derivation
(e.g. exploiting static knowledge that an actor is a default actor).
There are probably phase-ordering problems with doing this so late,
but hopefully they're restricted to situations like actors that
share an executor. We'll want to optimize that eventually, but
in the meantime, this unblocks the executor work.
For example, now we get the following diagnostic on globals:
public func getGlobal() -> Klass {
return global // expected-remark @:5 {{retain of type 'Klass'}}
// expected-note @-5:12 {{of 'global'}}
+ // expected-remark @-2:12 {{begin exclusive access to value of type 'Klass'}}
+ // expected-note @-7:12 {{of 'global'}}
+ // expected-remark @-4 {{end exclusive access to value of type 'Klass'}}
+ // expected-note @-9:12 {{of 'global'}}
+
}
and for classes when we can't eliminate the access:
+func simpleInOut() -> Klass {
+ let x = Klass() // expected-remark @:13 {{heap allocated ref of type 'Klass'}}
+ // expected-note @-1:9 {{of 'x'}}
+ simpleInOutUser(&x.next) // expected-remark @:5 {{begin exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-3:9 {{of 'x.next'}}
+ // expected-remark @-2:28 {{end exclusive access to value of type 'Optional<Klass>'}}
+ // expected-note @-5:9 {{of 'x.next'}}
+ return x
+}
The immediate desire is to minimize the set of ABI dependencies
on the layout of an ExecutorRef. In addition to that, however,
I wanted to generally reduce the code size impact of an unsafe
continuation since it now requires accessing thread-local state,
and I wanted resumption to not have to create unnecessary type
metadata for the value type just to do the initialization.
Therefore, I've introduced a swift_continuation_init function
which handles the default initialization of a continuation
and returns a reference to the current task. I've also moved
the initialization of the normal continuation result into the
caller (out of the runtime), and I've moved the resumption-side
cmpxchg into the runtime (and prior to the task being enqueued).
Don't move an end_access over a (non-aliasing) end_access. This would destroy the proper nesting of accesses.
Also, add some comments, asserts and tests.
Try to move an end_access down to extend the access scope over all uses of the temporary.
For example:
%a = begin_access %src
copy_addr %a to [initialization] %temp : $*T
end_access %a
use %temp
We must not replace %temp with %a after the end_access. Instead we try to move the end_access after "use %temp".
This fixes generation of invalid SIL and/or the invalid removal of access checks.
Tasks shouldn't normally hog the actor context indefinitely after making a call that's bound to
that actor, since that prevents the actor from potentially taking on other jobs it needs to
be able to address. Set up SILGen so that it saves the current executor (using a new runtime
entry point) and hops back to it after every actor call, not only ones where the caller context
is also actor-bound.
The added executor hopping here also exposed a bug in the runtime implementation while processing
DefaultActor jobs, where if an actor job returned to the processing loop having already yielded
the thread back to a generic executor, we would still attempt to make the actor give up the thread
again, corrupting its state.
rdar://71905765
This directly adds support to BasicBlockCloner for updating OSSA.
It also adds a much more general-purpose GuaranteedPhiBorrowFixup
utility which can be used for more complicated SSA updates, in which
multiple phis need to be created. More generally, it handles adding
nested borrow scopes for guaranteed phis even when that phi is used by
other guaranteed phis.
And rename MemoryDataflow -> BitDataflow.
MemoryLifetime contained MemoryLocations, MemoryDataflow and the MemoryLifetimeVerifier.
Three independent things, for which it makes sense to have them in three separated files.
NFC.
Ensure that any object lifetime that will be shortened ends in
destroy_value [poison], indicating that the reference should not be
dereferenced (e.g. by the debugger) beyond this point.
This has no way of knowing whether the object will actually be
deallocated at this point. It conservatively avoids showing garbage to
debuggers.
Part 1/2: rdar://75012368 (-Onone compiler support for early object
deinitialization with sentinel dead references)
In their previous form, the non-`_f` variants of these entry points were unused, and IRGen
lowered the `createAsyncTask` builtins to use the `_f` variants with a large amount of caller-side
codegen to manually unpack closure values. Amid all this, it also failed to make anyone responsible
for releasing the closure context after the task completed, causing every task creation to leak.
Redo the `swift_task_create_*` entry points to accept the two words of an async closure value
directly, and unpack the closure to get its invocation entry point and initial context size
inside the runtime. (Also get rid of the non-future `swift_task_create` variant, since it's unused
and it's subtly different in a lot of hairy ways from the future forms. Better to add it later
when it's needed than to have a broken unexercised version now.)
Replace a String initializer followed by String.utf8CString with a (UTF8 encoded) string literal.
This code pattern is generated when calling C functions with string literals, e.g.
puts("hello!")
rdar://74941849
Also, relax the check for enums a bit. Instead of only accepting single-payload enums, just require that the payload type is the same for an enum location.
Refactor SILGen's ApplyOptions into an OptionSet, add a
DoesNotAwait flag to go with DoesNotThrow, and sink it
all down into SILInstruction.h.
Then, replace the isNonThrowing() flag in ApplyInst and
BeginApplyInst with getApplyOptions(), and plumb it
through to TryApplyInst as well.
Set the flag when SILGen emits a sync call to a reasync
function.
When set, this disables the SIL verifier check against
calling async functions from sync functions.
Finally, this allows us to add end-to-end tests for
rdar://problem/71098795.
-enable-copy-propagation: enables whatever form of copy propagation
the current pipeline runs (mandatory-copy-propagation at -Onone,
regular copy-propation at -O).
-disable-copy-propagation: similarly disables any form of copy
propagation in the current pipelien.
This bleeds into the implementation where "guaranteed" is used
everywhere to talk about optimization of guaranteed values. We need to
use mandatory to indicate we're talking about the pass pipeline.
It is currently disabled so this commit is NFC.
MandatoryCopyPropagation canonicalizes all all OSSA lifetimes with
either CopyValue or DestroyValue operations. While regular
CopyPropagation only canonicalizes lifetimes with copies. This ensures
that more lifetime program bugs are found in debug builds. Eventually,
regular CopyPropagation will also canonicalize all lifetimes, but for
now, we don't want to expose optimized code to more behavior change
than necessary.
Add frontend flags for developers to easily control copy propagation:
-enable-copy-propagation: enables whatever form of copy propagation
the current pipeline runs (mandatory-copy-propagation at -Onone,
regular copy-propation at -O).
-disable-copy-propagation: similarly disables any form of copy
propagation in the current pipelien.
To control a specific variant of the passes, use
-Xllvm -disable-pass=mandatory-copy-propagation
or -Xllvm -disable-pass=copy-propagation instead.
The meaning of these flags will stay the same as we adjust the
defaults. Soon mandatory-copy-propagation will be enabled by
default. There are two reasons to do this, both related to predictable
behavior across Debug and Release builds.
1. Shortening object lifetimes can cause observable changes in program
behavior in the presense of weak/unowned reference and
deinitializer side effects.
2. Programmers need to know reliably whether a given code pattern will
copy the storage for copy-on-write types (Array, Set). Eliminating
the "unexpected" copies the same way at -Onone and -O both makes
debugging tractable and provides assurance that the code isn't
relying on the luck of the optimizer in a particular compiler
release.