Detect stores to globals which get the address from an addressor call (and not from global_addr).
This kind of SIL is currently not generated, so most likely this bug does not show up. But as soon as we CSE addressor calls, it can be a real problem.
This pass makes assumptions about the visibility of a type's memory
based on the visibility of its properties. This is the wrong way to
think about memory visibility.
Fixes <rdar://57564377> Struct component with 'let', unexpected
behavior
This pass wants assume that the contents of a property is known based
on whether the property is declared as a 'let' and the visibility of
the initializers that access the property. For example:
```
public struct X<T> {
public let hidden: T
init(t: T) { self.hidden = t }
}
```
The pass currently assumes that `X` only takes on values that are
assigned by the invocations of `X.init`, which is only visible in `X`s
module. This is wrong if the layout of `Impl` is exposed to other
modules. A struct's memory may be initialized by any module with
access to the struct's layout.
In fact, this assumption is wrong even if the struct, and it's let
property cannot be accessed externally by name. In this next example,
external modules cannot access `Impl` or `Impl.hidden` by name, but
can still access the memory.
```
internal struct Impl<T> {
let hidden: T
init(t: T) { self.hidden = t }
}
public struct Wrapper<T> {
var impl: Impl<T>
public var property: T {
get {
return impl.hidden
}
}
}
```
As long as `Wrapper`s layout is exposed to other modules, the contents
of `Wrapper`, `Impl`, and `hidden' can all be initialized in another
module
```
// Legal as long Wrapper's home module is *not* built with library evolution
// (or if Wrapper is declared `@frozen`).
func inExternalModule(buffer: UnsafeRawPointer) -> Wrapper<Int64> {
return buffer.load(as: Wrapper<Int64>.self)
}
```
If library evolution is enabled and a `public` struct is not declared
`@frozen` then external modules cannot assume its layout, and therefore
cannot initialize the struct memory. In that case, it is possible to optimize
`X.hidden` and `Impl.hidden` as if the properties are only initialized inside
their home module.
The right way to view a type's memory visibility is to consider whether
external modules have access to the layout of the type. If not, then the
property can still be optimized As long as a struct is never enclosed in a
public effectively-`@frozen` type. However, finding all places where a struct
is explicitly created is still insufficient. Instead, the optimization needs
to find all uses of enclosing types and determine if every use has a known
constant initialization, or is simply copied from another value. If an
escaping unsafe pointer to any enclosing type is created, then the
optimization is not valid.
When viewed this way, the fact that a property is declared 'let' is mostly
irrelevant to this optimization--it can be expanded to handle non-'let'
properties. The more salient feature is whether the propery has a public
setter.
For now, this optimization only recognizes class properties because class
properties are only accessibly via a ref_element_addr instruction. This is a
side effect of the fact that accessing a class property requires a "formal
access". This means that begin_access marker must be emitted directly on the
address produced by a ref_element_addr. Struct properties are not handled, as
explained above, because they can be indirectly accessed via addresses of
outer types.
In order to allow this, I've had to rework the syntax of substituted function types; what was previously spelled `<T> in () -> T for <X>` is now spelled `@substituted <T> () -> T for <X>`. I think this is a nice improvement for readability, but it did require me to churn a lot of test cases.
Distinguishing the substitutions has two chief advantages over the existing representation. First, the semantics seem quite a bit clearer at use points; the `implicit` bit was very subtle and not always obvious how to use. More importantly, it allows the expression of generic function types that must satisfy a particular generic abstraction pattern, which was otherwise impossible to express.
As an example of the latter, consider the following protocol conformance:
```
protocol P { func foo() }
struct A<T> : P { func foo() {} }
```
The lowered signature of `P.foo` is `<Self: P> (@in_guaranteed Self) -> ()`. Without this change, the lowered signature of `A.foo`'s witness would be `<T> (@in_guaranteed A<T>) -> ()`, which does not preserve information about the conformance substitution in any useful way. With this change, the lowered signature of this witness could be `<T> @substituted <Self: P> (@in_guaranteed Self) -> () for <A<T>>`, which nicely preserves the exact substitutions which relate the witness to the requirement.
When we adopt this, it will both obviate the need for the special witness-table conformance field in SILFunctionType and make it far simpler for the SILOptimizer to devirtualize witness methods. This patch does not actually take that step, however; it merely makes it possible to do so.
As another piece of unfinished business, while `SILFunctionType::substGenericArgs()` conceptually ought to simply set the given substitutions as the invocation substitutions, that would disturb a number of places that expect that method to produce an unsubstituted type. This patch only set invocation arguments when the generic type is a substituted type, which we currently never produce in type-lowering.
My plan is to start by producing substituted function types for accessors. Accessors are an important case because the coroutine continuation function is essentially an implicit component of the function type which the current substitution rules simply erase the intended abstraction of. They're also used in narrower ways that should exercise less of the optimizer.
This is in prepration for other bug fixes.
Clarify the SIL utilities that return canonical address values for
formal access given the address used by some memory operation:
- stripAccessMarkers
- getAddressAccess
- getAccessedAddress
These are closely related to the code in MemAccessUtils.
Make sure passes use these utilities consistently so that
optimizations aren't defeated by normal variations in SIL patterns.
Create an isLetAddress() utility alongside these basic utilities to
make sure it is used consistently with the address corresponding to
formal access. When this query is used inconsistently, it defeats
optimization. It can also cause correctness bugs because some
optimizations assume that 'let' initialization is only performed on a
unique address value.
Functional changes to Memory Behavior:
- An instruction with side effects now conservatively still has side
effects even when the queried value is a 'let'. Let values are
certainly sensitive to side effects, such as the parent object being
deallocated.
- Return the correct MemBehavior for begin/end_access markers.
* Simplified the logic for creating static initializers and constant folding for global variables: instead of creating a getter function, directly inline the constant value into the use-sites.
* Wired up the constant folder in GlobalOpt, so that a chains for global variables can be propagated, e.g.
let a = 1
let b = a + 10
let c = b + 5
* Fixed a problem where we didn't create a static initializer if a global is not used in the same module. E.g. a public let variable.
* Simplified the code in general.
rdar://problem/31515927
Add ExecuteSILPipelineRequest which executes a
pipeline plan on a given SIL (and possibly IRGen)
module. This serves as a top-level request for
the SILOptimizer that we'll be able to hang
dependencies off.
Changes:
* Allow optimizing partial_apply capturing opened existential: we didn't do this originally because it was complicated to insert the required alloc/dealloc_stack instructions at the right places. Now we have the StackNesting utility, which makes this easier.
* Support indirect-in parameters. Not super important, but why not? It's also easy to do with the StackNesting utility.
* Share code between dead closure elimination and the apply(partial_apply) optimization. It's a bit of refactoring and allowed to eliminate some code which is not used anymore.
* Fix an ownership problem: We inserted copies of partial_apply arguments _after_ the partial_apply (which consumes the arguments).
* When replacing an apply(partial_apply) -> apply and the partial_apply becomes dead, avoid inserting copies of the arguments twice.
These changes don't have any immediate effect on our current benchmarks, but will allow eliminating curry thunks for existentials.
... including all SIL functions with are transitively referenced from such witness tables.
After the last devirtualizer run witness tables are not needed in the optimizer anymore.
We can delete witness tables with an available-externally linkage. IRGen does not emit such witness tables anyway.
This can save a little bit of compile time, because it reduces the amount of SIL at the end of the optimizer pipeline.
It also reduces the size of the SIL output after the optimizer, which makes debugging the SIL output easier.
This commit eliminates the need for mark uninitialized fixup by updating the
compiler so that we now emit:
```
%0 = alloc_box
%1 = mark_uninitialized %0
%2 = project_box %1
...
destroy_value %1
```
Instead of:
```
%0 = alloc_box
%1 = project_box %0
%2 = mark_uninitialized %1
...
destroy_value %0
```
Now that the first type of code is generated, I can change project_box to only
take guaranteed arguments. This will ensure that the OSSA ARC optimizer can
eliminate copies of boxes without needing to understand the usage of the
project_box.
To avoid duplicate-symbol linker errors. Instead make them alwaysEmitIntoClient.
But only do that for thunks to limit the code size impact. Anyway, it's only important for thunks because thunks are generated at SILGen, i.e. before CrossModuleSerializationSetup.
Other shared functions, e.g. specializations are created after CrossModuleSerializationSetup, so we don't have to deal with them.