Type annotations for instruction operands are omitted, e.g.
```
%3 = struct $S(%1, %2)
```
Operand types are redundant anyway and were only used for sanity checking in the SIL parser.
But: operand types _are_ printed if the definition of the operand value was not printed yet.
This happens:
* if the block with the definition appears after the block where the operand's instruction is located
* if a block or instruction is printed in isolation, e.g. in a debugger
The old behavior can be restored with `-Xllvm -sil-print-types`.
This option is added to many existing test files which check for operand types in their check-lines.
The ComputeEffects pass derives escape information for function arguments and adds those effects in the function.
This needs a lot of changes in check-lines in the tests, because the effects are printed in SIL
The ComputeEffects pass derives escape information for function arguments and adds those effects in the function.
This needs a lot of changes in check-lines in the tests, because the effects are printed in SIL
This pass makes assumptions about the visibility of a type's memory
based on the visibility of its properties. This is the wrong way to
think about memory visibility.
Fixes <rdar://57564377> Struct component with 'let', unexpected
behavior
This pass wants assume that the contents of a property is known based
on whether the property is declared as a 'let' and the visibility of
the initializers that access the property. For example:
```
public struct X<T> {
public let hidden: T
init(t: T) { self.hidden = t }
}
```
The pass currently assumes that `X` only takes on values that are
assigned by the invocations of `X.init`, which is only visible in `X`s
module. This is wrong if the layout of `Impl` is exposed to other
modules. A struct's memory may be initialized by any module with
access to the struct's layout.
In fact, this assumption is wrong even if the struct, and it's let
property cannot be accessed externally by name. In this next example,
external modules cannot access `Impl` or `Impl.hidden` by name, but
can still access the memory.
```
internal struct Impl<T> {
let hidden: T
init(t: T) { self.hidden = t }
}
public struct Wrapper<T> {
var impl: Impl<T>
public var property: T {
get {
return impl.hidden
}
}
}
```
As long as `Wrapper`s layout is exposed to other modules, the contents
of `Wrapper`, `Impl`, and `hidden' can all be initialized in another
module
```
// Legal as long Wrapper's home module is *not* built with library evolution
// (or if Wrapper is declared `@frozen`).
func inExternalModule(buffer: UnsafeRawPointer) -> Wrapper<Int64> {
return buffer.load(as: Wrapper<Int64>.self)
}
```
If library evolution is enabled and a `public` struct is not declared
`@frozen` then external modules cannot assume its layout, and therefore
cannot initialize the struct memory. In that case, it is possible to optimize
`X.hidden` and `Impl.hidden` as if the properties are only initialized inside
their home module.
The right way to view a type's memory visibility is to consider whether
external modules have access to the layout of the type. If not, then the
property can still be optimized As long as a struct is never enclosed in a
public effectively-`@frozen` type. However, finding all places where a struct
is explicitly created is still insufficient. Instead, the optimization needs
to find all uses of enclosing types and determine if every use has a known
constant initialization, or is simply copied from another value. If an
escaping unsafe pointer to any enclosing type is created, then the
optimization is not valid.
When viewed this way, the fact that a property is declared 'let' is mostly
irrelevant to this optimization--it can be expanded to handle non-'let'
properties. The more salient feature is whether the propery has a public
setter.
For now, this optimization only recognizes class properties because class
properties are only accessibly via a ref_element_addr instruction. This is a
side effect of the fact that accessing a class property requires a "formal
access". This means that begin_access marker must be emitted directly on the
address produced by a ref_element_addr. Struct properties are not handled, as
explained above, because they can be indirectly accessed via addresses of
outer types.
Tests which check if the optimizer is able to generate a certain code should never be "worked around" by adding command line options.
This defeats the purpose of such tests.
Unfortunately some optimizer deficiencies got unnoticed by adding this option.
to-do: there are more such cases which I didn't fix in this PR yet.
This change could impact Swift programs that previously appeared
well-behaved, but weren't fully tested in debug mode. Now, when running
in release mode, they may trap with the message "error: overlapping
accesses...".
Recent optimizations have brought performance where I think it needs
to be for adoption. More optimizations are planned, and some
benchmarks should be further improved, but at this point we're ready
to begin receiving bug reports. That will help prioritize the
remaining work for Swift 5.
Of the 656 public microbenchmarks in the Swift repository, there are
still several regressions larger than 10%:
TEST OLD NEW DELTA RATIO
ClassArrayGetter2 139 1307 +840.3% **0.11x**
HashTest 631 1233 +95.4% **0.51x**
NopDeinit 21269 32389 +52.3% **0.66x**
Hanoi 1478 2166 +46.5% **0.68x**
Calculator 127 158 +24.4% **0.80x**
Dictionary3OfObjects 391 455 +16.4% **0.86x**
CSVParsingAltIndices2 526 604 +14.8% **0.87x**
Prims 549 626 +14.0% **0.88x**
CSVParsingAlt2 1252 1411 +12.7% **0.89x**
Dictionary4OfObjects 206 232 +12.6% **0.89x**
ArrayInClass 46 51 +10.9% **0.90x**
The common pattern in these benchmarks is to define an array of data
as a class property and to repeatedly access that array through the
class reference. Each of those class property accesses now incurs a
runtime call. Naturally, introducing a runtime call in a loop that
otherwise does almost no work incurs substantial overhead. This is
similar to the issue caused by automatic reference counting. In some
cases, more sophistacated optimization will be able to determine the
same object is repeatedly accessed. Furthermore, the overhead of the
runtime call itself can be improved. But regardless of how well we
optimize, there will always a class of microbenchmarks in which the
runtime check has a noticeable impact.
As a general guideline, avoid performing class property access within
the most performance critical loops, particularly on different objects
in each loop iteration. If that isn't possible, it may help if the
visibility of those class properties is private or internal.
Based on the review of my previous commit, I did some re-factorings.
The code is now simpler and handles more cases. More tests were added.
The overall idea of this rewrite is that the pass basically tries to check if it can
see all possible writes (i.e. initializations) into a given let property. Only if it can be proven
that the pass sees all possible writes and all those initializations are producing
the same constant, statically known value, the pass propagates this constant value
into uses of a property.
SR-1026 and rdar://25303106