Computes the side effects for a function, which consists of argument- and global effects.
This is similar to the ComputeEscapeEffects pass, just for side-effects.
In order to fully optimize OptionSet literals, it's important that this function is inlined and fully optimized.
So far this was done by chance, but with COW representation it needs a hint to the optimizer.
Constant-propagate the 0 value when loading "count" or "capacity" from the empty Array, Set or Dictionary storage.
On high-level SIL this optimization is also done by the ArrayCountPropagation pass, but only for Array.
And even for Array it's sometimes needed to propagate the empty-array count when high-level semantics function are already inlined.
Fixes an optimization deficiency for empty OptionSet literals.
https://bugs.swift.org/browse/SR-12046
rdar://problem/58861171
The purpose of this test is to check if the compiler optimizes optionsets as expected. There should not be any additional tweaks necessary, like disabling exclusivity checks.
Fortunately, optimizations are good enough in the meantime that this test also works with enabled exclusivity checking.
This change could impact Swift programs that previously appeared
well-behaved, but weren't fully tested in debug mode. Now, when running
in release mode, they may trap with the message "error: overlapping
accesses...".
Recent optimizations have brought performance where I think it needs
to be for adoption. More optimizations are planned, and some
benchmarks should be further improved, but at this point we're ready
to begin receiving bug reports. That will help prioritize the
remaining work for Swift 5.
Of the 656 public microbenchmarks in the Swift repository, there are
still several regressions larger than 10%:
TEST OLD NEW DELTA RATIO
ClassArrayGetter2 139 1307 +840.3% **0.11x**
HashTest 631 1233 +95.4% **0.51x**
NopDeinit 21269 32389 +52.3% **0.66x**
Hanoi 1478 2166 +46.5% **0.68x**
Calculator 127 158 +24.4% **0.80x**
Dictionary3OfObjects 391 455 +16.4% **0.86x**
CSVParsingAltIndices2 526 604 +14.8% **0.87x**
Prims 549 626 +14.0% **0.88x**
CSVParsingAlt2 1252 1411 +12.7% **0.89x**
Dictionary4OfObjects 206 232 +12.6% **0.89x**
ArrayInClass 46 51 +10.9% **0.90x**
The common pattern in these benchmarks is to define an array of data
as a class property and to repeatedly access that array through the
class reference. Each of those class property accesses now incurs a
runtime call. Naturally, introducing a runtime call in a loop that
otherwise does almost no work incurs substantial overhead. This is
similar to the issue caused by automatic reference counting. In some
cases, more sophistacated optimization will be able to determine the
same object is repeatedly accessed. Furthermore, the overhead of the
runtime call itself can be improved. But regardless of how well we
optimize, there will always a class of microbenchmarks in which the
runtime check has a noticeable impact.
As a general guideline, avoid performing class property access within
the most performance critical loops, particularly on different objects
in each loop iteration. If that isn't possible, it may help if the
visibility of those class properties is private or internal.
Replacing an alloc_stack of a struct/tuple with multiple alloc_stacks of the struct/tuple elements should only be done if the elements are somehow accessed individually.
If not, e.g. if the whole struct/tuple is just copied, there is no benefit of doing SROA.
Although this change has little impact by its own (some small code size wins), it is important for the improvement of let-property optimization.