Commit Graph

352 Commits

Author SHA1 Message Date
Andrew Trick
b2d1ac1631 Add AccessPathVerification pass and run it in the pipeline. 2020-10-16 15:00:10 -07:00
Arnold Schwaighofer
b994bf3191 Add support for _specialize(exported: true, ...)
This attribute allows to define a pre-specialized entry point of a
generic function in a library.

The following definition provides a pre-specialized entry point for
`genericFunc(_:)` for the parameter type `Int` that clients of the
library can call.

```
@_specialize(exported: true, where T == Int)
public func genericFunc<T>(_ t: T) { ... }
```

Pre-specializations of internal `@inlinable` functions are allowed.

```
@usableFromInline
internal struct GenericThing<T> {
  @_specialize(exported: true, where T == Int)
  @inlinable
  internal func genericMethod(_ t: T) {
  }
}
```

There is syntax to pre-specialize a method from a different module.

```
import ModuleDefiningGenericFunc

@_specialize(exported: true, target: genericFunc(_:), where T == Double)
func prespecialize_genericFunc(_ t: T) { fatalError("dont call") }

```

Specially marked extensions allow for pre-specialization of internal
methods accross module boundries (respecting `@inlinable` and
`@usableFromInline`).

```
import ModuleDefiningGenericThing
public struct Something {}

@_specializeExtension
extension GenericThing {
  @_specialize(exported: true, target: genericMethod(_:), where T == Something)
  func prespecialize_genericMethod(_ t: T) { fatalError("dont call") }
}
```

rdar://64993425
2020-10-12 09:19:29 -07:00
Erik Eckstein
68f485424c SILOptimizer: add an additional TempRValueOpt pass later in the pipeline.
This can compensate the performance regression of the more conservative handling of function calls in TempRValueOpt (see previous commit).
The pass runs after the inlining passes and can therefore optimize in some cases where it's not possible before inlining.
2020-10-09 20:54:59 +02:00
Michael Gottesman
4c8d09feb3 [ownership] Move ownership lowering past SROA.
I already updated SROA for this and we already have tests/etc. We have just been
waiting on some other passes to be moved afterwards.
2020-09-30 16:08:44 -05:00
Michael Gottesman
c3bc8e8ef9 [ownership] Move ownership elimination on the stdlib passed lower aggregate instrs. 2020-09-30 11:43:34 -05:00
Michael Gottesman
d1f43032fc [ownership] Move ownership passed TempLValueOpt for the stdlib and add an ossa test case. 2020-09-29 16:36:12 -05:00
Meghana Gupta
163d47ec90 Revert "Revert #33106 and #33205" (#34106) 2020-09-28 23:08:14 -07:00
Meghana Gupta
49d93c58a7 Revert "[ownership] Move OME after CopyForwarding (#33106)"
This reverts commit ef972eb34d.
2020-09-25 11:49:07 -07:00
Meghana Gupta
ef972eb34d [ownership] Move OME after CopyForwarding (#33106)
* Move OME after CopyForwarding

* Minor fix in CopyForwarding test
2020-09-24 20:59:28 -07:00
Michael Gottesman
4cbc07c6c6 [ownership] Add a frontend option to stop optimizing right before we lower ownership.
Specifically the option: -sil-stop-optzns-before-lowering-ownership. This makes
it possible to write end-to-end tests on OSSA passes. Before one would have to
pattern match after ownership was lowered, losing the ability to do finegrained
FileCheck pattern matching on ossa itself.
2020-09-17 18:02:33 -05:00
Erik Eckstein
4d03eb4f0f SILOptimizer: Move the StringOptimization a bit earlier in the pipeline.
Needed to make sure that global initializers are not optimized in mid-level SIL while other functions are still in high-level SIL.
Having the StringOptimization not in high-level SIL was just a mistake in my earlier PR.
2020-08-03 12:01:29 +02:00
Erik Eckstein
2a035432e7 SILOptimizer: make a separate SROA pass for high-level SIL, which doesn't split String types.
The StringOptimization relies on seeing String values a a whole and not being split.
2020-08-03 12:01:29 +02:00
Erik Eckstein
7f684b62e2 SIL optimizer: Add a new string optimization.
Optimizes String operations with constant operands.

Specifically:
  * Replaces x.append(y) with x = y if x is empty.
  * Removes x.append("")
  * Replaces x.append(y) with x = x + y if x and y are constant strings.
  * Replaces _typeName(T.self) with a constant string if T is statically known.

With this optimization it's possible to constant fold string interpolations, like "the \(Int.self) type" -> "the Int type"

This new pass runs on high-level SIL, where semantic calls are still in place.

rdar://problem/65642843
2020-07-27 21:32:56 +02:00
Joe Groff
b4a0ceac71 Add PruneVTables to the performance optimizer passes. 2020-07-23 20:40:49 -07:00
Michael Gottesman
76c7c3e579 [opt-remark] Add support for emitting opt-remark-generator remarks when compiling with optimization.
In order to test this, I implemented a small source loc inference routine for
instructions without valid SILLocations. This is an optional nob that the
opt-remark writer can optionally enable on a per remark basis. The current
behaviors are just forward/backward scans in the same basic block. If we scan
forwards, if we find a valid SourceLoc, we just use ethat. If we are scanning
backwards, instead we grab the SourceRange and if it is valid use the end source
range of the given instruction. This seems to give a good result for retain
(forward scan) and release (backward scan).

The specific reason that I did that is that my test case for this are
retain/release operations. Often times these operations due to code-motion are
moved around (and rightly to prevent user confusion) given by optimizations auto
generated SIL locations. Since that is the test case I am using, to test this I
needed said engine.
2020-07-20 12:01:34 -07:00
Michael Gottesman
0dbed44ddd [ownership] Move ownership lowering past the eager specializer on the stdlib. 2020-07-10 15:31:59 -07:00
Meghana Gupta
f8d8091c98 [ownership] Move ome after GlobalOpt (#32742) 2020-07-08 14:54:59 -07:00
Michael Gottesman
ba2e04be7e [ownership] Move the stdlib ome point to before global opt.
This just moves it past the SIL linker (which since the stdlib doesn't link
anything will not change anything and past TempRValueOpt which is already
updated for OSSA.
2020-06-26 14:04:48 -07:00
Michael Gottesman
5b6918fd3f Merge pull request #32505 from gottesmm/pr-12b4fd6015e37d9a95ea6a81da117e6678369d02
[ownership] Split ownership lowering in the pass pipeline for non-transparent stdlib vs non-stdlib functions.
2020-06-23 15:02:35 -07:00
Michael Gottesman
3530f8e26d [ownership] Split ownership lowering in the pass pipeline for non-transparent stdlib vs non-stdlib functions.
I am going to be moving back ownership lowering first in the stdlib so that we
can bring up the optimizer on ownership without needing to deal with
serialization issues (the stdlib doesn't deserialize SIL from any other
modules).

This patch just begins the mechanical process with a nice commit message. Should
be NFC.
2020-06-22 18:32:17 -07:00
Erik Eckstein
a7425c16ff Improvements for cross-module-optimization
* Include small non-generic functions for serializaion
* serialize initializer of global variables: so that global let variables can be constant propagated across modules

rdar://problem/60696510
2020-06-22 16:49:26 +02:00
Erik Eckstein
9e92389fa5 SILOptimizer: a new "TempLValueOpt" optimization pass for copy_addr
Optimizes copies from a temporary (an "l-value") to a destination.

    %temp = alloc_stack $Ty
    instructions_which_store_to %temp
    copy_addr [take] %temp to %destination
    dealloc_stack %temp

is optimized to

    destroy_addr %destination
    instructions_which_store_to %destination

The name TempLValueOpt refers to the TempRValueOpt pass, which performs a related transformation, just with the temporary on the "right" side.
The TempLValueOpt is similar to CopyForwarding::backwardPropagateCopy.
It's more restricted (e.g. the copy-source must be an alloc_stack).
That enables other patterns to be optimized, which backwardPropagateCopy cannot handle.

This pass also performs a small peephole optimization which simplifies copy_addr - destroy sequences.

    copy_addr %source to %destination
    destroy_addr %source

is replace with

    copy_addr [take] %source to %destination
2020-06-22 13:47:31 +02:00
Michael Gottesman
46432404f3 [ownership] Remove dead option: enable-ownership-stripping-after-serialization.
We always lower ownership now after the diagnostic passes (what this option
actually controlled). So remove it.

NFC.
2020-06-16 10:52:02 -07:00
Michael Gottesman
702c1bc5e8 [arc] Change guaranteed arc opts to be based on SemanticARCOpts and move from Diagnostic pipeline -> Onone pipeline.
The pass is already not being run during normal compilation scenarios today
since it bails on OSSA except in certain bit-rot situations where a test wasn't
updated and so was inadvertently invoking the pass. I discovered these while
originally just trying to eliminate the pass from the diagnostic pipeline. The
reason why I am doing this in one larger change is that I found there were a
bunch of sil tests inadvertently relying on guaranteed arc opts to eliminate
copy traffic. So, if I just removed this and did this in two steps, I would
basically be unoptimizing then re-optimizing the tests.

Some notes:

1. The new guaranteed arc opts is based off of SemanticARCOpts and runs only on
   ossa. Specifically, in this new pass, we just perform simple
   canonicalizations that do not involve any significant analysis. Some
   examples: a copy_value all of whose uses are destroys. This will do what the
   original pass did and more without more compile time. I did a conservative
   first approximation, but we can probably tune this a bit.

2. the reason why I am doing this now is that I was trying to eliminate the
   enable-ownership-stripping-after-serialization flag and discovered that the
   test opaque_value_mandatory implicitly depends on this since sil-opt by
   default was the only place left in the compiler with that option set to false
   by default. So I am eliminating that dependency before I land the larger
   change.
2020-06-15 17:00:18 -07:00
Erik Eckstein
6569c98332 SIL optimizer: add an additional stack promotion pass to the late pipeline
Sometimes stack promotion can catch cases only at a late stage of the pipeline, after FunctionSignatureOpts.

https://bugs.swift.org/browse/SR-12773
rdar://problem/63068408
2020-05-28 10:23:40 +02:00
Erik Eckstein
216eec2d21 SIL optimizer: add an additional LICM pass to the pipeline.
The COWOpts optimization relies more on LICM. This additional run of the pass ensures that there is no phase ordering issue between LICM and COWOpts
2020-05-26 18:01:17 +02:00
Erik Eckstein
9722578df6 SILOptimizer: a new optimization for copy-on-write
Constant folds the uniqueness result of begin_cow_mutation instructions, if it can be proved that the buffer argument is uniquely referenced.
For example:

     %buffer = end_cow_mutation %mutable_buffer
     // ...
     // %buffer does not escape here
     // ...
     (%is_unique, %mutable_buffer2) = begin_cow_mutation %buffer
     cond_br %is_unique, ...

is replaced with

     %buffer = end_cow_mutation [keep_unique] %mutable_buffer
     // ...
     (%not_used, %mutable_buffer2) = begin_cow_mutation %buffer
     %true = integer_literal 1
     cond_br %true, ...

Note that the keep_unique flag is set on the end_cow_mutation because the code now relies on that the buffer is really uniquely referenced.

The optimization can also handle def-use chains between end_cow_mutation and begin_cow_mutation which involve phi-arguments.

An additional peephole optimization is performed: if the begin_cow_mutation is the only use of the end_cow_mutation, the whole pair of instructions is eliminated.
2020-05-26 18:01:17 +02:00
Erik Eckstein
ad99b9d4f8 SILOptimizer: a new phi-argument expansion optimization.
If only a single field of a struct phi-argument is used, replace the argument by the field value.

     br bb(%str)
   bb(%phi):
     %f = struct_extract %phi, #Field // the only use of %phi
     use %f

is replaced with

     %f = struct_extract %str, #Field
     br bb(%f)
   bb(%phi):
     use %phi

This also works if the phi-argument is in a def-use cycle.

The new PhiExpansionPass is in the same file as the RedundantPhiEliminationPass. Therefore I renamed the source file to PhiArgumentOptimizations.cpp
2020-05-25 09:36:09 +02:00
Meghana Gupta
47fe49a2a9 Fix the mid-level function-pass pipeline (#31424)
* Fix the mid-level pass pipeline.

Module passes need to be in a separate pipeline, otherwise the
pipeline restart mechanism will be broken.

This makes GlobalOpt and serialization run earlier in the
pipeline. There's no explicit reason for them to be run later, in the
middle of a function pass pipeline.

Also, pipeline boundaries, like serialization and module passes should
be explicit at the the top level function that creates the pass
pipelines.

* SILOptimizer: Add enforcement of function-pass pipelines.

Don't allow module passes to be inserted within a function pass
pipeline. This silently breaks the function pipeline both interfering
with analysis and the normal pipeline restart mechanism.

* Add misssing pass in addFunctionPasses

Co-authored-by: Andrew Trick <atrick@apple.com>
2020-05-03 18:23:40 -07:00
Erik Eckstein
53f6fdadc6 SILOptimizer: reorganize the optimization-prepare passpipeline
Don't create a separate pass manager for those passes, just let them run at the beginning of the performance pipeline.
Regarding generated code this is a NFC.

This change fixes a problem with pass-bisecting (for debugging). Having two instances of the pass manager can cause troubles with bisecting, because -sil-opt-pass-count affects both pass managers at the same time.
2020-04-24 15:48:48 +02:00
ematejska
4cd68edf8c [Autodiff upstream] Add DifferentiabilityWitnessDevirtualizer SILOptimizer pass (#30984)
Add DifferentiabilityWitnessDevirtualizer: an optimization pass that
devirtualizes `differentiability_witness_function` instructions into
`function_ref` instructions.

Co-authored-by: Dan Zheng <danielzheng@google.com>
2020-04-23 02:13:05 -07:00
Dan Zheng
1775e8ae16 [AutoDiff upstream] Add VJPEmitter.
`VJPEmitter` is a cloner that emits VJP functions. It implements reverse-mode
automatic differentiation, along with `PullbackEmitter`.

`VJPEmitter` clones an original function, replacing function applications with
VJP function applications. In VJP functions, each basic block takes a pullback
struct (containing callee pullbacks) and produces a predecessor enum: these data
structures are consumed by pullback functions.
2020-04-05 20:35:35 -07:00
Dan Zheng
aa66cce808 [AutoDiff upstream] Add differentiation transform.
The differentiation transform does the following:
- Canonicalizes differentiability witnesses by filling in missing derivative
  function entries.
- Canonicalizes `differentiable_function` instructions by filling in missing
  derivative function operands.
- If necessary, performs automatic differentiation: generating derivative
  functions for original functions.
  - When encountering non-differentiability code, produces a diagnostic and
    errors out.

Partially resolves TF-1211: add the main canonicalization loop.

To incrementally stage changes, derivative functions are currently created
with empty bodies that fatal error with a nice message.

Derivative emitters will be upstreamed separately.
2020-04-02 15:43:57 -07:00
Erik Eckstein
93a0dfc578 SILOptimizer: a new small optimization pass to remove redundant basic block arguments.
RedundantPhiElimination eliminates block phi-arguments which have the same value as other arguments of the same block.
This also works with cycles, like two equivalent loop induction variables. Such patterns are generated e.g. when using stdlib's enumerated() on Array.

   preheader:
     br bb1(%initval, %initval)
   header(%phi1, %phi2):
     %next1 = builtin "add" (%phi1, %one)
     %next2 = builtin "add" (%phi2, %one)
     cond_br %loopcond, header(%next1, %next2), exit
   exit:

is replaced with

   preheader:
     br bb1(%initval)
   header(%phi1):
     %next1 = builtin "add" (%phi1, %one)
     %next2 = builtin "add" (%phi1, %one) // dead: will be cleaned-up later
     cond_br %loopcond, header(%next1), exit
   exit:

Any remaining dead or "trivially" equivalent instructions will then be cleaned-up by DCE and CSE, respectively.

rdar://problem/33438123
2020-03-26 19:30:01 +01:00
Erik Eckstein
e0c4fa2d92 SILOptimizer: add a LICM pass at mid-level in the pass pipeline.
It's needed to hoist global_init calls.
2020-03-23 15:53:23 +01:00
Michael Gottesman
e976aa9071 Merge pull request #29111 from gottesmm/pr-d5a69902a451af42884c0e9cc4b6f18ecf246ada
[passmanager] Change SIL pass pipeline plan to use an LLVM YAML representation.
2020-03-11 16:11:55 -07:00
Michael Gottesman
52c5f721b9 [passmanager] Change SIL pass pipeline plan to use an LLVM YAML representation.
This eliminates a bunch of code and will make it significantly easier to
maintain/add to this code/use this code.
2020-03-11 14:14:18 -07:00
Michael Gottesman
d1b41e9ac4 [ownership] Add an extra run of TempRValueOpt before Semantic ARC Opts.
There is a natural synergy in between the two passes since TempRValueOpt often
times eliminates temporaries that prevent Semantic ARC Opts from removing ARC
traffic.

NOTE: The reason why I am adding an extra run rather than moving the
TempRValueOpt that runs slightly after SemanticARCOpts on non-ownership SIL is
that the run afterwards is able to run on non-ossa code from the stdlib/etc and
eliminate copies that way. With time once we transition to always serializing in
OSSA form, we will be able to get rid of that second run.
2020-02-26 15:24:07 -08:00
Ravi Kandhadai
ec9844b2d9 [SIL Optimization] Add a new mandatory pass for unrolling forEach
calls over arrays created from array literals. This enables optimizing
further the output of the OSLogOptimization pass, and results in
highly-compact and optimized IR for calls to the new os log API.

<rdar://58928427>
2020-02-07 20:06:29 -08:00
Michael Gottesman
c09f397ce6 Experiment, eliminate early semantic arc opts run. 2020-02-03 10:08:41 -08:00
Erik Eckstein
03b0a6c148 DeadFunctionElimination: remove externally available witness tables at the end of the pipeline
... including all SIL functions with are transitively referenced from such witness tables.

After the last devirtualizer run witness tables are not needed in the optimizer anymore.
We can delete witness tables with an available-externally linkage. IRGen does not emit such witness tables anyway.
This can save a little bit of compile time, because it reduces the amount of SIL at the end of the optimizer pipeline.
It also reduces the size of the SIL output after the optimizer, which makes debugging the SIL output easier.
2020-01-27 14:45:10 +01:00
Meghana Gupta
5285bf7200 Turn off speculative devirtualization by default. (#29359)
Turn off speculative devirtualization by default. Add a flag to support enabling the pass.
Fixes rdar://58778959 and rdar://58429282
2020-01-24 10:23:06 -08:00
Michael Gottesman
28ffcf9a7a [ownership] Eliminate the need for mark_uninitialized fixup.
This commit eliminates the need for mark uninitialized fixup by updating the
compiler so that we now emit:

```
%0 = alloc_box
%1 = mark_uninitialized %0
%2 = project_box %1
...
destroy_value %1
```

Instead of:

```
%0 = alloc_box
%1 = project_box %0
%2 = mark_uninitialized %1
...
destroy_value %0
```

Now that the first type of code is generated, I can change project_box to only
take guaranteed arguments. This will ensure that the OSSA ARC optimizer can
eliminate copies of boxes without needing to understand the usage of the
project_box.
2020-01-02 09:54:18 -08:00
Erik Eckstein
f03956b30c Cross-module-optimization: Serialize immediately after CrossModuleSerializationSetup
Otherwise it can happen that e.g. specialization runs between CrossModuleSerializationSetup  and serialization, resulting that an inlinable function references a shared function (which doesn't have a public linkage).
The solution is to move serialization right after CrossModuleSerializationSetup. But only do that if cross-module-optimization is enabled (it would be a disruptive change to move serialization in general).
2019-12-11 18:14:41 +01:00
swift-ci
f0157b0f87 Merge pull request #28473 from atrick/arrayproperty-opt 2019-12-03 10:32:49 -08:00
Erik Eckstein
a5397b434c Cross module optimization
This is a first version of cross module optimization (CMO).

The basic idea for CMO is to use the existing library evolution compiler features, but in an automated way. A new SIL module pass "annotates" functions and types with @inlinable and @usableFromInline. This results in functions being serialized into the swiftmodule file and thus available for optimizations in client modules.
The annotation is done with a worklist-algorithm, starting from public functions and continuing with entities which are used from already selected functions. A heuristic performs a preselection on which functions to consider - currently just generic functions are selected.

The serializer then writes annotated functions (including function bodies) into the swiftmodule file of the compiled module. Client modules are able to de-serialize such functions from their imported modules and use them for optimiations, like generic specialization.

The optimization is gated by a new compiler option -cross-module-optimization (also available in the swift driver).
By default this option is off. Without turning the option on, this change is (almost) a NFC.

rdar://problem/22591518
2019-12-03 14:37:01 +01:00
Andrew Trick
4da33e15ad Cleanup: move ArrayPropertyOpt out of COWArrayOpt.cpp.
These are separate, mostly unrelated passes. Putting them in their own
files makes it easier to read the code, understand how to control the
passes, and makes it possible to independently trace, and debug them.
2019-11-25 11:53:49 -08:00
Michael Gottesman
47de65c17a [ownership] Add an additional run of the SemanticARCOpts at the beginning of the perf pipeline.
I left in the run before DestroyHoisting since I believe that DestroyHoisting
depends a bit on SemanticARCOpts running, but at the same time I don't want to
deal with any regressions that may come from moving DestroyHoisting.
2019-11-20 17:08:03 -08:00
Michael Gottesman
8de96f3959 [silopt] Add a new SerializeSIL utility pipeline.
This pipeline just runs the Serialize SIL pass. The reason I am adding this is
that currently if one passes -disable-sil-perf-optzns or mess with
-sil-opt-pass-count, one can cause serialization to not occur, making it
difficult to bisect/turn off-on opts.
2019-11-18 16:14:57 -08:00
Arnold Schwaighofer
8aaa7b4dc1 SILOptimizer: Pipe through TypeExpansionContext 2019-11-11 14:21:52 -08:00