Commit Graph

298 Commits

Author SHA1 Message Date
Erik Eckstein
569a520db2 SILOptimizer: add a mandatory generic-specialization pass.
This pass is only used for functions with performance annotations (@_noLocks, @_noAllocation).
It runs in the mandatory pipeline and specializes all function calls in performance-annotated functions and functions which are called from such functions.
In addition, the pass also does some other related optimizations: devirtualization, constant-folding Builtin.canBeClass, inlining of transparent functions and memory access optimizations.
2021-10-28 18:43:14 +02:00
Meghana Gupta
2d7e20e599 Remove unused mode in OME 2021-10-05 14:19:58 -07:00
Nate Chandler
4ff34aab31 [SIL] Enabled printing canonical module.
To print the module, use the new llvm flag -sil-print-canonical-module
which parallels the existing flag -sil-view-canonical-cfg.  When that
flag is passed, the new pass ModulePrinter is added to the diagnostic
pass pipeline after mandatory diagnostics have run.  The new pass just
prints the module to stdout.
2021-09-22 12:35:41 -07:00
Andrew Trick
e85228491d Rename AccessedStorage to AccessStorage
to be consistent with AccessPath and AccessBase.

Otherwise, the arbitrary name difference adds constant friction.
2021-09-21 23:18:24 -07:00
Erik Eckstein
8be0ca07c2 SIL optimizer: remove unbalanced retains/releases from immortal objects
ARC operations don't have an effect on immortal objects, like the empty array singleton or statically allocated arrays.
Therefore we can freely remove and retain/release instructions on such objects, even if there is no paired balanced ARC operation.

This optimization can only be done with a minimum deployment target of Swift 5.1, because in that version we added immortal ref count bits.

The optimization is implemented in libswift. Additionally, the remaining logic of simplifying strong_retain and strong_release is also ported to libswift.

rdar://81482156
2021-08-23 10:23:59 +02:00
Michael Gottesman
18670fc389 [assembly-vision] Rename opt remark generator to assembly vision remark generator.
TLDR: The reason why I am doing this is that often times people confuse assembly
vision remarks for normal opt remarks. I want to accentuate that this is
actually trying to do something different than a traditional opt remark. To that
end I renamed things in the compiler and added a true attribute
`@_assemblyVision` to trigger the compiler to emit these remarks to help
everyone remember what this is in their ontology. I explain below the
difference.

----

Normal opt remarks work by the optimizer telling you if it succeeded or failed
to perform an optimization. Another way of putting this is that opt remarks is
trying to give back feedback to the user from an expert system about why it did
or not do something. There is inherently an act of interpretation in the
optimizer about whether or not to report an 'action' that it perpetrated to the
user.

Assembly Vision Remarks is instead trying to be an expert tool that acts like an
xray. Instead of telling the user about what the optimizer did, it is instead a
simple visitor that visits the IR and emits SourceLocations for where specific
hazards ending up in the program. In this sense it is just telling the user
where certain instructions ended up and using heuristics to relate this to
information at the IR level. To a get a sense of this difference, consider the
following Swift Code:

```
public class Klass {
    func doSomething() {}
}

var global: Klass = Klass()

@inline(__always)
func bar() -> Klass { global }

@_assemblyVision
@inline(never)
func foo() {
  bar().doSomething()
}
```

In this case, we will emit the following remarks:

```
test.swift:16:5: remark: begin exclusive access to value of type 'Klass'
    bar().doSomething()
    ^
test.swift:7:5: note: of 'global'
var global: Klass = Klass()
    ^
test.swift:16:9: remark: end exclusive access to value of type 'Klass'
    bar().doSomething()
        ^
test.swift:7:5: note: of 'global'
var global: Klass = Klass()
    ^
test.swift:16:11: remark: retain of type 'Klass'
    bar().doSomething()
          ^
test.swift:7:5: note: of 'global'
var global: Klass = Klass()
    ^
test.swift:16:23: remark: release of type 'Klass'
    bar().doSomething()
                      ^
test.swift:7:5: note: of 'global'
var global: Klass = Klass()
    ^
```

Notice how the begin/end exclusive access are marked as actually being before
the retain, release of global. That seems weird since exclusive access to memory
seems like something that should not escape an exclusivity scope... but in fact
this corresponds directly to what we eventually see in the SIL:

```
// test.sil
sil hidden [noinline] [_semantics "optremark"] @$ss3fooyyF : $@convention(thin) () -> () {
bb0:
  %0 = global_addr @$ss6globals5KlassCvp : $*Klass
  %1 = begin_access [read] [dynamic] [no_nested_conflict] %0 : $*Klass
  %2 = load %1 : $*Klass
  end_access %1 : $*Klass
  %4 = class_method %2 : $Klass, #Klass.doSomething : (Klass) -> () -> (), $@convention(method) (@guaranteed Klass) -> ()
  strong_retain %2 : $Klass
  %6 = apply %4(%2) : $@convention(method) (@guaranteed Klass) -> ()
  strong_release %2 : $Klass
  %8 = tuple ()
  return %8 : $()
} // end sil function '$ss3fooyyF'
```

and assembly,

```
// test.S
_$ss3fooyyF:
	pushq	%rbp
	movq	%rsp, %rbp
	pushq	%r13
	pushq	%rbx
	subq	$32, %rsp
	leaq	_$ss6globals5KlassCvp(%rip), %rdi
	leaq	-40(%rbp), %rsi
	xorl	%edx, %edx
	xorl	%ecx, %ecx
	callq	_swift_beginAccess
	movq	_$ss6globals5KlassCvp(%rip), %r13
	movq	(%r13), %rax
	movq	80(%rax), %rbx
	movq	%r13, %rdi
	callq	_swift_retain
	callq	*%rbx
	movq	%r13, %rdi
	callq	_swift_release
	addq	$32, %rsp
	popq	%rbx
	popq	%r13
	popq	%rbp
	retq
```

so as one can see what we are trying to do is inform the user of hazards in the
code without trying to reason about it, automated a task that users often have
to perform by hand: inspection of assembly to determine where runtime calls and
other hazards ended up.
2021-06-12 15:09:46 -07:00
Erik Eckstein
e8d408d036 libswift: implement SILCombine's global_value optimization as instruction pass in libswift.
But keeping the old visit function as legacy
2021-06-09 11:33:41 +02:00
Erik Eckstein
1010f1e2ae libswift: infrastructure to define "instruction" passes.
Instruction passes are basically visit functions in SILCombine for a specific instruction type.
With the macro SWIFT_INSTRUCTION_PASS such a pass can be declared in Passes.def.
SILCombine then calls the run function of the pass in libswift.
2021-06-09 11:33:41 +02:00
Erik Eckstein
6fe719624a libswift: implement the MergeCondFail pass in libswift
But keeping the old pass as legacy pass
2021-06-09 11:33:41 +02:00
Erik Eckstein
9c363fb807 libswift: add an example SILPrinter pass.
SILPrinter prints the SIL of a function.
It's and example pass which demonstrates the basic SIL API of libswift.
2021-06-09 11:33:41 +02:00
Erik Eckstein
81f5e2f467 libswift: Infrastructure to call libswift function passes from the SILOptimizer's PassManager
With the macro SWIFT_FUNCTION_PASS a new libswift function pass can be defined in Passes.def.
The SWIFT_FUNCTION_PASS_WITH_LEGACY is similar, but it allows to keep an original C++ “legacy” implementation of the pass, which is used if the compiler is not built with libswift.
2021-06-09 11:30:59 +02:00
Nate Chandler
d765434e2b [IRGen] Restored non-constant async function calls in PAFs.
Previously, because partial apply forwarders for async functions were
not themselves fully-fledged async functions, they were not able to
handle dynamic functions.  Specifically, the reason was that it was not
possible to produce an async function pointer for the partial apply
forwarder because the size to be used was not knowable.

Thanks to https://github.com/apple/swift/pull/36700, that cause has been
eliminated.  With it, partial apply forwarders are fully-fledged async
functions and in particular have their own async function pointers.
Consequently, it is again possible for these partial apply forwarders to
handle non-constant function pointers.

Here, that behavior is restored, by way of reverting part of
ee63777332 while preserving the ABI it
introduced.

rdar://76122027
2021-04-06 15:51:32 -07:00
John McCall
4f6f8b3377 Rewrite hop_to_executor so that it takes a Builtin.Executor in IRGen
The comment in LowerHopToActor explains the design here.
We want SILGen to emit hops to actors, ignoring executors,
because it's easier to fully optimize in a world where deriving
an executor is a non-trivial operation.  But we also want something
prior to IRGen to lower the executor derivation because there are
useful static optimizations we can do, such as doing the derivation
exactly once on a dominance path and strength-reducing the derivation
(e.g. exploiting static knowledge that an actor is a default actor).

There are probably phase-ordering problems with doing this so late,
but hopefully they're restricted to situations like actors that
share an executor.  We'll want to optimize that eventually, but
in the meantime, this unblocks the executor work.
2021-03-30 20:08:41 -04:00
Nate Chandler
ee63777332 [IRGen] Fix ABI for thick async functions.
Previously, thick async functions were represented sometimes as a pair
of (AsyncFunctionPointer, nullptr)--when the thick function was produced
via a thin_to_thick_function, e.g.--and sometimes as a pair of
(FunctionPointer, ThickContext)--when the thick function was produced by
a partial_apply--with the size stored in the slot of the ThickContext.

That optimized for the wrong case: partial applies of dynamic async
functions; in that case, there is no appropriate AsyncFunctionPointer to
form when lowering the partial_apply instruction.  The far more common
case is to know exactly which function is being partially applied.  In
that case, we can form the appropriate AsyncFunctionPointer.

Furthermore, the previous representation made calling a thick function
more complex: it was always necessary to check whether the context was
in fact null and then proceed along two different paths depending.

Here, that behavior is corrected by creating a thunk in a mandatory
IRGen SIL pass in the case that the function that is being partially
applied is dynamic.  That new thunk is then partially applied in place
of the original partial_apply of the dynamic function.
2021-03-15 13:37:40 -07:00
Andrew Trick
b689b1dabe Rename GuaranteedARCOpts to MandatoryARCOpts.
This bleeds into the implementation where "guaranteed" is used
everywhere to talk about optimization of guaranteed values. We need to
use mandatory to indicate we're talking about the pass pipeline.
2021-03-02 22:20:13 -08:00
Erik Eckstein
a17f8c2f3f SILOptimizer: add a diagnostics pass to warn about lifetime issues with weak references.
The DiagnoseLifetimeIssuesPass pass prints a warning if an object is stored to a weak property (or is weakly captured) and destroyed before the property (or captured reference) is ever used again.
This can happen if the programmer relies on the lexical scope to keep an object alive, but copy-propagation can shrink the object's lifetime to its last use.
For example:

  func test() {
    let k = Klass()
      // k is deallocated immediately after the closure capture (a store_weak).
      functionWithClosure({ [weak k] in
                            // crash!
                            k!.foo()
                          })
    }

Unfortunately this pass can only catch simple cases, but it's better than nothing.

rdar://73910632
2021-02-15 11:11:35 +01:00
Erik Eckstein
9f616e5e37 DeadFunctionElimination: rename the pass and some other refactoring
The main change is to rename DeadFunctionElimination -> DeadFunctionAndGlobalElimination, because the pass is now also doing dead-global elimination.
A second change is to remove the FunctionLivenessComputation base class. It’s not used anywhere else.
2021-02-09 19:56:43 +01:00
Andrew Trick
02784a2dc2 Add a variation on CopyPropagation to handle debug_value correctly.
MandatoryCopyPropagation must be a separate pass in order to preserve
all debug_value instructions. CopyPropagation cannot preserve
debug_value because, as a rule, debug information cannot affect -O
behavior.
2021-01-05 20:38:31 -08:00
Andrew Trick
cead6a5122 Add an OptimizedMandatoryCombine pass variant.
It's against the principles of pass design to check the driver mode
within the pass. A pass always needs to do the same thing regardless
of where it runs in the pass pipeline. It also needs to be possible to
test passes in isolation.
2021-01-01 19:22:19 -08:00
eeckstein
56928ba851 Merge pull request #34593 from eeckstein/optimize_hte
[concurrency] SILOptimizer: optimize hop_to_executor instructions.
2020-11-09 09:23:46 +01:00
Ben Barham
7cee600bcd [SILGen] Add flag to skip typechecking and SIL gen for function bodies
Adds a new flag "-experimental-skip-all-function-bodies" that skips
typechecking and SIL generation for all function bodies (where
possible).

`didSet` functions are still typechecked and have SIL generated as their
body is checked for the `oldValue` parameter, but are not serialized.
Parsing will generally be skipped as well, but this isn't necessarily
the case since other flags (eg. "-verify-syntax-tree") may force delayed
parsing off.
2020-11-06 12:08:19 +10:00
Erik Eckstein
a47ebabe54 [concurrency] SILOptimizer: optimize hop_to_executor instructions.
* Redundant hop_to_executor elimination: if a hop_to_executor is dominated by another hop_to_executor with the same operand, it is eliminated:

      hop_to_executor %a
      ... // no suspension points
      hop_to_executor %a // can be eliminated

* Dead hop_to_executor elimination: if a hop_to_executor is not followed by any code which requires to run on its actor's executor, it is eliminated:

      hop_to_executor %a
      ... // no instruction which require to run on %a
      return

rdar://problem/70304809
2020-11-05 18:48:22 +01:00
Andrew Trick
3128eae3f0 Add NestedSemanticFunctionCheck diagnostic
to check for improperly nested '@_semantic' functions.

Add a missing @_semantics("array.init") in ArraySlice found by the
diagnostic.

Distinguish between array.init and array.init.empty.

Categorize the types of semantic functions by how they affect the
inliner and pass pipeline, and centralize this logic in
PerformanceInlinerUtils. The ultimate goal is to prevent inlining of
"Fundamental" @_semantics calls and @_effects calls until the late
pipeline where we can safely discard semantics. However, that requires
significant pipeline changes.

In the meantime, this change prevents the situation from getting worse
and makes the intention clear. However, it has no significant effect
on the pass pipeline and inliner.
2020-10-26 17:02:33 -07:00
Andrew Trick
b2d1ac1631 Add AccessPathVerification pass and run it in the pipeline. 2020-10-16 15:00:10 -07:00
Arnold Schwaighofer
b994bf3191 Add support for _specialize(exported: true, ...)
This attribute allows to define a pre-specialized entry point of a
generic function in a library.

The following definition provides a pre-specialized entry point for
`genericFunc(_:)` for the parameter type `Int` that clients of the
library can call.

```
@_specialize(exported: true, where T == Int)
public func genericFunc<T>(_ t: T) { ... }
```

Pre-specializations of internal `@inlinable` functions are allowed.

```
@usableFromInline
internal struct GenericThing<T> {
  @_specialize(exported: true, where T == Int)
  @inlinable
  internal func genericMethod(_ t: T) {
  }
}
```

There is syntax to pre-specialize a method from a different module.

```
import ModuleDefiningGenericFunc

@_specialize(exported: true, target: genericFunc(_:), where T == Double)
func prespecialize_genericFunc(_ t: T) { fatalError("dont call") }

```

Specially marked extensions allow for pre-specialization of internal
methods accross module boundries (respecting `@inlinable` and
`@usableFromInline`).

```
import ModuleDefiningGenericThing
public struct Something {}

@_specializeExtension
extension GenericThing {
  @_specialize(exported: true, target: genericMethod(_:), where T == Something)
  func prespecialize_genericMethod(_ t: T) { fatalError("dont call") }
}
```

rdar://64993425
2020-10-12 09:19:29 -07:00
Erik Eckstein
2a035432e7 SILOptimizer: make a separate SROA pass for high-level SIL, which doesn't split String types.
The StringOptimization relies on seeing String values a a whole and not being split.
2020-08-03 12:01:29 +02:00
Erik Eckstein
7f684b62e2 SIL optimizer: Add a new string optimization.
Optimizes String operations with constant operands.

Specifically:
  * Replaces x.append(y) with x = y if x is empty.
  * Removes x.append("")
  * Replaces x.append(y) with x = x + y if x and y are constant strings.
  * Replaces _typeName(T.self) with a constant string if T is statically known.

With this optimization it's possible to constant fold string interpolations, like "the \(Int.self) type" -> "the Int type"

This new pass runs on high-level SIL, where semantic calls are still in place.

rdar://problem/65642843
2020-07-27 21:32:56 +02:00
Andrew Trick
5091aee6f2 Add an AccessedStorageDumper pass to verify findAccessedStorage.
Rename the existing pass to AccessedStorageAnalysisDumper.

AccessedStorage is useful on its own as a utility without the
analysis. We need a way to test the utility itself.

Add test cases for the previous commit that introduced
FindPhiStorageVisitor.
2020-07-20 16:42:28 -07:00
Michael Gottesman
4559db01cb [opt-remark] Add a pass called Opt Remark Generator that implements small opt remarks that do not result from large dataflow checks.
We do allow for limited def-use traversal to do things like find debug_value.
But nothing like a full data flow pass. As a proof of concept I just emit
opt-remarks when we see retains, releases. Eventually I would like to also print
out which lvalue the retain is from but that is more than I want to do with this
proof of concept.
2020-07-14 19:32:46 -07:00
Erik Eckstein
9e92389fa5 SILOptimizer: a new "TempLValueOpt" optimization pass for copy_addr
Optimizes copies from a temporary (an "l-value") to a destination.

    %temp = alloc_stack $Ty
    instructions_which_store_to %temp
    copy_addr [take] %temp to %destination
    dealloc_stack %temp

is optimized to

    destroy_addr %destination
    instructions_which_store_to %destination

The name TempLValueOpt refers to the TempRValueOpt pass, which performs a related transformation, just with the temporary on the "right" side.
The TempLValueOpt is similar to CopyForwarding::backwardPropagateCopy.
It's more restricted (e.g. the copy-source must be an alloc_stack).
That enables other patterns to be optimized, which backwardPropagateCopy cannot handle.

This pass also performs a small peephole optimization which simplifies copy_addr - destroy sequences.

    copy_addr %source to %destination
    destroy_addr %source

is replace with

    copy_addr [take] %source to %destination
2020-06-22 13:47:31 +02:00
Michael Gottesman
c749f37e80 [ownership] Add new OwnershipEliminatorPass that does not run when the current module is the stdlib.
As part of bringing up ossa on the rest of the optimizer, I am going to be first
moving ossa back for the stdlib module since the stdlib module does not have any
sil based dependencies. To do so, I am adding this pass that I can place at the
beginning of the pipeline (where NonTransparentFunctionOwnershipModelEliminator
runs today) and then move NonTransparentFunctionOwnershipModelEliminator down as
I update passes. If we are processing the stdlib, the ome doesn't run early and
instead runs late. If we are not processing the stdlib, we perform first an OME
run early and then perform an additional OME run that does nothing since
lowering ownership is an idempotent operation.
2020-06-09 13:55:02 -07:00
Joe Groff
e3ce94d97d Introduce a PruneVTables pass to mark non-overridden entries.
Start with some high-level checks of whether a declaration is formally final, or has no visible overrides
in the domain of the program visible to the SIL module.

We can eventually adopt more of the logic from the Devirtualizer pass to tell whether call sites are
"effectively final", and maybe make the Devirtualizer consume that information applied by this pass.
2020-06-05 11:03:04 -07:00
Erik Eckstein
9722578df6 SILOptimizer: a new optimization for copy-on-write
Constant folds the uniqueness result of begin_cow_mutation instructions, if it can be proved that the buffer argument is uniquely referenced.
For example:

     %buffer = end_cow_mutation %mutable_buffer
     // ...
     // %buffer does not escape here
     // ...
     (%is_unique, %mutable_buffer2) = begin_cow_mutation %buffer
     cond_br %is_unique, ...

is replaced with

     %buffer = end_cow_mutation [keep_unique] %mutable_buffer
     // ...
     (%not_used, %mutable_buffer2) = begin_cow_mutation %buffer
     %true = integer_literal 1
     cond_br %true, ...

Note that the keep_unique flag is set on the end_cow_mutation because the code now relies on that the buffer is really uniquely referenced.

The optimization can also handle def-use chains between end_cow_mutation and begin_cow_mutation which involve phi-arguments.

An additional peephole optimization is performed: if the begin_cow_mutation is the only use of the end_cow_mutation, the whole pair of instructions is eliminated.
2020-05-26 18:01:17 +02:00
Erik Eckstein
ad99b9d4f8 SILOptimizer: a new phi-argument expansion optimization.
If only a single field of a struct phi-argument is used, replace the argument by the field value.

     br bb(%str)
   bb(%phi):
     %f = struct_extract %phi, #Field // the only use of %phi
     use %f

is replaced with

     %f = struct_extract %str, #Field
     br bb(%f)
   bb(%phi):
     use %phi

This also works if the phi-argument is in a def-use cycle.

The new PhiExpansionPass is in the same file as the RedundantPhiEliminationPass. Therefore I renamed the source file to PhiArgumentOptimizations.cpp
2020-05-25 09:36:09 +02:00
Michael Gottesman
82b08c5722 [ownership] Add new pass OwnershipVerifierTextualErrorDumper and use it in ownership verifier FileCheck tests instead of SILVerifier.
This will make it easier for me with a few further refactors to make the
ownership verifier testing mode emit per function error numbers instead of the
global error number that it is emitting now.

The reason why this is necessary is that today, the verification by
-sil-verify-all causes the errors to be emitted. That verification is done on a
per value level, rather than a per function level, so it is hard to get per
function error numbers without doing unprincipled things like propagating around
state saying what the current function being verified is.

This pass instead will let me make the error counter be per ErrorBuilder which
are created per function.

One thing to be aware of is that this /will/ cause SILValue::verifyOwnership to
not emit any output when the testing flag is enabled. This is to ensure I only
do not get duplicate textual error messages from the SILVerifier.
2020-05-04 13:58:56 -07:00
ematejska
4cd68edf8c [Autodiff upstream] Add DifferentiabilityWitnessDevirtualizer SILOptimizer pass (#30984)
Add DifferentiabilityWitnessDevirtualizer: an optimization pass that
devirtualizes `differentiability_witness_function` instructions into
`function_ref` instructions.

Co-authored-by: Dan Zheng <danielzheng@google.com>
2020-04-23 02:13:05 -07:00
Dan Zheng
aa66cce808 [AutoDiff upstream] Add differentiation transform.
The differentiation transform does the following:
- Canonicalizes differentiability witnesses by filling in missing derivative
  function entries.
- Canonicalizes `differentiable_function` instructions by filling in missing
  derivative function operands.
- If necessary, performs automatic differentiation: generating derivative
  functions for original functions.
  - When encountering non-differentiability code, produces a diagnostic and
    errors out.

Partially resolves TF-1211: add the main canonicalization loop.

To incrementally stage changes, derivative functions are currently created
with empty bodies that fatal error with a nice message.

Derivative emitters will be upstreamed separately.
2020-04-02 15:43:57 -07:00
Erik Eckstein
93a0dfc578 SILOptimizer: a new small optimization pass to remove redundant basic block arguments.
RedundantPhiElimination eliminates block phi-arguments which have the same value as other arguments of the same block.
This also works with cycles, like two equivalent loop induction variables. Such patterns are generated e.g. when using stdlib's enumerated() on Array.

   preheader:
     br bb1(%initval, %initval)
   header(%phi1, %phi2):
     %next1 = builtin "add" (%phi1, %one)
     %next2 = builtin "add" (%phi2, %one)
     cond_br %loopcond, header(%next1, %next2), exit
   exit:

is replaced with

   preheader:
     br bb1(%initval)
   header(%phi1):
     %next1 = builtin "add" (%phi1, %one)
     %next2 = builtin "add" (%phi1, %one) // dead: will be cleaned-up later
     cond_br %loopcond, header(%next1), exit
   exit:

Any remaining dead or "trivially" equivalent instructions will then be cleaned-up by DCE and CSE, respectively.

rdar://problem/33438123
2020-03-26 19:30:01 +01:00
Michael Gottesman
e3f2bb74d2 [inliner] Add a new Inliner that only inlines AlwaysInline functions (but do not put it in the pass pipeline).
We need this anyways for -Onone and I want to do some experiments with running
this very early so I can expose more of the stdlib (modulo inlining) to the new
ownership optimizing passes.

I also changed how the inliner handles inlining around OSSA by changing it to
check early that if the caller is in ossa, then we only inline if all of the
callees that the caller calls are in ossa. The intention is to hopefully avoid
weird swings in code-size/perf due to the inliner heuristic's calculation being
artificially manipulated due to some callees not being available to inline (due
to this difference) when others are already available.
2020-03-10 19:53:18 -07:00
Ravi Kandhadai
ec9844b2d9 [SIL Optimization] Add a new mandatory pass for unrolling forEach
calls over arrays created from array literals. This enables optimizing
further the output of the OSLogOptimization pass, and results in
highly-compact and optimized IR for calls to the new os log API.

<rdar://58928427>
2020-02-07 20:06:29 -08:00
Erik Eckstein
03b0a6c148 DeadFunctionElimination: remove externally available witness tables at the end of the pipeline
... including all SIL functions with are transitively referenced from such witness tables.

After the last devirtualizer run witness tables are not needed in the optimizer anymore.
We can delete witness tables with an available-externally linkage. IRGen does not emit such witness tables anyway.
This can save a little bit of compile time, because it reduces the amount of SIL at the end of the optimizer pipeline.
It also reduces the size of the SIL output after the optimizer, which makes debugging the SIL output easier.
2020-01-27 14:45:10 +01:00
Michael Gottesman
5b557afbbe [sil] Now that we aren't using mark-uninitialized-fixup anywhere, remove it. 2020-01-02 12:10:36 -08:00
Erik Eckstein
f03956b30c Cross-module-optimization: Serialize immediately after CrossModuleSerializationSetup
Otherwise it can happen that e.g. specialization runs between CrossModuleSerializationSetup  and serialization, resulting that an inlinable function references a shared function (which doesn't have a public linkage).
The solution is to move serialization right after CrossModuleSerializationSetup. But only do that if cross-module-optimization is enabled (it would be a disruptive change to move serialization in general).
2019-12-11 18:14:41 +01:00
swift-ci
f0157b0f87 Merge pull request #28473 from atrick/arrayproperty-opt 2019-12-03 10:32:49 -08:00
Erik Eckstein
a5397b434c Cross module optimization
This is a first version of cross module optimization (CMO).

The basic idea for CMO is to use the existing library evolution compiler features, but in an automated way. A new SIL module pass "annotates" functions and types with @inlinable and @usableFromInline. This results in functions being serialized into the swiftmodule file and thus available for optimizations in client modules.
The annotation is done with a worklist-algorithm, starting from public functions and continuing with entities which are used from already selected functions. A heuristic performs a preselection on which functions to consider - currently just generic functions are selected.

The serializer then writes annotated functions (including function bodies) into the swiftmodule file of the compiled module. Client modules are able to de-serialize such functions from their imported modules and use them for optimiations, like generic specialization.

The optimization is gated by a new compiler option -cross-module-optimization (also available in the swift driver).
By default this option is off. Without turning the option on, this change is (almost) a NFC.

rdar://problem/22591518
2019-12-03 14:37:01 +01:00
Andrew Trick
4da33e15ad Cleanup: move ArrayPropertyOpt out of COWArrayOpt.cpp.
These are separate, mostly unrelated passes. Putting them in their own
files makes it easier to read the code, understand how to control the
passes, and makes it possible to independently trace, and debug them.
2019-11-25 11:53:49 -08:00
Arnold Schwaighofer
8aaa7b4dc1 SILOptimizer: Pipe through TypeExpansionContext 2019-11-11 14:21:52 -08:00
Harlan Haskins
b904133c42 [Modules] Add flag to skip non-inlinable function bodies
This flag, currently staged in as `-experimental-skip-non-inlinable-function-bodies`, will cause the typechecker to skip typechecking bodies of functions that will not be serialized in the resulting `.swiftmodule`. This patch also includes a SIL verifier that ensures that we don’t accidentally include a body that we should have skipped.

There is still some work left to make sure the emitted .swiftmodule is exactly the same as what’s emitted without the flag, which is what’s causing the benchmark noise above. I’ll be committing follow-up patches to address those, but for now I’m going to land the implementation behind a flag.
2019-09-26 10:40:11 -07:00
Ravi Kandhadai
032442da93 [Constant Evaluator] Add a new sil-opt pass to check constant evaluability
of Swift code snippets. Add a new test: constant_evaluable_subset_test.swift
that tests Swift code snippets that are expected to be constant evaluable and
those that are not.
2019-09-16 18:49:19 -07:00
Nate Chandler
211e6562d8 [SILOptimizer] Added MandatoryCombiner.
Minimal commit to get the mandatory combiner rolling.  The intent of the
mandatory combiner is to do basic transformations after the diagnostics
passes have run at the beginning of both the Onone and the performance
pipelines.  For now, it simply visits reachable instructions and does
no work.  In a subsequent commit, apply instructions will be processed
and replaced if they are calls to partial applies defined in the same
function.  At that point an attempt will be made to eliminate the
partial apply altogether.

For now, the pass does no work and is not part of any pipeline.
2019-09-06 15:44:57 -07:00