Commit Graph

871 Commits

Author SHA1 Message Date
Michael Gottesman
baf293a0f6 Merge pull request #11846 from gottesmm/pr-0b7638743feaaad8531e324038be3fe1cd536022
[mandatory-inlining] Make fixupReferenceCounts not delete instructions.
2017-09-12 14:15:46 -07:00
Roman Levenstein
f0a39e9e14 Add support for collecting various SIL optimizer counters
This patch implements collection and dumping of statistics about SILModules, SILFunctions and memory consumption during the execution of SIL optimization pipelines.

The following statistics can be collected:
  *  For SILFunctions: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, duration of a pass
  *  For SILModules: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, the number of SILFunctions, the amount of memory used by the compiler, duration of a pass

By default, any collection of statistics is disabled to avoid affecting compile times.

One can enable the collection of statistics and dumping of these statistics for the whole SILModule and/or for SILFunctions.

To reduce the amount of produced data, one can set thresholds in such a way that changes in the statistics are only reported if the delta between the old and the new values are at least X%. The deltas are computed as using the following formula:

   Delta = (NewValue - OldValue) / OldValue

Thresholds provide a simple way to perform a simple filtering of the collected statistics during the compilation. But if there is a need for a more complex analysis of collected data (e.g. aggregation by a pipeline stage or by the type of a transformation), it is often better to dump as much data as possible into a file using e.g. -sil-stats-dump-all -sil-stats-modules -sil-stats-functions and then e.g. use the helper scripts to store the collected data into a database and then perform complex queries on it. Many kinds of analysis can be then formulated pretty easily as SQL queries.
2017-09-10 21:47:55 -07:00
Michael Gottesman
49bf82245b [mandatory-inlining] Make fixupReferenceCounts not delete instructions.
The main loop of mandatory inlining is spending a lot of time managing complex
iterator invalidation issues. This is the first in a series of commits that move
the main inlining loop to only delete the callee and to do all cleanups after we
have finished inlining.

This specific optimization (the quick retain/release peephole), I am not going
to do in MandatoryInlining, we already have guaranteed arc opts afterwards that
will be able to hit such a peephole so no perf should be lost.

*NOTE* The reason why I had to touch some of the code motion tests is that the
routine I am using to ensure that strong_retain/release_value is emitted as
appropriate is also used by codemotion. Code motion tests had cargo culted some
code from previous tests that retained Builtin.Int32. I changed the routines
though so that when a retain/release is inserted, if it is trivial, nothing is
inserted. No routine was relying on the actual usage of the inserted
retain/releases, so everything will be safe. This addition to the relevant code
caused me to need to change the tests in code motion to use actual non-trivial
values. The same code paths are being tested in terms of blocking code
motion/etc.

rdar://31521023
2017-09-10 13:23:48 -07:00
Michael Gottesman
9f53380824 [gardening] Add a note to DevirtualizationResult explaining that it can contain an Argument. 2017-09-09 15:54:47 -07:00
Michael Gottesman
430f865f73 [inliner] Extract out checking if we can inline from inlineFunction into canInlineFunction. NFC.
The reason to do this is:

1. The check in SILInliner if we can inline can be done without triggering
side-effects.

2. This enables us to know if inlining will succeed before attempting to inline.
This enables for arguments to be adjusted with new SILInstructions and the like
before inlining occurs. I use this in a forthcoming patch that updates mandatory
inlining for ownership.

rdar://31521023
2017-09-08 18:25:57 -07:00
Arnold Schwaighofer
b625d4da8a Osize: Add a SIL Outliner pass that outlines the bridging of objective c calls.
Implements outlining of bridged objective c property and method calls.

rdar://33387700
2017-09-06 08:37:37 -07:00
Jordan Rose
f8b7db4e76 Excise the terms "blacklist" and "whitelist" from Swift source. (#11687)
The etymology of these terms isn't about race, but "black" = "blocked"
and "white" = "allowed" isn't really a good look these days. In most
cases we weren't using these terms particularly precisely anyway, so
the rephrasing is actually an improvement.
2017-08-30 09:28:00 -07:00
Joe Shajrawi
570a82aea5 Reduce expansion of large types in the optimizer 2017-08-25 13:56:26 -07:00
Michael Gottesman
b1debfc401 [epilogue-arc-analysis] Be more efficient with memory usage.
This patch fixes a number of issues:

The analysis was using EpilogueARCContext as a temporary when computing. This is
an performance problem since EpilogueARCContext contains all of the memory used
in the analysis. So essentially, we were mallocing tons of memory every time we
missed the analyses cache. This patch changes the pass to instead have 1
EpilogueARCContext whose internal state is cleared in between invocations. Since
the data structures (see below) used after this patch do not shrink memory after
being cleared, this should cause us to have far less memory churn.

The analysis was managing its block state data structure by allocating the
individual block state structs using a BumpPtrAllocator/DenseMap stored in
EpilogueARCContext. The individual state structures were allocated from the
BumpPtrAllocator and the DenseMap then mapped a specific SILBasicBlock to its
State data structure. Ignoring that we were mallocing this memory every time we
computed rather than reusing global state, this pessimizes performance on small
functions significantly. This is because the BumpPtrAllocator by default heap
allocates initially a page and DenseMap initially mallocs a 64 entry hash
table. Thus for a 1 block function, we would be allocating a large amount of
memory that is just unneeded.

Instead this patch changes the analysis to use a std::vector in combination with
PostOrderFunctionInfo to manage the per block state. The way this works is that
PostOrderFunctionInfo already contains a map from a SILBasicBlock to its post
order number. So, when we are allocating memory for each block, we visit the CFG
in post order. Thus we know that each block's state will be stored in the vector
at vector[post order number].

This has a number of nice effects:

1. By eliminating the need for the DenseMap, in large test cases, we are
signficiantly reducing the memory overhead (by 24 bytes per basic block assuming
8 byte ptrs).
2. We will use far less memory when applying this analysis to small functions.

rdar://33841629
2017-08-11 18:18:39 -07:00
Michael Gottesman
b70c8b64a1 [sil-analysis] Add a new utility class for FunctionBaseInfo based analyses: LazyFunctionInfo.
Commonly when an analysis uses subanalyses, we eagerly create the sub function
info when constructing the main function info. This is not always necessary and
when the subanalyses do work in their constructor, can be actively harmful if
the parent analysis is never invoked.

This utility class solves this problem by being a very easy way to perform a
delayed call to the sub-analysis to get the sub-functioninfo combined with a
cache so that after the LazyFunctionInfo is used once, we do not reuse the
DenseMap in the sub-analysis unnecessarily.

An example of where this can happen is in EpilogueARCAnalysis in combination
with PostOrderFunctionInfo. PostOrderFunctionInfo eagerly creates a new post
order. So, if we were to eagerly create the PostOrderFunctionInfo (the
sub-functioninfo) when we created an EpilogueARCFunctionInfo, we would be
creating a post order even if we never actually invoke EpilogueARCFunctionInfo.
2017-08-11 18:18:39 -07:00
Michael Gottesman
ae25f44408 [epilogue-arc] Use maybeGet instead of get when handling delete notifications.
By using this the maybeGet API on FunctionAnalysisBase instead of get, we stop
EpilogueARCAnalysis from building itself if it does not yet exist, only to
invalidate itself.

rdar://33841629
2017-08-11 14:41:57 -07:00
Michael Gottesman
eb4d94f10b [sil-analysis] Add FunctionAnalysisBase::{hasAnalysis,maybeGet}(SILFunction *F).
Today, if one wants to invalidate state relative to your own function analysis,
you have to use FunctionAnalysisBase::get() to get the analysis. The problem
here is that if the analysis does not exist yet, then you are actually creating
the analysis. This is an issue when one wants to perform an action on an
analysis only if the analysis has already been built. An example of such a
situation is when one is processing a delete notification. If one does not have
an analysis for a function, one should just do nothing.

I am going to use this to fix a delete notification problem in
EpilogueARCAnalysis.

rdar://33841629
2017-08-11 14:41:57 -07:00
Michael Gottesman
6b54531455 Move the include guard of Analysis.h /above/ the includes.
Otherwise, every time we include Analysis.h, we will try to include those other
files even if we have already included Analysis.h. This can increase compile
time.

rdar://33841629
2017-08-11 14:40:36 -07:00
Erik Eckstein
21ab99bf80 SILOptimizer: fix performance problem in EpilogueARCAnalysis
The cache for analysis result was never set. This resulted in a pretty bad quadratic behavior.
2017-08-09 14:24:44 -07:00
Roman Levenstein
d493859d4b Merge pull request #11251 from swiftix/generic-specialization-fixes
Implement a more robust way to avoid infinite generic specialization loops
2017-08-07 09:05:56 -07:00
Roman Levenstein
8503daee0d Implement a more robust way to avoid infinite generic specialization loops
The existing simple mechanism for avoiding infinite generic specialization loops is based on checking the structural depth and width of types passed as generic type parameters. If the depth or the width of a type is above a certain threshold, the type is considered too complex for generic specialization and no specialization is produced. While this approach prevents the possibility of producing an infinite number of generic specializations for ever-growing generic type parameters, it catches the issue too late in some cases, leading to excessive CPU and memory usage.

Therefore, the new method tries to solve the problem at its root. An infinite generic specialization loop can be triggered by specializing a given generic call-site if and only if:
-  Doing so would result in a loop inside the specialization graph represented by the `GenericSpecializationInformations`, i.e. it would produce direct or indirect recursion involving a generic call
-  The substitutions used by the current generic call-site are structurally more complex than the substitutions used by the same call-site in the previous iteration inside specialization graph. More complex in this context means that the new generic type parameter structurally contains the generic type parameter from a previous iteration inside the specialization graph and has greater structural depth, e.g. `Array<Int>` is more complex than `Int`.

The generic specializer now records all the required information about specializations it produces and uses it later to detect and prevent any generic specializations which would result in an infinite specialization loop. It detects them as early as possible and thus reduces compile times, memory consumption and potentially also reduces the code-size by not generating useless specializations.
2017-08-06 12:51:49 -07:00
Erik Eckstein
6c93798acc SILOptimizer: Add a new TempRValue optimization pass
This is a separate optimization that detects short-lived temporaries that can be eliminated.
This is necessary now that SILGen no longer performs basic RValue forwarding in some cases.

SR-5508: Performance regression in benchmarks caused by removing SILGen peephole for LoadExpr in +0 context
2017-08-05 17:23:51 -07:00
Erik Eckstein
6377cc095a SIL: Replace TransitivelyUnreachableBlocks with DeadEndBlocks
We had both utilities doing the same thing.
NFC
2017-07-24 09:50:42 -07:00
Erik Eckstein
f561176f14 EsacpeAnalysis: add a utility to get the list of use-points of a node 2017-07-21 10:47:26 -07:00
Erik Eckstein
a0e6082d25 SILOptimizer: change the way how ValueLifetimeAnalysis handles dead-end (unreachable) CFG paths.
In dead-array elimination we assume that the array allocation is post-dominated by all its final releases.
The only exception are branches to dead-end ("unreachable") blocks. So we just ignored all paths which didn't end up in a final release.
Now we explicitly pass the set of dead-end blocks and just ignore those blocks.
This is safer and it's also needed in the upcoming re-write of StackPromotion.
2017-07-21 10:46:03 -07:00
Erik Eckstein
3b54966ff2 SILOptimizer: Add a utility to find dead-end blocks. 2017-07-21 10:37:54 -07:00
Devin Coughlin
47d9de9751 [Exclusivity] Relax closure enforcement on separate stored properties (#10789)
Make the static enforcement of accesses in noescape closures stored-property
sensitive. This will relax the existing enforcement so that the following is
not diagnosed:

struct MyStruct {
   var x = X()
   var y = Y()

  mutating
  func foo() {
    x.mutatesAndTakesClosure() {
      _ = y.read() // no-warning
   }
  }
}

To do this, update the access summary analysis to summarize accesses to
subpaths of a capture.

rdar://problem/32987932
2017-07-10 13:33:22 -07:00
Michael Gottesman
9933f0f3b2 Fix the swap_refcnt test on linux.
The problem here is that we were performing a naive negative FileCheck test for
retain/release. In certain modes, we would not have any retains/releases along
normal control paths but would have retains on unreachable paths. This test only
is trying to test if normal code paths have this issue.

To work around this issue, I created a small utility pass that prunes all
non-unreachable instructions from blocks with an unreachable terminator. This is
useful functionality in general when analyzing SIL since often times one will
have large fatal error blocks that disguise the true behavior of the
function. In this specific case, I just pipe in the normal sil output and run it
through sil-opt. sil-opt then runs just the utility pass and I then FileCheck
that sil-opt output.

rdar://30181104
2017-07-07 13:03:25 -06:00
Andrew Trick
89985ebacd Use the pass's "tag" for command-line options.
A pass has an ID (C++ identifier), Tag (shell identifier),
and Name (human identifier).

Command line options that identify passes should obviously be compatibile with
with the pass' command line identifier. This is also what the user is used to
typing for the -debug-only option.
2017-07-06 14:10:23 -07:00
Devin Coughlin
2501dd71de Revert "[Exclusivity] Relax closure enforcement on separate stored properties" 2017-07-05 20:19:50 -07:00
swift-ci
eb5d21b6e2 Merge pull request #10595 from devincoughlin/exclusivity-interprocedural-separate-stored-structs 2017-07-05 18:52:57 -07:00
Devin Coughlin
86dff5c0a7 [Exclusivity] Relax closure enforcement on separate stored properties
Make the static enforcement of accesses in noescape closures stored-property
sensitive. This will relax the existing enforcement so that the following is
not diagnosed:

struct MyStruct {
   var x = X()
   var y = Y()

  mutating
  func foo() {
    x.mutatesAndTakesClosure() {
      _ = y.read()
   }
  }
}

To do this, update the access summary analysis to be stored-property sensitive.

rdar://problem/32987932
2017-07-05 16:09:54 -07:00
Andrew Trick
d45f171c98 Cleanup AccessMarkerElimination.
In raw SIL, access markers are unconditionally retained. In canonical SIL,
markers are still removed prior to optimization.

A new flag, -sil-optimized-access-markers, allows testing access markers in
optimized builds, but it is not yet fully supported.
2017-07-05 15:18:48 -07:00
Andrew Trick
4575525786 NFC: Cleanup ClosureScope/AccessEnforcementSelection/Tests.
Per Devin and John's review.
2017-06-20 14:57:56 -07:00
Andrew Trick
00a72b8517 Comment typo. 2017-06-17 16:27:02 -07:00
Andrew Trick
0ce81c90f0 Doxygen formatting. 2017-06-17 16:19:16 -07:00
Andrew Trick
94db617471 Add TopDownClosureFunctionOrder based on ClosureScopeAnalysis.
Simple utility for transfersing functions such that parent scopes are always
visited before noescape closures.

Note that recursion is disallowed. Noescape closures are not reentrant.
2017-06-16 19:08:40 -07:00
Andrew Trick
3bec7d81ac Add ClosureScopeAnalysis.
Record noescape closure scopes. This allows passes to process closures and their
parent scopes in a controlled order. AccessEnforcementSelection needs this
because it needs to process parent scopes before selecting enforcement within
noescape closures.

Eventually this could be used by the PassManager so that
AccessEnforcementSelection can go back to being a function transform.
2017-06-16 19:08:39 -07:00
Devin Coughlin
06b9ed7501 [Exclusivity] Switch static checking to use IndexTrie instead of ProjectionPath
IndexTrie is a more light-weight representation and it works well in this case.
This requires recovering the represented sequence from an IndexTrieNode, so
also add a getParent() method.
2017-06-15 18:37:23 -07:00
swift-ci
32082f6ecb Merge pull request #10191 from devincoughlin/noescape-closure-access-summary 2017-06-15 13:28:54 -07:00
Devin Coughlin
d2ac3d556b [Exclusivity] Add analysis pass summarizing accesses to inout_aliasable args
Add an interprocedural SIL analysis pass that summarizes the accesses that
closures make on their @inout_aliasable captures. This will be used to
statically enforce exclusivity for calls to functions that take noescape
closures.

The analysis summarizes the accesses on each argument independently and
uses the BottomUpIPAnalysis utility class to iterate to a fixed point when
there are cycles in the call graph.

For now, the analysis is not stored-property-sensitive -- that will come in a
later commit.
2017-06-15 07:59:18 -07:00
Devin Coughlin
2896d5b93f [SIL Utils] Move IndexTrieNode into its own header in Utils. NFC.
Move IndexTrieNode from DeadObjectElimination into its own header. I plan to
use this data structure when diagnosing static violations of exclusive access.
2017-06-14 21:23:14 -07:00
Joe Shajrawi
edea7d04b3 Add a flag (false by default) for large loadable types pass 2017-05-22 14:25:25 -07:00
Roman Levenstein
dd93027a0e Always inline pure functions with constant arguments
A function is pure if it has no side-effects.
If there is a call of a pure function with constant arguments, it always makes sense to inline it, because we know that the whole computation will be constant folded.
2017-05-15 11:52:36 -07:00
Roman Levenstein
d66924b01e [sil-inliner] Move some functionality from PerformanceInliner into PerformanceInlinerUtils. NFC.
It does not change any functionality. The only purpose it to make some functions reusable by other passes.
2017-05-15 09:03:53 -07:00
Joe Shajrawi
17effee303 Disable large types irgen pass 2017-05-09 18:58:27 -07:00
practicalswift
492f5cd35a [gardening] Remove redundant repetition of type names (DRY): RepeatedTypeName foo = dyn_cast<RepeatedTypeName>(bar)
Replace `NameOfType foo = dyn_cast<NameOfType>(bar)` with DRY version `auto foo = dyn_cast<NameOfType>(bar)`.

The DRY auto version is by far the dominant form already used in the repo, so this PR merely brings the exceptional cases (redundant repetition form) in line with the dominant form (auto form).

See the [C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es11-use-auto-to-avoid-redundant-repetition-of-type-names) for a general discussion on why to use `auto` to avoid redundant repetition of type names.
2017-05-05 09:45:53 +02:00
Joe Shajrawi
4dc0801785 IRGen Mandatory Module Pass: Pass large loadable types by address instead of by value 2017-05-01 12:04:06 -07:00
Erik Eckstein
9ac13ae606 stdlib, optimizer: add Array. reserveCapacityForAppend as a new array semantics operation.
This function reserves capacity in an Array for new elements which are about to be appended.
2017-04-27 09:06:55 -07:00
Andrew Trick
e8b0947897 [Exclusivity] Allow testing the -Onone pipeline with access markers.
Markers are always eliminated before -O passes.

At -Onone, markers can be enabled via command line for all -Onone passes.
2017-04-26 17:32:48 -07:00
Roman Levenstein
8fb8cc4367 [generic-specializer] Cosmetic renaming of some vars and functions to match the new naming scheme 2017-04-20 08:16:24 -07:00
Roman Levenstein
686b83b6cb [generic-specializer] Improve comments 2017-04-20 08:16:24 -07:00
Roman Levenstein
f28e28c0a8 [generic-specializer] Big re-factoring of the partial specialization implementation
- Introduced a new helper class FunctionSignaturePartialSpecializer which provides most of the functionality required for producing a specialized generic signature based on the provided substitutions or requirements. The class consists of many small functions, which should make it easier to understand the code.

- Added a full support for partial specialization of generic parameters with generic substitutions (use flag `-Xllvm -sil-partial-specialization-with-generic-substitutions` to enable it)

- Removed the simpler version of the partial specializer which could partially specialize only generic parameters with non-generic substitutions. It is not needed anymore, because we can handle any substations now when performing the partial specialization.

- The functionality used by the EagerSpecializer to implement the partial specializations required by @_specialize is expressed in terms of FunctionSignaturePartialSpecializer as well. The code implementing it is much smaller now.

Partial specialization of generic parameters with generic substitutions is fully functional, but it is disabled by default, because it needs some tweaks when it comes to compile times and size of produced code. These issues will be addressed in the subsequent commits.
2017-04-20 08:16:24 -07:00
Roman Levenstein
ce8d986999 [generic-specializer] Rename OriginalF into Callee 2017-04-20 08:16:24 -07:00
practicalswift
7eb7d5b109 [gardening] Fix 100 typos. 2017-04-18 17:01:42 +02:00