Commit Graph

11224 Commits

Author SHA1 Message Date
Xin Tong
a48584ccbc Create a fast path for not-final release instruction.
For a release on a guaranteed function paramater, we know right away
that its not the final release and therefore does not call deinit.

Therefore we know it does not read or write memory other than the reference
count.

This reduces the compilation time of dead store and redundant load elim. As
we need to go over alias analysis to make sure tracked locations do not alias
with it.
2016-02-20 22:00:36 -08:00
Dmitri Gribenko
f27315b6f8 Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-20 15:36:04 -08:00
Mark Lacey
594a0d8c08 Use AddSSAPasses to add low-level passes.
This eliminates a pretty similar list of passes added in a similar order
with just re-using the ordering from AddSSAPasses. Beyond the particular
inliner pass (which is maintained with this change), there was nothing
really specific to low-level code with the order that was present before.

I measure a 1% increase in compile time of the stdlib, no perf
regressions (at -O), and a few decent improvements:
 19 CaptureProp                           5233             4129     -1104    -21.1%     1.27x
 30 ErrorHandling                         3053             2678      -375    -12.3%     1.14x
 65 Sim2DArray                             610              518       -92    -15.1%     1.18x

I expect to be able to get back the 1% compile-time hit (and probably
more) with future changes.
2016-02-20 14:38:21 -08:00
Dmitri Gribenko
3d3d4540e1 Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-20 14:37:49 -08:00
practicalswift
9a96e412f9 [gardening] Fix recently introduced typo: "transistive" → "transitive" 2016-02-20 07:27:59 +01:00
Nadav Rotem
b4d836880f [Doc] Rename a function and change 'auto' to an explicit type as suggested by @slavapestov in code review. 2016-02-19 21:54:07 -08:00
Xin Tong
95f3280461 Remove a double negative. NFC 2016-02-19 20:45:23 -08:00
Xin Tong
e42bd372eb Skip processing block without loads.
After collected enough information in the first iteration of the
data flow. We do not do second iteration (last iteration) for blocks
without loads as we will not forward any load there.

This improves compilation time of redundant load elimination.
2016-02-19 20:45:23 -08:00
Nadav Rotem
2d20eb6c54 Update the pass to use the destructured result types that John introduced a few days ago. 2016-02-19 16:48:29 -08:00
Nadav Rotem
30927d3459 Implement CSE of the trio open_ext + witness_method + apply.
When we emit calls to existential methods silgen produces a sequence of the
three instructions below:

open_existential_addr %0 : $*Pingable to $*@opened("1E467EB8-...") Pingable
witness_method $@opened("1E467EB8-...") Pingable, #Pingable.ping!1
apply %3<@opened("1E467EB8-...") Pingable>(%2)

This commit adds a new CSE-like pass that finds sequences of calls to protocol
methods and reuses the first two instructions open_existential_addr and
witness_method. The optimization finds arguments that must not alias and may not
escape and combines all of the existential method calls to use the same method
lookup. The optimization handles control flow by finding the top dominating
open_existential instruction, and uses that instruction.

related to rdar://22704464.
2016-02-19 16:48:29 -08:00
Xin Tong
fa2daeb5b8 Fix headers. NFC 2016-02-19 16:22:41 -08:00
Xin Tong
fcb707b40c Improve epilogue retain matcher.
Instead of only checking the return block, we could potentially check
its predecessors and its predecessors's predecessors, etc.

Alos put in a threshold to throttle this to make sure its cheap.

We are still only being able to find of a small # of epilogue retains.
The bail on MayDecrement is blocking many of the opportunites.

This should bring us closer to being able to handle Walsh.

This is part of rdar://24022375.
2016-02-19 16:22:41 -08:00
Adrian Prantl
0758f31458 Let isUserCode() take into account that SILFileLocation does no longer
exist. This fixes a bunch of spurious unreachable code warnings
introduced in 40c7a1a.
2016-02-19 15:51:39 -08:00
Adrian Prantl
45a7197081 Fix a several-year-old copy&paste error. 2016-02-19 15:51:39 -08:00
Mark Lacey
57b2db0648 Silence some unused variable warnings. 2016-02-19 14:51:12 -08:00
Mark Lacey
945065f37d Change where in the pass manager we validate that analyses are unlocked.
Verify just prior to running passes, and after running each pass, that
no analyses are locked from being invalidated.
2016-02-19 13:32:40 -08:00
Adrian Prantl
40c7a1abee Separate underlying storage and location kind in SILLocation and
remove the mixed concept that was SILFileLocation.
Also add support for a third type of underlying storage that will be used
for deserialized debug lcoations from textual SIL.

NFC

<rdar://problem/22706994>
2016-02-19 11:16:48 -08:00
practicalswift
0b67f52823 [gardening] Fix recently introduced typo: "anyting" → "anything" 2016-02-19 14:20:02 +01:00
Dmitri Gribenko
f39b443e24 Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-19 01:16:19 -08:00
Michael Gottesman
13cc88f694 Revert "[arc] Put back in the RCIdentity cache."
This reverts commit 6c728daa61.

This is a speculative revert to try and fix the ASAN build.
2016-02-18 18:25:01 -08:00
Dmitri Gribenko
0f36bec31f Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-18 16:41:35 -08:00
John McCall
e249fd680e Destructure result types in SIL function types.
Similarly to how we've always handled parameter types, we
now recursively expand tuples in result types and separately
determine a result convention for each result.

The most important code-generation change here is that
indirect results are now returned separately from each
other and from any direct results.  It is generally far
better, when receiving an indirect result, to receive it
as an independent result; the caller is much more likely
to be able to directly receive the result in the address
they want to initialize, rather than having to receive it
in temporary memory and then copy parts of it into the
target.

The most important conceptual change here that clients and
producers of SIL must be aware of is the new distinction
between a SILFunctionType's *parameters* and its *argument
list*.  The former is just the formal parameters, derived
purely from the parameter types of the original function;
indirect results are no longer in this list.  The latter
includes the indirect result arguments; as always, all
the indirect results strictly precede the parameters.
Apply instructions and entry block arguments follow the
argument list, not the parameter list.

A relatively minor change is that there can now be multiple
direct results, each with its own result convention.
This is a minor change because I've chosen to leave
return instructions as taking a single operand and
apply instructions as producing a single result; when
the type describes multiple results, they are implicitly
bound up in a tuple.  It might make sense to split these
up and allow e.g. return instructions to take a list
of operands; however, it's not clear what to do on the
caller side, and this would be a major change that can
be separated out from this already over-large patch.

Unsurprisingly, the most invasive changes here are in
SILGen; this requires substantial reworking of both call
emission and reabstraction.  It also proved important
to switch several SILGen operations over to work with
RValue instead of ManagedValue, since otherwise they
would be forced to spuriously "implode" buffers.
2016-02-18 01:26:28 -08:00
Dmitri Gribenko
65d840c0ae stdlib: lowercase cases in Optional and ImplicitlyUnwrappedOptional 2016-02-18 00:40:33 -08:00
Michael Gottesman
479893e4db [arc] We do not need to resummarize the subregion blocks of a loop when we summarize the loops ARC interesting instructions. This will save compile time.
The reason why this work is not needed is that ARC before any dataflow is
performed first summarizes the interesting instructions in all blocks. This
information is kept up to date by the ARC optimizer as it moves around
retains/releases.

Thus while performing dataflow, all we need to summarize are loops.
2016-02-17 23:42:14 -08:00
Michael Gottesman
35abad24d1 Merge pull request #1343 from gottesmm/arc-rc-identity-cache
[arc] Put back in the RCIdentity cache. This shaves of ~0.5 seconds f…
2016-02-17 23:26:49 -08:00
Xin Tong
b69706734d Implement @owned to @unowned retain value conversion.
If a value is returned as @owned, we can move the epilogue retain
to the caller and convert the return value to @unowned. This gives
ARC optimizer more freedom to optimize the retain out on the caller's
side.

It appears that epilgue retains are harder to find than epilogue
releases. Most of the time they are not in the return block.

(1) Sometimes, they are in predecessors
(2) Sometimes they come from a call which returns an @owned return value.
This should be improved if we fix (1) and go bottom up.
(3) We do not handle exploded retain_value.

Currently, this catches a small number of opportunities.

We probably need to improve epilogue retain matcher if we are to handle
more cases.

This is part of rdar://24022375.

We also need some refactoring in the pass. e.g. break functions into smaller
functions. I will do with subsequent commit.
2016-02-17 21:59:55 -08:00
Michael Gottesman
6c728daa61 [arc] Put back in the RCIdentity cache.
This shaves of ~0.5 seconds from ARC when compiling the stdlib on my machine.

I wired up the cache to the delete notification trigger so we are still memory
safe.
2016-02-17 21:56:32 -08:00
Michael Gottesman
9638e2917e Merge pull request #1338 from gottesmm/arc-make-trivial
[arc] Make all *RefCountStates and RCStateTransition trivially destru…
2016-02-17 17:02:16 -08:00
Michael Gottesman
d55ebaec42 [arc] Make all *RefCountStates and RCStateTransition trivially destructable and constructable.
Tested via static assert.

There is no reason for these data structures to not have these properties.
Adding these properties will improve the compile time efficiency of ARC by
allowing for cheaper copying and 0 cost destruction.
2016-02-17 16:06:32 -08:00
Arnold Schwaighofer
31e01a5dd9 CopyForwarding: More places to check whether we have a function arg 2016-02-17 15:08:43 -08:00
Dmitri Gribenko
dd75aed67a Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-17 14:40:05 -08:00
Arnold Schwaighofer
2f81e4eaf8 CopyForwarding: We need to check whether an argument is a function argument before checking its convention 2016-02-17 14:38:44 -08:00
Erik Eckstein
fd566d92bd EscapeAnalysis: fix a problem where a inconsistent connection graph can be generated.
rdar://problem/24686791
2016-02-17 13:27:31 -08:00
practicalswift
7f8052d289 [gardening] Fix incorrect file name in file header 2016-02-17 10:24:25 +01:00
Xin Tong
b156824f3b Rename EpilogueReleaseMatcherDumper.cpp.
Rename to EpilogueRetainReleaseMatcherDumper.cpp to better reflect
what it does now.
2016-02-16 22:29:32 -08:00
Xin Tong
a007d47dd4 Add a simple epilogue retain matcher.
This is similar and yet different from epilogue release matcher. Particularly
how retain is found and when to bail. Therefore this is put into a different
class than ConsumedArgToEpilogueReleaseMatcher

This is currently a NFC other than some basic testing using the epilogue dumper.
2016-02-16 22:27:40 -08:00
practicalswift
12012b1a15 [gardening] Fix recently introduced typo: "idenity" → "identity" 2016-02-16 22:20:48 +01:00
Xin Tong
81cc962c54 Create a fast path for release silargument.
When we have a single release that can be traced back to a SILArgument.
i.e. the released value is RC-identical to the SILArgument. we do not
need projection to check whether there are overlapping/uncovered releases
for the SILargument (which will result we exit the epilogue walking sequnece).

This brings the # of Total owned args -> guaranteed args from 118 and 149.

There are 23 owned args which we can not find epilogue releases for at this
point and many (if not most) is a result of partial_apply which rc-identity
nor projection can handle.

with this commit i see differences only in 2 benchmarks. baseline is without this
change. I am looking at the SuperChars regression.

StrToInt                              9297             8402      -895     -9.6%     1.11x
SuperChars                          676554           748375    +71821    +10.6%     0.90x (!)
2016-02-16 09:59:48 -08:00
Michael Gottesman
e93af4c119 Merge pull request #1321 from gottesmm/immutable_pointer_set_feedback
Immutable Pointer Set Fixes
2016-02-16 09:45:17 -08:00
Michael Gottesman
90dcaa7de3 Rename ImmutablePointerSet::concat => ImmutablePointerSet::merge. 2016-02-16 02:13:56 -08:00
Xin Tong
670df6dc9d Wire up improved epilogue release matcher with function signature
optimization.

We get some improvements on the # of parameters converted to guanrateed
from owned on the stdlib.

before
======
103 sil-function-signature-opts      - Total owned args -> guaranteed args

after
======
118 sil-function-signature-opts      - Total owned args -> guaranteed args

I see the following improvements by running benchmarks with and without this
change. Only difference >=1.05X

ErrorHandling         8154             7497      -657     -8.1%     1.09x
LinkedList            9973             9529      -444     -4.5%     1.05x
ObjectAllocation      239              222       -17     -7.1%     1.08x
RC4                   23167            21993     -1174     -5.1%     1.05x (!)

This is part of rdar://22380547
2016-02-15 16:10:40 -08:00
Xin Tong
99ca08e4af Check whether epilogue releases cover all non-trivial fields.
When we have all the epilogue releases. Make sure they cover all the non-trivial
parts of the base. Otherwise, treat as if we've found no releases for the base.

Currently. this is a NFC other than epilogue dumper. I will wire it up with
function signature with next commit.

This is part of rdar://22380547
2016-02-15 16:00:02 -08:00
Max Moiseev
3a3984877a Merge remote-tracking branch 'origin/master' into swift-3-api-guidelines 2016-02-15 15:43:34 -08:00
practicalswift
109eb8063f [gardening] Fix recently introduced typo: "thats" → "that is" 2016-02-15 20:46:44 +01:00
Xin Tong
4f66bc88b4 Move ProjectionTree::isRedundantRelease to ConsumedArgToEpilogueReleaseMatcher::isRedundantRelease.
NFC.
2016-02-15 10:22:47 -08:00
practicalswift
2f547e9356 [gardening] Use consistent header structure in newly introduced files 2016-02-15 16:47:48 +01:00
Michael Gottesman
f718111a4f [arc] Integrate ImmutablePointerSet{,Factory} into ARC Sequence Opts.
This speeds and reduces memory consumption of test cases with large
CFGs. The specific test case that spawned this fix was a large function
with many dictionary assignments:

public func func_0(dictIn : [String : MyClass]) -> [String : MyClass] {
  var dictOut : [String : MyClass] = [:]
  dictOut["key5000"] = dictIn["key500"]
  dictOut["key5010"] = dictIn["key501"]
  dictOut["key5020"] = dictIn["key502"]
  dictOut["key5030"] = dictIn["key503"]
  dictOut["key5040"] = dictIn["key504"]
  ...
}

This continued for 10k - 20k values.

This commit reduces the compile time by 2.5x and reduces the amount of
memory allocated by ARC by 2.6x (the memory allocation number includes
memory that is subsequently freed).

rdar://24350646
2016-02-14 15:26:59 -08:00
Michael Gottesman
79de928006 [arc] Only visit instructions that are actually interesting from the perspective of ARC when performing the dataflow.
This will improve ARC compile time performance.

rdar://24350646
2016-02-14 14:56:13 -08:00
Michael Gottesman
02609c0209 [arc] Remove CodeMotionOrDeleteCallback.
Now that the pairing and the actual pass are together I can remove this. It is no longer needed.
2016-02-14 14:56:13 -08:00
Michael Gottesman
e152746277 [arc] Split GlobalARCPairingAnalysis into the matching set builder part and the
top level driver . Move the top level driver of the pairing analysis into
ARCSequenceOpts and have ARCSequenceOpts use ARCMatchingSetBuilder directly.

This patch is the first in a series of patches that improve ARC compile
time performance by ensuring that ARC only visits the full CFG at most
one time.

Previously when ARC was split into an analysis and a pass, the split in
the codebase occurred at the boundary in between ARCSequenceOpts and
ARCPairingAnalysis. I used a callback to allow ARCSequenceOpts to inject
code into ARCPairingAnalysis.

Now that the analysis has been moved together with the pass this
unnecessarily complicates the code. More importantly though it creates
obstacles towards reducing compile time by visiting the CFG only once.

Specifically, we need to visit the full cfg once to gather interesting
instructions. Then when performing the actual dataflow analysis, we only
visit the interesting instructions. This causes an interesting problem
since retains/releases can have dependencies on each other implying that
I need to be able to update where various "interesting instructions" are
located after ARC moves it. The "interesting instruction" information is
stored at the pairing analysis level, but the moving/removal of
instructions is injected in via the callback.

By moving the top level driver part of ARCPairingAnalysis into
ARCSequenceOpts, we simplify the code by eliminating the dependency
injection callback and also make it easier to manage the cached CFG
state in the face of the ARC optimizer moving/removing retains/releases.
2016-02-14 14:56:13 -08:00