swift-mirror

mirror of https://github.com/apple/swift.git synced 2025-12-14 20:36:38 +01:00

Author	SHA1	Message	Date
Xin Tong	cc1dda478e	Move to a genset and killset for the dataflow in redundant load elimination. Previously we process every instruction every time the data flow re-iterates. This is very inefficient. In addition to moving to genset and killset, we also group function into OneIterationFunction which we know that the data flow would converge in 1 iteration and functions that requre the iterative data flow, mostly due to backedges in loops. we process them differently. I observed that there are ~93% of the functions that require just a single iteration to perform the RLE. But the other 7% accounts for 2321 (out of 6318) of the redundant loads we eliminated. This change reduces RLE compilation from 4.1% to 2.7% of the entire compilation time (frontend+OPT+LLVM) on stdlib with -O. This represents 6.9% of the time spent in SILOptimizations (38.8%). ~2 weeks ago, RLE was taking 1.9% of the entire compilation time. It rose to 4.1% mostly due to that we are now eliminating many more redundant loads (mostly thank to Erik's integragtion of escape analysis in alias analysis). i.e. 3945 redundant loads elimnated before Erik's change to 6318 redundant loads eliminated now.	2015-12-28 15:04:58 -08:00
Xin Tong	7c43b47bd0	revert some WIP code	2015-12-28 14:41:16 -08:00
Xin Tong	0c05064429	Speculative disable escape analysis in local variable DSE. There is test case failure on the real device, but not on simulators. The failure could be a result of this change.	2015-12-28 14:39:54 -08:00
practicalswift	149b50d901	Fix typos in code (non-comment/documentation typos).	2015-12-28 11:42:15 +01:00
Chris Lattner	487b0d1365	Merge pull request #796 from practicalswift/fix-comment-typos Fix typos in code comments	2015-12-27 20:46:46 -08:00
Xin Tong	7e49985b00	Turn a function into early exit sytle. NFC	2015-12-27 17:35:24 -08:00
practicalswift	fd70b26033	Fix typos in comments.	2015-12-28 02:15:34 +01:00
Xin Tong	656894567f	Some of the functions do not really need the iterative data flow in DSE. i.e. for function of which a single post-order would be enough for DSE. In this case, we do not really need to compute the genset and killset (which is a costly operation). On stdlib, i see 93% of the functions are "OneIterationFunction". With this change, i see the compilation time of DSE drops from 2.0% to 1.7% of the entire compilation. This represents 4.3% of all the time spent in SILOptimizations (39.5%).	2015-12-26 20:33:35 -08:00
Chris Lattner	bee0d955ff	Merge pull request #784 from practicalswift/a-vs-an-again Fix typos: "a" vs. "an"	2015-12-26 17:45:36 -08:00
Xin Tong	173fc871ff	Use a SmallBitVector instead of BitVector in DSE. I observed that most functions do not have over 64 locations which makes SmallBitVector a more suitable choice than BitVector. I see a ~10% drop in compilation time in DSE. i.e. 1430ms to 1270ms (2.2% to 2.0% of overall compilation time).	2015-12-26 12:30:49 -08:00
Xin Tong	8be2c51875	Merge 2 steps in data flow in DSE into 1. The more we iterate over the instructions in the function, the more compilation time DSE take. I see a ~10% drop in DSE compilation time, from 2.4% to 2.2%. (Last time (~1 week ago) i checked DSE was taking 1.4% of the compilation time. Now its taking 2.2%. I will look into where the increase come from later).	2015-12-26 11:06:52 -08:00
practicalswift	fa0b339a21	Fix typos.	2015-12-26 17:51:59 +01:00
practicalswift	85e2e6eb9a	Fix a vs. an	2015-12-26 14:40:16 +01:00
practicalswift	db13bcb22e	Fix typos.	2015-12-26 14:11:42 +01:00
Nadav Rotem	07d4558c1c	[Mangler] Change the Swift mangler into a symbol builder. This commit changes the Swift mangler from a utility that writes tokens into a stream into a name-builder that has two phases: "building a name", and "ready". This clear separation is needed for the implementation of the compression layer. Users of the mangler can continue to build the name using the mangleXXX methods, but to access the results the users of the mangler need to call the finalize() method. This method can write the result into a stream, like before, or return an std::string.	2015-12-25 21:40:25 -08:00
practicalswift	22e10737e2	Fix typos	2015-12-26 01:19:40 +01:00
Xin Tong	ed7ef800ab	More refactoring in DSE. Split 1 more function and update some comments.	2015-12-24 16:11:19 -08:00
Xin Tong	30ed2f15aa	Move local variable store checks into helper function. NFC	2015-12-24 15:26:37 -08:00
Xin Tong	1dd13b5d92	Optimize how basic blocks are processed in DSE. I do not see a real compilation time improvement	2015-12-24 15:10:39 -08:00
Xin Tong	6a942d2d20	Split bigger functions into multiple smaller functions in DSE. NFC	2015-12-24 14:58:44 -08:00
Xin Tong	32dc903339	Enable local variable dead store elimination If a variable can not escape the function, we mark the store to it as dead before the function exits. It was disabled due to some TBAA and side-effect analysis changes. We were removing 8 dead stores on the stdlib. With this change, we are now removing 203 dead stores. I only see noise-level performance difference on PerfTestSuite. I do not see real increase on compilation time either.	2015-12-24 14:18:36 -08:00
practicalswift	4ff12be924	Fix typos.	2015-12-24 14:26:57 +01:00
Erik Eckstein	dd204e68ef	DSE: Use escape analysis for checking if memory locations are dead at the end of a function. Currently NFC as local DSE is still disabled.	2015-12-23 16:43:11 -08:00
practicalswift	81e7439a9a	Fix typos.	2015-12-23 11:16:34 +01:00
Mark Lacey	99d17a47f9	Change the invalidation kind for the generic specializer. It adds functions, so technically it needs to invalidate more than just the function body.	2015-12-22 23:19:44 -08:00
Mark Lacey	75f2de5c79	Use a more appropriate invalidation kind in the devirtualizer. We're not touching branches, so we do not need to invalidate those.	2015-12-22 23:18:38 -08:00
Andrew Trick	2da0f601d8	Explicitly restrict NRVO optimization to "out" args. Don't allow this optimization to kick in for "inout" args. The optimization may expose local writes to any aliases of the argument. I can't prove that is memory safe. Erik pointed out this case.	2015-12-22 21:03:03 -08:00
Andrew Trick	000e630b2f	Teach CopyForwarding to handle existential initialization ...in addition to enum initialization, as Slava requested. \|.benchmark...\|.bestbase.\|.bestopt.\|.delta.\|.%delta.\|.speedup.\| \|.ArrayAppend.\|.....2514.\|....2382.\|..-132.\|..-5.3%.\|...1.06x.\| \|.NSXMLParser.\|....12076.\|...11223.\|..-853.\|..-7.1%.\|...1.08x.\| \|.NopDeinit...\|....54961.\|...50619.\|.-4342.\|..-7.9%.\|...1.09x.\| \|.StringWalk..\|....20503.\|...16119.\|.-4384.\|.-21.4%.\|...1.27x.\|	2015-12-22 21:03:03 -08:00
Nadav Rotem	a981c80571	[Mangler] Move the SILMangler out of the AST Mangler namespace.	2015-12-22 17:19:40 -08:00
Slava Pestov	36ddea64ae	Merge pull request #729 from ken0nek/fix-can-not Convert [Cc]an not -> [Cc]annot	2015-12-22 16:06:20 -08:00
practicalswift	6e3b700b44	Fix typos.	2015-12-23 00:31:13 +01:00
Andrew Trick	0c7ee1f283	Teach CopyForwarding to handle enum initialization sequences. This requires a bit of code motion. e.g. 1. %Tmp = alloc_stack 2. copy_addr %InArg to [initialization] %Tmp 3. DataAddr = init_enum_data_addr %OutArg 4. copy_addr %Tmp#1 to [initialization] %DataAddr becomes 1. %Tmp = alloc_stack 4. DataAddr = init_enum_data_addr %OutArg 2. copy_addr %InArg to [initialization] %DataAddr Fixes at least one regression resulting from '++' removal. See rdar://23874709 [perf] -Onone Execution Time regression of up-to 19% -Onone results \|.benchmark............\|.bestbase.\|.bestopt.\|..delta.\|.%delta.\|speedup.\| \|.StringWalk...........\|....33570.\|...20967.\|.-12603.\|.-37.5%.\|..1.60x.\| \|.OpenClose............\|......446.\|.....376.\|....-70.\|.-15.7%.\|..1.19x.\| \|.SmallPT..............\|....98959.\|...83964.\|.-14995.\|.-15.2%.\|..1.18x.\| \|.StrToInt.............\|....17550.\|...16377.\|..-1173.\|..-6.7%.\|..1.07x.\| \|.BenchLangCallingCFunc\|......453.\|.....428.\|....-25.\|..-5.5%.\|..1.06x.\| \|.CaptureProp..........\|....50758.\|...48156.\|..-2602.\|..-5.1%.\|..1.05x.\| \|.ProtocolDispatch.....\|.....5276.\|....5017.\|...-259.\|..-4.9%.\|..1.05x.\| \|.Join.................\|.....1433.\|....1372.\|....-61.\|..-4.3%.\|..1.04x.\|	2015-12-22 14:35:52 -08:00
Andrew Trick	9a85a0254f	Make CopyForwarding more conservative with destination addresses. AFAICT, this does not fix any existing bug, but eliminates unverified assumptions about well-formed SIL, which could be broken by future optimization. Forward: The optimization will replace all in-scope uses of the destination address with the source. With this change we will be sure not eliminate writes into a destination address unless the destination is an AllocStackInst. This hasn't been a problem in practice because the optimization requires an in-scope deinit of the destination address, which can't happen on typical address projections. Backward: The optimization will replace in-scope uses of the source with the destination. With this change we will be sure not to write into the destination location prior to the copy unless the destination is an AllocStackInst. This hasn't been a problem in practice because the optimization requires the copy to be an initialization of the address, which can't happen on typical address projections. This change prevents both optimizations without an obvious guarantee that any dependency on the destination address will manifest as a SIL-level dependence on the address producer. For example, init_enum_data_addr would not qualify because it simply projects an address within a value that may have other dependencies.	2015-12-22 14:35:52 -08:00
ken0nek	fcd8fcee91	Convert [Cc]an not -> [Cc]annot	2015-12-23 00:55:48 +09:00
Arsen Gasparyan	be738abb7c	Fix else code style	2015-12-22 11:13:26 +03:00
Mark Lacey	70938b1aee	Add a stand-alone devirtualizer pass. Add back a stand-alone devirtualizer pass, running prior to generic specialization. As with the stand-alone generic specializer pass, this may add functions to the pass manager's work list. This is another step in unbundling these passes from the performance inliner.	2015-12-21 23:42:37 -08:00
Mark Lacey	4bb33dc3fd	Remove some inadvertantly committed code. Remove some things that were supposed to have been removed prior to the original commit, and fix a typo in the DEBUG_TYPE string.	2015-12-21 23:42:37 -08:00
Michael Gottesman	976d39fa08	Add 3 cases that I missed to make a switch truly exhaustive.	2015-12-21 19:06:25 -06:00
Michael Gottesman	3eca15623b	Change 6 non-exhaustive switches on ValueKind to be exhaustive switches on TermKind. NFC. This exposed the first interesting bug found by using TermKind, in DCE we were not properly handling switch_enum_addr and checked_cast_addr_br. SR-335 rdar://23980060	2015-12-21 17:12:06 -06:00
Xin Tong	ee1396aaa6	Refactor RLE into stages. After the refactoring, RLE runs in the following phases: Phase 1. we use an iterative data flow to compute whether there is an available value at a given point, we do not yet care about what the value is. Phase 2. we compute the real forwardable value at a given point. Phase 3. we setup the SILValues for the redundant load elimination. Phase 4. we perform the redundant load elimination. Previously we were computing available bit as well as what the available value is every iteration of the data flow. I do not see a compilation time improvement though, but this helps to move to a genset and killset later as we only need to expand Phase 1 into a few smaller phases to compute genset & killset first and then iterate until convergence for the data flow. I verified that we are performing same # of RLE on stdlib before the change. Existing test ensure correctness.	2015-12-21 14:56:56 -08:00
practicalswift	36d7072013	Remove immediately adjacent repeated words ("the the", "for for", "an an", etc.).	2015-12-21 22:16:04 +01:00
Erik Eckstein	b3691c83a6	SimplifyCFG: fix crash if a branch argument is replaced with a SILValue with result number != 0 Fixes https://bugs.swift.org/browse/SR-329 rdar://problem/23969954	2015-12-21 10:08:36 -08:00
Xin Tong	5ce5a5ec4d	Move to use SSAUpdater to generate the SILArgument when a location has a covering value, i.e. multiple different values from predecessors Previously, RLE is placing the SILArguments and branch edgevalues itself. This is probably not as reliable/robust as using the SSAupdater. RLE uses a single SSAupdater to create the SILArguments, this way previously created SILArguments can be reused. One test is created specifically for that so that we do not generate extraneous SILArguments.	2015-12-21 00:31:11 -08:00
Xin Tong	5d10f31b0e	Add a condition on what the maximum of locations there are in a function for RLE to optimize RLE is an iterative data flow. Functions with too many locations may take a long time for the data flow to converge. Once we move to a genset and killset for RLE. we should be able to lessen the condition a bit more. I have observed no difference in # of redundant loads eliminated on the stdlib (currently we eliminate 3862 redundant loads).	2015-12-20 22:42:49 -08:00
Xin Tong	ec83f8c661	Revert "Move to use SSAUpdater to generate the SILArgument when a location has a covering value," This reverts commit `bc172647c7`. Caused a compiler crash on Linux.	2015-12-20 18:19:40 -08:00
Xin Tong	bc172647c7	Move to use SSAUpdater to generate the SILArgument when a location has a covering value, i.e. multiple different values from predecessors Previously, RLE is placing the SILArguments and branch edgevalues itself. This is probably not as reliable/robust as using the SSAupdater. RLE uses a single SSAupdater to create the SILArguments, this way previously created SILArguments can be reused. One test is created specifically for that so that we do not generate extraneous SILArguments.	2015-12-20 17:40:41 -08:00
Xin Tong	375bef1295	Minor comment fix in DeadObjectElimination.cpp	2015-12-20 14:23:20 -08:00
Xin Tong	a95e25c887	Refactor in DSE 1. Add some comments regarding how the pass builds and uses genset and killset 2. Merge some similar functions. 3. Rename DSEComputeKind to DSEKind. 4. Some other small comment changes.	2015-12-20 14:19:47 -08:00
Mark Lacey	faba6e56b7	Add a stand-alone generic specializer pass. Begin unbundling devirtualization, specialization, and inlining by recreating the stand-alone generic specializer pass. I've added a use of the pass to the pipeline, but this is almost certainly not going to be the final location of where it runs. It's primarily there to ensure this code gets exercised. Since this is running prior to inlining, it changes the order that some functions are specialized in, which means differences in the order of output of one of the tests (one which similarly changed when devirtualization, specialization, and inlining were bundled together).	2015-12-18 14:08:56 -08:00
Xin Tong	cb94f5fdd1	Update some comments and remove dead code in dead store elimination	2015-12-18 10:18:36 -08:00

... 48 49 50 51 52

2592 Commits