swift-mirror

mirror of https://github.com/apple/swift.git synced 2025-12-14 20:36:38 +01:00

Author	SHA1	Message	Date
Xin Tong	64e2710102	Move LSBase.x to SILOptimizer/Utils/. NFC.	2016-03-07 22:07:13 -05:00
Xin Tong	ba0249c924	Rename SILValueProjection.x to LSBase.x. NFC	2016-03-07 21:26:56 -05:00
Xin Tong	bfc258f628	Simplify LSValue::reduce for redundant load elimination LSValue::reduce reduces a set of LSValues (mapped to a set of LSLocations) to a single LSValue. It can then be used as the forwarding value for the location. Previously, we expand into intermediate nodes and leaf nodes and then go bottom up, trying to create a single LSValue out of the given LSValues. Instead, we now use a recursion to go top down. This simplifies the code. And this is fine as we do not expect to run into type tree that are too deep. Existing test cases ensure correctness.	2016-03-07 21:26:56 -05:00
Kevin Yu	8f193c856e	[gardening] Fix typos "cant" -> "can't", "dont" -> "don't"	2016-03-06 00:25:14 +00:00
Xin Tong	ddb9bba50e	Improve the compilation time of redundant load elimination For forwarding on allocstacks, we can invalidate the forwable bit when we hit the deallocate stack. This helps compilation time as we do not need to propagate these bits down to subsequent basic blocks.	2016-02-29 10:11:40 -08:00
Xin Tong	19c528e59d	Make more passes respect no.optimize	2016-02-26 16:03:17 -08:00
Xin Tong	a48584ccbc	Create a fast path for not-final release instruction. For a release on a guaranteed function paramater, we know right away that its not the final release and therefore does not call deinit. Therefore we know it does not read or write memory other than the reference count. This reduces the compilation time of dead store and redundant load elim. As we need to go over alias analysis to make sure tracked locations do not alias with it.	2016-02-20 22:00:36 -08:00
Xin Tong	e42bd372eb	Skip processing block without loads. After collected enough information in the first iteration of the data flow. We do not do second iteration (last iteration) for blocks without loads as we will not forward any load there. This improves compilation time of redundant load elimination.	2016-02-19 20:45:23 -08:00
Xin Tong	84a6ff1d98	And lastly rename NewProjection to Projection. This is a NFC. rdar://24520269	2016-02-09 22:20:10 -08:00
saisi	7f1da6adcc	Fixed more niggling typos	2016-01-29 23:52:24 -05:00
Xin Tong	5c96bc4945	RLE marks the LiveOut of unreachable block as 0. This is done to simplfy the SSAupdate etc, i.e. we do not need to place bogus value in the unreachable blocks in case a SILArgument needs to be constrcuted for this block's successors. This relies on simplifycfg or other passes to clean up the CFG before RLE is ran. isReachable logic is incorrect. This make RLE too conservative in some cases and incorrect in others . This fixed ASAN build break caused by commit `925eb2e0d9` I see more redundant loads elim'ed, but I do not see a performance difference with this change.	2016-01-27 14:58:28 -08:00
practicalswift	75bec87b5a	[gardening] Fix recently introduced typo: optimsitic → optimistic	2016-01-27 11:59:07 +01:00
Xin Tong	9c3cdcc00e	Replace some DenseMap with SmallDenseMap. Many, if not most functions do not have more than 64 Locations. which is the default for DenseMap.	2016-01-26 19:45:01 -08:00
Xin Tong	925eb2e0d9	Correct an inefficiency in initial state of the data flow in RLE	2016-01-26 19:45:01 -08:00
Xin Tong	b3d0d815fc	Change SmallDenseMap initial size from 4 to 16. It seems this gives a bit better compilation time as we do not resize the densemap as much. NFC.	2016-01-26 19:45:01 -08:00
Xin Tong	5034e5ba72	Add in some throttle logic for RLE. This is mostly intended for functions that are way too large to process. I do not see compilation time difference in stdlib -O nor any change in # of redundant loads eliminated. I am more looking at compilation time and precision in stdlibunittest. === Before Throttle Logic === compilation time stdlibunit -O: Running Time Self (ms) Symbol Name 27016.0ms 26.4% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 26885.0ms 26.2% 0.0 swift::SILPassManager::runOneIteration() 22355.0ms 21.8% 15.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 21416.0ms 20.9% 42.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 5662.0ms 5.5% 10.0 (anonymous namespace)::ARCSequenceOpts::run() 3916.0ms 3.8% 58.0 (anonymous namespace)::RedundantLoadElimination::run() 2707.0ms 2.6% 3.0 (anonymous namespace)::SILCombine::run() 2248.0ms 2.1% 5.0 (anonymous namespace)::SimplifyCFGPass::run() 1974.0ms 1.9% 121.0 (anonymous namespace)::SILCSE::run() 1592.0ms 1.5% 30.0 (anonymous namespace)::DeadStoreElimination::run() 746.0ms 0.7% 170.0 (anonymous namespace)::DCE::run() === After Throttle Logic === compilation time stdlibunit -O: Running Time Self (ms) Symbol Name 25735.0ms 25.4% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 25611.0ms 25.3% 0.0 swift::SILPassManager::runOneIteration() 21260.0ms 21.0% 21.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 20340.0ms 20.1% 43.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 5319.0ms 5.2% 8.0 (anonymous namespace)::ARCSequenceOpts::run() 3265.0ms 3.2% 58.0 (anonymous namespace)::RedundantLoadElimination::run() 2661.0ms 2.6% 1.0 (anonymous namespace)::SILCombine::run() 2185.0ms 2.1% 5.0 (anonymous namespace)::SimplifyCFGPass::run() 1847.0ms 1.8% 105.0 (anonymous namespace)::SILCSE::run() 1499.0ms 1.4% 21.0 (anonymous namespace)::DeadStoreElimination::run() 708.0ms 0.7% 150.0 (anonymous namespace)::DCE::run() 498.0ms 0.4% 7.0 (anonymous namespace)::SILCodeMotion::run() 370.0ms 0.3% 0.0 (anonymous namespace)::StackPromotion::run()	2016-01-25 20:10:53 -08:00
Xin Tong	f5bd3eab49	Optimize compilation time for RLE and DSE with respective to the new projection path. We do not need to trace from the accessed field to the base object when we've done it before in enumerateLSLOcations Stdlib -O === Before === Running Time Self (ms) Symbol Name 25137.0ms 37.3% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 24939.0ms 37.0% 0.0 swift::SILPassManager::runOneIteration() 20226.0ms 30.0% 29.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 19241.0ms 28.5% 83.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 3214.0ms 4.7% 10.0 (anonymous namespace)::SimplifyCFGPass::run() 3005.0ms 4.4% 14.0 (anonymous namespace)::ARCSequenceOpts::run() 2438.0ms 3.6% 7.0 (anonymous namespace)::SILCombine::run() 2217.0ms 3.2% 54.0 (anonymous namespace)::RedundantLoadElimination::run() 2212.0ms 3.2% 131.0 (anonymous namespace)::SILCSE::run() 1195.0ms 1.7% 11.0 (anonymous namespace)::GenericSpecializer::run() 1168.0ms 1.7% 39.0 (anonymous namespace)::DeadStoreElimination::run() 853.0ms 1.2% 150.0 (anonymous namespace)::DCE::run() 499.0ms 0.7% 7.0 (anonymous namespace)::SILCodeMotion::run() === After === Running Time Self (ms) Symbol Name 22955.0ms 38.2% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 22777.0ms 37.9% 0.0 swift::SILPassManager::runOneIteration() 18447.0ms 30.7% 30.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 17510.0ms 29.1% 67.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 2944.0ms 4.9% 5.0 (anonymous namespace)::SimplifyCFGPass::run() 2884.0ms 4.8% 12.0 (anonymous namespace)::ARCSequenceOpts::run() 2277.0ms 3.7% 1.0 (anonymous namespace)::SILCombine::run() 1951.0ms 3.2% 117.0 (anonymous namespace)::SILCSE::run() 1803.0ms 3.0% 54.0 (anonymous namespace)::RedundantLoadElimination::run() 1096.0ms 1.8% 10.0 (anonymous namespace)::GenericSpecializer::run() 911.0ms 1.5% 53.0 (anonymous namespace)::DeadStoreElimination::run() 795.0ms 1.3% 135.0 (anonymous namespace)::DCE::run() 453.0ms 0.7% 9.0 (anonymous namespace)::SILCodeMotion::run()	2016-01-25 20:10:04 -08:00
Xin Tong	4084c2383f	Refactor redundant load elimination. NFC	2016-01-25 20:09:17 -08:00
Xin Tong	546471ac4d	Port dead store elimination and redundant load elimination to use the new projection. This patch also implements some of the missing functions used by RLE and DSE in new projection that exist in the old projection. New projection provides better memory usage, eventually we will phase out the old projection code. New projection is now copyable, i.e. we have a proper constructor for it. This helps make the code more readable. We do see a bit increase in compilation time in compiling stdlib -O, this is a result of the way we now get types of a projection path, but I expect this to go down (away) with further improvement on how memory locations are constructed and cached with later patches. === With the OLD Projection. === Total amount of memory allocated. -------------------------------- Bytes Used Count Symbol Name 13032.01 MB 50.6% 2158819 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 2879.70 MB 11.1% 3076018 (anonymous namespace)::ARCSequenceOpts::run() 2663.68 MB 10.3% 1375465 (anonymous namespace)::RedundantLoadElimination::run() 1534.35 MB 5.9% 5067928 (anonymous namespace)::SimplifyCFGPass::run() 1278.09 MB 4.9% 576714 (anonymous namespace)::SILCombine::run() 1052.68 MB 4.0% 935809 (anonymous namespace)::DeadStoreElimination::run() 771.75 MB 2.9% 1677391 (anonymous namespace)::SILCSE::run() 715.07 MB 2.7% 4198193 (anonymous namespace)::GenericSpecializer::run() 434.87 MB 1.6% 652701 (anonymous namespace)::SILSROA::run() 402.99 MB 1.5% 658563 (anonymous namespace)::SILCodeMotion::run() 341.13 MB 1.3% 962459 (anonymous namespace)::DCE::run() 279.48 MB 1.0% 415031 (anonymous namespace)::StackPromotion::run() Compilation time breakdown. -------------------------- Running Time Self (ms) Symbol Name 25716.0ms 35.8% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 25513.0ms 35.5% 0.0 swift::SILPassManager::runOneIteration() 20666.0ms 28.8% 24.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 19664.0ms 27.4% 77.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 3272.0ms 4.5% 12.0 (anonymous namespace)::SimplifyCFGPass::run() 3266.0ms 4.5% 7.0 (anonymous namespace)::ARCSequenceOpts::run() 2608.0ms 3.6% 5.0 (anonymous namespace)::SILCombine::run() 2089.0ms 2.9% 104.0 (anonymous namespace)::SILCSE::run() 1929.0ms 2.7% 47.0 (anonymous namespace)::RedundantLoadElimination::run() 1280.0ms 1.7% 14.0 (anonymous namespace)::GenericSpecializer::run() 1010.0ms 1.4% 45.0 (anonymous namespace)::DeadStoreElimination::run() 966.0ms 1.3% 191.0 (anonymous namespace)::DCE::run() 496.0ms 0.6% 6.0 (anonymous namespace)::SILCodeMotion::run() === With the NEW Projection. === Total amount of memory allocated. -------------------------------- Bytes Used Count Symbol Name 11876.64 MB 48.4% 22112349 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 2887.22 MB 11.8% 3079485 (anonymous namespace)::ARCSequenceOpts::run() 1820.89 MB 7.4% 1877674 (anonymous namespace)::RedundantLoadElimination::run() 1533.16 MB 6.2% 5073310 (anonymous namespace)::SimplifyCFGPass::run() 1282.86 MB 5.2% 577024 (anonymous namespace)::SILCombine::run() 772.21 MB 3.1% 1679154 (anonymous namespace)::SILCSE::run() 721.69 MB 2.9% 936958 (anonymous namespace)::DeadStoreElimination::run() 715.08 MB 2.9% 4196263 (anonymous namespace)::GenericSpecializer::run() Compilation time breakdown. -------------------------- Running Time Self (ms) Symbol Name 25137.0ms 37.3% 0.0 swift::runSILOptimizationPasses(swift::SILModule&) 24939.0ms 37.0% 0.0 swift::SILPassManager::runOneIteration() 20226.0ms 30.0% 29.0 swift::SILPassManager::runFunctionPasses(llvm::ArrayRef<swift::SILFunctionTransform>) 19241.0ms 28.5% 83.0 swift::SILPassManager::runPassesOnFunction(llvm::ArrayRef<swift::SILFunctionTransform>, swift::SILFunction) 3214.0ms 4.7% 10.0 (anonymous namespace)::SimplifyCFGPass::run() 3005.0ms 4.4% 14.0 (anonymous namespace)::ARCSequenceOpts::run() 2438.0ms 3.6% 7.0 (anonymous namespace)::SILCombine::run() 2217.0ms 3.2% 54.0 (anonymous namespace)::RedundantLoadElimination::run() 2212.0ms 3.2% 131.0 (anonymous namespace)::SILCSE::run() 1195.0ms 1.7% 11.0 (anonymous namespace)::GenericSpecializer::run() 1168.0ms 1.7% 39.0 (anonymous namespace)::DeadStoreElimination::run() 853.0ms 1.2% 150.0 (anonymous namespace)::DCE::run() 499.0ms 0.7% 7.0 (anonymous namespace)::SILCodeMotion::run()	2016-01-25 20:08:29 -08:00
Erik Eckstein	74d44b74e7	SIL: remove SILValue::getDef and add a cast operator to ValueBase * as a repelacement. NFC.	2016-01-25 15:00:49 -08:00
Erik Eckstein	ec172cde5b	Remove SILValue::replaceAllUsesWith. It's not needed anymore because we can use ValueBase::replaceAllUses	2016-01-21 16:04:30 -08:00
practicalswift	6d0eee9b8c	Remove unused variables.	2016-01-21 10:33:17 +01:00
Xin Tong	9f8dcfab07	Make Fix_lifetime instruction inert from a load prospective	2016-01-20 15:01:56 -08:00
practicalswift	1339b5403b	Consistent use of header comment format. Correct format: //===--- Name of file - Description ----------------------------- Lang --===//	2016-01-04 13:26:31 +01:00
practicalswift	50baf2e53b	Use consistent formatting in top of file headers.	2016-01-04 02:17:48 +01:00
Zach Panzarino	e3a4147ac9	Update copyright date	2015-12-31 23:28:40 +00:00
practicalswift	2e995a8ba4	Fix recently introduced typos.	2015-12-31 15:28:54 +01:00
Xin Tong	a35eabd6f7	Instead of enumerating all the LSValues before RLE is ran on the function. we enumerate them lazily. This leads to compilation time improvement, as some of the LSValues previously enumerated do not be created in this approach. i.e. we enumerate LSValues created by loads previously, but the LoadInsts could be target for RLE. In such case, the enumerated LSValues are not used. Compilation time improvement: 1775ms to 1721ms (2.7% to 2.6% of the entire compilation time for stdlib -O). Existing tests ensure correctness. Note: we still enumerate locations, as we need to know how many locations there are in the function to resize the bitvector appropriately before the data flow runs.	2015-12-30 14:35:59 -08:00
Xin Tong	0c162cd3d1	Refactor in redundant load elimination. NFC. 1. Update some comments. 2. Rename a few functions, e.g. runIterativeDF -> runIterativeRLE, getLSValueBit -> getValueBit. 3. Remove unused headers. 4. Remove no-longer used function, mergePredecessorStates and mergePredecessorState. 5. A few other small NFCs.	2015-12-29 15:35:49 -08:00
practicalswift	fb314c37aa	Fix typos introduced yesterday :-)	2015-12-29 13:16:02 +01:00
Xin Tong	cc1dda478e	Move to a genset and killset for the dataflow in redundant load elimination. Previously we process every instruction every time the data flow re-iterates. This is very inefficient. In addition to moving to genset and killset, we also group function into OneIterationFunction which we know that the data flow would converge in 1 iteration and functions that requre the iterative data flow, mostly due to backedges in loops. we process them differently. I observed that there are ~93% of the functions that require just a single iteration to perform the RLE. But the other 7% accounts for 2321 (out of 6318) of the redundant loads we eliminated. This change reduces RLE compilation from 4.1% to 2.7% of the entire compilation time (frontend+OPT+LLVM) on stdlib with -O. This represents 6.9% of the time spent in SILOptimizations (38.8%). ~2 weeks ago, RLE was taking 1.9% of the entire compilation time. It rose to 4.1% mostly due to that we are now eliminating many more redundant loads (mostly thank to Erik's integragtion of escape analysis in alias analysis). i.e. 3945 redundant loads elimnated before Erik's change to 6318 redundant loads eliminated now.	2015-12-28 15:04:58 -08:00
practicalswift	fa0b339a21	Fix typos.	2015-12-26 17:51:59 +01:00
practicalswift	22e10737e2	Fix typos	2015-12-26 01:19:40 +01:00
practicalswift	81e7439a9a	Fix typos.	2015-12-23 11:16:34 +01:00
ken0nek	fcd8fcee91	Convert [Cc]an not -> [Cc]annot	2015-12-23 00:55:48 +09:00
Xin Tong	ee1396aaa6	Refactor RLE into stages. After the refactoring, RLE runs in the following phases: Phase 1. we use an iterative data flow to compute whether there is an available value at a given point, we do not yet care about what the value is. Phase 2. we compute the real forwardable value at a given point. Phase 3. we setup the SILValues for the redundant load elimination. Phase 4. we perform the redundant load elimination. Previously we were computing available bit as well as what the available value is every iteration of the data flow. I do not see a compilation time improvement though, but this helps to move to a genset and killset later as we only need to expand Phase 1 into a few smaller phases to compute genset & killset first and then iterate until convergence for the data flow. I verified that we are performing same # of RLE on stdlib before the change. Existing test ensure correctness.	2015-12-21 14:56:56 -08:00
Xin Tong	5ce5a5ec4d	Move to use SSAUpdater to generate the SILArgument when a location has a covering value, i.e. multiple different values from predecessors Previously, RLE is placing the SILArguments and branch edgevalues itself. This is probably not as reliable/robust as using the SSAupdater. RLE uses a single SSAupdater to create the SILArguments, this way previously created SILArguments can be reused. One test is created specifically for that so that we do not generate extraneous SILArguments.	2015-12-21 00:31:11 -08:00
Xin Tong	5d10f31b0e	Add a condition on what the maximum of locations there are in a function for RLE to optimize RLE is an iterative data flow. Functions with too many locations may take a long time for the data flow to converge. Once we move to a genset and killset for RLE. we should be able to lessen the condition a bit more. I have observed no difference in # of redundant loads eliminated on the stdlib (currently we eliminate 3862 redundant loads).	2015-12-20 22:42:49 -08:00
Xin Tong	ec83f8c661	Revert "Move to use SSAUpdater to generate the SILArgument when a location has a covering value," This reverts commit `bc172647c7`. Caused a compiler crash on Linux.	2015-12-20 18:19:40 -08:00
Xin Tong	bc172647c7	Move to use SSAUpdater to generate the SILArgument when a location has a covering value, i.e. multiple different values from predecessors Previously, RLE is placing the SILArguments and branch edgevalues itself. This is probably not as reliable/robust as using the SSAupdater. RLE uses a single SSAupdater to create the SILArguments, this way previously created SILArguments can be reused. One test is created specifically for that so that we do not generate extraneous SILArguments.	2015-12-20 17:40:41 -08:00
practicalswift	8ab8847684	Fix typos.	2015-12-16 22:09:32 +01:00
Dmitri Gribenko	f957a68170	Merge pull request #513 from practicalswift/fix-typos-17 Fix typos (17 of 30)	2015-12-14 01:11:28 -08:00
practicalswift	a297b26278	Fix typo: invaliate → invalidate	2015-12-14 00:12:00 +01:00
practicalswift	45cb275de0	Fix typo: avalable → available	2015-12-14 00:11:18 +01:00
Xin Tong	3b62b2f15c	Rename reduceWithValues to LSValue::reduce	2015-12-13 11:37:34 -08:00
Xin Tong	27b5a40359	Rename MemLocation to LSLocation and LoadStoreValue to LSValue	2015-12-13 10:02:45 -08:00
Xin Tong	642a555a6c	Rename MemLocation.h to SILValueProjection.h	2015-12-13 09:31:56 -08:00
Xin Tong	7559f2013d	Implement a LoadStoreValue vault so that every basic block keeps an unsigned index instead of the real LoadStoreValue in RLE	2015-12-13 09:10:32 -08:00
Andrew Trick	739b0e9c56	Reorganize SILOptimizer directories for better discoverability. (libraries now) It has been generally agreed that we need to do this reorg, and now seems like the perfect time. Some major pass reorganization is in the works. This does not have to be the final word on the matter. The consensus among those working on the code is that it's much better than what we had and a better starting point for future bike shedding. Note that the previous organization was designed to allow separate analysis and optimization libraries. It turns out this is an artificial distinction and not an important goal.	2015-12-11 15:14:23 -08:00

49 Commits