inlined-at chain.
The previous implementation was only correct for cases where the inliner
inlined bottom-up in the call graph, which happened to cover the majority
of all cases.
rdar://problem/24462475
This allows devirtualization of witness method calls if the initialization of the existential is not in the same basic block.
This change also fixes a bug where promotion is done even if the stack is overwritten after initialization. Although I'm not sure if this kind of code is ever generated.
function signature opt.
Instead of replacing %1 with UNDEF in debugvalueinst %1, we form an aggregate,
taking the alive part of %1 and fill the dead part with undef.
rdar://23727705
Now that we process functions in bottom-up order in the pass manager and
have a mechanism to restart the pass pipeline on the current
function (or on a newly created callee function), we can split these
passes back out from the inliner and end up with the same benefits we
had from initially integrating them. We get the further benefit of fully
optimizing newly created callee functions before continuing with the
function that resulted in the creation of those callee
functions (e.g. as a result of a specialization pass running).
Previously we treated the * platform as checking for the minimum
deployment target, but that's definitely unnecessary.
There is a bit of a hack here to avoid diagnosing the 'else' branch as
unreachable: if a constant true/false came from #available, ignore it.
This effectively returns us to the code from dc65f70.
With the recent pass manager changes, combined with upcoming inliner
changes, we can potentially run the inliner more than we currently
do. Allowing self-recursive functions to be inlined, and running the
inliner more often, can result in a lot of code bloat, which increases
binary sizes and compile times. Even with a relatively small value (10)
for the number of times we allow a function to run through the pass
pipeline, we end up with a significant increase in the stdlib and
stdlib unit test build times.
This results in some performance regressions, but I think the trade-off
here is reasonable.
This is done by splitting the transformation into an analysis phase and a transformation phase (which does not use the dominator tree anymore).
The domintator tree is recalucated once after the whole function is processed.
This change eventually solves the compile time problem of rdar://problem/24410167.
If a class has an @objc ancestry this class can be dynamically overridden and
therefore we don't know the default case even if we see the full class
hierarchy.
rdar://23228386
Allow function passes to:
1. Add new functions, to be optimized before continuing with the current
function.
2. Restart the pipeline on the current function after the current pass
completes.
This makes it possible to fully optimize callees that are the result of
specialization prior to generating interprocedural information or making
inlining choices about these callees.
It also allows us to solve a phase-ordering issue we have with generic
specialization, devirtualization, and inlining, by rescheduling the
current function after changes happen in one of these passes as opposed
to running all of these as part of the inlining pass as happens today.
Currently this is NFC since we have no passes that use this
functionality.
The main intention for this change is to eliminate the use of the post/dominator trees in this transformation.
These were re-calculated on every conversion which caused long compile times for functions with lot of switch_enum instructions: rdar://problem/24410167
Beside that, the code for collecting the target-block's predecessors is now simpler. It's not necessary to handle arbitrary control flow pathes because jump threading is simplifying the CFG anyway.
Now SimplifyCFG does not use the PostDominanceAnalysis anymore.
This reverts commit 0515889cf0.
I made a mistake and did not catch this regression when I measured the change on
my local machine. The regression was detected by our automatic performance
tests. Thank you @slavapestov for identifying the commit.
we do not need to place bogus value in the unreachable blocks in case a SILArgument needs to be
constrcuted for this block's successors.
This relies on simplifycfg or other passes to clean up the CFG before RLE is ran.
isReachable logic is incorrect. This make RLE too conservative in some cases and incorrect in
others .
This fixed ASAN build break caused by commit 925eb2e0d9
I see more redundant loads elim'ed, but I do not see a performance difference with this change.
This is a first step towards moving the analysis portion of function
signature optimization into an actual SILAnalysis. I'll split this class
into two pieces (one for the analysis it does, and one for the rewrites
it does) next.
Removing one of the invocation of the ARC optimizer. I did not measure any
regressions on the performance test suite (using -O), but I did see a
reduction in compile time on rdar://24350646.