For functions which results in > 10000 nodes, just bail and don't compute the connection graph.
The node merging algorithm is quadratic and can result in significant compile times for very large functions.
rdar://problem/56268570
Make `SynthesizedFileUnit` attached to a `SourceFile`. This seemed like the
least ad-hoc approach to avoid doing unnecessary work for other `FileUnit`s.
TBDGen: when visiting a `SourceFile`, also visit its `SynthesizedFileUnit` if
it exists.
Serialization: do not treat `SynthesizedFileUnit` declarations as xrefs when
serializing the companion `SourceFile`.
Resolves TF-1239: AutoDiff test failures.
JVP functions are forward-mode derivative functions. They take original
arguments and return original results and a differential function. Differential
functions take derivatives wrt arguments and return derivatives wrt results.
`JVPEmitter` is a cloner that emits JVP and differential functions at the same
time. In JVP functions, function applications are replaced with JVP function
applications. In differential functions, function applications are replaced
with differential function applications.
In JVP functions, each basic block takes a differential struct containing callee
differentials. These structs are consumed by differential functions.
Make `ADContext` lazily create a `SynthesizedFileUnit` instead of creating one
during `ADContext` construction. This avoids always creating a
`SynthesizedFileUnit` in every module, since differentiation is a mandatory
transform that always runs.
It was nonetheless useful to test always creating a `SynthesizedFileUnit` for
testing purposes.
Add implicit declarations generated by the differentiation transform to a
`SynthesizedFileUnit` instead of an ad-hoc pre-existing `SourceFile`.
Resolves TF-1232: type reconstruction for AutoDiff-generated declarations.
Previously, type reconstruction failed because retroactively adding declarations
to a `SourceFile` did not update name lookup caches.
`PullbackEmitter` is a visitor that emits pullback functions. It implements
reverse-mode automatic differentiation, along with `VJPEmitter`.
Pullback functions take derivatives with respect to outputs and return
derivatives with respect to inputs. Every active value/address in an original
function has a corresponding adjoint value/buffer in the pullback function.
Pullback functions consume pullback structs and predecessor enums constructed
by VJP functions.
`VJPEmitter` is a cloner that emits VJP functions. It implements reverse-mode
automatic differentiation, along with `PullbackEmitter`.
`VJPEmitter` clones an original function, replacing function applications with
VJP function applications. In VJP functions, each basic block takes a pullback
struct (containing callee pullbacks) and produces a predecessor enum: these data
structures are consumed by pullback functions.
`LinearMapInfo` contains information about linear map structs and branching
trace enums, which are auxiliary data structures created by the differentiation
transform.
These data structures are constructed in JVP/VJP functions and consumed in
differential/pullback functions.
Differentiable activity analysis is a dataflow analysis which marks values in
a function as varied, useful, or active (both varied and useful).
Only active values need a derivative.
Canonicalizes `differentiable_function` instructions by filling in missing
derivative function operands.
Derivative function emission rules, based on the original function value:
- `function_ref`: look up differentiability witness with the exact or a minimal
superset derivative configuration. Emit a `differentiability_witness_function`
for the derivative function.
- `witness_method`: emit a `witness_method` with the minimal superset derivative
configuration for the derivative function.
- `class_method`: emit a `class_method` with the minimal superset derivative
configuration for the derivative function.
If an *actual* emitted derivative function has a superset derivative
configuration versus the *desired* derivative configuration, create a "subset
parameters thunk" to thunk the actual derivative to the desired type.
For `differentiable_function` instructions formed from curry thunk applications:
clone the curry thunk (with type `(Self) -> (T, ...) -> U`) and create a new
version with type `(Self) -> @differentiable (T, ...) -> U`.
Progress towards TF-1211.
The differentiation transform does the following:
- Canonicalizes differentiability witnesses by filling in missing derivative
function entries.
- Canonicalizes `differentiable_function` instructions by filling in missing
derivative function operands.
- If necessary, performs automatic differentiation: generating derivative
functions for original functions.
- When encountering non-differentiability code, produces a diagnostic and
errors out.
Partially resolves TF-1211: add the main canonicalization loop.
To incrementally stage changes, derivative functions are currently created
with empty bodies that fatal error with a nice message.
Derivative emitters will be upstreamed separately.
* [Typechecker] Allow enum cases without payload to witness a static get-only property with Self type protocol requirement
* [SIL] Add support for payload cases as well
* [SILGen] Clean up comment
* [Typechecker] Re-enable some previously disabled witness matching code
Also properly handle the matching in some cases
* [Test] Update typechecker tests with payload enum test cases
* [Test] Update SILGen test
* [SIL] Add two FIXME's to address soon
* [SIL] Emit the enum case constructor unconditionally when an enum case is used as a witness
Also, tweak SILDeclRef::getLinkage to update the 'limit' to 'OnDemand' if we have an enum declaration
* [SILGen] Properly handle a enum witness in addMethodImplementation
Also remove a FIXME and code added to workaround the original bug
* [TBDGen] Handle enum case witness
* [Typechecker] Fix conflicts
* [Test] Fix tests
* [AST] Fix indentation in diagnostics def file
RedundantPhiElimination eliminates block phi-arguments which have the same value as other arguments of the same block.
This also works with cycles, like two equivalent loop induction variables. Such patterns are generated e.g. when using stdlib's enumerated() on Array.
preheader:
br bb1(%initval, %initval)
header(%phi1, %phi2):
%next1 = builtin "add" (%phi1, %one)
%next2 = builtin "add" (%phi2, %one)
cond_br %loopcond, header(%next1, %next2), exit
exit:
is replaced with
preheader:
br bb1(%initval)
header(%phi1):
%next1 = builtin "add" (%phi1, %one)
%next2 = builtin "add" (%phi1, %one) // dead: will be cleaned-up later
cond_br %loopcond, header(%next1), exit
exit:
Any remaining dead or "trivially" equivalent instructions will then be cleaned-up by DCE and CSE, respectively.
rdar://problem/33438123
Devirtualizing try_apply modified the CFG, but passes that run
devirtualization were not invalidating any CFG analyses, such as the
domtree.
This could hypothetically have caused miscompilation, but will be
caught by running with -sil-verify-all.
If EscapeAnalysis verification runs on unreachable code, it asserts
with "Missing escape connection graph mapping" because the connection
graph builder only runs on reachable blocks.
Add a ReachableBlocks utility and use it during verification.
Fixes <rdar://problem/60373501> EscapeAnalysis crashes with CFG with
unreachable blocks
We need this anyways for -Onone and I want to do some experiments with running
this very early so I can expose more of the stdlib (modulo inlining) to the new
ownership optimizing passes.
I also changed how the inliner handles inlining around OSSA by changing it to
check early that if the caller is in ossa, then we only inline if all of the
callees that the caller calls are in ossa. The intention is to hopefully avoid
weird swings in code-size/perf due to the inliner heuristic's calculation being
artificially manipulated due to some callees not being available to inline (due
to this difference) when others are already available.
This is in prepration for other bug fixes.
Clarify the SIL utilities that return canonical address values for
formal access given the address used by some memory operation:
- stripAccessMarkers
- getAddressAccess
- getAccessedAddress
These are closely related to the code in MemAccessUtils.
Make sure passes use these utilities consistently so that
optimizations aren't defeated by normal variations in SIL patterns.
Create an isLetAddress() utility alongside these basic utilities to
make sure it is used consistently with the address corresponding to
formal access. When this query is used inconsistently, it defeats
optimization. It can also cause correctness bugs because some
optimizations assume that 'let' initialization is only performed on a
unique address value.
Functional changes to Memory Behavior:
- An instruction with side effects now conservatively still has side
effects even when the queried value is a 'let'. Let values are
certainly sensitive to side effects, such as the parent object being
deallocated.
- Return the correct MemBehavior for begin/end_access markers.
* Simplified the logic for creating static initializers and constant folding for global variables: instead of creating a getter function, directly inline the constant value into the use-sites.
* Wired up the constant folder in GlobalOpt, so that a chains for global variables can be propagated, e.g.
let a = 1
let b = a + 10
let c = b + 5
* Fixed a problem where we didn't create a static initializer if a global is not used in the same module. E.g. a public let variable.
* Simplified the code in general.
rdar://problem/31515927
I was inconsistently providing initialized or uninitialized memory
to the callback when projecting a settable address, depending on
component type. We should always provide an uninitialized address.
We have an optimization in SILCombiner that "inlines" the use of compile-time constant key paths by performing the property access directly instead of calling a runtime function (leading to huge performance gains e.g. for heavy use of @dynamicMemberLookup). However, this optimization previously only supported key paths which solely access stored properties, so computed properties, optional chaining, etc. still had to call a runtime function. This commit generalizes the optimization to support all types of key paths.
I was inconsistently providing initialized or uninitialized memory
to the callback when projecting a settable address, depending on
component type. We should always provide an uninitialized address.
Add ExecuteSILPipelineRequest which executes a
pipeline plan on a given SIL (and possibly IRGen)
module. This serves as a top-level request for
the SILOptimizer that we'll be able to hang
dependencies off.
Rather than registering individual IRGen passes
when we want to execute them, store function
pointers to all the pass constructors on the
ASTContext. This will make it easier to requestify
the execution of pass pipelines.
Changes:
* Allow optimizing partial_apply capturing opened existential: we didn't do this originally because it was complicated to insert the required alloc/dealloc_stack instructions at the right places. Now we have the StackNesting utility, which makes this easier.
* Support indirect-in parameters. Not super important, but why not? It's also easy to do with the StackNesting utility.
* Share code between dead closure elimination and the apply(partial_apply) optimization. It's a bit of refactoring and allowed to eliminate some code which is not used anymore.
* Fix an ownership problem: We inserted copies of partial_apply arguments _after_ the partial_apply (which consumes the arguments).
* When replacing an apply(partial_apply) -> apply and the partial_apply becomes dead, avoid inserting copies of the arguments twice.
These changes don't have any immediate effect on our current benchmarks, but will allow eliminating curry thunks for existentials.
calls over arrays created from array literals. This enables optimizing
further the output of the OSLogOptimization pass, and results in
highly-compact and optimized IR for calls to the new os log API.
<rdar://58928427>
semantics attribute that is used by the top-level array initializer (in ArrayShared.swift),
which is the entry point used by the compiler to initialize array from array literals.
This initializer is early-inlined so that other optimizations can work on its body.
Fix DeadObjectElimination and ArrayCOWOpts optimization passes to work with this
semantics attribute in addition to "array.uninitialized", which they already use.
Refactor mapInitializationStores function from ArrayElementValuePropagation.cpp to
ArraySemantic.cpp so that the array-initialization pattern matching functionality
implemented by the function can be reused by other optimizations.
setPointsToEdge should assert that its target isn't already merged,
but now that we batch up multiple merge requests, it's fine to allow
the target to be scheduled-for-merge.
Many assertions have been recently added and tightened in order to
"discover" unexpected cases. There's nothing incorrect about how these
cases were handled, but they lack unit tests. In this case I still
haven't been able to reduce a test case. I'm continuing to work on
it, but don't want to further delay the fix.
... including all SIL functions with are transitively referenced from such witness tables.
After the last devirtualizer run witness tables are not needed in the optimizer anymore.
We can delete witness tables with an available-externally linkage. IRGen does not emit such witness tables anyway.
This can save a little bit of compile time, because it reduces the amount of SIL at the end of the optimizer pipeline.
It also reduces the size of the SIL output after the optimizer, which makes debugging the SIL output easier.