FSO can handle self-recursive calls.
But this only works if the result of the self-recursive call is actually returned and not used otherwise.
The check for this was missing.
https://bugs.swift.org/browse/SR-12677
rdar://problem/62895040
This will make it easier for me with a few further refactors to make the
ownership verifier testing mode emit per function error numbers instead of the
global error number that it is emitting now.
The reason why this is necessary is that today, the verification by
-sil-verify-all causes the errors to be emitted. That verification is done on a
per value level, rather than a per function level, so it is hard to get per
function error numbers without doing unprincipled things like propagating around
state saying what the current function being verified is.
This pass instead will let me make the error counter be per ErrorBuilder which
are created per function.
One thing to be aware of is that this /will/ cause SILValue::verifyOwnership to
not emit any output when the testing flag is enabled. This is to ensure I only
do not get duplicate textual error messages from the SILVerifier.
This became necessary after recent function type changes that keep
substituted generic function types abstract even after substitution to
correctly handle automatic opaque result type substitution.
Instead of performing the opaque result type substitution as part of
substituting the generic args the underlying type will now be reified as
part of looking at the parameter/return types which happens as part of
the function convention apis.
rdar://62560867
Move differentiation-related SILOptimizer files to
{include/swift,lib}/SILOptimizer/Differentiation/.
This reduces directory nesting and gathers files together.
* Fix the mid-level pass pipeline.
Module passes need to be in a separate pipeline, otherwise the
pipeline restart mechanism will be broken.
This makes GlobalOpt and serialization run earlier in the
pipeline. There's no explicit reason for them to be run later, in the
middle of a function pass pipeline.
Also, pipeline boundaries, like serialization and module passes should
be explicit at the the top level function that creates the pass
pipelines.
* SILOptimizer: Add enforcement of function-pass pipelines.
Don't allow module passes to be inserted within a function pass
pipeline. This silently breaks the function pipeline both interfering
with analysis and the normal pipeline restart mechanism.
* Add misssing pass in addFunctionPasses
Co-authored-by: Andrew Trick <atrick@apple.com>
* Update Devirtualizer's analysis invalidation
castValueToABICompatibleType can change CFG, Devirtualizer uses this api but doesn't check if it modified the cfg
MSVC does not realize that the switch is exhaustive and requires that
the path is explicitly marked as unreachable. This silences the C4715
warning ("not all control paths return a value").
-sil-verify-all flag will verify analyses before and after a pass to
confirm correct invalidations. But if an analysis was never
constructed or invalidated as per current pass order,
it may never detect insufficient invalidations.
-sil-verify-force-analysis will force construct an analysis so that we
can better check for insufficient invalidations.
It is also terribly slow compared to -sil-verify-all.
Don't create a separate pass manager for those passes, just let them run at the beginning of the performance pipeline.
Regarding generated code this is a NFC.
This change fixes a problem with pass-bisecting (for debugging). Having two instances of the pass manager can cause troubles with bisecting, because -sil-opt-pass-count affects both pass managers at the same time.
I am going to use this in mandatory combine, and it seems like a generally
useful transformation.
I also updated the routine to construct its own SILBuilder that injects a user
passed in SILBuilderContext eliminating the bad pattern of passing in
SILBuilders.
This should be an NFC change.
Add DifferentiabilityWitnessDevirtualizer: an optimization pass that
devirtualizes `differentiability_witness_function` instructions into
`function_ref` instructions.
Co-authored-by: Dan Zheng <danielzheng@google.com>
For functions which results in > 10000 nodes, just bail and don't compute the connection graph.
The node merging algorithm is quadratic and can result in significant compile times for very large functions.
rdar://problem/56268570
Make `SynthesizedFileUnit` attached to a `SourceFile`. This seemed like the
least ad-hoc approach to avoid doing unnecessary work for other `FileUnit`s.
TBDGen: when visiting a `SourceFile`, also visit its `SynthesizedFileUnit` if
it exists.
Serialization: do not treat `SynthesizedFileUnit` declarations as xrefs when
serializing the companion `SourceFile`.
Resolves TF-1239: AutoDiff test failures.
JVP functions are forward-mode derivative functions. They take original
arguments and return original results and a differential function. Differential
functions take derivatives wrt arguments and return derivatives wrt results.
`JVPEmitter` is a cloner that emits JVP and differential functions at the same
time. In JVP functions, function applications are replaced with JVP function
applications. In differential functions, function applications are replaced
with differential function applications.
In JVP functions, each basic block takes a differential struct containing callee
differentials. These structs are consumed by differential functions.
Make `ADContext` lazily create a `SynthesizedFileUnit` instead of creating one
during `ADContext` construction. This avoids always creating a
`SynthesizedFileUnit` in every module, since differentiation is a mandatory
transform that always runs.
It was nonetheless useful to test always creating a `SynthesizedFileUnit` for
testing purposes.
Add implicit declarations generated by the differentiation transform to a
`SynthesizedFileUnit` instead of an ad-hoc pre-existing `SourceFile`.
Resolves TF-1232: type reconstruction for AutoDiff-generated declarations.
Previously, type reconstruction failed because retroactively adding declarations
to a `SourceFile` did not update name lookup caches.
`PullbackEmitter` is a visitor that emits pullback functions. It implements
reverse-mode automatic differentiation, along with `VJPEmitter`.
Pullback functions take derivatives with respect to outputs and return
derivatives with respect to inputs. Every active value/address in an original
function has a corresponding adjoint value/buffer in the pullback function.
Pullback functions consume pullback structs and predecessor enums constructed
by VJP functions.
`VJPEmitter` is a cloner that emits VJP functions. It implements reverse-mode
automatic differentiation, along with `PullbackEmitter`.
`VJPEmitter` clones an original function, replacing function applications with
VJP function applications. In VJP functions, each basic block takes a pullback
struct (containing callee pullbacks) and produces a predecessor enum: these data
structures are consumed by pullback functions.
`LinearMapInfo` contains information about linear map structs and branching
trace enums, which are auxiliary data structures created by the differentiation
transform.
These data structures are constructed in JVP/VJP functions and consumed in
differential/pullback functions.
Differentiable activity analysis is a dataflow analysis which marks values in
a function as varied, useful, or active (both varied and useful).
Only active values need a derivative.
Canonicalizes `differentiable_function` instructions by filling in missing
derivative function operands.
Derivative function emission rules, based on the original function value:
- `function_ref`: look up differentiability witness with the exact or a minimal
superset derivative configuration. Emit a `differentiability_witness_function`
for the derivative function.
- `witness_method`: emit a `witness_method` with the minimal superset derivative
configuration for the derivative function.
- `class_method`: emit a `class_method` with the minimal superset derivative
configuration for the derivative function.
If an *actual* emitted derivative function has a superset derivative
configuration versus the *desired* derivative configuration, create a "subset
parameters thunk" to thunk the actual derivative to the desired type.
For `differentiable_function` instructions formed from curry thunk applications:
clone the curry thunk (with type `(Self) -> (T, ...) -> U`) and create a new
version with type `(Self) -> @differentiable (T, ...) -> U`.
Progress towards TF-1211.
The differentiation transform does the following:
- Canonicalizes differentiability witnesses by filling in missing derivative
function entries.
- Canonicalizes `differentiable_function` instructions by filling in missing
derivative function operands.
- If necessary, performs automatic differentiation: generating derivative
functions for original functions.
- When encountering non-differentiability code, produces a diagnostic and
errors out.
Partially resolves TF-1211: add the main canonicalization loop.
To incrementally stage changes, derivative functions are currently created
with empty bodies that fatal error with a nice message.
Derivative emitters will be upstreamed separately.
* [Typechecker] Allow enum cases without payload to witness a static get-only property with Self type protocol requirement
* [SIL] Add support for payload cases as well
* [SILGen] Clean up comment
* [Typechecker] Re-enable some previously disabled witness matching code
Also properly handle the matching in some cases
* [Test] Update typechecker tests with payload enum test cases
* [Test] Update SILGen test
* [SIL] Add two FIXME's to address soon
* [SIL] Emit the enum case constructor unconditionally when an enum case is used as a witness
Also, tweak SILDeclRef::getLinkage to update the 'limit' to 'OnDemand' if we have an enum declaration
* [SILGen] Properly handle a enum witness in addMethodImplementation
Also remove a FIXME and code added to workaround the original bug
* [TBDGen] Handle enum case witness
* [Typechecker] Fix conflicts
* [Test] Fix tests
* [AST] Fix indentation in diagnostics def file
RedundantPhiElimination eliminates block phi-arguments which have the same value as other arguments of the same block.
This also works with cycles, like two equivalent loop induction variables. Such patterns are generated e.g. when using stdlib's enumerated() on Array.
preheader:
br bb1(%initval, %initval)
header(%phi1, %phi2):
%next1 = builtin "add" (%phi1, %one)
%next2 = builtin "add" (%phi2, %one)
cond_br %loopcond, header(%next1, %next2), exit
exit:
is replaced with
preheader:
br bb1(%initval)
header(%phi1):
%next1 = builtin "add" (%phi1, %one)
%next2 = builtin "add" (%phi1, %one) // dead: will be cleaned-up later
cond_br %loopcond, header(%next1), exit
exit:
Any remaining dead or "trivially" equivalent instructions will then be cleaned-up by DCE and CSE, respectively.
rdar://problem/33438123
Devirtualizing try_apply modified the CFG, but passes that run
devirtualization were not invalidating any CFG analyses, such as the
domtree.
This could hypothetically have caused miscompilation, but will be
caught by running with -sil-verify-all.
If EscapeAnalysis verification runs on unreachable code, it asserts
with "Missing escape connection graph mapping" because the connection
graph builder only runs on reachable blocks.
Add a ReachableBlocks utility and use it during verification.
Fixes <rdar://problem/60373501> EscapeAnalysis crashes with CFG with
unreachable blocks
We need this anyways for -Onone and I want to do some experiments with running
this very early so I can expose more of the stdlib (modulo inlining) to the new
ownership optimizing passes.
I also changed how the inliner handles inlining around OSSA by changing it to
check early that if the caller is in ossa, then we only inline if all of the
callees that the caller calls are in ossa. The intention is to hopefully avoid
weird swings in code-size/perf due to the inliner heuristic's calculation being
artificially manipulated due to some callees not being available to inline (due
to this difference) when others are already available.
This is in prepration for other bug fixes.
Clarify the SIL utilities that return canonical address values for
formal access given the address used by some memory operation:
- stripAccessMarkers
- getAddressAccess
- getAccessedAddress
These are closely related to the code in MemAccessUtils.
Make sure passes use these utilities consistently so that
optimizations aren't defeated by normal variations in SIL patterns.
Create an isLetAddress() utility alongside these basic utilities to
make sure it is used consistently with the address corresponding to
formal access. When this query is used inconsistently, it defeats
optimization. It can also cause correctness bugs because some
optimizations assume that 'let' initialization is only performed on a
unique address value.
Functional changes to Memory Behavior:
- An instruction with side effects now conservatively still has side
effects even when the queried value is a 'let'. Let values are
certainly sensitive to side effects, such as the parent object being
deallocated.
- Return the correct MemBehavior for begin/end_access markers.
* Simplified the logic for creating static initializers and constant folding for global variables: instead of creating a getter function, directly inline the constant value into the use-sites.
* Wired up the constant folder in GlobalOpt, so that a chains for global variables can be propagated, e.g.
let a = 1
let b = a + 10
let c = b + 5
* Fixed a problem where we didn't create a static initializer if a global is not used in the same module. E.g. a public let variable.
* Simplified the code in general.
rdar://problem/31515927
I was inconsistently providing initialized or uninitialized memory
to the callback when projecting a settable address, depending on
component type. We should always provide an uninitialized address.
We have an optimization in SILCombiner that "inlines" the use of compile-time constant key paths by performing the property access directly instead of calling a runtime function (leading to huge performance gains e.g. for heavy use of @dynamicMemberLookup). However, this optimization previously only supported key paths which solely access stored properties, so computed properties, optional chaining, etc. still had to call a runtime function. This commit generalizes the optimization to support all types of key paths.