Backends
List
We support the following dense backend choices from ADTypes.jl:
AutoChainRules
AutoDiffractor
AutoEnzyme
AutoFastDifferentiation
AutoFiniteDiff
AutoFiniteDifferences
AutoForwardDiff
AutoMooncake
AutoPolyesterForwardDiff
AutoReverseDiff
AutoSymbolics
AutoTracker
AutoZygote
Features
Given a backend object, you can use:
check_available
to know whether the required AD package is loadedcheck_inplace
to know whether the backend supports in-place functions (all backends support out-of-place functions)
In theory, all we need from each backend is either a pushforward
or a pullback
: we can deduce every other operator from these two. In practice, many AD backends have custom implementations for high-level operators like gradient
or jacobian
, which we reuse whenever possible.
Details
In the rough summary table below,
- ✅ means that we reuse the custom implementation from the backend;
- ❌ means that a custom implementation doesn't exist, so we use our default fallbacks;
- 🔀 means it's complicated or not done yet.
pf | pb | der | grad | jac | hess | hvp | der2 | |
---|---|---|---|---|---|---|---|---|
AutoChainRules | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
AutoDiffractor | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
AutoEnzyme (forward) | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
AutoEnzyme (reverse) | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | 🔀 | ❌ |
AutoFastDifferentiation | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
AutoFiniteDiff | 🔀 | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
AutoFiniteDifferences | 🔀 | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
AutoForwardDiff | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
AutoMooncake | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
AutoPolyesterForwardDiff | 🔀 | ❌ | 🔀 | ✅ | ✅ | 🔀 | 🔀 | 🔀 |
AutoReverseDiff | ❌ | 🔀 | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ |
AutoSymbolics | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
AutoTracker | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
AutoZygote | ❌ | ✅ | ❌ | ✅ | ✅ | ✅ | 🔀 | ❌ |
Moreover, each context type is supported by a specific subset of backends:
Constant | |
---|---|
AutoChainRules | ✅ |
AutoDiffractor | ❌ |
AutoEnzyme (forward) | ✅ |
AutoEnzyme (reverse) | ✅ |
AutoFastDifferentiation | ❌ |
AutoFiniteDiff | ✅ |
AutoFiniteDifferences | ✅ |
AutoForwardDiff | ✅ |
AutoMooncake | ✅ |
AutoPolyesterForwardDiff | ✅ |
AutoReverseDiff | ✅ |
AutoSymbolics | ❌ |
AutoTracker | ✅ |
AutoZygote | ✅ |
Second order
For second-order operators like second_derivative
, hessian
and hvp
, there are two main options. You can either use a single backend, or combine two of them within the SecondOrder
struct:
backend = SecondOrder(outer_backend, inner_backend)
The inner backend will be called first, and the outer backend will differentiate the generated code. In general, using a forward outer backend over a reverse inner backend will yield the best performance.
Second-order AD is tricky, and many backend combinations will fail (even if you combine a backend with itself). Be ready to experiment and open issues if necessary.
Backend switch
The wrapper DifferentiateWith
allows you to switch between backends. It takes a function f
and specifies that f
should be differentiated with the substitute backend of your choice, instead of whatever true backend the surrounding code is trying to use. In other words, when someone tries to differentiate dw = DifferentiateWith(f, substitute_backend)
with true_backend
, then substitute_backend
steps in and true_backend
does not dive into the function f
itself. At the moment, DifferentiateWith
only works when true_backend
is either ForwardDiff.jl or a ChainRules.jl-compatible backend.
Implementations
What follows is a list of implementation details from the package extensions of DifferentiationInterface.jl It is not part of the public API or protected by semantic versioning, and it may become outdated. When in doubt, refer to the code itself.
ChainRulesCore
We only implement pullback
, using the RuleConfig
mechanism to call back into AD. Same-point preparation runs the forward sweep and returns the pullback closure.
Diffractor
We only implement pushforward
.
The latest releases of Diffractor broke DifferentiationInterface.
Enzyme
Depending on the mode
attribute inside AutoEnzyme
, we implement either pushforward
or pullback
based on Enzyme.autodiff
. When necessary, preparation chooses a number of chunks (for gradient
and jacobian
in forward mode, for jacobian
only in reverse mode).
FastDifferentiation
For every operator, preparation generates an executable function from the symbolic expression of the differentiated function.
Preparation can be very slow for symbolic AD.
FiniteDiff
Whenever possible, preparation creates a cache object. Pushforward is implemented rather slowly using a closure.
FiniteDifferences
Nothing specific to mention.
ForwardDiff
We implement pushforward
directly using Dual
numbers, and preparation allocates the necessary space. For higher level operators, preparation creates a config object, which can be type-unstable.
PolyesterForwardDiff
Most operators fall back on AutoForwardDiff
.
ReverseDiff
Wherever possible, preparation records a tape of the function's execution. This tape is computed from the arguments x
and contexts...
provided at preparation time. It is control-flow dependent, so only one branch is recorded at each if
statement.
If your function has value-specific control flow (like if x[1] > 0
or if c == 1
), you may get silently wrong results whenever it takes new branches that were not taken during preparation. You must make sure to run preparation with an input and contexts whose values trigger the correct control flow for future executions.
Symbolics
For all operators, preparation generates an executable function from the symbolic expression of the differentiated function.
Preparation can be very slow for symbolic AD.
Mooncake
For pullback
, preparation builds the reverse rule of the function.
Tracker
We implement pullback
based on Tracker.back
. Same-point preparation runs the forward sweep and returns the pullback closure at x
.
Zygote
We implement pullback
based on Zygote.pullback
. Same-point preparation runs the forward sweep and returns the pullback closure at x
.