User guide
The list of functions on this page is the officially supported differentiation interface in AbstractDifferentiation
.
Loading AbstractDifferentiation
To load AbstractDifferentiation
, it is recommended to use
import AbstractDifferentiation as AD
With the AD
alias you can access names inside of AbstractDifferentiation
using AD.<>
instead of typing the long name AbstractDifferentiation
.
AbstractDifferentiation
backends
To use AbstractDifferentiation
, first construct a backend instance ab::AD.AbstractBackend
using your favorite differentiation package in Julia that supports AbstractDifferentiation
.
Here's an example:
julia> import AbstractDifferentiation as AD, Zygote
julia> backend = AD.ZygoteBackend();
julia> f(x) = log(sum(exp, x));
julia> AD.gradient(backend, f, collect(1:3))
([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)
The following backends are temporarily made available by AbstractDifferentiation
as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):
AbstractDifferentiation.ReverseDiffBackend
— TypeReverseDiffBackend
AD backend that uses reverse mode with ReverseDiff.jl.
To be able to use this backend, you have to load ReverseDiff.
AbstractDifferentiation.ReverseRuleConfigBackend
— TypeReverseRuleConfigBackend
AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.
Constructed with a RuleConfig
object:
backend = AD.ReverseRuleConfigBackend(rc)
On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.
AbstractDifferentiation.FiniteDifferencesBackend
— TypeFiniteDifferencesBackend{M}
AD backend that uses forward mode with FiniteDifferences.jl.
The type parameter M
is the type of the method used to perform finite differences.
To be able to use this backend, you have to load FiniteDifferences.
AbstractDifferentiation.ZygoteBackend
— FunctionZygoteBackend()
Create an AD backend that uses reverse mode with Zygote.jl.
It is a special case of ReverseRuleConfigBackend
.
To be able to use this backend, you have to load Zygote.
AbstractDifferentiation.ForwardDiffBackend
— TypeForwardDiffBackend{CS}
AD backend that uses forward mode with ForwardDiff.jl.
The type parameter CS
denotes the chunk size of the differentiation algorithm. If it is Nothing
, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.
See also: ForwardDiff.jl: Configuring Chunk Size
To be able to use this backend, you have to load ForwardDiff.
AbstractDifferentiation.TrackerBackend
— TypeTrackerBackend
AD backend that uses reverse mode with Tracker.jl.
To be able to use this backend, you have to load Tracker.
In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation
interface. This is already the case for:
Diffractor.DiffractorForwardBackend()
for Diffractor.jl in forward mode
For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend
.
AbstractDifferentiation.HigherOrderBackend
— TypeAD.HigherOrderBackend{B}
Let ab_f
be a forward-mode automatic differentiation backend and let ab_r
be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r))
. To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f))
.
Fields
backends::B
Derivatives
The following list of functions can be used to request the derivative, gradient, Jacobian or Hessian without the function value.
AbstractDifferentiation.derivative
— FunctionAD.derivative(ab::AD.AbstractBackend, f, xs::Number...)
Compute the derivatives of f
with respect to the numbers xs
using the backend ab
.
The function returns a Tuple
of derivatives, one for each element in xs
.
AbstractDifferentiation.gradient
— FunctionAD.gradient(ab::AD.AbstractBackend, f, xs...)
Compute the gradients of f
with respect to the inputs xs
using the backend ab
.
The function returns a Tuple
of gradients, one for each element in xs
.
AbstractDifferentiation.jacobian
— FunctionAD.jacobian(ab::AD.AbstractBackend, f, xs...)
Compute the Jacobians of f
with respect to the inputs xs
using the backend ab
.
The function returns a Tuple
of Jacobians, one for each element in xs
.
AbstractDifferentiation.hessian
— FunctionAD.hessian(ab::AD.AbstractBackend, f, x)
Compute the Hessian of f
wrt the input x
using the backend ab
.
The function returns a single matrix because hessian
currently only supports a single input.
Value and derivatives
The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian or Hessian. You can also request the function value, its gradient and Hessian for single-input functions.
AbstractDifferentiation.value_and_derivative
— FunctionAD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)
Return the tuple (v, ds)
of the function value v = f(xs...)
and the derivatives ds = AD.derivative(ab, f, xs...)
.
See also AbstractDifferentiation.derivative
.
AbstractDifferentiation.value_and_gradient
— FunctionAD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)
Return the tuple (v, gs)
of the function value v = f(xs...)
and the gradients gs = AD.gradient(ab, f, xs...)
.
See also AbstractDifferentiation.gradient
.
AbstractDifferentiation.value_and_jacobian
— FunctionAD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)
Return the tuple (v, Js)
of the function value v = f(xs...)
and the Jacobians Js = AD.jacobian(ab, f, xs...)
.
See also AbstractDifferentiation.jacobian
.
AbstractDifferentiation.value_and_hessian
— FunctionAD.value_and_hessian(ab::AD.AbstractBackend, f, x)
Return the tuple (v, H)
of the function value v = f(x)
and the Hessian H = AD.hessian(ab, f, x)
.
See also AbstractDifferentiation.hessian
.
AbstractDifferentiation.value_gradient_and_hessian
— FunctionAD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)
Return the tuple (v, g, H)
of the function value v = f(x)
, the gradient g = AD.gradient(ab, f, x)
, and the Hessian H = AD.hessian(ab, f, x)
.
See also AbstractDifferentiation.gradient
and AbstractDifferentiation.hessian
.
Jacobian-vector products
This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f
with a Jacobian J
, the pushforward operator pf_f
is equivalent to applying the function v -> J * v
on a (tangent) vector v
.
The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f
of a function f
at the inputs xs
, you can use either of:
AbstractDifferentiation.pushforward_function
— FunctionAD.pushforward_function(ab::AD.AbstractBackend, f, xs...)
Return the pushforward function pff
of the function f
at the inputs xs
using backend ab
.
The pushfoward function pff
accepts as input a Tuple
of tangents, one for each element in xs
. If xs
consists of a single element, pff
can also accept a single tangent instead of a 1-tuple.
AbstractDifferentiation.value_and_pushforward_function
— FunctionAD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)
Return a single function vpff
which, given tangents ts
, computes the tuple (v, p) = vpff(ts)
composed of
- the function value
v = f(xs...)
- the pushforward value
p = pff(ts)
given by the pushforward functionpff = AD.pushforward_function(ab, f, xs...)
applied tots
.
See also AbstractDifferentiation.pushforward_function
.
This name should be understood as "(value and pushforward) function", and thus is not aligned with the reverse mode counterpart AbstractDifferentiation.value_and_pullback_function
.
Vector-Jacobian products
This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f
with a Jacobian J
, the pullback operator pb_f
is equivalent to applying the function v -> v' * J
on a (co-tangent) vector v
.
The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f
of a function f
at the inputs xs
, you can use either of:
AbstractDifferentiation.pullback_function
— FunctionAD.pullback_function(ab::AD.AbstractBackend, f, xs...)
Return the pullback function pbf
of the function f
at the inputs xs
using backend ab
.
The pullback function pbf
accepts as input a Tuple
of cotangents, one for each output of f
. If f
has a single output, pbf
can also accept a single input instead of a 1-tuple.
AbstractDifferentiation.value_and_pullback_function
— FunctionAD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)
Return a tuple (v, pbf)
of the function value v = f(xs...)
and the pullback function pbf = AD.pullback_function(ab, f, xs...)
.
See also AbstractDifferentiation.pullback_function
.
This name should be understood as "value and (pullback function)", and thus is not aligned with the forward mode counterpart AbstractDifferentiation.value_and_pushforward_function
.
Lazy operators
You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the *
operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:
AbstractDifferentiation.lazy_derivative
— FunctionAD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)
Return an operator ld
for multiplying by the derivative of f
at xs
.
You can apply the operator by multiplication e.g. ld * y
where y
is a number if f
has a single input, a tuple of the same length as xs
if f
has multiple inputs, or an array of numbers/tuples.
AbstractDifferentiation.lazy_gradient
— FunctionAD.lazy_gradient(ab::AbstractBackend, f, xs...)
Return an operator lg
for multiplying by the gradient of f
at xs
.
You can apply the operator by multiplication e.g. lg * y
where y
is a number if f
has a single input or a tuple of the same length as xs
if f
has multiple inputs.
AbstractDifferentiation.lazy_jacobian
— FunctionAD.lazy_jacobian(ab::AbstractBackend, f, xs...)
Return an operator lj
for multiplying by the Jacobian of f
at xs
.
You can apply the operator by multiplication e.g. lj * y
or y' * lj
where y
is a number, vector or tuple of numbers and/or vectors. If f
has multiple inputs, y
in lj * y
should be a tuple. If f
has multiple outputs, y
in y' * lj
should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.
AbstractDifferentiation.lazy_hessian
— FunctionAD.lazy_hessian(ab::AbstractBackend, f, x)
Return an operator lh
for multiplying by the Hessian of the scalar-valued function f
at x
.
You can apply the operator by multiplication e.g. lh * y
or y' * lh
where y
is a number or a vector of the appropriate length.
Index
AbstractDifferentiation.FiniteDifferencesBackend
AbstractDifferentiation.ForwardDiffBackend
AbstractDifferentiation.HigherOrderBackend
AbstractDifferentiation.ReverseDiffBackend
AbstractDifferentiation.ReverseRuleConfigBackend
AbstractDifferentiation.TrackerBackend
AbstractDifferentiation.ZygoteBackend
AbstractDifferentiation.derivative
AbstractDifferentiation.gradient
AbstractDifferentiation.hessian
AbstractDifferentiation.jacobian
AbstractDifferentiation.lazy_derivative
AbstractDifferentiation.lazy_gradient
AbstractDifferentiation.lazy_hessian
AbstractDifferentiation.lazy_jacobian
AbstractDifferentiation.pullback_function
AbstractDifferentiation.pushforward_function
AbstractDifferentiation.value_and_derivative
AbstractDifferentiation.value_and_gradient
AbstractDifferentiation.value_and_hessian
AbstractDifferentiation.value_and_jacobian
AbstractDifferentiation.value_and_pullback_function
AbstractDifferentiation.value_and_pushforward_function
AbstractDifferentiation.value_gradient_and_hessian