User guide
The list of functions on this page is the officially supported differentiation interface in AbstractDifferentiation.
Loading AbstractDifferentiation
To load AbstractDifferentiation, it is recommended to use
import AbstractDifferentiation as ADWith the AD alias you can access names inside of AbstractDifferentiation using AD.<> instead of typing the long name AbstractDifferentiation.
AbstractDifferentiation backends
To use AbstractDifferentiation, first construct a backend instance ab::AD.AbstractBackend using your favorite differentiation package in Julia that supports AbstractDifferentiation.
Here's an example:
julia> import AbstractDifferentiation as AD, Zygote
julia> backend = AD.ZygoteBackend();
julia> f(x) = log(sum(exp, x));
julia> AD.gradient(backend, f, collect(1:3))
([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)The following backends are temporarily made available by AbstractDifferentiation as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):
AbstractDifferentiation.ReverseDiffBackend — TypeReverseDiffBackendAD backend that uses reverse mode with ReverseDiff.jl.
To be able to use this backend, you have to load ReverseDiff.
AbstractDifferentiation.ReverseRuleConfigBackend — TypeReverseRuleConfigBackendAD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.
Constructed with a RuleConfig object:
backend = AD.ReverseRuleConfigBackend(rc)On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.
AbstractDifferentiation.FiniteDifferencesBackend — TypeFiniteDifferencesBackend{M}AD backend that uses forward mode with FiniteDifferences.jl.
The type parameter M is the type of the method used to perform finite differences.
To be able to use this backend, you have to load FiniteDifferences.
AbstractDifferentiation.ZygoteBackend — TypeZygoteBackendCreate an AD backend that uses reverse mode with Zygote.jl.
Alternatively, you can perform AD with Zygote using a special ReverseRuleConfigBackend, namely ReverseRuleConfigBackend(Zygote.ZygoteRuleConfig()). Note, however, that the behaviour of this backend is not equivalent to ZygoteBackend() since the former uses a generic implementation of jacobian etc. for ChainRules-compatible AD backends whereas ZygoteBackend uses implementations in Zygote.jl.
To be able to use this backend, you have to load Zygote.
AbstractDifferentiation.ForwardDiffBackend — TypeForwardDiffBackend{CS}AD backend that uses forward mode with ForwardDiff.jl.
The type parameter CS denotes the chunk size of the differentiation algorithm. If it is Nothing, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.
See also: ForwardDiff.jl: Configuring Chunk Size
To be able to use this backend, you have to load ForwardDiff.
AbstractDifferentiation.TrackerBackend — TypeTrackerBackendAD backend that uses reverse mode with Tracker.jl.
To be able to use this backend, you have to load Tracker.
In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation interface. This is already the case for:
Diffractor.DiffractorForwardBackend()for Diffractor.jl in forward mode
For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend.
AbstractDifferentiation.HigherOrderBackend — TypeAD.HigherOrderBackend{B}Let ab_f be a forward-mode automatic differentiation backend and let ab_r be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r)). To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f)).
Fields
backends::B
Derivatives
The following list of functions can be used to request the derivative, gradient, Jacobian, second derivative or Hessian without the function value.
AbstractDifferentiation.derivative — FunctionAD.derivative(ab::AD.AbstractBackend, f, xs::Number...)Compute the derivatives of f with respect to the numbers xs using the backend ab.
The function returns a Tuple of derivatives, one for each element in xs.
AbstractDifferentiation.gradient — FunctionAD.gradient(ab::AD.AbstractBackend, f, xs...)Compute the gradients of f with respect to the inputs xs using the backend ab.
The function returns a Tuple of gradients, one for each element in xs.
AbstractDifferentiation.jacobian — FunctionAD.jacobian(ab::AD.AbstractBackend, f, xs...)Compute the Jacobians of f with respect to the inputs xs using the backend ab.
The function returns a Tuple of Jacobians, one for each element in xs.
AbstractDifferentiation.second_derivative — FunctionAD.second_derivative(ab::AD.AbstractBackend, f, x)Compute the second derivative of f with respect to the input x using the backend ab.
The function returns a single value because second_derivative currently only supports a single input.
AbstractDifferentiation.hessian — FunctionAD.hessian(ab::AD.AbstractBackend, f, x)Compute the Hessian of f wrt the input x using the backend ab.
The function returns a single matrix because hessian currently only supports a single input.
Value and derivatives
The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian, second derivative, or Hessian. You can also request the function value, its derivative (or its gradient) and its second derivative (or Hessian) for single-input functions.
AbstractDifferentiation.value_and_derivative — FunctionAD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)Return the tuple (v, ds) of the function value v = f(xs...) and the derivatives ds = AD.derivative(ab, f, xs...).
See also AbstractDifferentiation.derivative.
AbstractDifferentiation.value_and_gradient — FunctionAD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)Return the tuple (v, gs) of the function value v = f(xs...) and the gradients gs = AD.gradient(ab, f, xs...).
See also AbstractDifferentiation.gradient.
AbstractDifferentiation.value_and_jacobian — FunctionAD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)Return the tuple (v, Js) of the function value v = f(xs...) and the Jacobians Js = AD.jacobian(ab, f, xs...).
See also AbstractDifferentiation.jacobian.
AbstractDifferentiation.value_and_second_derivative — FunctionAD.value_and_second_derivative(ab::AD.AbstractBackend, f, x)Return the tuple (v, d2) of the function value v = f(x) and the second derivative d2 = AD.second_derivative(ab, f, x).
AbstractDifferentiation.value_and_hessian — FunctionAD.value_and_hessian(ab::AD.AbstractBackend, f, x)Return the tuple (v, H) of the function value v = f(x) and the Hessian H = AD.hessian(ab, f, x).
See also AbstractDifferentiation.hessian.
AbstractDifferentiation.value_derivative_and_second_derivative — FunctionAD.value_derivative_and_second_derivative(ab::AD.AbstractBackend, f, x)Return the tuple (v, d, d2) of the function value v = f(x), the first derivative d = AD.derivative(ab, f, x), and the second derivative d2 = AD.second_derivative(ab, f, x).
AbstractDifferentiation.value_gradient_and_hessian — FunctionAD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)Return the tuple (v, g, H) of the function value v = f(x), the gradient g = AD.gradient(ab, f, x), and the Hessian H = AD.hessian(ab, f, x).
See also AbstractDifferentiation.gradient and AbstractDifferentiation.hessian.
Jacobian-vector products
This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pushforward operator pf_f is equivalent to applying the function v -> J * v on a (tangent) vector v.
The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f of a function f at the inputs xs, you can use either of:
AbstractDifferentiation.pushforward_function — FunctionAD.pushforward_function(ab::AD.AbstractBackend, f, xs...)Return the pushforward function pff of the function f at the inputs xs using backend ab.
The pushfoward function pff accepts as input a Tuple of tangents, one for each element in xs. If xs consists of a single element, pff can also accept a single tangent instead of a 1-tuple.
AbstractDifferentiation.value_and_pushforward_function — FunctionAD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)Return a single function vpff which, given tangents ts, computes the tuple (v, p) = vpff(ts) composed of
- the function value
v = f(xs...) - the pushforward value
p = pff(ts)given by the pushforward functionpff = AD.pushforward_function(ab, f, xs...)applied tots.
See also AbstractDifferentiation.pushforward_function.
This name should be understood as "(value and pushforward) function", and thus is not aligned with the reverse mode counterpart AbstractDifferentiation.value_and_pullback_function.
Vector-Jacobian products
This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pullback operator pb_f is equivalent to applying the function v -> v' * J on a (co-tangent) vector v.
The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f of a function f at the inputs xs, you can use either of:
AbstractDifferentiation.pullback_function — FunctionAD.pullback_function(ab::AD.AbstractBackend, f, xs...)Return the pullback function pbf of the function f at the inputs xs using backend ab.
The pullback function pbf accepts as input a Tuple of cotangents, one for each output of f. If f has a single output, pbf can also accept a single input instead of a 1-tuple.
AbstractDifferentiation.value_and_pullback_function — FunctionAD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)Return a tuple (v, pbf) of the function value v = f(xs...) and the pullback function pbf = AD.pullback_function(ab, f, xs...).
See also AbstractDifferentiation.pullback_function.
This name should be understood as "value and (pullback function)", and thus is not aligned with the forward mode counterpart AbstractDifferentiation.value_and_pushforward_function.
Lazy operators
You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the * operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:
AbstractDifferentiation.lazy_derivative — FunctionAD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)Return an operator ld for multiplying by the derivative of f at xs.
You can apply the operator by multiplication e.g. ld * y where y is a number if f has a single input, a tuple of the same length as xs if f has multiple inputs, or an array of numbers/tuples.
AbstractDifferentiation.lazy_gradient — FunctionAD.lazy_gradient(ab::AbstractBackend, f, xs...)Return an operator lg for multiplying by the gradient of f at xs.
You can apply the operator by multiplication e.g. lg * y where y is a number if f has a single input or a tuple of the same length as xs if f has multiple inputs.
AbstractDifferentiation.lazy_jacobian — FunctionAD.lazy_jacobian(ab::AbstractBackend, f, xs...)Return an operator lj for multiplying by the Jacobian of f at xs.
You can apply the operator by multiplication e.g. lj * y or y' * lj where y is a number, vector or tuple of numbers and/or vectors. If f has multiple inputs, y in lj * y should be a tuple. If f has multiple outputs, y in y' * lj should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.
AbstractDifferentiation.lazy_hessian — FunctionAD.lazy_hessian(ab::AbstractBackend, f, x)Return an operator lh for multiplying by the Hessian of the scalar-valued function f at x.
You can apply the operator by multiplication e.g. lh * y or y' * lh where y is a number or a vector of the appropriate length.
Index
AbstractDifferentiation.FiniteDifferencesBackendAbstractDifferentiation.ForwardDiffBackendAbstractDifferentiation.HigherOrderBackendAbstractDifferentiation.ReverseDiffBackendAbstractDifferentiation.ReverseRuleConfigBackendAbstractDifferentiation.TrackerBackendAbstractDifferentiation.ZygoteBackendAbstractDifferentiation.derivativeAbstractDifferentiation.gradientAbstractDifferentiation.hessianAbstractDifferentiation.jacobianAbstractDifferentiation.lazy_derivativeAbstractDifferentiation.lazy_gradientAbstractDifferentiation.lazy_hessianAbstractDifferentiation.lazy_jacobianAbstractDifferentiation.pullback_functionAbstractDifferentiation.pushforward_functionAbstractDifferentiation.second_derivativeAbstractDifferentiation.value_and_derivativeAbstractDifferentiation.value_and_gradientAbstractDifferentiation.value_and_hessianAbstractDifferentiation.value_and_jacobianAbstractDifferentiation.value_and_pullback_functionAbstractDifferentiation.value_and_pushforward_functionAbstractDifferentiation.value_and_second_derivativeAbstractDifferentiation.value_derivative_and_second_derivativeAbstractDifferentiation.value_gradient_and_hessian