User guide

The list of functions on this page is the officially supported differentiation interface in AbstractDifferentiation.

Loading `AbstractDifferentiation`

To load AbstractDifferentiation, it is recommended to use

import AbstractDifferentiation as AD

With the AD alias you can access names inside of AbstractDifferentiation using AD.<> instead of typing the long name AbstractDifferentiation.

`AbstractDifferentiation` backends

To use AbstractDifferentiation, first construct a backend instance ab::AD.AbstractBackend using your favorite differentiation package in Julia that supports AbstractDifferentiation.

Here's an example:

julia> import AbstractDifferentiation as AD, Zygote

julia> backend = AD.ZygoteBackend();

julia> f(x) = log(sum(exp, x));

julia> AD.gradient(backend, f, collect(1:3))
([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)

The following backends are temporarily made available by AbstractDifferentiation as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):

AbstractDifferentiation.ReverseDiffBackend — Type

ReverseDiffBackend

AD backend that uses reverse mode with ReverseDiff.jl.

Note

To be able to use this backend, you have to load ReverseDiff.

AbstractDifferentiation.ReverseRuleConfigBackend — Type

ReverseRuleConfigBackend

AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.

Constructed with a RuleConfig object:

backend = AD.ReverseRuleConfigBackend(rc)

Note

On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.

AbstractDifferentiation.FiniteDifferencesBackend — Type

FiniteDifferencesBackend{M}

AD backend that uses forward mode with FiniteDifferences.jl.

The type parameter M is the type of the method used to perform finite differences.

Note

To be able to use this backend, you have to load FiniteDifferences.

AbstractDifferentiation.ZygoteBackend — Type

ZygoteBackend

Create an AD backend that uses reverse mode with Zygote.jl.

Alternatively, you can perform AD with Zygote using a special ReverseRuleConfigBackend, namely ReverseRuleConfigBackend(Zygote.ZygoteRuleConfig()). Note, however, that the behaviour of this backend is not equivalent to ZygoteBackend() since the former uses a generic implementation of jacobian etc. for ChainRules-compatible AD backends whereas ZygoteBackend uses implementations in Zygote.jl.

Note

To be able to use this backend, you have to load Zygote.

AbstractDifferentiation.ForwardDiffBackend — Type

ForwardDiffBackend{CS}

AD backend that uses forward mode with ForwardDiff.jl.

The type parameter CS denotes the chunk size of the differentiation algorithm. If it is Nothing, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.

See also: ForwardDiff.jl: Configuring Chunk Size

Note

To be able to use this backend, you have to load ForwardDiff.

AbstractDifferentiation.TrackerBackend — Type

TrackerBackend

AD backend that uses reverse mode with Tracker.jl.

Note

To be able to use this backend, you have to load Tracker.

In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation interface. This is already the case for:

Diffractor.DiffractorForwardBackend() for Diffractor.jl in forward mode

For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend.

AbstractDifferentiation.HigherOrderBackend — Type

AD.HigherOrderBackend{B}

Let ab_f be a forward-mode automatic differentiation backend and let ab_r be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r)). To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f)).

Fields

backends::B

Derivatives

The following list of functions can be used to request the derivative, gradient, Jacobian, second derivative or Hessian without the function value.

AbstractDifferentiation.derivative — Function

AD.derivative(ab::AD.AbstractBackend, f, xs::Number...)

Compute the derivatives of f with respect to the numbers xs using the backend ab.

The function returns a Tuple of derivatives, one for each element in xs.

AbstractDifferentiation.gradient — Function

AD.gradient(ab::AD.AbstractBackend, f, xs...)

Compute the gradients of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of gradients, one for each element in xs.

AbstractDifferentiation.jacobian — Function

AD.jacobian(ab::AD.AbstractBackend, f, xs...)

Compute the Jacobians of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of Jacobians, one for each element in xs.

AbstractDifferentiation.second_derivative — Function

AD.second_derivative(ab::AD.AbstractBackend, f, x)

Compute the second derivative of f with respect to the input x using the backend ab.

The function returns a single value because second_derivative currently only supports a single input.

AbstractDifferentiation.hessian — Function

AD.hessian(ab::AD.AbstractBackend, f, x)

Compute the Hessian of f wrt the input x using the backend ab.

The function returns a single matrix because hessian currently only supports a single input.

Value and derivatives

The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian, second derivative, or Hessian. You can also request the function value, its derivative (or its gradient) and its second derivative (or Hessian) for single-input functions.

AbstractDifferentiation.value_and_derivative — Function

AD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)

Return the tuple (v, ds) of the function value v = f(xs...) and the derivatives ds = AD.derivative(ab, f, xs...).

See also AbstractDifferentiation.derivative.

AbstractDifferentiation.value_and_gradient — Function

AD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, gs) of the function value v = f(xs...) and the gradients gs = AD.gradient(ab, f, xs...).

See also AbstractDifferentiation.gradient.

AbstractDifferentiation.value_and_jacobian — Function

AD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, Js) of the function value v = f(xs...) and the Jacobians Js = AD.jacobian(ab, f, xs...).

See also AbstractDifferentiation.jacobian.

AbstractDifferentiation.value_and_second_derivative — Function

AD.value_and_second_derivative(ab::AD.AbstractBackend, f, x)

Return the tuple (v, d2) of the function value v = f(x) and the second derivative d2 = AD.second_derivative(ab, f, x).

See also AbstractDifferentiation.second_derivative

AbstractDifferentiation.value_and_hessian — Function

AD.value_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, H) of the function value v = f(x) and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.hessian.

AbstractDifferentiation.value_derivative_and_second_derivative — Function

AD.value_derivative_and_second_derivative(ab::AD.AbstractBackend, f, x)

Return the tuple (v, d, d2) of the function value v = f(x), the first derivative d = AD.derivative(ab, f, x), and the second derivative d2 = AD.second_derivative(ab, f, x).

AbstractDifferentiation.value_gradient_and_hessian — Function

AD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, g, H) of the function value v = f(x), the gradient g = AD.gradient(ab, f, x), and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.gradient and AbstractDifferentiation.hessian.

Jacobian-vector products

This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pushforward operator pf_f is equivalent to applying the function v -> J * v on a (tangent) vector v.

The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pushforward_function — Function

AD.pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return the pushforward function pff of the function f at the inputs xs using backend ab.

The pushfoward function pff accepts as input a Tuple of tangents, one for each element in xs. If xs consists of a single element, pff can also accept a single tangent instead of a 1-tuple.

AbstractDifferentiation.value_and_pushforward_function — Function

AD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return a single function vpff which, given tangents ts, computes the tuple (v, p) = vpff(ts) composed of

the function value v = f(xs...)
the pushforward value p = pff(ts) given by the pushforward function pff = AD.pushforward_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pushforward_function.

Warning

This name should be understood as "(value and pushforward) function", and thus is not aligned with the reverse mode counterpart AbstractDifferentiation.value_and_pullback_function.

Vector-Jacobian products

This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pullback operator pb_f is equivalent to applying the function v -> v' * J on a (co-tangent) vector v.

The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pullback_function — Function

AD.pullback_function(ab::AD.AbstractBackend, f, xs...)

Return the pullback function pbf of the function f at the inputs xs using backend ab.

The pullback function pbf accepts as input a Tuple of cotangents, one for each output of f. If f has a single output, pbf can also accept a single input instead of a 1-tuple.

AbstractDifferentiation.value_and_pullback_function — Function

AD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)

Return a tuple (v, pbf) of the function value v = f(xs...) and the pullback function pbf = AD.pullback_function(ab, f, xs...).

See also AbstractDifferentiation.pullback_function.

Warning

This name should be understood as "value and (pullback function)", and thus is not aligned with the forward mode counterpart AbstractDifferentiation.value_and_pushforward_function.

Lazy operators

You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the * operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:

AbstractDifferentiation.lazy_derivative — Function

AD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)

Return an operator ld for multiplying by the derivative of f at xs.

You can apply the operator by multiplication e.g. ld * y where y is a number if f has a single input, a tuple of the same length as xs if f has multiple inputs, or an array of numbers/tuples.

AbstractDifferentiation.lazy_gradient — Function

AD.lazy_gradient(ab::AbstractBackend, f, xs...)

Return an operator lg for multiplying by the gradient of f at xs.

You can apply the operator by multiplication e.g. lg * y where y is a number if f has a single input or a tuple of the same length as xs if f has multiple inputs.

AbstractDifferentiation.lazy_jacobian — Function

AD.lazy_jacobian(ab::AbstractBackend, f, xs...)

Return an operator lj for multiplying by the Jacobian of f at xs.

You can apply the operator by multiplication e.g. lj * y or y' * lj where y is a number, vector or tuple of numbers and/or vectors. If f has multiple inputs, y in lj * y should be a tuple. If f has multiple outputs, y in y' * lj should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.

AbstractDifferentiation.lazy_hessian — Function

AD.lazy_hessian(ab::AbstractBackend, f, x)

Return an operator lh for multiplying by the Hessian of the scalar-valued function f at x.

You can apply the operator by multiplication e.g. lh * y or y' * lh where y is a number or a vector of the appropriate length.

Index

AbstractDifferentiation.FiniteDifferencesBackend
AbstractDifferentiation.ForwardDiffBackend
AbstractDifferentiation.HigherOrderBackend
AbstractDifferentiation.ReverseDiffBackend
AbstractDifferentiation.ReverseRuleConfigBackend
AbstractDifferentiation.TrackerBackend
AbstractDifferentiation.ZygoteBackend
AbstractDifferentiation.derivative
AbstractDifferentiation.gradient
AbstractDifferentiation.hessian
AbstractDifferentiation.jacobian
AbstractDifferentiation.lazy_derivative
AbstractDifferentiation.lazy_gradient
AbstractDifferentiation.lazy_hessian
AbstractDifferentiation.lazy_jacobian
AbstractDifferentiation.pullback_function
AbstractDifferentiation.pushforward_function
AbstractDifferentiation.second_derivative
AbstractDifferentiation.value_and_derivative
AbstractDifferentiation.value_and_gradient
AbstractDifferentiation.value_and_hessian
AbstractDifferentiation.value_and_jacobian
AbstractDifferentiation.value_and_pullback_function
AbstractDifferentiation.value_and_pushforward_function
AbstractDifferentiation.value_and_second_derivative
AbstractDifferentiation.value_derivative_and_second_derivative
AbstractDifferentiation.value_gradient_and_hessian