# ReverseDiff API

## Gradients of `f(x::AbstractArray{<:Real}...)::Real`

#
** ReverseDiff.gradient** —

*Function*.

```
ReverseDiff.gradient(f, input, cfg::GradientConfig = GradientConfig(input))
```

If `input`

is an `AbstractArray`

, assume `f`

has the form `f(::AbstractArray{<:Real})::Real`

and return `∇f(input)`

.

If `input`

is a tuple of `AbstractArray`

s, assume `f`

has the form `f(::AbstractArray{<:Real}...)::Real`

(such that it can be called as `f(input...)`

) and return a `Tuple`

where the `i`

th element is the gradient of `f`

w.r.t. `input[i].`

Note that `cfg`

can be preallocated and reused for subsequent calls.

If possible, it is highly recommended to use `ReverseDiff.GradientTape`

to prerecord `f`

. Otherwise, this method will have to re-record `f`

's execution trace for every subsequent call.

#
** ReverseDiff.gradient!** —

*Function*.

```
ReverseDiff.gradient!(result, f, input, cfg::GradientConfig = GradientConfig(input))
```

Returns `result`

. This method is exactly like `ReverseDiff.gradient(f, input, cfg)`

, except it stores the resulting gradient(s) in `result`

rather than allocating new memory.

`result`

can be an `AbstractArray`

or a `Tuple`

of `AbstractArray`

s. The `result`

(or any of its elements, if `isa(result, Tuple)`

), can also be a `DiffBase.DiffResult`

, in which case the primal value `f(input)`

(or `f(input...)`

, if `isa(input, Tuple)`

) will be stored in it as well.

```
ReverseDiff.gradient!(tape::Union{GradientTape,CompiledGradient}, input)
```

If `input`

is an `AbstractArray`

, assume `tape`

represents a function of the form `f(::AbstractArray)::Real`

and return `∇f(input)`

.

If `input`

is a tuple of `AbstractArray`

s, assume `tape`

represents a function of the form `f(::AbstractArray...)::Real`

and return a `Tuple`

where the `i`

th element is the gradient of `f`

w.r.t. `input[i].`

```
ReverseDiff.gradient!(result, tape::Union{GradientTape,CompiledGradient}, input)
```

Returns `result`

. This method is exactly like `ReverseDiff.gradient!(tape, input)`

, except it stores the resulting gradient(s) in `result`

rather than allocating new memory.

`result`

can be an `AbstractArray`

or a `Tuple`

of `AbstractArray`

s. The `result`

(or any of its elements, if `isa(result, Tuple)`

), can also be a `DiffBase.DiffResult`

, in which case the primal value `f(input)`

(or `f(input...)`

, if `isa(input, Tuple)`

) will be stored in it as well.

## Jacobians of `f(x::AbstractArray{<:Real}...)::AbstractArray{<:Real}`

#
** ReverseDiff.jacobian** —

*Function*.

```
ReverseDiff.jacobian(f, input, cfg::JacobianConfig = JacobianConfig(input))
```

If `input`

is an `AbstractArray`

, assume `f`

has the form `f(::AbstractArray{<:Real})::AbstractArray{<:Real}`

and return `J(f)(input)`

.

If `input`

is a tuple of `AbstractArray`

s, assume `f`

has the form `f(::AbstractArray{<:Real}...)::AbstractArray{<:Real}`

(such that it can be called as `f(input...)`

) and return a `Tuple`

where the `i`

th element is the Jacobian of `f`

w.r.t. `input[i].`

Note that `cfg`

can be preallocated and reused for subsequent calls.

If possible, it is highly recommended to use `ReverseDiff.JacobianTape`

to prerecord `f`

. Otherwise, this method will have to re-record `f`

's execution trace for every subsequent call.

```
ReverseDiff.jacobian(f!, output, input, cfg::JacobianConfig = JacobianConfig(output, input))
```

Exactly like `ReverseDiff.jacobian(f, input, cfg)`

, except the target function has the form `f!(output::AbstractArray{<:Real}, input::AbstractArray{<:Real}...)`

.

#
** ReverseDiff.jacobian!** —

*Function*.

```
ReverseDiff.jacobian!(result, f, input, cfg::JacobianConfig = JacobianConfig(input))
```

Returns `result`

. This method is exactly like `ReverseDiff.jacobian(f, input, cfg)`

, except it stores the resulting Jacobian(s) in `result`

rather than allocating new memory.

`result`

can be an `AbstractArray`

or a `Tuple`

of `AbstractArray`

s. The `result`

(or any of its elements, if `isa(result, Tuple)`

), can also be a `DiffBase.DiffResult`

, in which case the primal value `f(input)`

(or `f(input...)`

, if `isa(input, Tuple)`

) will be stored in it as well.

```
ReverseDiff.jacobian!(result, f!, output, input, cfg::JacobianConfig = JacobianConfig(output, input))
```

Exactly like `ReverseDiff.jacobian!(result, f, input, cfg)`

, except the target function has the form `f!(output::AbstractArray{<:Real}, input::AbstractArray{<:Real}...)`

.

```
ReverseDiff.jacobian!(tape::Union{JacobianTape,CompiledJacobian}, input)
```

If `input`

is an `AbstractArray`

, assume `tape`

represents a function of the form `f(::AbstractArray{<:Real})::AbstractArray{<:Real}`

or `f!(::AbstractArray{<:Real}, ::AbstractArray{<:Real})`

and return `tape`

's Jacobian w.r.t. `input`

.

If `input`

is a tuple of `AbstractArray`

s, assume `tape`

represents a function of the form `f(::AbstractArray{<:Real}...)::AbstractArray{<:Real}`

or `f!(::AbstractArray{<:Real}, ::AbstractArray{<:Real}...)`

and return a `Tuple`

where the `i`

th element is `tape`

's Jacobian w.r.t. `input[i].`

Note that if `tape`

represents a function of the form `f!(output, input...)`

, you can only execute `tape`

with new `input`

values. There is no way to re-run `tape`

's tape with new `output`

values; since `f!`

can mutate `output`

, there exists no stable "hook" for loading new `output`

values into the tape.

```
ReverseDiff.jacobian!(result, tape::Union{JacobianTape,CompiledJacobian}, input)
```

Returns `result`

. This method is exactly like `ReverseDiff.jacobian!(tape, input)`

, except it stores the resulting Jacobian(s) in `result`

rather than allocating new memory.

`result`

can be an `AbstractArray`

or a `Tuple`

of `AbstractArray`

s. The `result`

(or any of its elements, if `isa(result, Tuple)`

), can also be a `DiffBase.DiffResult`

, in which case the primal value of the target function will be stored in it as well.

## Hessians of `f(x::AbstractArray{<:Real})::Real`

#
** ReverseDiff.hessian** —

*Function*.

```
ReverseDiff.hessian(f, input::AbstractArray, cfg::HessianConfig = HessianConfig(input))
```

Given `f(input::AbstractArray{<:Real})::Real`

, return `f`

s Hessian w.r.t. to the given `input`

.

Note that `cfg`

can be preallocated and reused for subsequent calls.

If possible, it is highly recommended to use `ReverseDiff.HessianTape`

to prerecord `f`

. Otherwise, this method will have to re-record `f`

's execution trace for every subsequent call.

#
** ReverseDiff.hessian!** —

*Function*.

```
ReverseDiff.hessian!(result::AbstractArray, f, input::AbstractArray, cfg::HessianConfig = HessianConfig(input))
ReverseDiff.hessian!(result::DiffResult, f, input::AbstractArray, cfg::HessianConfig = HessianConfig(result, input))
```

Returns `result`

. This method is exactly like `ReverseDiff.hessian(f, input, cfg)`

, except it stores the resulting Hessian in `result`

rather than allocating new memory.

If `result`

is a `DiffBase.DiffResult`

, the primal value `f(input)`

and the gradient `∇f(input)`

will be stored in it along with the Hessian `H(f)(input)`

.

```
ReverseDiff.hessian!(tape::Union{HessianTape,CompiledHessian}, input)
```

Assuming `tape`

represents a function of the form `f(::AbstractArray{<:Real})::Real`

, return the Hessian `H(f)(input)`

.

```
ReverseDiff.hessian!(result::AbstractArray, tape::Union{HessianTape,CompiledHessian}, input)
ReverseDiff.hessian!(result::DiffResult, tape::Union{HessianTape,CompiledHessian}, input)
```

Returns `result`

. This method is exactly like `ReverseDiff.hessian!(tape, input)`

, except it stores the resulting Hessian in `result`

rather than allocating new memory.

If `result`

is a `DiffBase.DiffResult`

, the primal value `f(input)`

and the gradient `∇f(input)`

will be stored in it along with the Hessian `H(f)(input)`

.

## The `AbstractTape`

API

ReverseDiff works by recording the target function's execution trace to a "tape", then running the tape forwards and backwards to propagate new input values and derivative information.

In many cases, it is the recording phase of this process that consumes the most time and memory, while the forward and reverse execution passes are often fast and non-allocating. Luckily, ReverseDiff provides the `AbstractTape`

family of types, which enable the user to *pre-record* a reusable tape for a given function and differentiation operation.

**Note that pre-recording a tape can only capture the the execution trace of the target function with the given input values.** Therefore, re-running the tape (even with new input values) will only execute the paths that were recorded using the original input values. In other words, the tape cannot any re-enact branching behavior that depends on the input values. You can guarantee your own safety in this regard by never using the `AbstractTape`

API with functions that contain control flow based on the input values.

Similarly to the branching issue, a tape is not guaranteed to capture any side-effects caused or depended on by the target function.

#
** ReverseDiff.GradientTape** —

*Type*.

```
ReverseDiff.GradientTape(f, input, cfg::GradientConfig = GradientConfig(input))
```

Return a `GradientTape`

instance containing a pre-recorded execution trace of `f`

at the given `input`

.

This `GradientTape`

can then be passed to `ReverseDiff.gradient!`

to take gradients of the execution trace with new `input`

values. Note that these new values must have the same element type and shape as `input`

.

See `ReverseDiff.gradient`

for a description of acceptable types for `input`

.

#
** ReverseDiff.JacobianTape** —

*Type*.

```
ReverseDiff.JacobianTape(f, input, cfg::JacobianConfig = JacobianConfig(input))
```

Return a `JacobianTape`

instance containing a pre-recorded execution trace of `f`

at the given `input`

.

This `JacobianTape`

can then be passed to `ReverseDiff.jacobian!`

to take Jacobians of the execution trace with new `input`

values. Note that these new values must have the same element type and shape as `input`

.

See `ReverseDiff.jacobian`

for a description of acceptable types for `input`

.

```
ReverseDiff.JacobianTape(f!, output, input, cfg::JacobianConfig = JacobianConfig(output, input))
```

Return a `JacobianTape`

instance containing a pre-recorded execution trace of `f`

at the given `output`

and `input`

.

This `JacobianTape`

can then be passed to `ReverseDiff.jacobian!`

to take Jacobians of the execution trace with new `input`

values. Note that these new values must have the same element type and shape as `input`

.

See `ReverseDiff.jacobian`

for a description of acceptable types for `input`

.

#
** ReverseDiff.HessianTape** —

*Type*.

```
ReverseDiff.HessianTape(f, input, cfg::HessianConfig = HessianConfig(input))
```

Return a `HessianTape`

instance containing a pre-recorded execution trace of `f`

at the given `input`

.

This `HessianTape`

can then be passed to `ReverseDiff.hessian!`

to take Hessians of the execution trace with new `input`

values. Note that these new values must have the same element type and shape as `input`

.

See `ReverseDiff.hessian`

for a description of acceptable types for `input`

.

#
** ReverseDiff.compile** —

*Function*.

```
ReverseDiff.compile(t::AbstractTape)
```

Return a fully compiled representation of `t`

of type `CompiledTape`

. This object can be passed to any API methods that accept `t`

(e.g. `gradient!(result, t, input)`

).

In many cases, compiling `t`

can significantly speed up execution time. Note that the longer the tape, the more time compilation may take. Very long tapes (i.e. when `length(t)`

is on the order of 10000 elements) can take a very long time to compile.

Note that this function calls `eval`

in the `current_module()`

to generate functions from `t`

. Thus, the returned `CompiledTape`

will only be useable once the world-age counter has caught up with the world-age of the `eval`

'd functions (i.e. once the call stack has bubbled up to top level).

## The `AbstractConfig`

API

For the sake of convenience and performance, all "extra" information used by ReverseDiff's API methods is bundled up in the `ReverseDiff.AbstractConfig`

family of types. These types allow the user to easily feed several different parameters to ReverseDiff's API methods, such as work buffers and tape configurations.

ReverseDiff's basic API methods will allocate these types automatically by default, but you can reduce memory usage and improve performance if you preallocate them yourself.

#
** ReverseDiff.GradientConfig** —

*Type*.

```
ReverseDiff.GradientConfig(input, tp::RawTape = RawTape())
```

Return a `GradientConfig`

instance containing the preallocated tape and work buffers used by the `ReverseDiff.gradient`

/`ReverseDiff.gradient!`

methods.

Note that `input`

is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of `input`

is same as the element type of the target function's output.

See `ReverseDiff.gradient`

for a description of acceptable types for `input`

.

```
ReverseDiff.GradientConfig(input, ::Type{D}, tp::RawTape = RawTape())
```

Like `GradientConfig(input, tp)`

, except the provided type `D`

is assumed to be the element type of the target function's output.

#
** ReverseDiff.JacobianConfig** —

*Type*.

```
ReverseDiff.JacobianConfig(input, tp::RawTape = RawTape())
```

Return a `JacobianConfig`

instance containing the preallocated tape and work buffers used by the `ReverseDiff.jacobian`

/`ReverseDiff.jacobian!`

methods.

Note that `input`

is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of `input`

is same as the element type of the target function's output.

See `ReverseDiff.jacobian`

for a description of acceptable types for `input`

.

```
ReverseDiff.JacobianConfig(input, ::Type{D}, tp::RawTape = RawTape())
```

Like `JacobianConfig(input, tp)`

, except the provided type `D`

is assumed to be the element type of the target function's output.

```
ReverseDiff.JacobianConfig(output::AbstractArray, input, tp::RawTape = RawTape())
```

Return a `JacobianConfig`

instance containing the preallocated tape and work buffers used by the `ReverseDiff.jacobian`

/`ReverseDiff.jacobian!`

methods. This method assumes the target function has the form `f!(output, input)`

Note that `input`

and `output`

are only used for type and shape information; they are not stored or modified in any way.

See `ReverseDiff.jacobian`

for a description of acceptable types for `input`

.

```
ReverseDiff.JacobianConfig(result::DiffBase.DiffResult, input, tp::RawTape = RawTape())
```

A convenience method for `JacobianConfig(DiffBase.value(result), input, tp)`

.

#
** ReverseDiff.HessianConfig** —

*Type*.

```
ReverseDiff.HessianConfig(input::AbstractArray, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
```

Return a `HessianConfig`

instance containing the preallocated tape and work buffers used by the `ReverseDiff.hessian`

/`ReverseDiff.hessian!`

methods. `gtp`

is the tape used for the inner gradient calculation, while `jtp`

is used for outer Jacobian calculation.

Note that `input`

is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of `input`

is same as the element type of the target function's output.

```
ReverseDiff.HessianConfig(input::AbstractArray, ::Type{D}, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
```

Like `HessianConfig(input, tp)`

, except the provided type `D`

is assumed to be the element type of the target function's output.

```
ReverseDiff.HessianConfig(result::DiffBase.DiffResult, input::AbstractArray, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
```

Like `HessianConfig(input, tp)`

, but utilize `result`

along with `input`

to construct work buffers.

Note that `result`

and `input`

are only used for type and shape information; they are not stored or modified in any way.

## Optimization Annotations

#
** ReverseDiff.@forward** —

*Macro*.

```
ReverseDiff.@forward(f)(args::Real...)
ReverseDiff.@forward f(args::Real...) = ...
ReverseDiff.@forward f = (args::Real...) -> ...
```

Declare that the given function should be differentiated using forward mode automatic differentiation. Note that the macro can be used at either the definition site or at the call site of `f`

. Currently, only `length(args) <= 2`

is supported. **Note that, if f is defined within another function g, f should not close over any differentiable input of g.** By using this macro, you are providing a guarantee that this property holds true.

This macro can be very beneficial for performance when intermediate functions in your computation are low dimensional scalar functions, because it minimizes the number of instructions that must be recorded to the tape. For example, take the function `sigmoid(n) = 1. / (1. + exp(-n))`

. Normally, using ReverseDiff to differentiate this function would require recording 4 instructions (`-`

, `exp`

, `+`

, and `/`

). However, if we apply the `@forward`

macro, only one instruction will be recorded (`sigmoid`

). The `sigmoid`

function will then be differentiated using ForwardDiff's `Dual`

number type.

This is also beneficial for higher-order elementwise function application. ReverseDiff overloads `map`

/`broadcast`

to dispatch on `@forward`

-applied functions. For example, `map(@forward(f), x)`

will usually be more performant than `map(f, x)`

.

ReverseDiff overloads many Base scalar functions to behave as `@forward`

functions by default. A full list is given by `ReverseDiff.FORWARD_UNARY_SCALAR_FUNCS`

and `ReverseDiff.FORWARD_BINARY_SCALAR_FUNCS`

.

#
** ReverseDiff.@skip** —

*Macro*.

```
ReverseDiff.@skip(f)(args::Real...)
ReverseDiff.@skip f(args::Real...) = ...
ReverseDiff.@skip f = (args::Real...) -> ...
```

Declare that the given function should be skipped during the instruction-recording phase of differentiation. Note that the macro can be used at either the definition site or at the call site of `f`

. **Note that, if f is defined within another function g, f should not close over any differentiable input of g.** By using this macro, you are providing a guarantee that this property holds true.

ReverseDiff overloads many Base scalar functions to behave as `@skip`

functions by default. A full list is given by `ReverseDiff.SKIPPED_UNARY_SCALAR_FUNCS`

and `ReverseDiff.SKIPPED_BINARY_SCALAR_FUNCS`

.