# API Documentation

`ChainRulesCore.NO_FIELDS`

— Constant.`NO_FIELDS`

Constant for the reverse-mode derivative with respect to a structure that has no fields. The most notable use for this is for the reverse-mode derivative with respect to the function itself, when that function is not a closure.

`ChainRulesCore.AbstractZero`

— Type.`AbstractZero <: AbstractDifferential`

This is zero-like differential types. If a AD system encounter a propagator taking as input only subtypes of `AbstractZero`

then it can stop performing any AD operations, as all propagator are linear functions, and thus the final result will be zero.

All `AbstractZero`

subtypes are singleton types. There are two of them `Zero()`

and `DoesNotExist()`

.

`ChainRulesCore.Composite`

— Type.`Composite{P, T} <: AbstractDifferential`

This type represents the differential for a `struct`

/`NamedTuple`

, or `Tuple`

. `P`

is the the corresponding primal type that this is a differential for.

`Composite{P}`

should have fields (technically properties), that match to a subset of the fields of the primal type; and each should be a differential type matching to the primal type of that field. Fields of the P that are not present in the Composite are treated as `Zero`

.

`T`

is an implementation detail representing the backing data structure. For Tuple it will be a Tuple, and for everything else it will be a `NamedTuple`

. It should not be passed in by user.

For `Composite`

s of `Tuple`

s, `iterate`

and `getindex`

are overloaded to behave similarly to for a tuple. For `Composite`

s of `struct`

s, `getproperty`

is overloaded to allow for accessing values via `comp.fieldname`

. Any fields not explictly present in the `Composite`

are treated as being set to `Zero()`

. To make a `Composite`

have all the fields of the primal the `canonicalize`

function is provided.

`ChainRulesCore.DoesNotExist`

— Type.`DoesNotExist() <: AbstractZero`

This differential indicates that the derivative does not exist. It is the differential for a Primal type that is not differentiable. Such an Integer, or Boolean (when not being used as a represention of a value that normally would be a floating point.) The only valid way to pertube such a values is to not change it at all. As such, `DoesNotExist`

is functionally identical to `Zero()`

, but provides additional semantic information.

If you are adding this differential to a primal then something is wrong. A optimization package making use of this might like to check for such a case.

!!! note: This does not indicate that the derivative it is not implemented, but rather that mathematically it is not defined.

This mostly shows up as the deriviative with respect to dimension, index, or size arguments.

```
function rrule(fill, x, len::Int)
y = fill(x, len)
fill_pullback(ȳ) = (NO_FIELDS, @thunk(sum(Ȳ)), DoesNotExist())
return y, fill_pullback
end
```

`ChainRulesCore.InplaceableThunk`

— Type.`InplaceableThunk(val::Thunk, add!::Function)`

A wrapper for a `Thunk`

, that allows it to define an inplace `add!`

function.

`add!`

should be defined such that: `ithunk.add!(Δ) = Δ .+= ithunk.val`

but it should do this more efficently than simply doing this directly. (Otherwise one can just use a normal `Thunk`

).

Most operations on an `InplaceableThunk`

treat it just like a normal `Thunk`

; and destroy its inplacability.

`ChainRulesCore.One`

— Type.` One()`

The Differential which is the multiplicative identity. Basically, this represents `1`

.

`ChainRulesCore.Thunk`

— Type.`Thunk(()->v)`

A thunk is a deferred computation. It wraps a zero argument closure that when invoked returns a differential. `@thunk(v)`

is a macro that expands into `Thunk(()->v)`

.

Calling a thunk, calls the wrapped closure. `extern`

ing thunks applies recursively, it also externs the differial that the closure returns. If you do not want that, then simply call the thunk

```
julia> t = @thunk(@thunk(3))
Thunk(var"##7#9"())
julia> extern(t)
3
julia> t()
Thunk(var"##8#10"())
julia> t()()
3
```

**When to @thunk?**

When writing `rrule`

s (and to a lesser exent `frule`

s), it is important to `@thunk`

appropriately. Propagation rules that return multiple derivatives may not have all deriviatives used. By `@thunk`

ing the work required for each derivative, they then compute only what is needed.

**How do thunks prevent work?**

If we have `res = pullback(...) = @thunk(f(x)), @thunk(g(x))`

then if we did `dx + res[1]`

then only `f(x)`

would be evaluated, not `g(x)`

. Also if we did `Zero() * res[1]`

then the result would be `Zero()`

and `f(x)`

would not be evaluated.

**So why not thunk everything?**

`@thunk`

creates a closure over the expression, which (effectively) creates a `struct`

with a field for each variable used in the expression, and call overloaded.

Do not use `@thunk`

if this would be equal or more work than actually evaluating the expression itself.

For more details see the manual section on using thunks effectively

`ChainRulesCore.Zero`

— Type.`Zero() <: AbstractZero`

The additive identity for differentials. This is basically the same as `0`

. A derivative of `Zero()`

. does not propagate through the primal function.

`ChainRulesCore.canonicalize`

— Method.`canonicalize(comp::Composite{P}) -> Composite{P}`

Return the canonical `Composite`

for the primal type `P`

. The property names of the returned `Composite`

match the field names of the primal, and all fields of `P`

not present in the input `comp`

are explictly set to `Zero()`

.

`ChainRulesCore.extern`

— Method.`extern(x)`

Makes a best effort attempt to convert a differential into a primal value. This is not always a well-defined operation. For two reasons:

- It may not be possible to determine the primal type for a given differential.

For example, `Zero`

is a valid differential for any primal.

- The primal type might not be a vector space, thus might not be a valid differential type.

For example, if the primal type is `DateTime`

, it's not a valid differential type as two `DateTime`

can not be added (fun fact: `Milisecond`

is a differential for `DateTime`

).

Where it is defined the operation of `extern`

for a primal type `P`

should be `extern(x) = zero(P) + x`

.

Because of its limitations, `extern`

should only really be used for testing. It can be useful, if you know what you are getting out, as it recursively removes thunks, and otherwise makes outputs more consistent with finite differencing. The more useful action in general is to call `+`

, or in the case of thunks: `unthunk`

.

Note that `extern`

may return an alias (not necessarily a copy) to data wrapped by `x`

, such that mutating `extern(x)`

might mutate `x`

itself.

`ChainRulesCore.frule`

— Method.`frule(f, x..., ṡelf, Δx...)`

Expressing `x`

as the tuple `(x₁, x₂, ...)`

, `Δx`

as the tuple `(Δx₁, Δx₂, ...)`

, and the output tuple of `f(x...)`

as `Ω`

, return the tuple:

`(Ω, (Ω̇₁, Ω̇₂, ...))`

The second return value is the propagation rule, or the pushforward. It takes in differentials corresponding to the inputs (`ẋ₁, ẋ₂, ...`

) and `ṡelf`

the internal values of the function (for closures).

If no method matching `frule(f, x..., ṡelf, Δx...)`

has been defined, then return `nothing`

.

Examples:

unary input, unary output scalar function:

```
julia> dself = Zero()
Zero()
julia> x = rand();
julia> sinx, sin_pushforward = frule(sin, x, dself, 1)
(0.35696518021277485, 0.9341176907197836)
julia> sinx == sin(x)
true
julia> sin_pushforward == cos(x)
true
```

unary input, binary output scalar function:

```
julia> x = rand();
julia> sincosx, sincos_pushforward = frule(sincos, x, dself, 1);
julia> sincosx == sincos(x)
true
julia> sincos_pushforward == (cos(x), -sin(x))
true
```

See also: `rrule`

, `@scalar_rule`

`ChainRulesCore.rrule`

— Method.`rrule(f, x...)`

Expressing `x`

as the tuple `(x₁, x₂, ...)`

and the output tuple of `f(x...)`

as `Ω`

, return the tuple:

`(Ω, (Ω̄₁, Ω̄₂, ...) -> (s̄elf, x̄₁, x̄₂, ...))`

Where the second return value is the the propagation rule or pullback. It takes in differentials corresponding to the outputs (`x̄₁, x̄₂, ...`

), and `s̄elf`

, the internal values of the function itself (for closures)

If no method matching `rrule(f, xs...)`

has been defined, then return `nothing`

.

Examples:

unary input, unary output scalar function:

```
julia> x = rand();
julia> sinx, sin_pullback = rrule(sin, x);
julia> sinx == sin(x)
true
julia> sin_pullback(1) == (NO_FIELDS, cos(x))
true
```

binary input, unary output scalar function:

```
julia> x, y = rand(2);
julia> hypotxy, hypot_pullback = rrule(hypot, x, y);
julia> hypotxy == hypot(x, y)
true
julia> hypot_pullback(1) == (NO_FIELDS, (x / hypot(x, y)), (y / hypot(x, y)))
true
```

See also: `frule`

, `@scalar_rule`

`ChainRulesCore.unthunk`

— Method.`unthunk(x)`

On `AbstractThunk`

s this removes 1 layer of thunking. On any other type, it is the identity operation.

In contrast to `extern`

this is nonrecursive.

`ChainRulesCore.@scalar_rule`

— Macro.```
@scalar_rule(f(x₁, x₂, ...),
@setup(statement₁, statement₂, ...),
(∂f₁_∂x₁, ∂f₁_∂x₂, ...),
(∂f₂_∂x₁, ∂f₂_∂x₂, ...),
...)
```

A convenience macro that generates simple scalar forward or reverse rules using the provided partial derivatives. Specifically, generates the corresponding methods for `frule`

and `rrule`

:

```
function ChainRulesCore.frule(::typeof(f), x₁::Number, x₂::Number, ...)
Ω = f(x₁, x₂, ...)
$(statement₁, statement₂, ...)
return Ω, (_, Δx₁, Δx₂, ...) -> (
(∂f₁_∂x₁ * Δx₁ + ∂f₁_∂x₂ * Δx₂ + ...),
(∂f₂_∂x₁ * Δx₁ + ∂f₂_∂x₂ * Δx₂ + ...),
...
)
end
function ChainRulesCore.rrule(::typeof(f), x₁::Number, x₂::Number, ...)
Ω = f(x₁, x₂, ...)
$(statement₁, statement₂, ...)
return Ω, (ΔΩ₁, ΔΩ₂, ...) -> (
NO_FIELDS,
∂f₁_∂x₁ * ΔΩ₁ + ∂f₂_∂x₁ * ΔΩ₂ + ...),
∂f₁_∂x₂ * ΔΩ₁ + ∂f₂_∂x₂ * ΔΩ₂ + ...),
...
)
end
```

If no type constraints in `f(x₁, x₂, ...)`

within the call to `@scalar_rule`

are provided, each parameter in the resulting `frule`

/`rrule`

definition is given a type constraint of `Number`

. Constraints may also be explicitly be provided to override the `Number`

constraint, e.g. `f(x₁::Complex, x₂)`

, which will constrain `x₁`

to `Complex`

and `x₂`

to `Number`

.

At present this does not support defining for closures/functors. Thus in reverse-mode, the first returned partial, representing the derivative with respect to the function itself, is always `NO_FIELDS`

. And in forward-mode, the first input to the returned propagator is always ignored.

The result of `f(x₁, x₂, ...)`

is automatically bound to `Ω`

. This allows the primal result to be conveniently referenced (as `Ω`

) within the derivative/setup expressions.

The `@setup`

argument can be elided if no setup code is need. In other words:

```
@scalar_rule(f(x₁, x₂, ...),
(∂f₁_∂x₁, ∂f₁_∂x₂, ...),
(∂f₂_∂x₁, ∂f₂_∂x₂, ...),
...)
```

is equivalent to:

```
@scalar_rule(f(x₁, x₂, ...),
@setup(nothing),
(∂f₁_∂x₁, ∂f₁_∂x₂, ...),
(∂f₂_∂x₁, ∂f₂_∂x₂, ...),
...)
```

For examples, see ChainRules' `rulesets`

directory.

`ChainRulesCore.@thunk`

— Macro.`@thunk expr`

Define a `Thunk`

wrapping the `expr`

, to lazily defer its evaluation.

The subtypes of `AbstractDifferential`

define a custom "algebra" for chain rule evaluation that attempts to factor various features like complex derivative support, broadcast fusion, zero-elision, etc. into nicely separated parts.

All subtypes of `AbstractDifferential`

implement the following operations:

`+(a, b)`

: linearly combine differential `a`

and differential `b`

`*(a, b)`

: multiply the differential `b`

by the scaling factor `a`

`Base.conj(x)`

: complex conjugate of the differential `x`

`Base.zero(x) = Zero()`

: a zero.

In general a differential type is the type of a derivative of a value. The type of the value is for contrast called the primal type. Differential types correspond to primal types, although the relation is not one-to-one. Subtypes of `AbstractDifferential`

are not the only differential types. In fact for the most common primal types, such as `Real`

or `AbstractArray{Real}`

the the differential type is the same as the primal type.

In a circular definition: the most important property of a differential is that it should be able to be added (by defining `+`

) to another differential of the same primal type. That allows for gradients to be accumulated.

It generally also should be able to be added to a primal to give back another primal, as this facilitates gradient descent.

`_normalize_scalarrules_macro_input(call, maybe_setup, partials)`

returns (in order) the correctly escaped: - `call`

with out any type constraints - `setup_stmts`

: the content of `@setup`

or `nothing`

if that is not provided, - `inputs`

: with all args having the constraints removed from call, or defaulting to `Number`

- `partials`

: which are all `Expr{:tuple,...}`

`ChainRulesCore._zeroed_backing`

— Method.`_zeroed_backing(P)`

Returns a NamedTuple with same fields as `P`

, and all values `Zero()`

.

`ChainRulesCore.backing`

— Method.`backing(x)`

Accesses the backing field of a `Composite`

, or destructures any other composite type into a `NamedTuple`

. Identity function on `Tuple`

. and `NamedTuple`

s.

This is an internal function used to simplify operations between `Composite`

s and the primal types.

`ChainRulesCore.construct`

— Method.`construct(::Type{T}, fields::[NamedTuple|Tuple])`

Constructs an object of type `T`

, with the given fields. Fields must be correct in name and type, and `T`

must have a default constructor.

This internally is called to construct structs of the primal type `T`

, after an operation such as the addition of a primal to a composite.

It should be overloaded, if `T`

does not have a default constructor, or if `T`

needs to maintain some invarients between its fields.

`ChainRulesCore.propagation_expr`

— Method.```
propagation_expr(Δs, ∂s)
Returns the expression for the propagation of
the input gradient `Δs` though the partials `∂s`.
```

`ChainRulesCore.propagator_name`

— Method.`propagator_name(f, propname)`

Determines a reasonable name for the propagator function. The name doesn't really matter too much as it is a local function to be returned by `frule`

or `rrule`

, but a good name make debugging easier. `f`

should be some form of AST representation of the actual function, `propname`

should be either `:pullback`

or `:pushforward`

This is able to deal with fairly complex expressions for `f`

:

```
julia> propagator_name(:bar, :pushforward)
:bar_pushforward
julia> propagator_name(esc(:(Base.Random.foo)), :pullback)
:foo_pullback
```