Arguments
General guidelines
Function form
DifferentiationInterface only computes derivatives for functions with one of two specific forms:
y = f(x, contexts...) # out of place, returns `y`
f!(y, x, contexts...) # in place, returns `nothing`In this notation:
f(orf!) is the differentiated functionyis the outputxis the input, the only "active" argument, which always comes firstcontextsmay contain additional, inactive arguments
The quantities returned by the various operators always correspond to (partial) derivatives of y with respect to x.
Assumptions
The package makes one central assumption on the behavior and implementation of f (or f!):
Either an argument's provided value matters, or it can be mutated during the function call, but never both.
This rule is declined as follows:
- The provided value of
xmatters because we evaluate and differentiatefat pointx. Therefore,xcannot be mutated by the function. - For in-place functions
f!, the outputyis meant to be overwritten. Hence, its provided (initial) value cannot matter, and it must be entirely overwritten.
Whether or not the function object itself can be mutated is a tricky question, and support for this varies between backends. When in doubt, try to avoid mutating functions and pass contexts instead. In any case, DifferentiationInterface will assume that the recursive components (fields, subfields, etc.) of f or f! individually satisfy the same mutation rule: whenever the initial value matters, no mutation is allowed.
Contexts
Motivation
As stated, there can be only one active argument, which we call x. However, version 0.6 of the package introduced the possibility of additional "context" arguments, whose derivatives we don't need to compute. Contexts can be useful if you have a function y = f(x, a, b, c, ...) or f!(y, x, a, b, c, ...) and you only want the derivative of y with respect to x. Another option would be creating a closure, but that is sometimes undesirable for performance reasons.
Every context argument must be wrapped in a subtype of Context and come after the active argument x.
Context types
There are three kinds of context: Constant, Cache and the hybrid ConstantOrCache. Those are also classified based on the mutation rule:
Constantcontexts wrap data that influences the output of the function. Hence they cannot be mutated.Cachecontexts correspond to scratch spaces that can be mutated at will. Hence their provided value is arbitrary.ConstantOrCacheis a hybrid, whose recursive components (fields, subfields, etc.) must individually satisfy the assumptions of eitherConstantorCache.
Semantically, both of these calls compute the partial gradient of f(x, c) with respect to x, but they consider c differently:
gradient(f, backend, x, Constant(c))
gradient(f, backend, x, Cache(c))In the first call, c must be kept unchanged throughout the function evaluation. In the second call, c may be mutated with values computed during the function.
Not every backend supports every type of context. See the documentation on backends for more details.