Differentiation tools in Julia. JuliaDiff on GitHub

## Stop approximating derivatives!

Derivatives are required at the core of many numerical algorithms. Unfortunately, they are usually computed inefficiently and approximately by some variant of the finite difference approach

$f'(x) \approx \frac{f(x+h) - f(x)}{h}, h \text{ small }.$

This method is inefficient because it requires $\Omega(n)$ evaluations of $f : \mathbb{R}^n \to \mathbb{R}$ to compute the gradient $\nabla f(x) = \left( \frac{\partial f}{\partial x_1}(x), \cdots, \frac{\partial f}{\partial x_n}(x)\right)$, for example. It is approximate because we have to choose some finite, small value of the step length $h$, balancing floating-point precision with mathematical approximation error.

#### What can we do instead?

One option is to explicitly write down a function which computes the exact derivatives by using the rules that we know from calculus. However, this quickly becomes an error-prone and tedious exercise. There is another way! The field of automatic differentiation provides methods for automatically computing exact derivatives (up to floating-point error) given only the function $f$ itself. Some methods use many fewer evaluations of $f$ than would be required when using finite differences. In the best case, the exact gradient of $f$ can be evaluated for the cost of $O(1)$ evaluations of $f$ itself. The caveat is that $f$ cannot be considered a black box; instead, we require either access to the source code of $f$ or a way to plug in a special type of number using operator overloading.

JuliaDiff is an informal organization which aims to unify and document packages written in Julia for evaluating derivatives. The technical features of Julia, namely, multiple dispatch, source code via reflection, JIT compilation, and first-class access to expression parsing make implementing and using techniques from automatic differentiation easier than ever before (in our biased opinion).

## The Big List

This is a big list of Julia Automatic Differentiation (AD) packages and related tooling. As you can see there is a lot going on here. As with any such big lists it rapidly becomes out-dated. When you notice something that is out of date, or just plain wrong, please submit a PR.

This list aims to be comprehensive in coverage. By necessity, this means it is not comprehensive in detail. It is worth investigating each package yourself to really understand its ins and outs, and pros and cons of its competitors.

### Reverse-mode

• Tracker.jl: Operator overloading reverse-mode AD. Most well-known for having been the AD used in earlier versions of the machine learning package Flux.jl. No longer used by Flux.jl, but still used in several places in the Julia ecosystem.

• Zygote.jl: IR-level source to source reverse-mode AD. Very widely used. Particularly notable for being the AD used by Flux.jl. Also features a secret experimental source to source forward-mode AD.

• Yota.jl: IR-level source to source reverse-mode AD.

• XGrad.jl: AST-level source to source reverse-mode AD. Not currently in active development.

• ReversePropagation.jl: Scalar, tracing-based source to source reverse-mode AD.

• Enzyme.jl: Scalar, LLVM source to source reverse-mode AD. Experimental.

• Diffractor.jl: Next-gen IR-level source to source reverse-mode (and forward-mode) AD. In development.

### Forward-mode

• Diffractor.jl: Next-gen IR-level source to source forward-mode (and reverse-mode) AD. In development.

### Exotic

• TaylorSeries.jl: Computes polynomial expansions; which is the generalization of forward-mode AD to nth-order derivatives.

• NiLang.jl: Reversible computing DSL, where everything is differentiable by reversing.

• TaylorDiff.jl: an efficient, linear-scaling implementation for higher-order directional derivatives, implemented with operator-overloading on statically-typed Taylor polynomials. In development.

### Finite Differencing

Yes, we said at the start to stop approximating derivatives, but these packages are faster and more accurate than you would expect finite differencing to ever achieve. If you really need finite differencing, use these packages rather than implementing your own.

• FiniteDifferences.jl: High-accuracy finite differencing with support for almost any type (not just arrays and numbers).

• FiniteDiff.jl: High-accuracy finite differencing with support for efficient calculation of sparse Jacobians via coloring vectors.

• Calculus.jl: Largely deprecated, legacy package. New users should look to FiniteDifferences.jl and FiniteDiff.jl instead.

### Rulesets

Packages providing collections of derivatives of functions which can be used in AD packages.

• DiffRules.jl: An earlier set of AD-independent rules, for scalar functions. Used as the primary source for ForwardDiff.jl, and in part by other packages.

• ZygoteRules.jl: Lightweight package for defining rules for Zygote.jl. Largely deprecated in favour of the AD-independent ChainRulesCore.jl.

### Sparsity

• SparsityDetection.jl: Automatic Jacobian and Hessian sparsity pattern detection.

• SparseDiffTools.jl: Exploiting sparsity to speed up FiniteDiff.jl and ForwardDiff.jl, as well as other algorithms.

### Interface

• AbstractDifferentiation.jl: AD backend-agnostic interface for algorithms that rely on derivatives, gradients, Jacobians, Hessians, etc.

Discussions on JuliaDiff and its uses may be directed to the Julia Discourse forum The autodiff.org site serves as a portal for the academic community, though it is often out-of-date. The ChainRules project maintains a list of recommend reading/watching for those after more information. Finally, automatic differentiation techniques have been implemented in a variety of languages. If you would prefer not to use Julia, see the wikipedia page for a comprehensive list of available packages.