NumericalFunctions¶
NumericalFunction is the central abstraction in anvil. It wraps a Python function written with tinygrad Tensor operations and turns it into a callable, code-generable unit.
Key mental model: Think of NumericalFunction as a compiled function specification. The Tensor class is only used inside function definitions to describe computations (similar to CasADi's SX/MX). Outside of function definitions, data flows as contiguous numpy arrays (similar to CasADi's DM). Users should never need to manipulate Tensors directly except to:
- Write function bodies
- Capture constant data (matrices, lookup tables, neural network weights) via
Tensor(data)in closures
A NumericalFunction can be used in two ways:
- Called from Python:
fn(x)accepts and returns numpy arrays, dispatching to JIT-compiled native code - Exported to C++:
av.generate_module("name", [fn])produces standalone.hpp/.cppfiles
Decorator¶
The simplest way to create a NumericalFunction is with the @av.numerical_function decorator:
import anvil as av
@av.numerical_function("my_fn", 1024)
def my_fn(x: av.Tensor) -> av.Tensor:
return 0.5 * x.square()
The arguments after the name are the input shapes. For multiple inputs, pass multiple shapes:
Direct construction¶
For more control (e.g. custom dtypes, 2D inputs), construct NumericalFunction directly:
fn = av.NumericalFunction(
"my_function",
lambda x: 0.5 * x.square(),
(av.Arg(1024),), # input shape (float64 by default)
)
The Arg constructor specifies the shape and optionally the dtype of each input:
av.Arg(1024) # 1D, float64 (default)
av.Arg((3, 3)) # 2D, float64
av.Arg(1024, dtype=av.dtypes.float32) # 1D, float32
av.Arg(0) # zero-length (used for unused parameters)
Calling convention¶
NumericalFunction.__call__ performs lazy JIT compilation on first use: the function is traced, C++ source is generated, compiled to a shared library via clang++, and loaded via ctypes. Subsequent calls dispatch directly to native code.
import anvil as av
import numpy as np
fn = av.NumericalFunction("add", lambda x, y: x + y, (av.Arg(3), av.Arg(3)))
x = np.array([1.0, 2.0, 3.0])
y = np.array([4.0, 5.0, 6.0])
result = fn(x, y) # np.ndarray([5.0, 7.0, 9.0])
# Multiple outputs returned as tuple of numpy arrays
fn2 = av.NumericalFunction("split", lambda x, y: (x + y, x * y), (av.Arg(3), av.Arg(3)))
sum_result, prod_result = fn2(x, y)
# Scalar outputs
fn3 = av.NumericalFunction("norm", lambda x: (x * x).sum(), (av.Arg(3),))
s = fn3(x) # np.float64
Inputs are cast to the declared dtype and made contiguous. Outputs are always C-contiguous numpy arrays.
Capturing constants¶
Constants (matrices, weights, lookup tables) are captured from closures. They are embedded as static constexpr arrays in the generated C++ and copied into workspace buffers at initialization.
import anvil as av
import numpy as np
A = av.Tensor([[1.0, 2.0], [3.0, 4.0]]) # captured constant
fn = av.NumericalFunction("matmul", lambda x: A @ x, (av.Arg(2),))
fn(np.array([1.0, 1.0])) # array([3., 7.])
In the generated C++, the matrix A is embedded as a static constexpr double[] array and copied into the workspace during init_ws().
SparseNumericalFunction¶
For functions with sparse outputs (typically Jacobians and Hessians), SparseNumericalFunction stores the output as a flat array of non-zero values alongside CSC (Compressed Sparse Column) metadata.
You usually don't construct SparseNumericalFunction directly -- instead, use the AD operators:
from anvil.ad import spjacobian, sphessian
fn = av.NumericalFunction("f", my_fn, (av.Arg(n),))
jac_fn = spjacobian(fn) # SparseNumericalFunction
hess_fn = sphessian(fn) # SparseNumericalFunction (upper triangular)
A SparseNumericalFunction has additional attributes:
shape:(rows, cols)of the full sparse matrixnnz: number of non-zero elementsindices: row indices (CSCinnerIndices)indptr: column pointers (CSCouterStarts)
When called, it returns a flat np.ndarray of nnz values. To reconstruct the sparse matrix:
import scipy.sparse as sp
values = jac_fn(x)
J = sp.csc_array(
(values, jac_fn.indices, jac_fn.indptr),
shape=jac_fn.shape
)
In the generated C++ header, the sparsity pattern is exposed as static constexpr arrays:
namespace my_fn_jac {
static constexpr int rows = 100;
static constexpr int cols = 50;
static constexpr int nnz = 200;
static constexpr int innerIndices[200] = { /* row indices */ };
static constexpr int outerStarts[51] = { /* column pointers */ };
typedef Buffer<double, 200> OUT0_t; // flat array of nnz values
}
Reconstructing with Eigen (Eigen is not required -- any sparse matrix library works):
using SpMat = Eigen::SparseMatrix<double, Eigen::ColMajor, int>;
Eigen::Map<const SpMat> J(
my_fn_jac::rows, my_fn_jac::cols, my_fn_jac::nnz,
my_fn_jac::outerStarts, my_fn_jac::innerIndices, out.data
);
Generic typing¶
NumericalFunction is a generic class parameterized by its input and output types:
The constructor has overloads for 1-4 inputs with single or dual outputs. This means your LSP (e.g. Pylance, pyright) can:
- Infer the number of inputs: if you create
NumericalFunction("f", fn, (Arg(3), Arg(3))), the type isNumericalFunction[tuple[Arg, Arg], Tensor]and the LSP will flag calls with the wrong number of arguments. - Distinguish single vs tuple returns:
fn(x)returnsnp.ndarrayfor single-output functions andtuple[np.ndarray, ...]for multi-output functions.
Limitations:
- Overloads cover up to 4 inputs and up to 2 outputs. Functions with more inputs/outputs will have a less precise type (
NumericalFunction[tuple[Arg, ...], Tensor | tuple[Tensor, ...]]). - The
@av.numerical_functiondecorator returns aNumericalFunctionbut the LSP may not infer the specific generic parameters from the decorator arguments. - AD operators like
gradient(fn)return aNumericalFunctionwith the same input type but the output type may not be precisely tracked.