Multistage Optimal Control¶

For optimal control problems (OCPs) with a time horizon structure, MultistageProblem exploits the block-tridiagonal sparsity pattern for efficient solving.

Problem structure¶

A multistage OCP has the form:

\[ \begin{align*} \min_{z_0, \ldots, z_N} & \quad \sum_{i=0}^{N} \ell_i(z_i, p_i) \\ \text{s.t.} & \quad g_\mathrm{initial}(z_0, p_0) = 0, \\ & \quad g_\mathrm{dynamics}(z_i, z_{i+1}, p_i) = 0, \quad i = 0, \ldots, N-1, \\ & \quad g_\mathrm{stage}(z_i, p_i) \leq 0, \quad i = 1, \ldots, N-1, \\ & \quad g_\mathrm{terminal}(z_N, p_N) = 0 \end{align*} \]

where \(z_i \in \mathbb{R}^{n_z}\) is the stage variable (typically stacking state and control) and \(p_i\) is the per-stage parameter.

Defining stage functions¶

Each stage function is a regular NumericalFunction. The input conventions depend on the function type:

import anvil as av

nz = 6   # stage dimension (e.g., [x, y, phi, v, T, delta])
np_ = 4  # per-stage parameter dimension
ng = 0   # global parameter dimension

# Initial constraint: (z_0, p_global, p_0) -> R^m
initial_fn = av.NumericalFunction(
    "initial", lambda z, pg, p: z[:4] - p[:4],
    (av.Arg(nz), av.Arg(ng), av.Arg(np_)),
)

# Dynamics: (z_i, z_{i+1}, p_global, p_i) -> R^m
dynamics_fn = av.NumericalFunction(
    "dynamics", lambda z, znext, pg, p: znext[:4] - rk4(z, p),
    (av.Arg(nz), av.Arg(nz), av.Arg(ng), av.Arg(np_)),
)

# Stage constraint: (z_i, p_global, p_i) -> R^m
stage_fn = av.NumericalFunction(
    "stage", lambda z, pg, p: z[4:] - bounds,
    (av.Arg(nz), av.Arg(ng), av.Arg(np_)),
)

# Terminal constraint: (z_N, p_global, p_N) -> R^m
terminal_fn = av.NumericalFunction(
    "terminal", lambda z, pg, p: z[:4] - target,
    (av.Arg(nz), av.Arg(ng), av.Arg(np_)),
)

Creating a MultistageProblem¶

from anvil.opti import MultistageProblem, SparseSQPFunction, SQPSettings

N = 10  # horizon length

problem = MultistageProblem(
    N=N,
    nz=nz,
    cost_initial_fn=cost_initial,
    cost_stage_fn=cost_stage,
    cost_terminal_fn=cost_terminal,
    eq_initial_fn=initial_fn,
    eq_interstage_fn=dynamics_fn,
    eq_stage_fn=None,
    eq_terminal_fn=terminal_fn,
    ineq_stage_fn=stage_fn,
    z_lb=av.Tensor([...]),
    z_ub=av.Tensor([...]),
    param_dim=np_,
    global_param_dim=ng,
)

solver = SparseSQPFunction(
    name="ocp_solver",
    problem=problem,
    settings=SQPSettings(max_iter=50),
)

MultistageProblem extends SparseSQPProblem -- it automatically:

Assembles per-stage constraints into the global constraint vector
Computes the block-tridiagonal sparse Jacobian structure
Generates efficient sparse Hessian via per-block Jacobians

Parameters¶

The parameter vector p passed to the solver is structured as:

p = [p_global, p_0, p_1, ..., p_N]

where p_global has dimension global_param_dim and each p_i has dimension param_dim. The total parameter length is global_param_dim + param_dim * (N + 1).

NLS variant¶

For problems where the cost is a sum of squared residuals, NLSMultistageProblem uses a Gauss-Newton Hessian approximation (\(J^\top J\)) instead of the exact Lagrangian Hessian:

from anvil.opti import NLSMultistageProblem

problem = NLSMultistageProblem(
    N=N, nz=nz,
    residual_initial_fn=res_initial,
    residual_stage_fn=res_stage,
    residual_terminal_fn=res_terminal,
    eq_interstage_fn=dynamics_fn,
    # ... same constraint options as MultistageProblem
)

The NLS variant is typically more efficient when the residual Jacobians are sparse, as the Gauss-Newton Hessian is computed directly from the sparse residual Jacobians without forming the full Lagrangian.

Discretization helpers¶

anvil provides integrators for discretizing continuous-time dynamics:

from anvil.discretization import euler, rk2, rk4

# Continuous dynamics: dx/dt = f(x, u)
continuous_fn = av.NumericalFunction("f", dynamics, (av.Arg(nx), av.Arg(nu)))

# Discrete dynamics: x_{k+1} = F(x_k, u_k)
discrete_fn = rk4(continuous_fn, dt=0.1)

Available integrators: euler, rk2, rk4.

Exporting to C++¶

av.generate_module("ocp_module", [solver])

The generated interface is the same as for regular SQP solvers (see Optimization), with the decision variable being the stacked vector \([z_0, z_1, \ldots, z_N]\) and the parameter being \([p_\mathrm{global}, p_0, \ldots, p_N]\).