Part I · Foundations Week 6 Published

Implicit methods, stiff systems, symplectic integration

When explicit methods fail — backward Euler, BDF, DIRK as A-stable alternatives; exponential integrators as the family of choice for selective SSMs; Hamiltonian mechanics and symplectic integration as the natural geometric language for hidden-state dynamics.

On this page
  1. 6.1 Why explicit methods fail: stiffness
  2. 6.2 Implicit methods: backward Euler, BDF, DIRK
  3. 6.3 Exponential integrators revisited
  4. 6.4 Hamiltonian mechanics, energy conservation, and the symplectic structure
  5. 6.5 Symplectic integrators
  6. 6.6 Toward selective SSMs as Hamiltonian flows
  7. 6.7 What’s next
  8. 6.8 Exercises
  9. Exercise 6.1 (computation)
  10. Exercise 6.2 (computation)
  11. Exercise 6.3 (computation + code)
  12. Exercise 6.4 (theory) — solution in §6.9
  13. Exercise 6.5 (theory) — solution in §6.9
  14. Exercise 6.6 (theory) — solution in §6.9
  15. 6.9 Full solutions to theory exercises
  16. Solution to Exercise 6.4
  17. Solution to Exercise 6.5
  18. Solution to Exercise 6.6
  19. 6.10 Companion code

Implicit methods, stiff systems, symplectic integration

6.1 Why explicit methods fail: stiffness

Chapter 5 ended with the Dahlquist barrier: no explicit Runge–Kutta method is A-stable. The barrier matters because of a phenomenon called stiffness — when the dynamics matrix A\statemat has eigenvalues with very different magnitudes.

A linear ODE h˙=Ah\dot \statevec = \statemat \statevec is stiff if A\statemat has at least one eigenvalue with Re(λi)1/T\operatorname{Re}(\lambda_i) \le -1/T for a time-scale TT much smaller than the simulation horizon, and the eigenvalue is paired with a long-lived (small-λ\abs{\lambda}) mode that one actually wants to track. Stiffness is therefore relative: a system is stiff with respect to a given horizon and given accuracy goal, not in absolute terms. The Robertson chemical-kinetics problem (λ\lambda‘s spanning ten orders of magnitude) is the canonical textbook example; van der Pol at parameter μ1\mu \gg 1 is a popular ML-adjacent stand-in because it generates a nonlinear oscillation whose slow and fast phases differ by μ\mu.

For an explicit method, the step size is bounded by the fastest eigenvalue: Δ2/λmax\stepsize \lesssim 2/\abs{\lambda_{\max}} (forward Euler) or some other small constant times 1/λmax1/\abs{\lambda_{\max}} (RK4). If the slow eigenvalue we want to track is λmin\abs{\lambda_{\min}}, the integrator takes λmax/λmin\abs{\lambda_{\max}}/\abs{\lambda_{\min}} times more steps than the slow time scale would suggest. For Robertson this ratio is 101010^{10}. The simulation never finishes.

For an SSM, stiffness arises naturally for input-dependent dynamics. Mamba’s selective scan generates an effective A(ut)\statemat(u_t) at each token; for inputs that produce eigenvalues with very different magnitudes, the time-discretized recurrence is solving a stiff problem at each step. If the discretization is not A-stable, the recurrence either blows up or requires a step size Δ1\stepsize \ll 1 — at which point you are running the model in an absurd regime. This is the structural reason Mamba-3 switched to exponential-trapezoidal (§4.5), which is A-stable in A\statemat but explicit in uu.

6.2 Implicit methods: backward Euler, BDF, DIRK

The fix for stiffness is to abandon explicit methods. Backward Euler is the simplest implicit method:

hk+1=hk+Δf(hk+1).\statevec_{k+1} = \statevec_k + \stepsize f(\statevec_{k+1}).

Solving for hk+1\statevec_{k+1} requires either a linear solve (for linear ff) or a Newton iteration (for nonlinear ff). The pay-off is the stability function

\stabfnBE(z)=11z,\stabfn_{\text{BE}}(z) = \frac{1}{1 - z},

which has \stabfnBE(z)1\abs{\stabfn_{\text{BE}}(z)} \le 1 for every zz with Re(z)0\operatorname{Re}(z) \le 0 — backward Euler is A-stable. It is also L-stable: \stabfnBE(z)0\stabfn_{\text{BE}}(z) \to 0 as Re(z)\operatorname{Re}(z) \to -\infty, so fast modes get damped aggressively as desired on stiff problems.

.

(Backward Euler is L-stable.) Backward Euler’s stability function \stabfn(z)=1/(1z)\stabfn(z) = 1/(1-z) satisfies \stabfn(z)1\abs{\stabfn(z)} \le 1 for every zCz \in \C with Re(z)0\operatorname{Re}(z) \le 0, and \stabfn(z)0\stabfn(z) \to 0 as Re(z)\operatorname{Re}(z) \to -\infty.

The proof is one line each: 1z2=(1Re(z))2+Im(z)21\abs{1 - z}^2 = (1 - \operatorname{Re}(z))^2 + \operatorname{Im}(z)^2 \ge 1 when Re(z)0\operatorname{Re}(z) \le 0, giving \stabfn(z)1\abs{\stabfn(z)} \le 1. The limit is immediate.

Backward Euler is first-order accurate. Higher-order implicit methods come in several families.

BDFkk (Backward Differentiation Formula, kkth-order) is a multi-step method that approximates h˙(tk+1)\dot \statevec(t_{k+1}) by a kk-point backward difference and equates it to f(hk+1)f(\statevec_{k+1}). BDF1 = backward Euler; BDF2 is second-order and L-stable; BDFkk for k=3,,6k = 3, \ldots, 6 are conditionally stable but widely used in stiff ODE codes. BDF7+ is unstable. The order limit is the second Dahlquist barrier, a structural fact about multi-step methods that we will state without proof: no AA-stable linear multi-step method has order higher than 2.

Diagonally implicit RK (DIRK) sits between explicit RK and fully implicit RK. The AA matrix is lower triangular with non-zero diagonal, so each stage requires solving an independent nonlinear system (cheaper than the coupled system of fully implicit RK, but more expensive than the trivial explicit evaluations). Singly-DIRK (SDIRK) further constrains all diagonal entries to be equal, so the Newton Jacobian factorizes once per step. The Crank–Nicolson scheme is a 2-stage SDIRK of order 2.

Fully implicit RK — the most expensive family — includes Gauss–Legendre methods, which we will return to in §6.5 as the optimal symplectic methods.

The computational trade-off for implicit methods: each step costs O(N3)O(N^3) (linear solve for the Jacobian factor) or O(iterationsN2)O(\text{iterations} \cdot N^2) (Newton iteration for nonlinear ff). For SSMs with N=16,64N = 16, 64 this is acceptable; for NN in the thousands it begins to dominate. The Chapter 4 exponential-trapezoidal scheme has the same A-stability as backward Euler but only costs one matrix-exponential per timestep (vs one Newton solve), and the matrix exponential of a low-rank or diagonalized A\statemat is cheap. Mamba-3’s specific tactical advantage over a backward-Euler implementation comes down to this cost asymmetry.

6.3 Exponential integrators revisited

The exp-trapezoidal scheme of §4.5 is the simplest member of the exponential integrator family. Recall the general idea: given h˙=Ah+Bu(t)\dot \statevec = \statemat \statevec + \inputmat u(t), treat the linear part exactly via eAΔe^{\statemat \stepsize} and approximate only the forcing integral. The φ\varphi-function family φk(z)=(ezj=0k1zj/j!)/zk\varphi_k(z) = (e^z - \sum_{j=0}^{k-1} z^j/j!)/z^k provides building blocks for arbitrarily high order.

Three commonly used schemes:

  • exp-Euler: hk+1=eAΔhk+Δφ1(AΔ)Buk\statevec_{k+1} = e^{\statemat \stepsize} \statevec_k + \stepsize \varphi_1(\statemat \stepsize) \inputmat u_k. Treats the input as piecewise constant (like ZOH); first-order for forced systems, exact on autonomous problems.
  • exp-midpoint: uses the input midpoint instead of uku_k; second-order for symmetric input perturbations.
  • exp-trapezoidal (§4.5): linear interpolation of the input; second-order for C1C^1 inputs.
  • ETDRK4 Hochbruck & Ostermann (2010) : a four-stage exponential RK method that achieves fourth order. The Mamba-3 paper does not (yet) use ETDRK4; its second-order exp-trapezoidal is the empirical sweet spot.

All exponential integrators inherit A-stability from eAΔe^{\statemat \stepsize}. They sidestep the Dahlquist barrier because their stability function is not a polynomial — it is eze^z times a rational-in-zz correction factor — and the polynomial constraint of explicit RK methods does not apply.

The implementation cost: one matrix exponential plus one or two φ\varphi-function evaluations per step. For a low-rank A\statemat (which selective SSMs typically have, since the structured-A\statemat pattern of Chapter 7 carries forward), the matrix exponential is cheap. For dense general A\statemat exponentials cost O(N3)O(N^3) via Padé / scaling-and-squaring, the same as a Newton solve — so the choice between exp-trap and backward Euler often comes down to whether you want autonomous-exactness (exp-trap wins) or L-stability damping of high-frequency modes (backward Euler wins).

6.4 Hamiltonian mechanics, energy conservation, and the symplectic structure

The remainder of this chapter introduces the language of Hamiltonian systems and symplectic integration. This is where the direct-transfer hook from continuum mechanics applies most cleanly. The exposition is brief and standalone; for the full theory see Hairer–Lubich–Wanner Hairer & Wanner (1996) Volume II Chapter VI.

A Hamiltonian system on phase space (q,p)Rn×Rn(q, p) \in \R^n \times \R^n is governed by a scalar function \hamilton(q,p)\hamilton(q, p) — the Hamiltonian, intended to represent total energy — through Hamilton’s equations:

q˙=\hamiltonp,p˙=\hamiltonq.\dot q = \frac{\partial \hamilton}{\partial p}, \qquad \dot p = -\frac{\partial \hamilton}{\partial q}.

The canonical example is the harmonic oscillator \hamilton(q,p)=12(p2+ω2q2)\hamilton(q, p) = \tfrac{1}{2}(p^2 + \omega^2 q^2), giving q˙=p\dot q = p, p˙=ω2q\dot p = -\omega^2 q. The pendulum is \hamilton(q,p)=12p2cos(q)\hamilton(q, p) = \tfrac{1}{2} p^2 - \cos(q); vortex-dynamics flows are higher-dimensional but structurally identical.

Three properties make Hamiltonian systems geometrically special.

Energy conservation. Along any trajectory (q(t),p(t))(q(t), p(t)), ddt\hamilton(q(t),p(t))=\hamiltonqq˙+\hamiltonpp˙=\hamiltonq\hamiltonp\hamiltonp\hamiltonq=0\frac{d}{dt} \hamilton(q(t), p(t)) = \frac{\partial \hamilton}{\partial q} \dot q + \frac{\partial \hamilton}{\partial p} \dot p = \frac{\partial \hamilton}{\partial q} \frac{\partial \hamilton}{\partial p} - \frac{\partial \hamilton}{\partial p} \frac{\partial \hamilton}{\partial q} = 0. The Hamiltonian is a constant of motion.

Phase-space volume preservation (Liouville’s theorem). The flow Φt:(q0,p0)(q(t),p(t))\Phi_t : (q_0, p_0) \mapsto (q(t), p(t)) has Jacobian determinant 1 everywhere: detΦt1\det \nabla \Phi_t \equiv 1. Volumes in phase space are preserved exactly under the dynamics.

Symplectic structure. Define the symplectic 2-form \symform=idqidpi\symform = \sum_i dq_i \wedge dp_i. The flow Φt\Phi_t is a symplectic transformation: it preserves \symform\symform in the sense that Φt\symform=\symform\Phi_t^* \symform = \symform where Φt\Phi_t^* denotes the pullback. Symplecticity is a strict refinement of volume preservation — every symplectic transformation is volume-preserving, but not vice versa.

The reason these properties matter for numerics is that standard order-optimized integrators destroy them. A fourth-order RK method simulates the harmonic oscillator with O(Δ4)O(\stepsize^4) local error per step — but it does not exactly preserve energy. Over T/ΔT/\stepsize steps, the energy error accumulates as O(Δ3)O(\stepsize^3) per period times the number of periods — leading to linear-in-time energy drift that eventually destroys the trajectory’s qualitative behavior. RK4 on a pendulum will, over 1000 periods, gradually inflate the orbit and unbound the system. RK4 on a planetary orbit will (eventually) eject the planet. These are not numerical instabilities in the Chapter 5 sense — the method is A-stable for the small-amplitude regime — they are geometric errors: the wrong qualitative dynamics.

6.5 Symplectic integrators

A symplectic integrator is a numerical map ΨΔ:(q,p)(q,p)\Psi_\stepsize : (q, p) \mapsto (q', p') that preserves the symplectic 2-form: ΨΔ\symform=\symform\Psi_\stepsize^* \symform = \symform. Symplectic integrators do not, in general, preserve \hamilton\hamilton exactly — but they preserve a nearby modified Hamiltonian \hamilton~=\hamilton+O(Δp)\widetilde \hamilton = \hamilton + O(\stepsize^p) for a pp-th order symplectic method. As a consequence, the energy \hamilton(qk,pk)\hamilton(q_k, p_k) along the discrete trajectory oscillates within a bounded band of width O(Δp)O(\stepsize^p) around its initial value — not drifting linearly with tt, but staying near a level set of \hamilton~\widetilde \hamilton. This is the backward error analysis of geometric integrators, and it is the practical reason symplectic methods are mandatory for long-horizon Hamiltonian simulations.

Three canonical schemes.

Symplectic Euler (1st order). For a separable Hamiltonian \hamilton(q,p)=T(p)+V(q)\hamilton(q, p) = T(p) + V(q):

pk+1=pkΔV(qk),qk+1=qk+ΔT(pk+1).p_{k+1} = p_k - \stepsize \, V'(q_k), \qquad q_{k+1} = q_k + \stepsize \, T'(p_{k+1}).

Update momentum first using position-derived force; then update position using the updated momentum. The asymmetry — using pk+1p_{k+1} in the second step rather than pkp_k — is what makes the scheme symplectic. A symmetric version updates qq first, then pp; both are valid.

Störmer–Verlet (2nd order, symmetric). A symmetric version of symplectic Euler that does a half-step in momentum, full step in position, half-step in momentum:

pk+1/2=pkΔ2V(qk),qk+1=qk+ΔT(pk+1/2),pk+1=pk+1/2Δ2V(qk+1).\begin{aligned} p_{k+1/2} &= p_k - \tfrac{\stepsize}{2} V'(q_k), \\ q_{k+1} &= q_k + \stepsize \, T'(p_{k+1/2}), \\ p_{k+1} &= p_{k+1/2} - \tfrac{\stepsize}{2} V'(q_{k+1}). \end{aligned}

This is the algorithm used in essentially every molecular-dynamics simulation. Second-order accurate, symplectic, time-reversible (the same algorithm run with Δ-\stepsize exactly reverses the trajectory). The energy error is bounded by O(Δ2)O(\stepsize^2) over arbitrarily long horizons.

Gauss–Legendre IRK (order 2s2s, A-stable, symplectic). The ss-stage Gauss–Legendre Runge–Kutta method has nodes at the Gauss–Legendre quadrature points and is the unique ss-stage RK method of order 2s2s. The 1-stage method is the implicit midpoint rule (order 2, symplectic, A-stable). The 2-stage method is order 4. Gauss–Legendre IRK is the unique family that is simultaneously A-stable, symplectic, and of optimal order — making it the gold standard for high-accuracy long-horizon Hamiltonian simulation.

Energy of a harmonic oscillator over 100 periods, integrated with classical RK4 (linear drift) versus Störmer-Verlet (bounded oscillation).
Long-horizon energy comparison on the harmonic oscillator $\\dot q = p$, $\\dot p = -q$ over 100 periods at $\\Delta = 0.05$. Classical RK4 (gold) accumulates linear-in-time energy drift (~$10^{-3}$ after 100 periods despite local error $O(\\Delta^4)$). Störmer-Verlet (navy) oscillates within a band of width $O(\\Delta^2)$ but the running average stays at the initial energy. The qualitative difference between linear drift and bounded oscillation is the defining property of symplectic vs non-symplectic integrators. Produced by companions/ch06/jax/symplectic_demo.py.
.

(Modified Hamiltonian for symplectic methods.) For a symplectic integrator ΨΔ\Psi_\stepsize of order pp applied to a Hamiltonian system, there exists a modified Hamiltonian \hamilton~=\hamilton+ΔpHp+Δp+1Hp+1+\widetilde \hamilton = \hamilton + \stepsize^p H_p + \stepsize^{p+1} H_{p+1} + \cdots such that the discrete trajectory exactly preserves \hamilton~\widetilde \hamilton to within exponentially small terms over exponentially long times. The original \hamilton\hamilton therefore oscillates within a bounded O(Δp)O(\stepsize^p) band along the discrete trajectory.

The proof — Hairer–Lubich–Wanner Chapter IX — uses backward error analysis to expand the modified Hamiltonian as an asymptotic series in Δ\stepsize and shows the series can be truncated with controlled remainder. The takeaway for practice: a pp-th order symplectic integrator is exponentially better than a pp-th order non-symplectic one on Hamiltonian problems, even though the local truncation error of both is O(Δp+1)O(\stepsize^{p+1}). Local error is misleading for long-horizon problems; geometric structure is what matters.

6.6 Toward selective SSMs as Hamiltonian flows

The C1 niche pilot proposes the following research program. The hidden-state dynamics of a selective SSM can sometimes be written as a Hamiltonian flow on the hidden-state manifold; when they can, applying a symplectic integrator should reduce long-horizon error in a way that order-optimized integrators cannot match. The first step is to characterize when this Hamiltonian structure exists.

For a linear-time-invariant SSM h˙=Ah\dot \statevec = \statemat \statevec, the dynamics are Hamiltonian iff there exists a symmetric positive-definite PP such that AP+PA=0\statemat^\top P + P \statemat = 0. Equivalently, A\statemat is similar (via P1/2P^{1/2}) to a skew-symmetric matrix. The quadratic Hamiltonian is then \hamilton(h)=12hPh\hamilton(\statevec) = \tfrac{1}{2} \statevec^\top P \statevec, and the symplectic 2-form is \symform=P1\symform = P^{-1} (in the standard symplectic-flat coordinates). For HiPPO-LegS the eigenvalues are all real-negative, not purely imaginary, so HiPPO-LegS dynamics are dissipative, not Hamiltonian. Symplectic methods do not apply. This is consistent with the empirical observation that ZOH and bilinear discretizations of HiPPO-LegS perform well — dissipation handles the long-horizon stability story.

For a selective SSM with input-dependent A(u)\statemat(u), the relevant question becomes: does A(u)\statemat(u) have a Hamiltonian decomposition for some range of inputs, even if not for all? The C1 pilot’s empirical hypothesis is that during certain training phases the learned A(u)\statemat(u) matrices drift toward eigenvalue spectra that are closer to purely imaginary than to the LHP, and that this drift is associated with the long-horizon recall artifacts observed in Mamba’s failure modes Anonymous (2025) . If this hypothesis holds, then symplectic discretization of selective SSMs should reduce those artifacts. The Chapter 17 pilot integration develops this thread; this chapter has supplied the necessary integration-theory vocabulary.

Phase portrait of the pendulum: closed orbits preserved by symplectic Euler, spirals (energy drift) under RK4.
Pendulum $\\hamilton = p^2/2 - \\cos q$ phase portrait over 50 periods at $\\Delta = 0.1$ (large step size, chosen to expose qualitative differences). Symplectic Euler (navy): trajectory stays on a closed orbit (constant modified energy). RK4 (gold): orbit slowly spirals outward (linear energy drift). Both methods are formally stable; the qualitative difference is geometric. Produced by companions/ch06/jax/symplectic_demo.py.

6.7 What’s next

This chapter completes the foundations layer of the book (Part I). The next part — Chapters 7–10 — introduces structured SSMs (S4, S4D, S5, Mamba-1/2, Mamba-3) as discretizations of the continuous systems developed here. Chapter 7 sets up HiPPO theory; Chapter 8 derives the S4 / S4D / S5 architectures and shows where bilinear discretization shines; Chapter 9 introduces selective scans (Mamba-1/2); Chapter 10 brings the exp-trapezoidal scheme of §4.5 to its home in Mamba-3.

The C1 pilot integration in Chapter 17 returns to this chapter’s symplectic story, applying the modified-Hamiltonian framework to learned selective dynamics. Readers focused on the SSM applications can skip to Chapter 7 now and consult §6.5–§6.6 when the symplectic discussion comes back.

6.8 Exercises

Six problems. Short/numerical (6.1–6.3) have inline collapsible solutions; long/proof exercises (6.4–6.6) have full worked solutions in §6.9.

Exercise 6.1 (computation)

Verify that backward Euler’s stability function is \stabfnBE(z)=1/(1z)\stabfn_{\text{BE}}(z) = 1/(1-z) by directly applying the scheme to the test problem y˙=λy\dot y = \lambda y and solving for yk+1y_{k+1}.

Solution

Backward Euler: yk+1=yk+Δλyk+1y_{k+1} = y_k + \stepsize \lambda y_{k+1}. Rearranging, (1Δλ)yk+1=yk(1 - \stepsize \lambda) y_{k+1} = y_k, so yk+1=yk/(1Δλ)y_{k+1} = y_k / (1 - \stepsize \lambda). Setting z=Δλz = \stepsize \lambda, the stability function is \stabfnBE(z)=1/(1z)\stabfn_{\text{BE}}(z) = 1/(1 - z). ∎

Exercise 6.2 (computation)

For the harmonic oscillator q˙=p\dot q = p, p˙=q\dot p = -q (so \hamilton=(p2+q2)/2\hamilton = (p^2 + q^2)/2), compute one step of symplectic Euler (“p first” variant) starting from (q0,p0)=(1,0)(q_0, p_0) = (1, 0) at Δ=0.1\stepsize = 0.1. What are (q1,p1)(q_1, p_1), and what is \hamilton(q1,p1)\hamilton(q_1, p_1)?

Solution

Symplectic Euler (pp first): pk+1=pkΔqkp_{k+1} = p_k - \stepsize \cdot q_k, then qk+1=qk+Δpk+1q_{k+1} = q_k + \stepsize \cdot p_{k+1}.

Starting from (1,0)(1, 0): p1=00.11=0.1p_1 = 0 - 0.1 \cdot 1 = -0.1. Then q1=1+0.1(0.1)=0.99q_1 = 1 + 0.1 \cdot (-0.1) = 0.99. So (q1,p1)=(0.99,0.1)(q_1, p_1) = (0.99, -0.1).

Energy: \hamilton(q1,p1)=(0.992+0.12)/2=(0.9801+0.01)/2=0.49505\hamilton(q_1, p_1) = (0.99^2 + 0.1^2)/2 = (0.9801 + 0.01)/2 = 0.49505. Initial energy was 0.50.5. The energy is not exactly preserved (it changed by 0.00495Δ2/2-0.00495 \approx -\stepsize^2/2), but it is bounded — over many steps it will oscillate, not drift.

Exercise 6.3 (computation + code)

Run companions/ch06/jax/symplectic_demo.py. Verify that classical RK4 produces linear-in-time energy drift on the harmonic oscillator while Störmer–Verlet’s energy stays in a bounded band. What is the drift rate (energy per period) of RK4 at Δ=0.05\stepsize = 0.05?

Solution

The companion’s main output shows two energy-vs-time traces over 100 periods. RK4 drifts approximately linearly; the drift rate is roughly 10510^{-5} per period at Δ=0.05\stepsize = 0.05 — consistent with O(Δ5)O(\stepsize^5) local error compounded over 125\sim 125 steps per period. Störmer–Verlet’s energy stays within a band of width 103=O(Δ2)\sim 10^{-3} = O(\stepsize^2), with no secular drift. The contrast — linear drift vs bounded oscillation — is the key takeaway of §6.5.

Exercise 6.4 (theory) — solution in §6.9

Prove the L-stability of backward Euler (Theorem 6.1): \stabfnBE(z)1\abs{\stabfn_{\text{BE}}(z)} \le 1 for Re(z)0\operatorname{Re}(z) \le 0, and \stabfnBE(z)0\stabfn_{\text{BE}}(z) \to 0 as Re(z)\operatorname{Re}(z) \to -\infty.

Exercise 6.5 (theory) — solution in §6.9

Show that symplectic Euler is symplectic. That is, prove that the Jacobian (qk+1,pk+1)/(qk,pk)\partial(q_{k+1}, p_{k+1})/\partial(q_k, p_k) has determinant exactly 1 for the harmonic oscillator. (Then note: for separable Hamiltonians, the same argument applies via the chain rule.)

Exercise 6.6 (theory) — solution in §6.9

Show that for a linear-time-invariant ODE h˙=Ah\dot \statevec = \statemat \statevec with A\statemat skew-symmetric (A=A\statemat^\top = -\statemat), the function \hamilton(h)=12hh\hamilton(\statevec) = \tfrac{1}{2} \statevec^\top \statevec is a constant of motion. Use this to give a sufficient condition (in terms of A\statemat only) for an SSM to be Hamiltonian.

6.9 Full solutions to theory exercises

Solution to Exercise 6.4

Write z=r+iωz = -r + i\omega with r0r \ge 0. Then 1z2=(1+r)2+ω2\abs{1 - z}^2 = (1 + r)^2 + \omega^2. Since r0r \ge 0, (1+r)21(1 + r)^2 \ge 1, so 1z21\abs{1 - z}^2 \ge 1. Therefore \stabfnBE(z)=1/1z1\abs{\stabfn_{\text{BE}}(z)} = 1/\abs{1 - z} \le 1. ∎

For the L-stability limit: as rr \to \infty with ω\omega bounded, 1z2=(1+r)2+ω2\abs{1 - z}^2 = (1 + r)^2 + \omega^2 \to \infty, so \stabfnBE(z)0\stabfn_{\text{BE}}(z) \to 0. ∎

Solution to Exercise 6.5

For the harmonic oscillator q˙=p\dot q = p, p˙=q\dot p = -q, symplectic Euler (pp first) gives

pk+1=pkΔqk,qk+1=qk+Δpk+1=qk+Δ(pkΔqk)=(1Δ2)qk+Δpk.\begin{aligned} p_{k+1} &= p_k - \stepsize \, q_k, \\ q_{k+1} &= q_k + \stepsize \, p_{k+1} = q_k + \stepsize (p_k - \stepsize \, q_k) = (1 - \stepsize^2) q_k + \stepsize p_k. \end{aligned}

The Jacobian is

J=(qk+1,pk+1)(qk,pk)=(1Δ2ΔΔ1).J = \frac{\partial(q_{k+1}, p_{k+1})}{\partial(q_k, p_k)} = \begin{pmatrix} 1 - \stepsize^2 & \stepsize \\ -\stepsize & 1 \end{pmatrix}.

The determinant: detJ=(1Δ2)(1)Δ(Δ)=1Δ2+Δ2=1\det J = (1 - \stepsize^2)(1) - \stepsize \cdot (-\stepsize) = 1 - \stepsize^2 + \stepsize^2 = 1. ∎

So the discrete map preserves the symplectic 2-form dqdpdq \wedge dp exactly. This is the algebraic content of “symplectic Euler is symplectic” — the determinant being exactly 11, not 1+O(Δp)1 + O(\stepsize^p).

For a general separable Hamiltonian \hamilton=T(p)+V(q)\hamilton = T(p) + V(q), symplectic Euler gives pk+1=pkΔV(qk)p_{k+1} = p_k - \stepsize V'(q_k) and qk+1=qk+ΔT(pk+1)q_{k+1} = q_k + \stepsize T'(p_{k+1}). The Jacobian factorizes as

J=(10ΔV(qk)1)(1ΔT(pk+1)01),J = \begin{pmatrix} 1 & 0 \\ -\stepsize V''(q_k) & 1 \end{pmatrix} \cdot \begin{pmatrix} 1 & \stepsize T''(p_{k+1}) \\ 0 & 1 \end{pmatrix},

each of which has determinant 1 (lower- and upper-triangular with 1s on the diagonal). The product determinant is therefore 1. ∎

This “shear-decomposition” argument generalizes to all symplectic integrators that update qq and pp in alternating steps: each individual sub-step is a shear, with unit determinant. Verlet, leapfrog, Yoshida composition methods — all are products of unit-determinant shears.

Solution to Exercise 6.6

For h˙=Ah\dot \statevec = \statemat \statevec with A=A\statemat^\top = -\statemat, compute

ddt\hamilton(h)=ddt12hh=12(h˙h+hh˙)=12(hAh+hAh)=12h(A+A)h=0,\frac{d}{dt} \hamilton(\statevec) = \frac{d}{dt} \tfrac{1}{2} \statevec^\top \statevec = \tfrac{1}{2} (\dot \statevec^\top \statevec + \statevec^\top \dot \statevec) = \tfrac{1}{2} (\statevec^\top \statemat^\top \statevec + \statevec^\top \statemat \statevec) = \tfrac{1}{2} \statevec^\top (\statemat^\top + \statemat) \statevec = 0,

since A+A=0\statemat^\top + \statemat = 0 by skew-symmetry. So \hamilton(h(t))\hamilton(h0)\hamilton(\statevec(t)) \equiv \hamilton(\statevec_0) — the squared norm is exactly preserved. ∎

Sufficient condition for an SSM to be Hamiltonian. Any SSM whose continuous dynamics matrix A\statemat is skew-symmetric is Hamiltonian with Hamiltonian \hamilton(h)=12hh\hamilton(\statevec) = \tfrac{1}{2} \statevec^\top \statevec. The eigenvalues of a real skew-symmetric matrix are purely imaginary (or zero), so this regime is exactly the purely oscillatory corner of the SSM landscape — the opposite of the LHP-stable corner where HiPPO and S4 live. A more general sufficient condition: A\statemat has a Hamiltonian structure (in the matrix-Lie-algebra sense) — equivalently, there exists a symmetric positive-definite PP with AP+PA=0\statemat^\top P + P \statemat = 0. The C1 pilot’s empirical project is to identify selective-SSM regimes where this condition holds approximately, and to apply symplectic discretization there.

6.10 Companion code

Two JAX companions and two Julia companions for Chapter 6.

JAX (companions/ch06/jax/):

  • stiff_demo.py — van der Pol oscillator at μ=30\mu = 30 (mildly stiff); compares explicit RK4 (blows up at large Δ\stepsize) against backward Euler (stable at every Δ\stepsize); emits stiff_blowup.png.
  • symplectic_demo.py — harmonic oscillator + pendulum; runs classical RK4 vs Störmer–Verlet over 100 periods; emits energy_drift.png and phase_portrait.png.

Julia (companions/ch06/julia/):

  • implicit_methods.jl — backward Euler, BDF2, and DIRK from-scratch implementations on a stiff test problem; uses only LinearAlgebra + Printf.
  • symplectic_methods.jl — symplectic Euler, Störmer–Verlet, and the 2-stage Gauss–Legendre IRK method on a Hamiltonian test; emits empirical order-of-accuracy table.
  • Project.toml / Manifest.toml.

To run from the repo root:

# JAX
PYTHONPATH=. python companions/ch06/jax/stiff_demo.py
PYTHONPATH=. python companions/ch06/jax/symplectic_demo.py

# Julia (lightweight — no DifferentialEquations.jl dependency)
julia --project=companions/ch06/julia companions/ch06/julia/implicit_methods.jl
julia --project=companions/ch06/julia companions/ch06/julia/symplectic_methods.jl

All figures emit to public/figures/ch06/.