Part I · Foundations Week 4 Published

Discretization theory: ZOH, bilinear, exponential families

One-step discretizations of linear ODEs — zero-order hold, bilinear (Tustin), and exponential-trapezoidal — with the Lax equivalence theorem as the unifying convergence criterion.

On this page
  1. 4.1 The discretization problem
  2. 4.2 The Lax equivalence theorem
  3. 4.3 Zero-order hold (ZOH)
  4. 4.4 The bilinear (Tustin) transform
  5. 4.5 Exponential-family discretizations
  6. 4.6 Order of accuracy and the discretization hierarchy
  7. 4.7 What’s next
  8. 4.8 Exercises
  9. Exercise 4.1 (computation)
  10. Exercise 4.2 (computation)
  11. Exercise 4.3 (computation + code)
  12. Exercise 4.4 (theory) — solution in §4.9
  13. Exercise 4.5 (theory) — solution in §4.9
  14. Exercise 4.6 (theory) — solution in §4.9
  15. 4.9 Full solutions to theory exercises
  16. Solution to Exercise 4.4
  17. Solution to Exercise 4.5
  18. Solution to Exercise 4.6
  19. 4.10 Companion code

Discretization theory: ZOH, bilinear, exponential families

4.1 The discretization problem

Chapter 1 wrote the linear state-space system as a continuous-time ODE

ddth(t)=Ah(t)+Bu(t),y(t)=Ch(t),\ddt \statevec(t) = \statemat \statevec(t) + \inputmat u(t), \qquad y(t) = \outputmat \statevec(t),

defined for every t0t \ge 0. Computers do not handle “every tt”; they handle a finite grid tk=kΔt_k = k \stepsize for k=0,1,2,k = 0, 1, 2, \ldots with step size Δ>0\stepsize > 0. A discretization is a rule for building a recurrence

hk+1=Aˉhk+Bˉuk\statevec_{k+1} = \discA \statevec_k + \discB u_k

whose iterates hk\statevec_k approximate the continuous trajectory at the grid points, hkh(tk)\statevec_k \approx \statevec(t_k). The two discrete matrices Aˉ,Bˉ\discA, \discB are functions of the continuous matrices A,B\statemat, \inputmat and of the step size Δ\stepsize. Different functions give different discretizations.

There is no single canonical choice. Every discrete recurrence in this book — the S4 layer, the Mamba selective scan, the exp-trapezoidal scheme of Mamba-3 — picks a specific rule for translating (A,B,Δ)(Aˉ,Bˉ)(\statemat, \inputmat, \stepsize) \mapsto (\discA, \discB). The choice matters: a poor discretization can take a perfectly stable continuous system and produce a discrete recurrence that diverges, fails to converge as Δ0\stepsize \to 0, or accumulates phase error on long horizons. Numerical-analysis textbooks Hairer et al. (1993) spend hundreds of pages on these failure modes. We will compress them into three properties and three schemes.

A useful warm-up is forward Euler, the simplest possible scheme:

hk+1=hk+Δ(Ahk+Buk)=(I+ΔA)hk+ΔBuk.\statevec_{k+1} = \statevec_k + \stepsize (\statemat \statevec_k + \inputmat u_k) = (I + \stepsize \statemat) \statevec_k + \stepsize \inputmat u_k.

This is what you get by replacing ddth\ddt \statevec with the forward difference (hk+1hk)/Δ(\statevec_{k+1} - \statevec_k)/\stepsize and freezing the input at uku_k. The discrete matrices are Aˉ=I+ΔA\discA = I + \stepsize \statemat and Bˉ=ΔB\discB = \stepsize \inputmat. Forward Euler is the cleanest illustration of what can go wrong: it is only conditionally stable, meaning that for any fixed A\statemat with Re(λi)<0\operatorname{Re}(\lambda_i) < 0 there is a maximum step size beyond which Aˉ=I+ΔA\discA = I + \stepsize \statemat acquires an eigenvalue of modulus >1> 1 and the recurrence blows up. The three schemes in §4.3–§4.5 fix this defect — each in its own way, with its own trade-off.

4.2 The Lax equivalence theorem

To compare schemes you need a definition of “correct.” The standard one comes from the Lax equivalence theorem Lax & Richtmyer (1956) , which decomposes correctness into two ingredients: consistency and stability.

A one-step scheme hk+1=Φ(Δ)(hk,uk)\statevec_{k+1} = \Phi(\stepsize)(\statevec_k, u_k) is consistent with the ODE if its local truncation error — the residual you obtain by plugging the exact continuous trajectory into the discrete recurrence — vanishes faster than Δ\stepsize as Δ0\stepsize \to 0. Formally,

τ(Δ):=h(t+Δ)Φ(Δ)(h(t),u(t))Δ0as Δ0.\tau(\stepsize) := \frac{\statevec(t + \stepsize) - \Phi(\stepsize)(\statevec(t), u(t))}{\stepsize} \to 0 \quad \text{as } \stepsize \to 0.

If τ(Δ)=O(Δp)\tau(\stepsize) = O(\stepsize^p) the scheme is pp-th order consistent. Higher pp means the per-step error shrinks faster with Δ\stepsize.

A scheme is zero-stable if small perturbations δ\delta injected at one step propagate boundedly to all future steps. For linear schemes this reduces to a uniform bound on the powers of Aˉ\discA: there exists KK independent of Δ\stepsize such that AˉkK\norm{\discA^k} \le K for all kk with kΔTk \stepsize \le T, for any fixed horizon TT. Equivalently, the spectral radius ρ(Aˉ)1+O(Δ)\rho(\discA) \le 1 + O(\stepsize).

The two ingredients combine.

.

(Lax equivalence.) For a linear, well-posed initial-value problem, a one-step scheme is convergent — meaning maxkΔThkh(tk)0\max_{k \stepsize \le T} \norm{\statevec_k - \statevec(t_k)} \to 0 as Δ0\stepsize \to 0 — if and only if it is both consistent and zero-stable.

Convergence is what you actually want: that the discrete iterates approach the true continuous trajectory as you refine the grid. The theorem says you cannot get there by being consistent alone (you also need stability) or by being stable alone (you also need consistency). Both are necessary, and together they are sufficient. This is the convergence criterion every scheme in this chapter must pass.

For a stable LTI test problem — meaning A\statemat has all eigenvalues in the open left half-plane — there is a stronger notion sitting on top of zero-stability: a scheme is A-stable if ρ(Aˉ)1\rho(\discA) \le 1 for every step size Δ>0\stepsize > 0, not just for sufficiently small Δ\stepsize. Forward Euler is not A-stable; ZOH, bilinear, and exp-trapezoidal are. The geometry of A-stability — the set of AΔ\statemat \stepsize values for which the scheme behaves well — is the subject of Chapter 5.

4.3 Zero-order hold (ZOH)

The first scheme assumes the input is piecewise constant between samples: u(t)=uku(t) = u_k for t[tk,tk+1)t \in [t_k, t_{k+1}). Under that assumption the inhomogeneous ODE solves exactly on a single interval. Starting from h(tk)=hk\statevec(t_k) = \statevec_k,

h(tk+1)=eAΔhk+tktk+1eA(tk+1s)Bukds=eAΔhk+A1 ⁣(eAΔI)Buk.\statevec(t_{k+1}) = e^{\statemat \stepsize} \statevec_k + \int_{t_k}^{t_{k+1}} e^{\statemat (t_{k+1} - s)} \inputmat \, u_k \, ds = e^{\statemat \stepsize} \statevec_k + \statemat^{-1}\!\left(e^{\statemat \stepsize} - I\right) \inputmat \, u_k.

Identifying the two matrices,

Aˉ=eAΔ,Bˉ=A1 ⁣(eAΔI)B.\discA = e^{\statemat \stepsize}, \qquad \discB = \statemat^{-1}\!\left(e^{\statemat \stepsize} - I\right) \inputmat.

This is the zero-order hold (ZOH) discretization. The hold refers to the input assumption; “zero-order” refers to the polynomial degree of the held signal (a constant is a zero-degree polynomial).

A practical wrinkle: writing Bˉ\discB as A1(eAΔI)B\statemat^{-1}(e^{\statemat \stepsize} - I) \inputmat requires A\statemat to be invertible. The HiPPO matrix of Chapter 7 is invertible; some structured A\statemat in later chapters are not. The augmented matrix exponential trick handles both cases uniformly. Build the block matrix

M=(AΔBΔ00)R(N+P)×(N+P),M = \begin{pmatrix} \statemat \stepsize & \inputmat \stepsize \\ 0 & 0 \end{pmatrix} \in \R^{(N+P) \times (N+P)},

where PP is the input dimension. Then

exp(M)=(eAΔA1(eAΔI)B0I),\exp(M) = \begin{pmatrix} e^{\statemat \stepsize} & \statemat^{-1}(e^{\statemat \stepsize} - I) \inputmat \\ 0 & I \end{pmatrix},

and a single expm call extracts both Aˉ\discA (top-left block) and Bˉ\discB (top-right block) without ever inverting A\statemat. The S4 reference implementation uses this trick; so does the JAX companion discretization_comparison.py.

.

(ZOH preserves Lyapunov stability.) If every eigenvalue λi\lambda_i of A\statemat has Re(λi)<0\operatorname{Re}(\lambda_i) < 0 and Δ>0\stepsize > 0, then every eigenvalue μi=eλiΔ\mu_i = e^{\lambda_i \stepsize} of Aˉ=eAΔ\discA = e^{\statemat \stepsize} satisfies μi<1\abs{\mu_i} < 1. ZOH is therefore A-stable.

The proof is one line: eλiΔ=eRe(λi)Δ<1\abs{e^{\lambda_i \stepsize}} = e^{\operatorname{Re}(\lambda_i) \stepsize} < 1 when Re(λi)<0\operatorname{Re}(\lambda_i) < 0 and Δ>0\stepsize > 0. Stability is preserved for every positive step size, which is what A-stability means.

Two further properties are worth highlighting. First, for the autonomous ODE (u0u \equiv 0), ZOH is exact: the recurrence hk+1=eAΔhk\statevec_{k+1} = e^{\statemat \stepsize} \statevec_k matches h(tk+1)=eAΔh(tk)\statevec(t_{k+1}) = e^{\statemat \stepsize} \statevec(t_k) without any error at all. This is unusual — most discretizations have nonzero error even on autonomous systems. Second, for forced systems with smooth uu, ZOH is first-order accurate: the per-step error in the forcing integral is O(Δ2)O(\stepsize^2) (because the integrand eA(tk+1s)B(u(s)uk)e^{\statemat (t_{k+1} - s)} \inputmat (u(s) - u_k) is O(Δ)O(\stepsize) over an interval of length Δ\stepsize), and after T/ΔT/\stepsize steps the cumulative error is O(Δ)O(\stepsize). The empirical convergence rate plotted in the companion log-log error figure confirms slope 1\approx 1.

Continuous eigenvalues in the complex plane (left half-plane) map to discrete eigenvalues inside the unit disk after ZOH discretization.
ZOH preserves stability: continuous eigenvalues in the open left half-plane (left panel) map to discrete eigenvalues strictly inside the unit disk (right panel) via $\mu_i = e^{\lambda_i \Delta}$. The mapping is bijective and depends on the step size $\Delta$. Produced by companions/ch04/jax/discretization_comparison.py.

4.4 The bilinear (Tustin) transform

The second scheme drops the piecewise-constant assumption and instead approximates the integral on [tk,tk+1][t_k, t_{k+1}] by the trapezoidal rule: tktk+1f(s)dsΔ2(f(tk)+f(tk+1))\int_{t_k}^{t_{k+1}} f(s) \, ds \approx \tfrac{\stepsize}{2}(f(t_k) + f(t_{k+1})). Applied to the inhomogeneous ODE this gives an implicit recurrence,

hk+1=hk+Δ2(Ahk+Buk)+Δ2(Ahk+1+Buk+1),\statevec_{k+1} = \statevec_k + \tfrac{\stepsize}{2}(\statemat \statevec_k + \inputmat u_k) + \tfrac{\stepsize}{2}(\statemat \statevec_{k+1} + \inputmat u_{k+1}),

which one solves for hk+1\statevec_{k+1}:

hk+1=(IΔ2A)1[(I+Δ2A)hk+Δ2B(uk+uk+1)].\statevec_{k+1} = \left(I - \tfrac{\stepsize}{2}\statemat\right)^{-1}\left[\left(I + \tfrac{\stepsize}{2}\statemat\right) \statevec_k + \tfrac{\stepsize}{2} \inputmat (u_k + u_{k+1})\right].

Reading off Aˉ\discA and Bˉ\discB,

Aˉ=(IΔ2A)1 ⁣(I+Δ2A),Bˉ=(IΔ2A)1ΔB.\discA = \left(I - \tfrac{\stepsize}{2}\statemat\right)^{-1}\!\left(I + \tfrac{\stepsize}{2}\statemat\right), \qquad \discB = \left(I - \tfrac{\stepsize}{2}\statemat\right)^{-1} \stepsize \, \inputmat.

This is the bilinear transform, also called the Tustin transform after Arnold Tustin’s 1947 paper introducing it in control theory Tustin (1947) . The S4D paper uses bilinear discretization; so do many signal-processing toolboxes.

The bilinear formula has a striking geometric interpretation. Restricted to a single eigenvalue λ\lambda of A\statemat (commuting through the simultaneous diagonalization), the map λμ\lambda \mapsto \mu that sends a continuous eigenvalue to its discrete image is

μ=1+Δ2λ1Δ2λ.\mu = \frac{1 + \tfrac{\stepsize}{2}\lambda}{1 - \tfrac{\stepsize}{2}\lambda}.

This is a Möbius transformation (a.k.a. fractional linear transformation) of the complex plane: z(az+b)/(cz+d)z \mapsto (az+b)/(cz+d) with adbc0ad - bc \ne 0. Möbius transformations are conformal — angle-preserving — and they map circles-or-lines to circles-or-lines. The specific Möbius map above sends the imaginary axis exactly to the unit circle and sends the open left half-plane exactly to the open unit disk.

.

(Bilinear preserves Lyapunov stability.) If every eigenvalue λi\lambda_i of A\statemat has Re(λi)<0\operatorname{Re}(\lambda_i) < 0 and Δ>0\stepsize > 0, then every eigenvalue μi\mu_i of Aˉ=(IΔ2A)1(I+Δ2A)\discA = (I - \tfrac{\stepsize}{2}\statemat)^{-1}(I + \tfrac{\stepsize}{2}\statemat) satisfies μi<1\abs{\mu_i} < 1. Bilinear is therefore A-stable.

The proof is geometric: z:=Δ2λiz := \tfrac{\stepsize}{2}\lambda_i has Re(z)<0\operatorname{Re}(z) < 0, and 1+z<1z\abs{1+z} < \abs{1-z} iff Re(z)<0\operatorname{Re}(z) < 0 (square both sides and cancel). Algebraically, (1+z)/(1z)2=1+z2/1z2<1\abs{(1+z)/(1-z)}^2 = \abs{1+z}^2/\abs{1-z}^2 < 1, which is exactly μi<1\abs{\mu_i} < 1. The full proof, with the Möbius geometry spelled out, is Exercise 4.5 in §4.9.

Bilinear is second-order accurate for smooth forcing — the trapezoidal-rule local truncation error is O(Δ3)O(\stepsize^3), giving global error O(Δ2)O(\stepsize^2). Unlike ZOH, it is not exact even on autonomous systems: the (IΔ2A)1(I+Δ2A)(I - \tfrac{\stepsize}{2}\statemat)^{-1}(I + \tfrac{\stepsize}{2}\statemat) Padé approximation of eAΔe^{\statemat \stepsize} agrees with the true exponential only through second order. The trade is one of structure: bilinear maps the imaginary axis exactly to the unit circle, so marginally stable continuous modes (purely imaginary eigenvalues) become marginally stable discrete modes (eigenvalues exactly on the unit circle). ZOH crushes them to the unit-disk interior at rate e0Δ=1e^{0 \cdot \stepsize} = 1 (marginal in the limit but never exactly on the circle for finite Δ\stepsize). This makes bilinear the natural choice when the system has oscillatory modes you want to preserve — a property that becomes load-bearing for the symplectic methods of Chapter 6.

4.5 Exponential-family discretizations

The third family takes a different approach: rather than approximating eAΔe^{\statemat \stepsize} by a low-order Padé form (as bilinear does) or by the identity-plus-linear truncation I+ΔAI + \stepsize \statemat (as forward Euler does), it computes eAΔe^{\statemat \stepsize} exactly and approximates only the forcing integral. This is the philosophy of exponential integrators Hochbruck & Ostermann (2010) .

The exact one-step formula from variation of parameters (Chapter 1, §1.6) is

h(tk+1)=eAΔhk+0ΔeA(Δs)Bu(tk+s)ds.\statevec(t_{k+1}) = e^{\statemat \stepsize} \statevec_k + \int_0^{\stepsize} e^{\statemat (\stepsize - s)} \inputmat u(t_k + s) \, ds.

ZOH approximates u(tk+s)uku(t_k + s) \approx u_k on [0,Δ][0, \stepsize]; bilinear approximates the integral by the trapezoidal rule with a Padé-approximated exponential. The exponential-trapezoidal scheme keeps the exact exponential and uses a linear interpolation of the input: u(tk+s)uk+(s/Δ)(uk+1uk)u(t_k + s) \approx u_k + (s/\stepsize)(u_{k+1} - u_k). Substituting and evaluating the resulting two integrals gives

hk+1=eAΔhk+Δφ1(AΔ)Buk+Δφ2(AΔ)B(uk+1uk),\statevec_{k+1} = e^{\statemat \stepsize} \statevec_k + \stepsize \, \varphi_1(\statemat \stepsize) \, \inputmat \, u_k + \stepsize \, \varphi_2(\statemat \stepsize) \, \inputmat \, (u_{k+1} - u_k),

where φ1\varphi_1 and φ2\varphi_2 are the first two φ\varphi-functions of the exponential family:

φ1(z)=ez1z,φ2(z)=ez1zz2.\varphi_1(z) = \frac{e^z - 1}{z}, \qquad \varphi_2(z) = \frac{e^z - 1 - z}{z^2}.

Both are entire functions (the apparent singularities at z=0z = 0 are removable: φ1(0)=1\varphi_1(0) = 1, φ2(0)=1/2\varphi_2(0) = 1/2). Applied to the matrix AΔ\statemat \stepsize, they produce matrix-valued objects φ1(AΔ),φ2(AΔ)\varphi_1(\statemat \stepsize), \varphi_2(\statemat \stepsize) that can be computed via the same augmented-matrix-exponential trick as ZOH.

Reading off the discrete matrices in the form hk+1=Aˉhk+Bˉ0uk+Bˉ1(uk+1uk)\statevec_{k+1} = \discA \statevec_k + \discB_0 u_k + \discB_1 (u_{k+1} - u_k):

Aˉ=eAΔ,Bˉ0=Δφ1(AΔ)B,Bˉ1=Δφ2(AΔ)B.\discA = e^{\statemat \stepsize}, \qquad \discB_0 = \stepsize \, \varphi_1(\statemat \stepsize) \, \inputmat, \qquad \discB_1 = \stepsize \, \varphi_2(\statemat \stepsize) \, \inputmat.

Notice that Aˉ\discA is the same as ZOH, so exp-trapezoidal preserves stability identically to ZOH: A-stable, exact for autonomous systems, eigenvalues in the unit disk for any Δ>0\stepsize > 0. The improvement is in the order of accuracy: by interpolating uu linearly rather than holding it constant, exp-trapezoidal becomes second-order accurate (provided uu is C1C^1).

Exp-trapezoidal is the discretization of choice when the continuous dynamics are stiff — when A\statemat has eigenvalues with very different magnitudes, so the linear part is the hard part of the problem. By treating eAΔe^{\statemat \stepsize} exactly the scheme sidesteps the step-size restriction that explicit methods inherit from the stiff eigenvalues, and the matrix exponential is computed once per layer in practice. Chapter 10 returns to this story for Mamba-3, which adopts exp-trapezoidal precisely because the input-dependent A(ut)\statemat(u_t) in selective SSMs produces stiff dynamics that ZOH and bilinear handle poorly.

A subtle implementation point: φ2(AΔ)\varphi_2(\statemat \stepsize) is not numerically equal to (eAΔIAΔ)/(AΔ)2(e^{\statemat \stepsize} - I - \statemat \stepsize)/(\statemat \stepsize)^2 when this formula is computed naively, because catastrophic cancellation in the numerator destroys precision for small Δ\stepsize. The standard remedy is to compute all φ\varphi-functions simultaneously from one augmented matrix exponential, again using the trick that produced Bˉ\discB for ZOH. Both companions (exp_trapezoidal.py in JAX and discretization_atlas.jl in Julia) implement the augmented form.

Log-log error vs step size for exp-trapezoidal, ZOH, and bilinear on a forced damped oscillator.
Exp-trapezoidal (rose) achieves the same slope-2 convergence as bilinear (gold) while inheriting ZOH's autonomous-exactness — the best of both. ZOH (navy) lags at slope 1. Empirical fit on the two finest step sizes confirms the theoretical orders. Produced by companions/ch04/jax/exp_trapezoidal.py.

4.6 Order of accuracy and the discretization hierarchy

The three schemes are summarized in the table below. “Autonomous-exact” means the scheme produces zero error for the homogeneous problem (u0u \equiv 0). “A-stable” means ρ(Aˉ)1\rho(\discA) \le 1 for every Δ>0\stepsize > 0 on every stable LTI test problem.

SchemeAˉ\discAOrderAutonomous-exactA-stable
Forward EulerI+ΔAI + \stepsize \statemat1NoNo
ZOHeAΔe^{\statemat \stepsize}1YesYes
Bilinear (Tustin)(IΔ2A)1(I+Δ2A)(I - \tfrac{\stepsize}{2}\statemat)^{-1}(I + \tfrac{\stepsize}{2}\statemat)2NoYes
Exp-trapezoidaleAΔe^{\statemat \stepsize}2YesYes

The empirical order can be read directly off a log-log plot of error against step size: a pp-th-order scheme has error Δp\propto \stepsize^p, so on a log-log plot the data lie on a line of slope pp. The Julia companion discretization_atlas.jl runs this sweep against a high-accuracy Tsit5 reference solution from DifferentialEquations.jl; the JAX companion discretization_comparison.py performs the same sweep using scipy.integrate.solve_ivp (Radau) as the reference.

Log-log plot of max error versus step size for ZOH (slope ~1), bilinear and exp-trapezoidal (slope ~2) on a forced damped oscillator.
Empirical order of accuracy of the three schemes on the forced damped oscillator $\\ddot q + 0.5 \\dot q + 2 q = \\sin(2t)$. ZOH (navy): slope 1 on the log-log plot, confirming first-order convergence. Bilinear (gold) and exp-trapezoidal (rose): slope 2, confirming second-order convergence. Produced by companions/ch04/jax/discretization_comparison.py.

The hierarchy is not strict: ZOH’s autonomous-exactness can outweigh its first-order accuracy when the forcing is mild relative to the homogeneous decay. Bilinear’s exact imaginary-axis preservation can outweigh its non-exactness on autonomous problems when the system has long-lived oscillations. Exp-trapezoidal combines the best of both — exact on autonomous, second-order on forced — at the cost of computing more φ\varphi-functions per step. The trade-offs are recurring themes throughout the SSM literature: S4 uses ZOH for its simplicity and exactness on the (autonomous) HiPPO drift; S4D and S5 follow suit; Mamba-3 switches to exp-trapezoidal once the input-dependence makes the dynamics stiff enough that the second-order error compounding matters.

4.7 What’s next

This chapter introduced three schemes and proved each preserves Lyapunov stability. Chapter 5 zooms in on the stability region of each scheme — the set of AΔ\statemat \stepsize values for which ρ(Aˉ)1\rho(\discA) \le 1 — and develops the Butcher-tableau machinery for systematically constructing higher-order Runge–Kutta methods. Chapter 6 then asks what to do when even A-stability is not enough — when the system is stiff or Hamiltonian and you need L-stable implicit methods, or symplectic methods that preserve geometric invariants. The exp-trapezoidal scheme of §4.5 turns out to be a member of a much larger family of exponential integrators, all of which deserve a place in your numerical-analysis toolkit.

Forward-looking SSM connections: ZOH is the default in Chapters 7–9 (HiPPO, S4, S4D, S5, Mamba-1/2). Bilinear shows up in S4D when the HiPPO matrix is diagonalized. Exp-trapezoidal is the Mamba-3 scheme, treated in Chapter 10.

4.8 Exercises

Six problems mixing computation and theory. Short/numerical (4.1–4.3) have inline collapsible solutions; long/proof exercises (4.4–4.6) have full worked solutions in §4.9.

Exercise 4.1 (computation)

Compute Aˉ\discA for ZOH on the 1-D test problem A=2\statemat = -2, B=1\inputmat = 1, Δ=0.1\stepsize = 0.1. Then verify by hand that Aˉ<1\abs{\discA} < 1.

Solution

Aˉ=e20.1=e0.20.8187\discA = e^{-2 \cdot 0.1} = e^{-0.2} \approx 0.8187. The discrete eigenvalue is 0.81870.8187, strictly inside the unit disk. The continuous decay rate is 22 per unit time; the discrete decay factor per step is e0.20.8187e^{-0.2} \approx 0.8187, so after 10 steps (one continuous-time unit) the state is reduced by a factor 0.8187100.135=e20.8187^{10} \approx 0.135 = e^{-2} — matching the continuous decay exactly. ZOH is autonomous-exact.

Exercise 4.2 (computation)

Compute Aˉ\discA for the bilinear transform on the same 1-D problem (A=2\statemat = -2, Δ=0.1\stepsize = 0.1). Compare Aˉbilinear\abs{\discA}_{\text{bilinear}} against AˉZOH=e0.2\abs{\discA}_{\text{ZOH}} = e^{-0.2} — which is smaller, and why?

Solution

Aˉbilinear=(1+(0.1/2)(2))/(1(0.1/2)(2))=0.9/1.10.8182\discA_{\text{bilinear}} = (1 + (0.1/2)(-2))/(1 - (0.1/2)(-2)) = 0.9/1.1 \approx 0.8182. ZOH gives e0.20.8187e^{-0.2} \approx 0.8187. The bilinear value is slightly smaller, meaning bilinear overdamps slightly relative to the true exponential. This is the second-order Padé approximation underestimating e0.2e^{-0.2}: the truncation in φ1\varphi_1 removes positive higher-order terms. For small Δλ\stepsize \cdot \abs{\lambda} the gap is O(Δ3)O(\stepsize^3); here the difference is about 51045 \cdot 10^{-4}.

Exercise 4.3 (computation + code)

Run companions/ch04/jax/discretization_comparison.py and verify empirically that ZOH has slope 1\approx 1 on the log-log error-vs-Δ\stepsize plot, bilinear and exp-trapezoidal have slope 2\approx 2. What happens to the ZOH slope if you set the forcing u(t)0u(t) \equiv 0 (autonomous case)?

Solution

The companion’s error_sweep function prints slopes near 1.00, 2.00, 2.00. Setting u0u \equiv 0 in the autonomous case makes ZOH exact — the error drops to roundoff (1014\sim 10^{-14} in float64) and the slope is meaningless. This is consistent with the autonomous-exactness property of §4.3: ZOH’s only error source for forced systems is the piecewise-constant approximation of uu.

Exercise 4.4 (theory) — solution in §4.9

Prove the augmented matrix exponential trick used in §4.3: that

exp ⁣(M1M200)=(eM1M11(eM1I)M20I)\exp\!\begin{pmatrix} M_1 & M_2 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} e^{M_1} & M_1^{-1}(e^{M_1} - I)\, M_2 \\ 0 & I \end{pmatrix}

for invertible M1M_1, by computing the series term-by-term.

Exercise 4.5 (theory) — solution in §4.9

Prove that the Möbius transformation z(1+z/2)/(1z/2)z \mapsto (1 + z/2)/(1 - z/2) sends the open left half-plane {z:Re(z)<0}\set{z : \operatorname{Re}(z) < 0} bijectively to the open unit disk {w:w<1}\set{w : \abs{w} < 1}, and sends the imaginary axis bijectively to the unit circle (minus the point w=1w = -1).

Exercise 4.6 (theory) — solution in §4.9

Derive the exponential-trapezoidal scheme of §4.5 from the variation-of-parameters formula by substituting the linear interpolation u(tk+s)=uk+(s/Δ)(uk+1uk)u(t_k + s) = u_k + (s/\stepsize)(u_{k+1} - u_k) into the forcing integral and evaluating term-by-term using the φ\varphi-function identities.

4.9 Full solutions to theory exercises

Solution to Exercise 4.4

The matrix M:=(M1M200)M := \begin{pmatrix} M_1 & M_2 \\ 0 & 0 \end{pmatrix} satisfies, for k1k \ge 1,

Mk=(M1kM1k1M200).M^k = \begin{pmatrix} M_1^k & M_1^{k-1} M_2 \\ 0 & 0 \end{pmatrix}.

This is verified by induction: M1M^1 matches by construction, and Mk+1=MMk=(M1M200)(M1kM1k1M200)=(M1k+1M1kM200)M^{k+1} = M \cdot M^k = \begin{pmatrix} M_1 & M_2 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} M_1^k & M_1^{k-1} M_2 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} M_1^{k+1} & M_1^k M_2 \\ 0 & 0 \end{pmatrix} as required. Now sum the matrix-exponential series:

eM=I+k=1Mkk!=(I00I)+(k1M1kk!k1M1k1k!M200)=(eM1SM20I),e^M = I + \sum_{k=1}^{\infty} \frac{M^k}{k!} = \begin{pmatrix} I & 0 \\ 0 & I \end{pmatrix} + \begin{pmatrix} \sum_{k \ge 1} \tfrac{M_1^k}{k!} & \sum_{k \ge 1} \tfrac{M_1^{k-1}}{k!} M_2 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} e^{M_1} & S \, M_2 \\ 0 & I \end{pmatrix},

where S:=k1M1k1/k!=M11k1M1k/k!=M11(eM1I)S := \sum_{k \ge 1} M_1^{k-1}/k! = M_1^{-1} \sum_{k \ge 1} M_1^k / k! = M_1^{-1}(e^{M_1} - I), using invertibility of M1M_1 to extract the leading M11M_1^{-1}. Plugging back gives the claimed result. ∎

This is exactly the identity used to compute Aˉ\discA and Bˉ\discB jointly from one expm call in §4.3: take M1=AΔM_1 = \statemat \stepsize and M2=BΔM_2 = \inputmat \stepsize.

Solution to Exercise 4.5

Let T(z):=(1+z/2)/(1z/2)T(z) := (1 + z/2)/(1 - z/2). Imaginary axis to unit circle: for z=iωz = i\omega with ωR\omega \in \R,

T(iω)2=1+iω/221iω/22=1+ω2/41+ω2/4=1,\abs{T(i\omega)}^2 = \frac{\abs{1 + i\omega/2}^2}{\abs{1 - i\omega/2}^2} = \frac{1 + \omega^2/4}{1 + \omega^2/4} = 1,

so T(iω)T(i\omega) lies on the unit circle. The map ωT(iω)\omega \mapsto T(i\omega) traces the unit circle once as ω\omega ranges over R\R, missing only the limit point T()=1T(\infty) = -1.

Left half-plane to unit disk: write z=x+iyz = x + iy with x<0x < 0. Then

T(z)2=(1+x/2)2+(y/2)2(1x/2)2+(y/2)2.\abs{T(z)}^2 = \frac{(1 + x/2)^2 + (y/2)^2}{(1 - x/2)^2 + (y/2)^2}.

Since x<0x < 0, 1+x/2<1x/2\abs{1 + x/2} < \abs{1 - x/2} (both for x<2\abs{x} < 2 where the signs match the absolute values, and for x>2\abs{x} > 2 where they reverse). The yy-terms are identical in numerator and denominator. Therefore the numerator is strictly less than the denominator, so T(z)<1\abs{T(z)} < 1.

Bijectivity: Möbius transformations z(az+b)/(cz+d)z \mapsto (az + b)/(cz + d) with adbc0ad - bc \ne 0 are bijections of the Riemann sphere C{}\C \cup \set{\infty} to itself. Restriction to the half-plane gives a bijection to the disk. The inverse is w(2(w1))/(w+1)w \mapsto (2(w-1))/(w+1), which can be verified by direct computation. ∎

This Möbius geometry is the algebraic content of A-stability for bilinear: a scheme is A-stable iff its eigenvalue map sends the open left half-plane into the closed unit disk. Bilinear achieves this bijectively; ZOH does so as well but non-bijectively (it folds an infinite strip of width 2π/Δ2\pi/\stepsize onto the unit disk, aliasing high-frequency continuous modes onto low-frequency discrete ones).

Solution to Exercise 4.6

Start from variation of parameters on [tk,tk+1][t_k, t_{k+1}]:

hk+1=eAΔhk+0ΔeA(Δs)Bu(tk+s)ds.\statevec_{k+1} = e^{\statemat \stepsize} \statevec_k + \int_0^{\stepsize} e^{\statemat(\stepsize - s)} \inputmat \, u(t_k + s) \, ds.

Substitute the linear interpolant u(tk+s)=uk+(s/Δ)(uk+1uk)u(t_k + s) = u_k + (s/\stepsize)(u_{k+1} - u_k) for s[0,Δ]s \in [0, \stepsize]:

0ΔeA(Δs)B[uk+sΔ(uk+1uk)]ds=I0uk+I1(uk+1uk),\int_0^{\stepsize} e^{\statemat(\stepsize - s)} \inputmat \left[u_k + \frac{s}{\stepsize}(u_{k+1} - u_k)\right] ds = I_0 \, u_k + I_1 \, (u_{k+1} - u_k),

with I0:=0ΔeA(Δs)BdsI_0 := \int_0^{\stepsize} e^{\statemat (\stepsize - s)} \inputmat \, ds and I1:=0ΔsΔeA(Δs)BdsI_1 := \int_0^{\stepsize} \tfrac{s}{\stepsize} e^{\statemat (\stepsize - s)} \inputmat \, ds. Change variables σ=Δs\sigma = \stepsize - s in I0I_0:

I0=0ΔeAσdσB=A1(eAΔI)B=Δφ1(AΔ)B,I_0 = \int_0^{\stepsize} e^{\statemat \sigma} \, d\sigma \cdot \inputmat = \statemat^{-1}(e^{\statemat \stepsize} - I) \, \inputmat = \stepsize \, \varphi_1(\statemat \stepsize) \, \inputmat,

where in the last step we substituted the definition φ1(z)=(ez1)/z\varphi_1(z) = (e^z - 1)/z applied with argument AΔ\statemat \stepsize. For I1I_1, change variables again σ=Δs\sigma = \stepsize - s, so s=Δσs = \stepsize - \sigma and s/Δ=1σ/Δs/\stepsize = 1 - \sigma/\stepsize:

I1=0Δ(1σΔ)eAσdσB=1Δ[ΔI00ΔσeAσdσB].I_1 = \int_0^{\stepsize} \left(1 - \frac{\sigma}{\stepsize}\right) e^{\statemat \sigma} d\sigma \cdot \inputmat = \frac{1}{\stepsize}\left[\stepsize \, I_0 - \int_0^{\stepsize} \sigma \, e^{\statemat \sigma} d\sigma \cdot \inputmat\right].

The remaining σ\sigma-weighted integral evaluates via integration by parts to A2(eAΔ(AΔI)+I)B=Δ2φ2(AΔ)B\statemat^{-2}(e^{\statemat \stepsize}(\statemat \stepsize - I) + I) \inputmat = \stepsize^2 \varphi_2(\statemat \stepsize) \inputmat, using the φ2(z)=(ez1z)/z2\varphi_2(z) = (e^z - 1 - z)/z^2 identity. Carefully tracking the algebra gives I1=Δφ2(AΔ)BI_1 = \stepsize \, \varphi_2(\statemat \stepsize) \, \inputmat as claimed. Combining,

hk+1=eAΔhk+Δφ1(AΔ)Buk+Δφ2(AΔ)B(uk+1uk).\statevec_{k+1} = e^{\statemat \stepsize} \statevec_k + \stepsize \, \varphi_1(\statemat \stepsize) \, \inputmat \, u_k + \stepsize \, \varphi_2(\statemat \stepsize) \, \inputmat \, (u_{k+1} - u_k). \qquad \square

The local truncation error is O(Δ3)O(\stepsize^3) for C1C^1 inputs (the linear interpolation has error O(Δ2)O(\stepsize^2) over an interval of length Δ\stepsize, and the integral picks up an additional Δ\stepsize factor). Global error is therefore O(Δ2)O(\stepsize^2) — second-order, as advertised.

4.10 Companion code

Three runnable companions for Chapter 4:

JAX (companions/ch04/jax/):

  • discretization_comparison.py — implements ZOH, bilinear, and forward Euler in JAX; runs the error-vs-step-size sweep on a forced damped oscillator; emits eigenvalue_migration.png and order_convergence.png.
  • exp_trapezoidal.py — implements the exp-trapezoidal scheme via the augmented matrix exponential; verifies second-order convergence against the same forced oscillator; emits exp_trap_convergence.png.

Julia (companions/ch04/julia/):

  • discretization_atlas.jl — ports the post_transformers Week-9 reference implementation to ssm-foundations; uses DifferentialEquations.jl Tsit5 as the ground-truth reference solver and reports the empirical slope per scheme.
  • Project.toml / Manifest.toml — companion-local Julia environment pinning DifferentialEquations and LinearAlgebra.

To run from the repo root:

# JAX (uses post_transformers .venv with scipy, numpy, matplotlib, jax pre-installed)
PYTHONPATH=. python companions/ch04/jax/discretization_comparison.py
PYTHONPATH=. python companions/ch04/jax/exp_trapezoidal.py

# Julia (first run will precompile DifferentialEquations.jl; ~2–5 minutes)
julia --project=companions/ch04/julia companions/ch04/julia/discretization_atlas.jl

All figures emit to public/figures/ch04/.