Prerequisites

This chapter requires IR-03 (swaption pricing). The key insight that motivates this chapter comes directly from IR-03: swaptions have positive observed market prices, which is only possible if rates are random. No stochastic calculus background needed; the Wiener process is built from scratch.

By the end, you can justify stochasticity from observed option prices and write down the Vasicek SDE from scratch.

The problem with deterministic rates

In Chapter 01 we built the bond price curve $P(0,T)$ from market quotes, and in Chapter 03 we priced swaptions using it. The yield curve is deterministic: it is a set of numbers observed today. For instruments whose payoff is linear in the rate (bonds, swaps), this is enough. For instruments whose payoff is nonlinear or has an optionality component (swaptions, caps, callable bonds), a single expected future rate is not enough. You need a distribution of where rates might go.

The key insight

A stochastic model is not trying to predict where rates will go. It is trying to characterise the full distribution of where they might go. The goal is not forecasting; it is pricing instruments whose value depends on that distribution.

The Wiener process

The simplest mathematical object that captures unpredictable, continuous movement is the Wiener process $W_t$ (Brownian motion). Its increments are independent across non-overlapping intervals, and normally distributed:

Wiener process increment $$dW_t \sim \mathcal{N}(0,\, dt)$$

The variance equals $dt$, not $dt^2$: shocks grow as the square root of time. This is why daily vol scales to annual vol by $\sqrt{252}$, and why longer-dated options are more expensive.

From noise to a short rate model

A Wiener process on its own is just noise. A useful model for interest rates adds a drift (where the rate is being pulled) and a diffusion (how much noise surrounds that pull). A stochastic differential equation (SDE) encodes both:

Generic SDE for the short rate $$dr_t = \underbrace{\mu(r_t,\, t)}_{\text{drift}} \, dt + \underbrace{\sigma(r_t,\, t)}_{\text{diffusion}} \, dW_t$$

The choice of $\mu$ and $\sigma$ defines the model. A natural choice is to let rates mean-revert toward a long-run level $b$ with speed $a$ and constant volatility $\sigma$. This is the Vasicek model (see [1], Section 3.2), stated under the risk-neutral measure $\mathbb{Q}$:

Vasicek SDE (under $\mathbb{Q}$) $$dr_t = a\,(b - r_t)\,dt + \sigma\,dW_t$$

When $r_t > b$, the drift is negative (rates are pulled down); when $r_t < b$, the drift is positive (rates are pulled up). The rate wanders randomly but is always drawn back toward the long-run mean. This is the Ornstein-Uhlenbeck process applied to interest rates.

Notation note (Vasicek vs Hull-White)

This chapter uses the canonical Vasicek symbols of Brigo-Mercurio: $a$ for the mean-reversion speed and $b$ for the constant long-run level. Chapter 05 keeps $a$ for the mean-reversion speed but elevates the long-run level to a time-dependent function $\theta(t)$, so that the SDE becomes $dr_t = (\theta(t) - a\,r_t)\,dt + \sigma\,dW_t$. The constant $a\,b$ here plays the role of $\theta(t)$ there, evaluated at any point in time.

The Vasicek bond price formula

A key advantage of Vasicek is that the zero-coupon bond price $P(t,T)$ has a closed-form solution. The risk-neutral pricing formula for a zero-coupon bond reads

Risk-neutral pricing of the ZCB $$P(t,T) = \mathbb{E}^{\mathbb{Q}}_t\!\left[\exp\!\Bigl(-\!\int_t^T r_s\,ds\Bigr)\right]$$

Under the Vasicek dynamics, $r$ is Markov under $\mathbb{Q}$ with constant parameters $(a, b, \sigma)$, $r_T \mid r_t$ is Gaussian, and the integral $\int_t^T r_s\,ds$ is also Gaussian (a linear functional of correlated normals). The result is an exponential-affine formula:

Vasicek zero-coupon bond price $$P(t,T) = A(t,T)\,\exp\!\bigl(-B(t,T)\,r_t\bigr)$$ $$B(t,T) = \frac{1 - e^{-a(T-t)}}{a}, \qquad A(t,T) = \exp\!\left[\left(b - \frac{\sigma^2}{2a^2}\right)\!\bigl(B(t,T) - (T - t)\bigr) - \frac{\sigma^2\,B(t,T)^2}{4a}\right]$$

The formula assumes $r$ is Markov under $\mathbb{Q}$ with constant parameters $(a, b, \sigma)$; it does not hold under the historical measure $\mathbb{P}$ unless those same constants happen to coincide, which they generally do not. See Brigo and Mercurio [1], Proposition 3.2.1 for the full derivation. The formula is the reason the martingale test below works: we can re-price bonds analytically at any simulated $r_s$, without nesting a second Monte Carlo inside the first.

For example, with $a = 0.15$, $b = 4\%$, $\sigma = 0.80\%$, and $r_0 = 4.33\%$, the 5-year bond price is:

$B(0,5) = \frac{1 - e^{-0.15 \times 5}}{0.15} = 3.5175$, and $\ln A(0,5) = \bigl(b - \frac{\sigma^2}{2a^2}\bigr)(B - T) - \frac{\sigma^2}{4a}B^2 = -0.0585$, giving $P(0,5) = e^{-0.0585} \times e^{-3.5175 \times 0.0433} = 0.8100$.

Monte Carlo pricing

Once you have an SDE, you can simulate paths. Start at $r_0$, draw random increments at each time step, accumulate the stochastic discount factor $\exp(-\int_0^T r_s\,ds)$, and compute the payoff. The price of any derivative is the average discounted payoff across all paths:

Monte Carlo price $$V_0 = \mathbb{E}^{\mathbb{Q}}\!\left[\exp\!\Bigl(-\!\int_0^T r_s\,ds\Bigr) \cdot h(r_T)\right] \approx \frac{1}{N} \sum_{i=1}^{N} e^{-\sum_k r_{t_k}^{(i)} \Delta t} \cdot h\!\left(r_T^{(i)}\right)$$

The expectation is taken under the risk-neutral measure $\mathbb{Q}$. The exponential factor is the stochastic (path-wise) discount factor; we keep it in its integral form throughout to avoid any confusion with the deterministic ZCB curve $P(0,T)$. The function $h(r_T)$ is the payoff at maturity. The approximation on the right is what you actually compute in code: $N$ simulated paths, each contributing one discounted payoff.

Short rate simulation: stochastic paths vs long-run mean
30 simulated Vasicek paths from $r_0 = 4.33\%$ (the Chapter 01 overnight rate), $a = 0.15$, $b = 4.00\%$, $\sigma = 0.80\%$. The red curve shows the mean across all simulated paths, converging toward the long-run mean $b = 4.00\%$ over time. The fan of paths shows the distribution the model says rates might follow. Pricing any instrument with optionality requires integrating over all paths, not just using a single deterministic forward.
These parameters are illustrative

The Vasicek model with constant $a$, $b$, $\sigma$ does not fit the bootstrapped curve from Chapter 01 exactly. The Hull-White model (Chapter 05, Draft) extends Vasicek by replacing the constant long-run level $b$ with a time-dependent function $\theta(t)/a$, which forces the model to reproduce today's bond prices exactly. Vasicek is used here as the simplest non-trivial short-rate model to illustrate the concepts; the production simulation engine in Chapters 07-08 and the XVA series is Hull-White, not Vasicek.

The martingale test

A well-implemented simulation must satisfy internal consistency conditions. We test two properties, each progressively more demanding.

Test 1: discount factor alone

The simplest check: can the simulation reproduce the analytical bond price $P(0,T)$ from the stochastic discount factor alone? Under $\mathbb{Q}$, the path-wise stochastic discount factor averages to the deterministic ZCB price:

Discount factor identity $$\mathbb{E}^{\mathbb{Q}}\!\left[\exp\!\Bigl(-\!\int_0^T r_s\,ds\Bigr)\right] = P(0,T)$$

We simulate paths out to each maturity $T \in \{0.5, 1, 2, 3, 5, 7, 10\}$ years, average the stochastic discount factor, and compare to the Vasicek analytical $P(0,T)$. If the simulation is correct, the two curves should overlap everywhere.

Term structure test: simulated stochastic discount factor vs analytical $P(0,T)$
Average stochastic discount factor $\mathbb{E}^{\mathbb{Q}}[\exp(-\!\int_0^T r_s\,ds)]$ across 5,000 simulated paths vs Vasicek analytical bond price $P(0,T)$, for seven maturities. Parameters: $r_0 = 4.33\%$, $a = 0.15$, $b = 4.00\%$, $\sigma = 0.80\%$. The two curves overlap within a fraction of a basis point, confirming the simulation reproduces the model's term structure.

Test 2: the tower property

The stronger check: for any intermediate time $s$, the path-wise stochastic discount factor times the conditional bond price must average back to today's $P(0,T)$. This is a direct consequence of the tower property of conditional expectation applied to the risk-neutral pricing formula:

Tower property (martingale condition) $$\mathbb{E}^{\mathbb{Q}}\!\left[\exp\!\Bigl(-\!\int_0^s r_u\,du\Bigr)\cdot P(s,T)\right] = P(0,T), \qquad \forall\, 0 \leq s \leq T$$

The exponential factor is the stochastic discount factor accumulated along each path to time $s$, and $P(s,T)$ is the Vasicek analytical bond price evaluated at the simulated $r_s$. We fix $T = 5$Y and test six monitoring dates $s \in \{0.5, 1.0, 1.5, 2.0, 3.0, 4.0\}$. Every single one must converge to the same $P(0,5)$.

Tower property: stochastic discount factor times $P(s,5)$ vs $P(0,5)$ at six monitoring dates
Six monitoring dates $s = 0.5, 1.0, 1.5, 2.0, 3.0, 4.0$Y, all targeting $T = 5$Y. As $N$ grows, all six running averages of $\exp(-\!\int_0^s r_u\,du)\cdot P(s,5)$ converge to the analytical $P(0,5)$ (dashed red). Parameters: $r_0 = 4.33\%$, $a = 0.15$, $b = 4.00\%$, $\sigma = 0.80\%$.

Python implementation

The code below runs both martingale tests using the same parameters as the charts above.

Python · PyTorch
import torch
from dataclasses import dataclass

@dataclass
class SimulationResult:
    """Container for Monte Carlo simulation output."""
    times: torch.Tensor       # (n_steps + 1,)
    rates: torch.Tensor       # (n_steps + 1, n_paths)
    integrals: torch.Tensor   # (n_steps + 1, n_paths)

    def stochastic_discount(self, T_idx: int) -> torch.Tensor:
        """Stochastic discount factor exp(-integral_0^T r_s ds) at step T_idx."""
        return torch.exp(-self.integrals[T_idx])


class VasicekModel:
    """
    Vasicek one-factor short rate model (under the risk-neutral measure Q).

    SDE: dr_t = a * (b - r_t) dt + sigma * dW_t

    Parameters (illustrative, not calibrated to Ch.01 curve):
        r0    = 4.33%  (overnight rate from Ch.01)
        a     = 0.15   (mean reversion speed)
        b     = 4.00%  (long-run mean)
        sigma = 0.80%  (volatility)
    """

    def __init__(self, r0=0.0433, a=0.15, b=0.04, sigma=0.008):
        self.r0 = r0
        self.a = a
        self.b = b
        self.sigma = sigma

    def bond_price(self, r_t: torch.Tensor, t: float, T: float) -> torch.Tensor:
        """Analytical Vasicek ZCB price P(t, T | r_t)."""
        tau = T - t
        B = (1.0 - torch.exp(torch.tensor(-self.a * tau))) / self.a
        A = torch.exp(
            (self.b - self.sigma**2 / (2 * self.a**2)) * (B - tau)
            - self.sigma**2 * B**2 / (4 * self.a)
        )
        return A * torch.exp(-B * r_t)

    def simulate(self, T: float, n_steps: int, n_paths: int,
                 seed: int = 137) -> SimulationResult:
        """Simulate paths using exact Vasicek discretisation."""
        torch.manual_seed(seed)
        dt = T / n_steps
        times = torch.linspace(0, T, n_steps + 1)

        # Precompute exact-step constants
        exp_a_dt = torch.exp(torch.tensor(-self.a * dt))
        mean_shift = self.b * (1.0 - exp_a_dt)
        vol_step = self.sigma * torch.sqrt(
            (1.0 - torch.exp(torch.tensor(-2.0 * self.a * dt)))
            / (2.0 * self.a)
        )

        rates = torch.zeros(n_steps + 1, n_paths)
        rates[0] = self.r0

        for i in range(n_steps):
            Z = torch.randn(n_paths)
            rates[i + 1] = rates[i] * exp_a_dt + mean_shift + vol_step * Z

        # Left-Riemann sum for the stochastic discount factor
        integrals = torch.zeros_like(rates)
        integrals[1:] = torch.cumsum(rates[:-1] * dt, dim=0)

        return SimulationResult(times=times, rates=rates, integrals=integrals)


# ── Run both martingale tests ──
model = VasicekModel(r0=0.0433, a=0.15, b=0.04, sigma=0.008)

# Test 1: E[exp(-int r ds)] = P(0,T)
maturities = [0.5, 1.0, 2.0, 3.0, 5.0, 7.0, 10.0]
sim = model.simulate(T=10.0, n_steps=400, n_paths=50_000)

print("Test 1: E[exp(-int r ds)] vs P(0,T)")
for T in maturities:
    step = round(T / (10.0 / 400))
    disc_mean = sim.stochastic_discount(step).mean()
    P_ana = model.bond_price(torch.tensor(model.r0), 0.0, T)
    err_bp = (disc_mean - P_ana).item() * 1e4
    print(f"  T={T:4.1f}Y  sim={disc_mean:.6f}  ana={P_ana:.6f}  err={err_bp:+.2f}bp")

# Test 2: tower property E[exp(-int_0^s r) * P(s,T)] = P(0,T)
T_target = 5.0
monitors = [0.5, 1.0, 1.5, 2.0, 3.0, 4.0]
P05 = model.bond_price(torch.tensor(model.r0), 0.0, T_target)
print(f"\nTest 2: E[exp(-int_0^s r)*P(s,5)] vs P(0,5) = {P05:.6f}")

for s in monitors:
    step_s = round(s / (10.0 / 400))
    disc_s = sim.stochastic_discount(step_s)             # exp(-int_0^s r du)
    r_s = sim.rates[step_s]                               # r(s)
    P_sT = model.bond_price(r_s, s, T_target)            # P(s, 5)
    avg = (disc_s * P_sT).mean()
    err_bp = (avg - P05).item() * 1e4
    print(f"  s={s:4.1f}Y  E[disc*P]={avg:.6f}  err={err_bp:+.2f}bp")

Why this matters for XVA

In XVA, we simulate the entire evolution of a portfolio's market value through time. At each future date $s$, on each simulated path, we compute what the trade is worth if we had to close it that day. The distribution of those values across paths gives us the expected exposure $\text{EE}(s) = \mathbb{E}^{\mathbb{Q}}[\max(V(s),0)]$ that feeds into CVA:

CVA integral (preview) $$\text{CVA} \approx (1 - R)\int_0^T P(0,s) \cdot \text{EE}(s) \cdot dPD(s)$$

where $R$ is the recovery rate, $P(0,s)$ is the bond price curve from Chapter 01, $\text{EE}(s)$ is the expected exposure at time $s$, and $dPD(s)$ is the counterparty's marginal default probability. The factorisation of the bond price curve out of the expectation requires independence of the short rate $r$, the exposure $V$, and the default time $\tau$; we make that assumption explicit in XVA Chapter 03. In contract notation $\text{EE}(s)$ is the time-varying function and $\text{EPE} = \tfrac{1}{T}\int_0^T \text{EE}(s)\,ds$ is the scalar time-average; we keep them strictly distinct throughout the XVA series. None of this is possible without a stochastic rate model. The model tells us not just where rates are today, but how they might move and how that movement drives changes in portfolio value. A model that underestimates rate volatility will underestimate exposure, and therefore underestimate CVA. The full derivation comes in XVA Chapter 03.

The moral

A stochastic model is not a prediction. It is a disciplined way to enumerate possible futures and average over them. The discipline comes from the SDE: it constrains which futures are consistent with market prices. The averaging is Monte Carlo. Together they give you a price that cannot be arbitraged, as long as the model is calibrated to the instruments you can actually observe in the market.

We now have a formal framework for modelling random rates. The Vasicek model is tractable (it has closed-form bond prices), but its parameters are fixed constants. In IR-05 (draft), we extend this to Hull-White: a model where the drift is time-dependent, allowing it to fit any observed discount curve exactly. Hull-White is the model used throughout the rest of this series.

References

  1. D. Brigo and F. Mercurio, Interest Rate Models: Theory and Practice, 2nd edition, Springer, 2006. Sections 3.2 (Vasicek model), 3.2.1 (analytical bond price), and 3.4 (risk-neutral pricing under short-rate models).
  2. S. E. Shreve, Stochastic Calculus for Finance II: Continuous-Time Models, Springer, 2004. Chapters 3-5 cover Brownian motion, SDEs, and change of measure with full rigour.
  3. O. A. Vasicek, "An Equilibrium Characterization of the Term Structure," Journal of Financial Economics, 5(2), 1977. The original Vasicek model paper.