Learn Merton Jump-Diffusion from Scratch

Merton Jump-Diffusion from zero

1/5

Section 1

Black-Scholes can't do crashes

Black-Scholes assumes the price moves continuously — small tick by small tick, no teleporting allowed. That is fine 99% of the time. The other 1% is what blows you up.

Markets gap. Earnings announcements, geopolitical shocks, protocol exploits — the price jumps instantly from one level to another with nothing in between. A model that only knows diffusion literally cannot assign probability to these events.

Robert Merton's fix (1976): keep the diffusion, but bolt on a second source of randomness — a Poisson process that fires at random times. When it fires, the price jumps by a random amount drawn from a lognormal distribution.

Merton jump-diffusion SDE

dS/S = (μ − λk)dt + σdW + (J − 1)dN

dW — standard Brownian increment (the usual diffusion).
dN — Poisson counter. Usually 0. Occasionally 1 (a jump happens).
J — jump multiplier. ln(J) ~ Normal(μ_J, σ_J²). If J = 0.9, the price drops 10% instantly.
λ — average number of jumps per year. k = E[J − 1] — compensator so drift stays clean.

Below are three simulated price paths under the Merton model. Most of the time, the path is smooth diffusion. Then a vertical line appears — that is a jump. Crank up λ to see more frequent jumps, or make μ_J more negative to see crash-like behaviour.

Jump-Diffusion Price Paths

Path 1

Path 2

Path 3

Jump events

Total jumps across 3 paths: 0

λ (freq)1.0/yr

μ_J (size)-8%

σ_J (vol)12%

Mental model

Think of diffusion as walking across a room. You take small, continuous steps. Now add trapdoors in the floor. Most steps are normal. But occasionally you fall through a trapdoor and land somewhere unexpected. That is the jump component.

Section 2

Three new parameters

On top of the usual σ (diffusion vol), Merton adds three parameters that together control the shape of the implied vol smile. Each one does a specific job.

λ (lambda) — jump frequency. How many jumps per year, on average. Higher λ means jumps are more common, which lifts both wings of the smile. If λ = 0, you are back in Black-Scholes world.

μ_J (mu-J) — average jump size. If negative, jumps are predominantly downward (crashes). This tilts the smile — the left wing (puts) gets more expensive than the right wing (calls). If zero, jumps are symmetric and the smile is roughly symmetric.

σ_J (sigma-J) — jump size volatility. How variable the jump size is. Even if μ_J = 0, a high σ_J means some jumps are huge and some are tiny. This adds excess kurtosis — fatter tails than normal — which increases wing curvature.

Merton Implied Vol Smile vs Black-Scholes

Merton smile

BS flat vol (20%)

λ controls overall wing level

μ_J < 0 creates downside skew

σ_J controls wing curvature

λ (freq)1.0/yr

μ_J (size)-8%

σ_J (vol)12%

Play with the sliders above. Three experiments to try:

1. Set λ = 0. The smile goes flat — pure BS.

2. Set λ = 2, μ_J = −0.15,σ_J = 0.05. You get steep downside skew — the market expects crashes more than rallies.

3. Set μ_J = 0, σ_J = 0.30. Both wings lift symmetrically — pure fat tails, no directional bias.

Section 3

The pricing formula

Merton's pricing formula is elegant: the option price is a weighted sum of Black-Scholes prices, one for each possible number of jumps. If you can price vanilla BS calls, you can price Merton.

Merton's series formula

C = Σ_n=0^∞ [e^−λ′τ(λ′τ)ⁿ/n!] · BS(S, K, σ_n, τ)

Each term asks: “What if exactly n jumps occurred during the option's life?”
σ_n² = σ² + nσ_J²/τ — each extra jump adds more effective variance.
The weight is a Poisson probability — the chance of exactly n events in time τ.
In practice, 10–15 terms is enough because the Poisson weights decay fast.

The visualization below decomposes the Merton price into its first six terms. The left panel shows bars for each term at your chosen strike. The right panel shows how the terms stack up across all strikes — you can see which terms dominate at-the-money versus the wings.

Merton Series Decomposition

Term contributions at K=95

Stacked terms across strikes

Strike95

BS price: 7.86Merton price: 9.67Jump premium: 1.81

Key observation: the n=0 term (zero jumps) is just the ordinary Black-Scholes price. The higher terms add progressively more value to the wings, because jumps push the effective volatility higher and make far strikes reachable.

Move the strike slider to the wings (K=80 or K=120). Watch how the higher-n terms become proportionally more important. At at-the-money, n=0 dominates. In the wings, n=1 and n=2 start doing serious work — that is where the jump premium lives.

Section 4

Jump risk is not hedgeable

In Black-Scholes, delta-hedging eliminates all risk — you rebalance continuously, and the diffusion risk cancels out. With jumps, that breaks. The jump happens instantly; you cannot rebalance fast enough.

Think about it: delta-hedging works by adjusting your stock position in response to small price changes. But a jump is not small — the price teleports. By the time you can react, the damage (or windfall) is done. Your hedge was sized for the pre-jump price, not the post-jump price.

This means the Merton market is incomplete. You cannot replicate every payoff with just the stock and bond. Jump risk is a separate risk factor that the market must price. This is why options in the real world carry a premium above what BS delta-hedging logic would imply.

Delta-Hedging PnL: BS vs Jump World

BS world (no jumps)

Merton world (with jumps)

Hit Regenerate a few times and watch the pattern. In the BS panel (left), cumulative PnL wanders but stays relatively contained — the hedge is doing its job. In the Merton panel (right), PnL looks similar most of the time, but then a red vertical bar appears (a jump) and PnL lurches.

The jump-induced PnL shocks are asymmetric when μ_J < 0: downward jumps hurt the hedger (who is short gamma) more than upward jumps help. This is the fundamental reason that crash puts carry a premium — someone has to be compensated for bearing this unhedgeable jump risk.

Section 5

Merton vs. Heston vs. reality

Merton is brilliant at short-dated smiles. Heston is brilliant at long-dated smiles. Reality needs both — which is why the Bates model (Heston + jumps) became the industry workhorse.

Here is the key distinction:

Jumps dominate at short maturities. A 1-week option is too short for stochastic vol to “diffuse” meaningfully. But a single jump can still reach a far strike. Merton's jump component is the primary driver of short-dated wing prices.

Stochastic vol dominates at long maturities. Over 6 months, the vol itself wanders up and down enough to generate fat tails on its own. Jump events get “diluted” in the averaging — one jump in 252 trading days matters less than one jump in 5 trading days.

Term structure intuition

Short-dated wings → jump risk → Merton
Long-dated wings → vol-of-vol → Heston
Both → Bates = Heston + Merton jumps

The practical consequence: if you calibrate Merton to 1-month options and then use it to price 1-year options, the long-dated smile will be too flat. The jump component decays with √τ, but the market smile stays elevated at long tenors because vol itself is uncertain.

Conversely, Heston alone underprices short-dated wings. The vol process is too slow to create the extreme short-dated kurtosis that the market demands. You need jumps for that.

The models, compared

Black-Scholes: flat smile. No skew, no wings. The simplest benchmark.

Merton: smile with elevated wings, especially at short maturities. Skew if μ_J < 0. Smile flattens with maturity as jumps get diluted.

Heston: smile from vol-of-vol. Smile persists at long maturities. Generates skew via vol-spot correlation (ρ).

Bates: Heston + Merton jumps. Matches the term structure of the smile from short to long tenors. The standard industry choice for equity and crypto.

Where to go next:

Heston Model — stochastic vol, the other half of the picture

Bates Model — Heston + jumps: the industry workhorse

Kou Jump-Diffusion — asymmetric jumps with double-exponential tails