stepping through an oil put option trade

Back on Oct 17th, I sold Z24 WTI (oil) 67 strike puts unhedged.

I explained my reasoning in this thread back when I did the trade. They were the equivalent to the USO Nov 15th 69 puts (I said Nov 22nd expiry but that was in error.)

I covered the WTI puts on Thursday morning. I published this thread when I covered them. Here’s a mildly edited version:

We’ll use the USO puts to write the post-mortem of a roughly 2 week trade. I hope its super educational.

Let’s start with the vibes.

Markets feel a bit, I don’t know, ahead of themselves. Everything but oil popping (til today). Rates & USD higher.

I’m less bullish on oil (and broadly bearish).

I cut oil length and bot t-bills this am.

Let’s get to the specifics of the “cutting length” trade because there’s many ways to do that. But my directional bias coincided with wanting to buy back my short vol (the moontower mantra — don’t touch the options without a vol lens).

Why?

Vol has sufficiently relaxed in oil since i sold the puts. The put skew was at normal levels when i first sold the 30d delta puts but the vol was high. Since spot/vol corr was positive that made them seem extra high!

Today there almost no skew in those (now) .29d puts
The implied vol is below realized (granted realized is on the high end of the range)
AND the election is getting negligible vol premium in oil

These pics show the negative VRP and negligible event premium assuming 35% fair base vol (which comes from eyeballing the USO vol term structure).

While that explains the how and why of covering the position, let’s understand the p/l attribution of the position while I held it. We do this with the same type of charts I’ve been showing post-mortems with.

I’ll narrate where your eyes should go so it’s easier to learn.

If the puts were hedged the trade was steadily profitable except for 2 days out of 11 when the market popped.

See the yellow boxes on the red line:

Every day we can see the contribution to the hedged p/l from 2 components:

realized vol p/l (tug of war between gamma & theta)
vega p/l (change in IV)

The yellow boxes are examples of the daily decomposition.

Look what happened on Friday 10/25’s big down move…the hedged p/l was still positive. Yes, you got hammered on the realized p/l but vol got slammed! The put skew was in fact unjustified. The down move was what I call stabilizing to the market

Down moves aren’t normally stabilizing but my idea was that the Middle East conflict was driving the high vol so in this context a down move would be stabilizing so the total vol was unjustified if oil is lower.

(Ofc I was naked short the puts so that Friday was still a tough p/l day because i experienced a rough delta p/l but overall it was buffered by the puts underperforming)

Over the life of the trade, on a delta hedged basis you would earn $.45 being short the option from $1.76 On an unhedged basis it was $.78 (I actually made more than that because I actually sold the options closer to $2 bc the stock was about the same place it is right now, $71.85)

More importantly let’s look at the cumulative p/l attribution:

Almost all of it came from vega. The option was well-priced from a realized vol point of view!

This all ties in to my general gestalt of “short where she lands, long where she ain’t” bit. If vol is going to relax when it “gets there” then you don’t wanna use the option to bet it’s gonna get there. And if everyone thinks it’s “not going there” then when it does it will destabilize and you’ll wish you owned that option.

As a junior trader I remember selling calls bc “it’ll never get there”. I promise you there are many people who think like that. They don’t understand vol trading.

[An aside: That statement sounds more incendiary than I’d like. It troubles me I can’t fully articulate it. It’s a bit of an ink blot test. It’s understandable if you find that unsatisfying but the raw reality is indifferent to both of our dissatisfaction. In truth, there was a point of separation somewhere along the evolution of trading careers where as things got more competitive from the floor days to today, the traders who were copycatting disappeared into other parts of the business. When trading morphed from time/place advantage edge to positional edge it exposed the copycats lack of deep options understanding. Before Hollywood, there was a “me too” era on option pits across the land, where you only had to be savvy enough to identify who was smart and just make sure you yelled “buy’em” at the same time.

There are a lot of people who sound like they get vol trading. In fact I can’t fully imagine how hard it is for someone who’s not super experienced to tell the difference. The problem with codifying a “trading Turing test” is the same one interviewers have with candidates — as professional-grade info gets disseminated it’s hard to know if someone has earned it or parroted it.]

One of the savviest oil options traders I ever knew had a good formulation:

He’d buy those nominally cheap (but vol expensive) weeny puts when he was bullish. Because if the market dropped he just wanted to own what I call “the trap door” to protect what he really wanted to do…get balls long

In my own trading, i wanted to own the trap door. Whatever everyone thinks is impossible is the option i want. Stick’em in my back pocket and if it ever comes into play I’m the only one with 2 hands on the wheel

I admit this instinct was much stronger when i was trading a big book and i don’t have it as much now (that’s another discussion altogether).

Anyway, I hope this was overall educational. There’s an art to this game called options. If anything i maybe it gets the ole mind bicycle spinnin’ in such a way that even if you don’t trade options, it can mentally upgrade your whole investment decision OS.

“negatively priced lunch”

Markku Kurtti is an engineer in the telecom world. His outsider quant take on portfolio construction is beautifully derived and intuitive.

I strongly recommend his interview with Corey Hoffstein:

🎙️Diversification is a Negatively Priced Lunch (Flirting with Models podcast)

His blog is also outstanding. I’ll point you to this post in particular:

How much skill a concentrated stock picker needs to beat a diversified benchmark? (17 min read)

I summarize key findings below (with the aid of an LLM). The “Moontower highlights” are direct quotes from my kindle.

The central theme:

For a stock picker to successfully manage a concentrated portfolio, they must generate sufficient alpha to overcome the inherent risks and volatility associated with fewer holdings.

Supporting points:

1) The Balance Between Concentration and Diversification

Concentrated portfolios inherently carry more risk due to idiosyncratic variance, or the unique risks associated with individual stocks. To overcome this risk, stock pickers need to generate enough alpha to offset the “variance drag” — the reduction in expected growth rate caused by high volatility.

🟡Moontower highlight: “Portfolio construction of a skilled stock picker is a compromise between enhancing alpha by concentration and mitigating idiosyncratic variance drag by diversification.”

2) Importance of Consistent Skill and Alpha Requirements by Stock Size:

Different types of stocks require varying levels of alpha to beat the benchmark. Larger, more stable stocks typically require less alpha than smaller, more volatile stocks. Consistency in skill is crucial, as erratic performance increases the minimum alpha required to compensate for the higher risk.

🟡Moontower highlight: “Assuming perfectly consistent stock picking skill over time, 10-stock big stocks portfolio has historically required roughly 0.5 percentage point (pp) annualized alpha, small stocks ~1pp and micro-caps ~2pp. High E/P, E/B, Mom and B/P styles, in the universe of all stocks, have required roughly ~1pp and low E/P, E/B, Mom and B/P styles north of ~2pp. Low E/P style (smallish growth stocks with low profitability) have required the highest 2.55pp alpha.”

3) Risk of Concentration Without Skill

Concentration magnifies returns but also heightens risks. Without genuine stock-picking skill, a concentrated portfolio becomes increasingly likely to underperform over time. The document cautions against relying too heavily on concentration to boost returns without sufficient alpha.

🟡Moontower highlight: “But concentration is risky. If you concentrate and don’t have genuine stock picking skill, time will be your enemy.”

4) Circle of Competence and Style Diversification

The post emphasizes the value of investing within one’s “circle of competence” — areas where the investor has the most knowledge or advantage. However, it also warns that focusing exclusively on a single style exposes investors to style risk.

5) Predictability of Variance Drag Over Return

Idiosyncratic variance drag, the penalty for concentrating in fewer stocks, is more predictable than expected returns.

🟡Moontower highlight: “Idiosyncratic variance drag differences are easier to predict than expected return differences. It is therefore safer to increase diversification, which reliably decreases minimum alpha requirement, than to increase portfolio concentration to enhance uncertain alpha.”

🟡Moontower reference: The idea that volatility is more predictable than returns is a foundational principle in portfolio management. See Know Nothing Sizing

6) Lottery Preference in High Variance Styles

Some investors are attracted to high-idiosyncratic-variance stocks with potential for lottery-like returns leading to lower forward-looking returns.

🟡Moontower highlight: “Some investors may prefer stocks that may pay off big and this is exactly what idiosyncratic variance delivers: large dispersion of returns among individual stocks.”

🟡Moontower reference: See A Recipe For Overpaying for a succinct explanation by Chris Schindler.

7) Takeaway on Diversification for Risk Management: Diversification not only reduces variance drag but also lessens reliance on unpredictable alpha.

🟡Moontower highlight: “Our take away is that idiosyncratic variance drag is much more predictable than expected return. More generally, it is easier to predict variance than mean return. It is therefore safer to diversify more as it will reliably bring down idiosyncratic variance drag compared to concentrating more in a hope of higher alpha.”

It’s a love letter to diversification mixing words and math. For what it’s worth, at SIG Jeff Yass also called diversification a free lunch.

I’m partial to my Sun/Rain example in You Don’t See The Whole Picture which is an even stronger statement — you are incinerating money by not diversifying but if you evaluate yourself by “resulting” you won’t see it. That’s because the highest bid for risk is the most efficient at absorbing it. This is deeply true in the derivatives world. In the broader investment landscape it’s confounded by info asymmetry, principal-agent conflict, and the comfort of (perceived) safety in herding.

If you want to get deeper into this idea see the back half of the moontower guide:

🟰Understanding Risk-Neutral Probability (link)

But be aware…”diversification always means having to say you’re sorry” since something is always losing.

And sometimes almost everything loses. This was Wednesday. Eww.

Volatility term structure from multiple angles (part 2)

In part 1 of Volatility term structure from multiple angles we opened by discussing how nearer dated implied vols move around more than deferred implieds. Recognizing that dynamic, our net vega position for a time spread can be ambiguous.

Just as stock traders use beta to normalize risk to a benchmark such as SPX, volatility traders will normalize their vega to a fulcrum month. √t scaling corresponds to a model world where time spreads between months remain relatively stable. It’s not reality but it’s a vast improvement over summing raw vegas.

In comparing vols between 2 months, vol ratios are popular. If M1 is 18% vol and M6 is 20%, the vol ratio is 90%. It’s a measure of how steep the term structure is. If you track the ratio for constant maturities then you can get a quick sense of the relative supply/demand for IV. If the ratio is less than 1.0, the term structure is ascending, a shape typical of “it’s quiet now, but we expect mean reversion to typical higher levels of vol”. A downward sloping vol curve is more closely associated with high vol periods or the market’s anticipation of an even such as earnings or the election.

Vol ratios are only one way to measure the slope of the term structure. We saw that implied forward vols are a complementary measure that also describes the relationship between 2 volatilities on the term structure. The computation tells what volatility is baked into the period between the 2 expirations. The logic is that the deferred expiration accounts for all the volatility from now until the option’s last trading date while the near-dated expiry isolates the early period’s expiration. If you consider an extreme example where the time spread is worth 0, ie the deferred option and the nearer-dated option are the same price, the forward vol is zero.

So why look at 2 measures, vol ratio and implied forward vol, if they both tell us about the relative price of implied vol on the term structure?

Remember, we were looking at GLD 1m/6M vols for the 1-year range 10/2/23-9/23/24:

We saw:

The forward vol is sometimes high and sometimes low regardless of the ratio!

But look at early March — not only was the vol ratio low, the forward got crushed. If you only look at vol ratio, you missed this.

Implied forwards are an orthogonal or complementary measure of relative volatility that is additive to your perspective.

Today, we will dive further into the relationship between vol scaling, forward vols, ratios. We will come out on the other side with what this all means for finding trades and managing risk.

Off we go…

Constant Straddle Spread vs Forward Vol

Suppose you are long a time spread.

We’ll say a straddle spread because that’s a common trade expression and it also connotes delta-neutrality.

[It’s probably helpful in thinking about these things to not have to process “call”, “put”, and their associated directionality. Depending on where you are in your learning I’m trying to be mindful of cognitive load.]

You’re long a 6-month straddle for 15% vol and short the 1-month straddle at 15% vol. Front month vol suddenly spikes a point to 16% and the 6-month vol increases by 1/√t (ie 1/√6) or .41 vol points to 15.41%

Here’s what we know:

The term structure went from flat to descending.
You lost 1 vol point on your 1-month straddle, gained .41 on your 6-month straddle. Because the 6-month straddle has 2.45x as much vega, you broke even.
The straddle spread is unchanged.

What happened to the forward vol?

You can compute it yourself with the moontower calculator but I’ll just tell you…the implied forward vol increased to 15.3%

Let’s summarize what happened:

You are long raw “click” vega since you own the longer-dated option
You are flat weighted or scaled vega if we use √t scaling with reference to M1
Vol across the curve increased in proportion to √t weights leaving the straddle spread unchanged
The forward vol you are long increased, although your p/l is unchanged.

The key point to appreciate:

A constant straddle spread price does NOT mean the forward vols are also constant.

Another way to say this:

The same straddle spread price can yield different implied forwards!

Implied Forward Vol: a “many-to-one” relationship

For any implied forward vol there are many pairs of vols that can produce it. This is why the straddle spread and forward are not overlapping measures of term structure.

The table shows:

many combinations of vols that generate a 15% forward
√t or constant straddle spread scaling means the forward is changing
The scaling would need to be sub √t (ie more muted) for the forward to not change. If fact, in a vol increase scenario, the straddle spread price would need to narrow for the forward to be unchanged.

Observing GLD vol data for the past year we can see the many-to-one relationship between:

a) vol ratio and forward vol

For any given vol ratio there are many forward vols. It’s not a function.

b) vol pairs and forward vol

For any given forward vol, the 6M vol can be a fairly tight range while the 1M vol could be far above or below the 6M vol!

You can see how the forward vol and the 6M vol are positively correlated. This makes sense — the forward vol is driven by 5 out of the 6 months in the tenor.

You can also see that at low (high) 6M vols the 1M vol tends to be even lower (higher).

Linking the scaling relationship to vol of vol

Remember that weighting our vega by month is analogous to beta weighting a stock portfolio. Beta weighting summarizes a portfolios market exposure with respect to SPX or some other benchmark. Looking at raw unweighted vega is like ignoring the relative volatilities between stocks in a traditional portfolio. $1mm of risk in SPY is not the same as $1mm of NVDA.

√t weighting is therefore suggesting a vol of vol with respect to the fulcrum month (in our examples M1). If the back month vol is more volatile than √t suggests than our long straddle spread is long raw click vega AND weighted vega. If M1 vol increases by 1 point and M6 increases by .75 points, than the straddle spread and forward are expanding quickly. The beta of the back month vol to the front is high.

We can regress GLD 6M vol vs 1M vol to see the sensitivity. Remember a sensitivity of .41 would be √t or constant straddle spread scaling.

√t scaling seems to underestimate the beta of 6M vol to 1M vol. A long straddle spread would be long weighted vega not just raw vega.

We can also compute the standard deviation of vol changes to see the vol of vol. You can see that the empirical 6m vol of vol is less than 1M vol of vol (totally expected) but it’s higher than what √t predicts.

If you truly wanted to be flat weighted vega you’d need to ratio the spread to have less units of the deferred straddle.

Takeaways and Discussion

Vol ratios are common ways to represent the steepness of a term structure.

A tradeable expression of a vol ratio is the price of a vega-neutral straddle spread (using our example from today, if you buy 1 6M straddle and sell 2.45 1M straddles). Its performance mimics the vol ratio chart at the beginning of the post!

Ratios just like beta-weighting sterilizes a position from the price level to isolate the slope.

You can even track or trade straddle butterflies to bet on the curvature of the term structure.

These are popular ideas from the futures and yield curve worlds:

Methodology – Yield Curve Spreads — https://yieldcurvespreads.com/methodology/

From that picture you can see how you might not have a strong opinion on A vs C (a slope bet that would be expressed in a ratio trade) but on the curvature between A and C (which would be expressed via butterfly or a “spread of spreads”).

The forward vol is an orthogonal measure

Implied forward vol changes even if the straddle spread doesn’t. We saw the many-to-one relationship between forward vol and the pairs they come from as well as the vol ratio.

💡Aside on Pairs trading💡

We’ve been talking about vol ratios along the term structure but we can combine our use of forward implied vols to inter-asset or pair trades.

In moontower.ai we have this vol ratio tool. Here USO vol looks rich compared to XLE

But you could also compare forward vol to forward vol difference in a matrix view.

These are cobbled together from 2 screenshots but you can imagine a UI which shows a matrix of all the individual forwards. Those implied forward ratios could then be compared to a vol cone of realized vol ratios between XLE and USO.

The orthogonal nature of implied forwards gives you another set of data to run through all your conventional views to see if something stands out.

I’ve mentioned many times in my writing how I hate vol trades that start with “skew is cheap or expensive”. My experience is that the “skew knows”. It’s highly self-fulfilling. Plus implied skew doesn’t vary as widely as realized skew, so you’re forcing convergence trades on a compressed implied range that doesn’t compensate for how sloppy vol can get on destabilizing moves.

Term structure trades on the other hand. This is the place to look.

Risk limits

Deferred vols are less volatile than near dated vols. It’s important to re-scale the vega per month. √t is as sensible choice as any but your survival shouldn’t depend on any particular choice mattering that much since it will be wrong.

Just as you would never trust all of your delta risk management to the concept of beta, vol scaling weights should be taken with a grain of salt both in terms of modeling changes in term structure and in determining risk limits.

As a risk manager, if you constrain net vega without also constraining gross vega (ie the absolute value of vega within each expiry) you are inviting a situation where a book looks flat based on some weights but masks giant time spreads underneath the surface.

Examining vol of vol directly as well as placing term structures into context with vol cones can offer an ensemble view to understanding how extreme time spreads can get.

💡A word on measuring vol of vol💡

In this post, I computed the standard deviation of vol changes. Any experienced trader knows that this is incomplete. Because of spot-vol correlation and skew, vol changes are not pure. They are a mixture of moving along the curve vs the curve shifting.

Always ask yourself — “what measure correlates to how my p/l performs?”. That’s what you care about.

In this case, you want to measure the vol of strike vol not some nebulous concept of floating ATM vol.

Extending the analysis

Gold doesn’t typically see much event vol priced into it. Some macro reports like unemployment or key Fed meetings will get their bump in the term structure but it’s not the same degree as say earnings for a single stock or the annual USDA Prospective Planting report in ags like corn and soybeans. These sort of events can can 1 or 2 weeks of vol priced into a single day.

Event variance propagates through the term structure in inverse proportion to the DTE. To say it in friendlier terms…they don’t impact the deferred months much relative to the fronts.

I will re-run all these charts for another name (thinking NVDA) for the past year and see what I come up with. I expect a lower vol beta than we saw in GLD but mostly as an artifact of the near-dated vols having a much wider range because of earnings.

Options Riddle

I saw a familiar type of riddle on Twitter that was directed at fundamental PMs. I gave a lazy answer and later improved it with a better answer after my half-assed-ness gnawed enough at me.

I’ll reprint the riddle and the better answer here but spelling out the steps in greater detail than I did on twitter.

Question:

Estimate the price of a $180 call (20% OTM) on a $150 stock with 50% volatility, 3 months to expiry

150 Call Calculation (The ATM option)

We start by estimating the at-the-money (ATM) call value using:

ATM straddle = .8 * stock price * implied vol * √(Time to expiry in years)

ATM Call = .4 * stock price * implied vol * √(Time to expiry in years)

ATM Call = 0.4× $150 × 50% ×√1/4= $15

180 Call Calculation (The OTM option)

The 150/180 call spread links the 150 call to the 180 call.

Call Spread Value Breakdown

The call spread’s total probability of expiring ITM is around 45%. This is another estimate off the top of my head.

Although you’d expect 50%, option models assume lognormal stock distributions because returns are compounded. Compounded or geometric returns are subject to “volatility drain”— pulling median price expectations lower than the forward price.

You can think of the expected value of the $150/$180 call spread in two parts:

The probability that it expires worth its maximum value of $30. This is P(S>$180)
The value on average when the stock expires between $150 and $180.
This is 45% – P(S>180)

Computing P(S>180)

Note that the straddle is simply 80% of a standard deviation.

The $180 call is conveniently $30 OTM or .80 standard deviations OTM

We know that 1 standard dev encompasses 68% of a distribution, so at a z-score of +1.0 the one-tailed CDF must be 16%

Spelling that out: 100% – 68% = 32% but we only care about the “up” case when the call is ITM, so we cut that in half to 16%.

Since this exercise is supposed to be all mental math, I’ll guess that a Z-score of 0.80 gives a one-tail CDF of ~ 20%, meaning there’s a 20% chance this call will expire in the money (ITM).

We will assume the 180 strike has P(ITM) = 20%

Expected Value Calculation for the 150/180 call spread

The case where stock > 180
E(call spread | S>180) = Max value x P(S>180) = $30 x 20% = $6
Case where S is between $150 and $180
E(call spread | 150<S<180) = Average value of the call spread when s is between the strikes x P(stock between 150 and 180) =

$15 x 25% = $3.75

💡Why $15?

The average roll of a die is 3.5

The average roll of a die given that the roll is greater than ‘3’ is 5. This assumes a uniform distribution over that range.

This same style of approximation works well enough for the call spread. Assuming the stock expires between 150 and 180, the call spread is worth $15 on average. The probability it expires between those strikes is the total probability of the stock expiring higher than $150 which I estimated earlier as 45% minus the probability of it roofing above $180 which we estimate at 20%. So the probability of the stock being between 150 and 180 is about 25%

Hence, $15 x 25%

We sum all scenarios where the call spread expires ITM (ie when the stock is above $150):

Call spread estimate: $6 + $3.75 = $9.75

If the 150 call is worth $15 and the 150/180 call spread is worth $9.75, then the 180 call is worth $5.25

Recapping key bits:

Knowing the ATM straddle approximation .8SV√T
Guessing that the probability of a >.8 standard deviations ~ 20%
Estimating that the probability of the stock going up is less than 50% in a Black Scholes price process (and that at 50% vol that probability is lower than say at 16% vol — in fact the drag is proportional to vol squared)

In the twitter discussion, a great link from 2012 emerged:

Calculating option prices in your head (7 min read)

The Hardy Decomposition offers a handy way to estimate OTM option prices in your head. By breaking down an option’s price into intrinsic value and a HardyFactor (which depends on how far you are from the strike, measured in standard deviations), you can quickly approximate the time value of the option.

The following comes from the post:

Option Price = Intrinsic + ATMPrice*HardyFactor

The HardyFactor is:

d1 is just how many standard deviations you are from the strike.

⚠️Looking at a quant forum it looks like the HardyFactor approximation is for options being priced with the ‘normal’ distribution version of the B-S model as opposed to the more commonly used lognormal version

Revisiting the riddle

If we revisit the riddle, we know the 180-strike has a d1 = .8 standard devs

If we linear interpolate between .5 and 1 we get a HardyFactor = 40%

Option Price = Intrinsic + ATMPrice*HardyFactor

180 call = 0 + $15 * 40% = $6

My call spread method yielded $5.25

The HardyFactor method (quickly) got us to $6.00

Sound like we have a decent market!

I put into an option calculator:

https://www.cboe.com/education/tools/options-calculator/

Pretty fun stuff. If the OTM call IV is discounted by 1 vol point (so -2% skew vs the 50% ATM IV @ the .27 delta option) then the theoretical call value would be $5.616 – .2575 (ie the vega) ~ $5.36

If you want more reinforcement on this I wrote a thorough twitter thread explaining vertical spread comprehension in detail.

my aversion to trading implied skew

First of all, free subs to moontower.ai can access a few tools and reading materials as well as the community but they cannot post and can’t see analytics.

Here’s a question that was posted in the community this week:

I was reading thru an old tweet of yours on trading skew. The tl;dr of the tweet was don’t trade skew… Given I am in a masochistic mood, how would one go about backtesting a skew trading strategy?

I had 2 ideas, which I’d love to get your thoughts on.

Idea 1:

X asset 25d 3M normalized put skew is in the 100th percentile, sell a 25d put strike, delta hedged
hedge the delta daily or at some discrete interval
check how this strat would have performed assuming the trade is held until expiry

Idea 2:

X asset 25d 3M normalized put skew is in the 100th percentile, sell a 25d put strike, delta hedged
wait until normalized skew returns to some threshold, for example 75th percentile
hedge the delta daily but close out the trade as soon as the threshold is hit

Lots of questions, but the main ones are:

for idea 1, does your pnl depend on implied skew vs realized skew (similar to implied vol vs realized vol). How would you measure this?
for idea 2, does your pnl depend on a combo of realized skew (for as long as the trade is held) as well as surface repricing (ie selling at 100th percentile implied skew and closing out at 75th percentile). The thought of measuring this gives true masochists vibes, but how would you?
I wonder if the juice is worth the squeeze? Meaning, assuming you built the foundation to measure/test all the above, is there really any pnl in it / are you better off focusing on VRP trades?

My response:

As a matter of practicality, I think the test should be more in the vein of idea #1.

If you consider skew percentiles, the difference between the the 25th and 75th percentile could be some absolutely small number like 2 vega points. And the level of skew itself measured by percentile is sensitive to the percentile lookback such that the range you are trading over is just quite small. Your interim p/l will be the sum of implied vol change but plus realized delta hedging p/l.

But consider this…let’s say you sell the 25d put and it becomes a 50d put but the skew normalizes. That skew metric is no longer referencing your position. You have a floating vs fixed problem. In other words, you can’t really trade implied skew directly.

Your results are basically going to come down to path. Your interim p/l is going to get marked based on the IV of the fixed strike you have on and that in turn is going to influence the delta you hedged on.

The delta you hedge on is going to have a large impact on your final p/l so it’s not just where does the stock go but what deltas were imputed along the way. For example suppose you run a model with spot/vol correlation embedded in the SP500…this will generate higher OTM put deltas.

If the market trends down you will win to this but vice versa. However, if you used B-S deltas you will get hurt as the market goes down and vice versa. And even then, you will def get hurt on the marks, but if the stock expires near the short strike you will probably still win by expiration even though the mark-to-market path is hairy.

I used to work with a big oil options trader that would on a monthly basis stick a hedged 1-month risk reversal in a separate account and hedge it on B-S deltas. My point is that is an active choice that influences the results. Another choice could be to hedge on deltas that don’t incorporate implied skew at all but just use ATM vols.

Overall, testing the idea, even a monte carlo, is a great way to get a shape of the problem but more importantly because you can see how the parameters you choose impact the p/l path.

I’m not kidding when I say skew trading is masochism. If oil is $75 and has massive put skew and the market drifts down to $55 and the skew gets hammered (so say the 40 puts don’t perform) but you sold the 60 put what skew did is irrelevant. All that will matter is how fast did the stock go to $55 and what deltas were you running on the $60 strike along the way.

The weirder the distribution the crazier this is. I’ve seen nat gas option traders blow out being long put skew on a 15% drop in the underlying because they used too high of an implied option delta and they delta hedged several times on the way down.

Had they they run a lower vol and delta OR hedged less they might have survived. There’s not much lesson from this other than…sometimes a 15% selloff is interpreted by the market as “stabilizing” and sometimes it’s destabilizing and that is what’s gonna dictate the options behavior.

Volatility term structure from multiple angles (part 1)

The post Dragonfly Eyes served as a broad preamble to our exploration today. Be like a dragonfly — look through multiple lens.

We’re going to expand our thinking about volatility term structure to see why it’s a diamond with several facets — and most interestingly — why multiple ways of looking at it are not all correlated.

We are going to consider volatility term structure in a few ways. The differences will make the value of multiple lenses self-evident. The source of the differences have highly practical ramifications for 3 tasks:

risk management
surface modeling
trade prospecting

I would be surprised if even an experienced trader didn’t walk away folding the Rubik’s cube known as vol term structure in their head. If anything, I’m sure a seasoned trader can find some interview questions embedded in the concepts to bounce off candidates.

If you are a novice trader, you will still benefit. There’s nothing more than arithmetic in here. The value of hacking ideas from several vantage points will be obvious plus you will learn some basic transformation and measures that the more experienced folk take for granted.

About this post

The post is the first of a 2-parter. It won’t all fit in the single email view plus I’ve been under the weather this week. All the background work is done but I’m running on fumes to write it all up.
It’s a semi-Socratic progression of “show don’t tell” which serves to make the lessons your own.
We will use GLD vol data for the past year which was whimsically chosen. I didn’t snoop at the data first.
We get into implications for your own procedures.*
Finally, I talk about how and why I will extend the analysis.

*The word “your” prompts a reasonable question — who’s this for? I’m imagining a trader or risk manager at a prop shop/asset manager or an extremely sophisticated retail option trader. The material comes from pragmatism and experimentation. A durable way of seeing based on lots of pain. This is the stuff of salt mines. The way traders think.

Where does this intersect with quant and formal risk management?

Everywhere.

Quants may have a different language and set of methods for computation but the concerns are the same. I’ve said it before, but the caricature of the ivory tower theoretical physics PhD without street smarts is foreign to me. All of the gigabrain quants I’ve worked with were both practical and exceptional at asking questions. Their priority was reality.

Their knowledge becomes indispensable as risk management scales and portfolios become far more complex cross multiple strategies and asset classes. I have little to add to the code-level minutiae of implementing a large-scale risk OS. I pretty much operated at the frontier of how much I could keep in my head at once but as combinations expand exponentially, well, you’re gonna need a bigger boat.

Let’s open with a “simple” question:

If you buy a 6 month/1 month straddle spread on an equity or ETF, are you long vega?

(To be linguistically clear, you are buying the 6 month straddle and shorting the 1 month)

If I’m asking the question you already know there’s more to this than simply “6 month option vega is > 1 month option vega” so YES.

You can see where I’m going with this if I use an analogy question. If I buy X dollars NVDA and short Y dollars of SPY, am I long the market if X = Y?

The question comes back to beta. Beta is a function of correlation and vol ratio. Just like with equities to a benchmark, the correlation of vol changes across a term structure is usually quite strong. We are mostly concerned with how much the vol moves with with respect to another vol.

We don’t expect 2-year implied vol to move as much as 1-week implied vol. In the NVDA question, beta tells us the ratio of X to Y to be “market neutral”.

Going back to the vol question. It’s true that you are long vega if you buy a 6 month straddle and short a 1 month straddle. You are long “click vega”. If the entire term structure parallel shifted higher 2 points you will win 2 x [net vega].

But vols don’t generally move in lockstep across the term structure. Instead, it’s common to weight the vols by their sensitivity to a fulcrum month and then re-scale all your monthly vegas by the weights.

So the answer to the riddle might be YES you are long vega but it’s not necessarily the most helpful answer. If vol parallel shifts up, you will definitely win and the idea that you are long vega was certainly true in this unusual dynamic. It is reasonable to expect 1-month vol to increase faster the 6-month vol.

This brings us to our next question.

If the front month vol increases by 1 point, how much does the 6-month vol need to increase to merely break even?

Hint: For an ATM (well technically at-the-forward) straddle the only things that affect the vega are spot price and DTE

This is an equity or ETF so the spot price is the same for both months. Note this would not be true for futures which have a curve of different underlyings.

The difference in vega is proportional to sqrt(dte). In this case the sqrt (6 / 1) = 2.45

[If this is not clear, recall the ATF straddle approximation from The MAD Straddle is straddle = .8Sσ√T.

Straddle vega is just change in straddle price per 1 point change in vol. We just re-arrange the formula:

straddle/σ = .8S√T

Since we are comparing 2 months, .8* S cancels out and we are left with the vegas being proportional to √T]

If 1 month vol increase by 1 point, then the back month vol needs to increase by 1/2.45 or .41 to keep the straddle spread price constant (ignoring theta).

If 6-month vol increases by more (less) than .41, we make (lose) money on the vol expansion.

Whether we are long or short vega is more ambiguous than it appears from our headline “click vega” measure.

√t Vega Scaling

Just like we beta-weighting a basket of stocks allows us to group directional exposure into equivalent SPX delta, it’s common to weight vega as a function of √dte. You may choose a “fulcrum” term such as 180 days to anchor the definition of vega and then re-scale each month’s vega by √(DTE/180)

This is review from Understanding Vega Risk:

This kind of scaling allows us to summarize the position with a statement like:

“If 6-month vol increases (decreases) by 1 point, I expect to lose (make) $13k”

This would not be obvious if you are looking at the sum of raw vega.

√t scaling is not pulled out of a hat. It corresponds to a world where straddle spreads are constant (again, ignoring theta). Armed with that straddle approximation formula, it is simple to prove that to yourself.

There’s nothing gospel about this weighting scheme. In the moontower.ai app users can view daily vol changes scaled to several choices of tenors. The point is that by normalizing at all, it is easy to see which straddle spreads changed. Implied volatility is a shortcut to get to a price and prices are what your p/l depends on. If you have a time spread on, the change in its price is the thing you care about.

Any vol weighting scheme you choose will not be perfectly accurate (if it were you could literally predict how the vols would change relative to each other, in which case you should have already chucked your phone in the ocean from the hammock of your private island). But it’s a gigantic improvement in risk monitoring from raw vega. Of course, any summary measure is a trade-off between convenience and resolution. The explicit trade-off with √t scaling:

Benefit: Easy to see changes in time spread prices across the term structure. Highly intuitive and interpretable.

Drawback: Inaccurate. Said differently — there’s lots of room to improve the accuracy of the weights from empirical data which will lead to better understanding of vega risk.

Personally, I think dialing in better weights would increase cognitive load. You’ll need to think about what regime or lookback the updating weights are drawn from. The inaccuracies of the constant straddle spread assumption are “the devil I already know”.

[At the risk of wasting digital ink, it should be obvious that designing metrics always depends user context. Building infra and dashboards is an exercise in being a product manager that serves many clients — traders, risk managers, accounting, and back office.]

√t scaling is a meaningful but rough improvement in measuring vega by treating weighting each month differently. Constant straddle spread is an implicit dynamic embedded in the calculation. It’s an assumption. So how do vols across a term structure actually move relative to one another?

Let’s see what we can discover by hackin’ on some data.

Viewing term structure

We are used to looking at charts like this:

That’s a snapshot of the SPX IV term structure. It’s an ascending shape. We used the term “steepness” to refer to the ratio of 1M/6M vol. In this case it is less than 1.0 since 6-month vol is premium to 1-month.

It’s a common shape (in oil we referred to this as the “droopy penis”). It’s relatively quiet, near vols are subdued, back months expect mean reversion to longer-term averages.

In early August, this would have been sharply descending with a steepness much greater than 1.0 as the near term stress and high realized vol was baked into the front of the IV term structure and sloping down to vols which suggest the markets will eventually calm down.

Instead of a snapshot, a time series can help us capture all that motion. We are going to use about 1 year of GLD data (10/2/23-9/23/24) for the rest of this post.

This is a daily time series of 10d, 30d, 90d, 6M IV. The darker blue line is the 10d IV and the sparkly blue line is the 6m IV. You can see how the 10d IV is itself quite volatile sometimes sagging below the 6m IV (“the droopy penis”) and sometimes shooting way above it into a backwardated or inverted term structure.

This is an instructive way to see term structure behavior albeit highly zoomed out.

Let’s use another chart for a closer look. We will include 2 time series:

The 1M/6M vol ratio
The 1M/6M implied forward vol

A forward vol represents the implied amount of volatility that exists between the 1M and 6M expirations. (It sounds complicated but 1 month VIX future settles to “what will the VIX, a 30d forward looking measure, be in 1 month”)

You can use the free moontower.ai forward vol calculator to play with the idea and read about the concept.

The central point is that forward vol is another way to consider the difference in relative volatility between 2 expirations. Seeing things in multiple ways is the focus of this post.

What stands out about this chart?

Here’s what I see:

The forward vol is sometimes high and sometimes low regardless of the ratio.

Let’s say you get an 80 on a test, but there’s a prediction market on your final grade in the class that is trading for 90. The market is implying that you are going to be acing the rest of your tests.

Conceptually the math here is similar — when the front month vol is low and the vol ratio is far below 1.0 (steep term structure) the forward vol can and often does look quite high. The market makes it expensive to lock in cheap volatility for a long time even when the vols are low (the expense will manifest as “roll down”).

But look at early March — not only was the vol ratio low, the forward got crushed. If you only look at vol ratio, you missed this.

Implied forwards are an orthogonal or complementary measure of relative volatility that is additive to your perspective.

I’m a big fan of waiting for imperfectly correlated signals lining up to size up.

Remember, dragonfly eyes.

Next week we will continue with part 2 where we:

deconstruct the nature of this relationship further
consider what the differences between lenses means for finding opportunities and measuring risk
substantiate both the how and why of extending the analysis
- I plan to actually publish the extended analysis as well, but it won’t fit in next week’s letter on top of the rest of this

“How did you solve that math problem?”

The last few issues I’ve talked about mathacademy.com (no less than 7 readers are now doing it for themselves and/or for their kids!).

My mother was visiting this week and was doing the diagnostic over my shoulder while I was working on it. It really bugged her to realize how out of practice she was in elementary math so we went through some refreshers.

We reviewed a bunch of exponents stuff, for example, why 1/2 of 2²⁰ is 2¹⁹.

This is apparent when you think about it. But one of the things I noticed about how she and I do math is how methodical she is with trying to find the formula and how that’s not my first instinct at all. My first reach is always “what’s a simpler analogy and then extrapolate”. If that doesn’t work then get the pencil. I mean a lot of my motivation for retaking math ed is because my only mode is ‘trader math’. Formulaically, I reminded her that multiplying by 1/2 is the same as 2⁻¹ which is how she relates to the problem — she knows the rule for multiplying exponents with the same base is to add the exponents.

[My mom reads moontower believe it or not so it’s nice to share this in print even if a bit corny— we’ve always bonded over math. She went back to school in her 50s to get a college degree. She even took Java and C++. She is a determined learner at heart even if formal education took a backseat to more urgent pragmatism. She cut her college days short to work and get married back in the 70s. I was born the week she turned 24. Meanwhile, my eldest was born hours after my 35th birthday. Just acknowledging the change in norms in a single generation makes me feel like a flea in the sweep of time — no need to invoke cosmic proportion or even geographic birth lottery to think of how lucky I am to feel even remotely resourced while my kids are still kids.]

If you want a similar math problem to practice I shared Barclays quant question back in July:

Lily pad

You start with a single lily pad sitting on an otherwise empty pond. You are told that the surface area of the lily pad doubles every day and that it will take 30 days for the single lily pad to cover the surface of the pond.

If instead of one lily pad you start with eight lily pads (each identical in characteristics to the original single lily pad), how many days will it take for the surface of the pond to become covered?

A thought on the Lily Pad question and more:

[My son Zak solved it just like I did — by realizing the answer is the same as if you started after Day 3. My mother preferred the 2³⁰ / 8 = 2³⁰ / 2³= 2²⁷. The different ways we reason through a problem show up yet again.

I suspect my son is railroaded into my method because it wasn’t natural for him to see that representing 8 as 2³ was desirable for the purpose of doing exponent division (which follows a mechanical rule of subtracting exponents).

But getting to the formulaic version is what my mom searches for first.

Even when I was on the trading floor where you had to do mental math quickly to make markets, I enjoyed asking the people standing next to me how they priced the structure. There was a lot of variation. It’s a fun thing to ask others and, as I discovered, people usually like explaining how they mental math so it’s an all-around feel-good exercise.

One of the things I like about common core math is the emphasis on seeing numbers in different ways. My 8-year-old reflexively turns numbers into “friendly numbers” ie ending in 0s before doing operations, then undoing the adjustments before finalizing his answer. They are taught to do this. People my age usually landed on this method organically. But it’s good to teach it.

That said, Nate Bargatze owns the best common core bit:

Money Angle

Here’s a question I made for my mother to drill the exponent stuff that doubles as an investment problem.

For a fixed tax rate and rate of return is it better to have your return taxed every year or wait to be taxed on the gains all at once at the end?

Knowing the answer to the question is useful in itself but I also want to mention a collateral benefit. The meta-process for approaching the question can help organize your numerical intuition.

Think of what is required to answer:

1) recognition

What kind of problem is this?

Well, it’s a compounding problem.

What does that tell us about the function?

It’s exponential. It takes the form y = abˣ

2) ask yourself where the variable in question (in this case the tax rate) makes the largest impact

Is it as part of the a or the b?

Since the b gets exponentiated (the historical term for this is “involution” or “involuted”) the tax term will have its largest impact there.

I gotta run — I only have hours to secure my spot in mathacademy’s Iron League. I can’t not be gamified.

☮️

a riddle related to American-style options

Friends,

I saw a fun riddle this week. To get you in the right mindset before sharing it I’ll introduce the so-called secretary problem. I first came across this concept when I was a trainee at SIG from John Allen Paulos’ Innumeracy in the context of choosing a mate.

From Wikipedia:

The basic form of the problem is the following: imagine an administrator who wants to hire the best secretary out of n rankable applicants for a position. The applicants are interviewed one by one in random order. A decision about each particular applicant is to be made immediately after the interview. Once rejected, an applicant cannot be recalled. During the interview, the administrator gains information sufficient to rank the applicant among all applicants interviewed so far, but is unaware of the quality of yet unseen applicants.

The question is about the optimal strategy (stopping rule) to maximize the probability of selecting the best applicant. If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately.

The shortest rigorous proof known so far is provided by the odds algorithm. It implies that the optimal win probability is always at least 1/e or about 37%

The reason the secretary problem has received so much attention is that it’s the optimal policy for the problem, the stopping rule is simple and selects the single best candidate about 37% of the time, irrespective of whether there are 100 or 100 million applicants.

Key Insights

[These are a mix of my thoughts and Llama 3.1, the LLM you can chat with from Whatsapp]

37% provides sufficient information about the distribution of quality.
Maximizes probability of selecting the best option.
Balances exploration and exploitation
(this should remind you of the multi-armed bandit problem — a problem so diabolical that the Allies considered “dropping” it on German scientists as the ultimate nerdsnipe — to distract them from the more urgent matter of developing weapons. See my notes from Algorithms To Live By author Brian Christian)

Real-World Applications

Job Searching: Interview 37% of candidates before making an offer.
Dating: Meet 37% of potential partners before committing.
Shopping: Research 37% of options before purchasing.
Recruitment: Screen 37% of applicants before inviting for interviews.

Assumptions

Random arrival: Options arrive randomly and independently.
No recall: Previously rejected options cannot be revisited.
No additional information: No new information becomes available after observing an option.

Limitations

Small sample size: With few options, the 37% Rule may not provide accurate results.
Non-uniform distribution: If options are not uniformly distributed (e.g., clustered), the rule may fail.
Correlated options: If options are correlated (e.g., similar), the rule may not account for this.

Practical Considerations

Difficulty in estimating 37%: Real-world applications may make it challenging to determine the exact 37% mark.
Time constraints: The rule assumes unlimited time for observation and decision-making.
Multiple criteria: The rule focuses on a single criterion; real-world decisions often involve multiple factors.

Contextual Limitations

Irreversible decisions: The rule may not apply to irreversible decisions (e.g., marriage).
High-stakes decisions: The rule may not suffice for critical decisions (e.g., life-or-death).
Dynamic environments: The rule assumes a static environment; changing circumstances may require adjustments.

Application to financial options

With that background, you can see how American-style options are a specific instance of “optimal stopping time problems”. That’s because they can be exercised any time before expiration, unlike European options, which can only be exercised at expiration. The holder of the option must decide the best time to exercise, if at all, to maximize their payoff.

This is why American-style options are priced by simulations such as tree methods while European-style options have closed-form equations.

In the simulations, the value of the option is computed by looking at the value at the next time step (i.e., whether to exercise now or wait). A backward induction process unravels from the expiration date back to the present. The model calculates the optimal decision at each point in time based on the payoff of immediate exercise versus the expected value of holding the option.

With ALL that said, you are ready for the riddle!

Flip 100 coins, labeled 1 through 100.

Alice checks the coins in order (1, 2, 3, …) while Bob checks the odd-labeled coins, then the even-labeled ones (so 1, 3, 5, …, 99, 2, 4, 6, …)

Who is more likely to see two heads first?

Alice
Bob
Equally likely

The riddle is neat because it works the same muscles as pricing an option. In fact, the riddle doesn’t even require math!

🔓See my reasoning and the original thread

Welcome to 2024…

The cost to learn is COLLAPSING if your eyes are open. Culturally I can sense (and anticipate much more) hand-wringing over what this means for society but right now I cannot emphasize enough how you should at least be taking advantage of all the consumer surplus LLMs are dumping in your lap. Earlier this week I mentioned that I screenshotted a spreadsheet of futures data and asked ChatGPT to write the formulas I’d need to arrange it the way I wanted. It spelled out exactly what helper column I needed and where to place all the formulas. It even stepped through the method so I can understand why its solution works.

Now consider that riddle.

I prompted ChatGPT to give me the Python code to simulate the question many times to so I could validate my answer.
I ran the code in Google Colab (cloud-based Jupyter notebook)

This entire process takes seconds not minutes.

Here are the steps you follow upon seeing the riddle on twitter:

open 2 tabs — ChatGPT & Google Colab
ctrl-c from twitter
ctrl-v into ChatGPT
type “what code that simulates this”
ctrl-c the response
ctrl-v into Google Colab
ctrl-enter to run the script

[And yes I use a PC…loving my new Surface laptop btw]

In 5 years, an AI agent implanted in your reading glasses will know you wanted to do that when you scrolled over the tweet and a tooltip will simply be projected over the tweet with the simulation results.

Just kidding.

Twitter will be gone by then.

If interested here’s my Google Colab link:

🔗coin-checking script.ipynb

[Not to get into the weeds but I had to have a couple back-and-forths with the LLM because it made the mistake of thinking that the position that the head is found in determines if Alice or Bob won but it’s actually which ordinal observation that determines the winner. The process took more like 5 minutes as I had to prompt a specific debug and explanation. It doesn’t take away from the point — we are talking about orders of magnitude decreases in the time to code up this simulation for someone whose coding skills are as soft as mine.]

crossing the commodity chasm

I was planning to publish a follow-up to last week’s derivative “income” bumhunting where I look closer at various ETF performances. I started data-wrangling and then got distracted with yesterday’s trade idea. I will circle back to the ETFs in an upcoming issue.

As far as the trade idea this is what I ended up doing:

Liquidated most of my railroad shares (economic activity play that has held up well in contrast to commods)
Shorted .54 delta calls in November expiry options in WTI. I used Z24 NYMEX crude options which are “Dec” options but expire in November. The nomenclature comes from the commodity delivery window as opposed to the last trading day of the options. Confusing if you are coming from equity land.
Bought Dec2025 WTI futures

On balance, I added overrall risk-on length via the total quantity of Dec25 futures.

The oil legs assumed a .50 beta between Z25 and Z24 so I bought 2x as much Z25 as I sold in Z24 oil delta, which brings me to…commodity analytics. Like where did I get .50 beta from?

For some background, I spent 2005-2021 managing commodity options businesses. Despite moontower.ai being equity focused and the start of my career being in equities, ETFs, and equity options, commodity futures and options are my binkie.

One of my favorite periods of my professional career was building out the analytics for commodities. It was also a clue that I enjoyed building as much as trading. The career is most rewarding when you can not only drive the racecar but drive the one you built.

The analytics were similar to moontower.ai in the sense that they are volatility-centric and ignore fundamental data. But there is a tremendous amount of translation required to think about vol and term structure in the commodity markets.

A non-exhaustive list of features (or bugs) in commodity futures:

Physical delivery and cash delivery nuances
A single order book instead of multiple exchanges
A large, opaque OTC market whose contracts are cleared by exchanges but not traded on them
No insider trading rules
Different rebate mechanisms for volume traders
Less restrictions on market makers taking a “show” in voice markets
Even the underlying is highly levered — and exchanges can change margin requirements on a whim
Cross margining of look-alike products (ICE vs NYMEX WTI)
Options on spreads, Asian options, “new crop”, short-dated ag options, and 0DTE has existed in these markets for almost 20 years already (but were not traded electronically)
Close to 24 hour trading
Early exercise of calls and puts in a world without dividends or corporate actions
CFTC not SEC
Different arbitrage bounds — a time spread can trade for a credit because each option expiry is tied to its own underlying. This gets very hairy in commodities that are hard to store (ie nat gas or electricity) or highly seasonal ones such as ags which have “new crop”/”old” crop” dynamics that can actually have negative correlations (the performance of an old crop will affect planting decisions for the next season — high prices one year can lead to oversupply the next. Forward vols in commodity markets are a fun topic.

Infrastructure-wise you need a totally different “security master” or database mapping between financial instruments.

Underlyings can be referenced by many types of options in various combinations. Crack spreads, crush spreads, spark spreads. Options on all of them.
Every commodity has different expiration schedules, trading hours, last trading time, settlement procedures, naming conventions.
The markets pool liquidity in bespoke ways — October options in cotton are listed but never traded. If you trade nat gas you care a lot about 2 spreads in particular — H/J (March/April) and V/F (Oct/Jan). In RBOB, the gasoline spec for delivery changes in the North American summer.

Your security master needs to be flexible enough to accommodate all this variation. Your front-end analytics are going in the other direction — tuned for the idiosyncrasies of the markets you’re focused on. And then all the risk needs to be reported to central command in a way that fits with the overall portfolio risk while not losing the details when they matter. It is essential for the risk management layer to understand the drivers in these markets to devise appropriate shock scenarios and aggregations.

My first home in commodity trading was the oil complex. WTI, Brent, heating oil and gasoline (RBOB but I started trading it when HU was listed in parallel as the market was migrating to the RBOB spec). In that world you have American-style options expiries spanning from 1 day to many years into the future. You have calendar spread options (CSOs), crack options (I stood next to an independent trader that was long so many gas crack calls when Katrina hit that it was a noble percentage of a day’s worth of refining capacity in the US), Asian options, and European look-alike options that settled to swaps.

There were active markets in American vs European “switches”. You could exercise an American option for an hour after the market officially closed while the Europeans were cash-settled. If you were long Europeans on a strike, and short the Americans going into a pin…well, the market was gonna help you discover what that switch is worth.

If you trade WTI-Brent “arb” options, you are trading options on the spread between the 2 oil benchmarks. WTI and Brent have their own supply/demand dynamics because they are in different locations and vary by spec. This means different buyers and suppliers. Refineries optimize their throughput based on what end products are demanded and where (diesel, gasoline, jet fuel, bunker oil, etc). Throw in transportation costs, technicalities, and legality/tariff/sanctions on exports and you have a pair of highly liquid individual markets only loosely tethered by the wire of “arbitrage”. The arb options are a way to directly trade the spread. And then nerds can then relative value trade the vanilla options on each commodity vs the arb options. There’s no boxed arbitrage embedded in the math but the concept of implied pairwise correlation is a tradeable, albeit, messy parameter. And with the different expiration dates and times for the “same” month you will find yourself long or short a pile of unmatched option greeks for a string of days in between expirations (which also suggests that getting your volatility calendar correct is paramount).

(This implied correlation idea also exists in the context of calendar spread options. In 2007/2008 there were giant discounts in the WTI option forwards. The time spreads were incredibly attractive. The risk was you were inherently short spread vol so if you believed that the spread vol was highest conditional on the front month futures collapsing, then CSO puts were a clever hedge. The trade set-up was caused by the SEM Group blowout. I feel like I owe their desk a thank you for effectively buying my first apartment for me.)

An aside

The natural gas futures and options world is (was?) the king of cruft. It felt like a racket for churning exchange fees and broker commissions.

Expirations were so cumbersome because the liquid underlying was a physical future but the options that referenced them barely traded — instead the cash-settled options were traded but the cash-settled underlying swaps were not. They weren’t even listed, they were only cleared by the exchange! I can still remember being worried that I’d have an error, outtrade, or unreconciled position contaminate the grid I used to project how many futures I needed to buy/sell above/below each strike to replace the cash-settled deltas that were “going away”.

Oh yeah, and the swaps themselves had a different expiry than the futures so there were highly active markets in TAS (“trade at settlement”) futures, pen/LD swaps (“penultimate mini vs last day” swap spread), “futures/pen” (physical futures vs full penultimate swap), and the EFS (“exchange for swap” — last day futures vs last day swap). All of these things are tied together by algebraic relationships so you can triangulate the implied market in one from the legs of the others. Except none of this trades on a screen so you were still doing mock trading math based on quotes you are seeing on AOL IM or YM…in the 2010s!

I don’t remember the full history of how the markets eased into these conventions but I believe it was the marriage of disjointed OTC, exchange-traded, and physical markets. I’m not even getting into how the multipliers on these contracts work — if you trade X mmBTUs in the physical market that’s a size per # of days in the month which is how pipeline operators think and to which the financial players need to adapt.

And just like the oil market, there’s a NYMEX version and ICE version and sometimes you didn’t find out what you got until after the deal was consummated.

Hence venting on my Lost Xtranormal Video (Fonz, remember when we scripted this?)

It remains true that the number of commodities you can trade is much smaller than the number of equities, but there are loads of futures expirations, futures option expirations, strikes, and instruments.

A basic infrastructure starts with the futures chain and term structure. While I don’t currently have a proper infra I did grab several years of simple end-of-day WTI futures prices prompted by my oil trade idea.

I got distracted from writing the ETF post into creating a bunch of charts that demonstrate things I like to look at in futures. I’ll share them below and the reason a vol trader should care.

A quick comment on infrastructures.

You should be able to examine futures data in at least 2 ways:

Fixed expiration (ie a history of the Dec24 contract)This is especially useful for seasonal charts (ie X-axis is Jan thru Dec and the lines are contracts for various years)
Relative expiration (ie M1 or CL1)The ordinal number 1 refers to the first contract listed. This is the basis for “continuous” contracts. It’s also important because of the Samuelson effect which I mention in the interview with Dean — a contract with 12 months until expiry is less volatile than the same contract with 1 month until expiry. Which is another way of saying the M1-M12 spread has volatility and that volatility has its own properties.

The importance of having both views compounds when we layer in option analysis.

We’ll keep to a tight format. A chart and its relevance. All data is WTI futures settlement on the NYMEX.

[LLM note: To organize the data there was one instance where I wasn’t sure the easiest way to do it in Excel. I figured I’d need a helper column but instead of trial-and-error I just gave ChatGPT a screenshot of the spreadsheet and a description of my desired output. Abracadabra. Even the explanation for how the formula worked was perfect. I don’t know how long until it’s here but I feel like sometime shortly, organizing the data, laying out the charts, and the blog post will all be done by autonomous parallel AI agents. Come to think of it, why would I even prompt it…the LLM will just prompt better questions than I’m bothering to answer just by knowing what data is in front of me and the history of my blog posts.

The movie Fight Club makes more sense to me now than it ever did before.

Nerds being the victim of their own success is the own goal the Colosseum has been waiting for]

Alright, let’s get to it.

We start with a couple high level charts. They are useful when inspecting a commodity for the first time to get a sense of its nature.

Term structure time series chart

To keep it legible, I just chose 3 ordinal months — M1, M8, M15. You can see how steep the contango was in April 2020 when the front month went negative!
You can also see how for the past 5 years, the oil price has been backwardated (descending term structure) whenever the prompt price has been at least $45. I didn’t load the data from the 2010s but the last few years have been regime. Perhaps reflecting the idea that oil-demand in the future is uncertain with the focus on alternative fuels. (Personally I keep my core oil length in the deferred futures incentivized by an implied positive roll return and because the mandate also comes with a lowered incentive to invest in supply. In other words, vibes. I don’t know anything about the future but energy is part of my asset allocation. Hedgers sell the back so that’s where there should be a compensation for providing liquidity which manifests in a roll return. If I have to pick a spot on the curve to fill that bucket, that’s where I’m going.)

Term structure cloud chart

I group term structures by month and average them. Looks like a vol cone doesn’t it.
The back end has a tighter range than the front. It’s less volatile. Supply and demand are more elastic with a year to go than with a month to go.

Volatility term structure is one of the most important tradable concepts for option traders. Time spreads, straddle swaps, implied forward vols. So here’s the kind of riddle I might give in an interview:

There’s 3 contracts listed — M1, M2, M3

They each have their own option chains and the ATM implied vols are 30% across the board.

1) Is the option term structure flat, ascending, or descending?

2) Give me a framework for computing the forward vol.

Instead of an answer, I’ll offer a clue:

The vol ratio cloud

This is a chart of 1 month realized volatility for each contract divided by M1’s realized volatility (grouped by month)
Notice how some periods of time the back months are far less volatile than the near months. I think of this as volatility in the futures spread absorbing the volatility from the back months as the near month is being driven by some current concern that is not propagating to the backs. May 2020 being the archetypical example as we were running out of near term storage during COVID.
How does this idea inform your thinking about the riddle above?

Zooming in on spread volatility

To reduce the noise from M1 a tad we focus on the M2/M12 futures behavior from Jan2021 until last week.

In the top panel we see:

Time series of both futures prices. This is not a continuous roll return display so at each expiry as M2 inherits M3’s price there’s a small bump in futures prices that you would adjust for if you were working in a returns context.
The green time series of the futures spread defined as M2 – M12 (front – back is a commodity convention but the equity market defines it the opposite way. Another source of confusion and “Texas” hedging for traders who cross the chasm from commods to equities or vice versa). Notice that the spread has been positive (a backwardated market) for almost the entire period.
The silver time series just normalizes the spread value by the M2 price into percent terms.

In the bottom panel:

The red line is the rolling 21d standard deviation of the spread price changes. The spike is the Ukraine invasion where the near-dated futures skyrocketed relative to the backs. Inelastic demand in the front meets supply concerns. We don’t compute percent volatility (what if a spread price is zero or negative) but instead measure the price volatility directly.
The white line is the MAD or “mean absolute deviation”. This measure of volatility tends to be lower since we don’t amplify large moves by squaring them as we do with standard deviation.
The army green line (right axis) is the MAD/St Dev ratio. An MAD less than .80 typically signifies a skewed or fat-tailed distribution of moves. See the [👿MAD Straddle for more color on this idea.]

Spread “delta”

This is a scatterplot of spread price vs M2 price.

The slope of the regression tells us the “spread delta”. For example in 2022 (yellow), a $1 change in M2, meant a $.44 change in the spread.

If you are long the futures spread (long M2 and short M12), you are inherently long the market. If you want to isolate your trade to just betting on the slope between the 2 prices you must weight the position by the spread delta. So for each M12 you sell, you buy only .44 M2.

This comes in handy if you are trading a large options book with deltas in each month. You normalize them all to M1 deltas as a quick, liquid hedge. The r2 of the regressions give you a sense of how volatile that delta estimate is. A lower r2, a worse fit. If the fit is relatively poor, you will likely delta hedge your spreads directly and more often to reduce the noise.

A few observations:

2021 & 2022: the underlying had a wide range of M2 prices
2021 and 2023 had the lowest r2 indicating more variation in the spread “delta”
Spread delta is highest in 2024 ($.57 change per $1 move in M2)

This observation brings us to the last chart. A high spread delta typically means a wider divergence between contracts as M2 moves around. The spread is volatile. If the front month goes up a dollar, the back month “lags” more than if the spread delta were lower.

Say it however you like:

More volatile spreads mean that the volatility ratio between the back and front is lower.
The beta is lower.
The back is not “keeping” up with the front.

We can observe this directly by looking at each of those years’ M12/M2 realized vol ratio.

Notice that the high spread delta of 2024 corresponds to a low M12/M2 vol ratio.

In 2021, when the spread delta is a mere .18, the vol ratio between the 2 months spends plenty of time above 80% and even goes above 100% as the futures curve moved in parallel rather than flattening/steepening.

Can you see how spread volatility and vol ratio would have profound influence on how to interpret the options term structure on these contracts?

If the spread volatility is high, meaning the realized vol ratio of M12 to M2 is low, then if you are long time spreads you are going to be short options on the thing that moving a lot, and long options on the thing that’s lagging. You want to make sure you are getting the appropriate vol discount to hold that! Measuring what that is will get you to a proper understanding of the vol term structure and implied forward vol.

If you are coming from equity vol land, where the options are struck on the same underlying this is a new frontier for you.

We close with the bottom panel which simply shows the rolling correlation and beta (beta is vol ratio * correlation). This is the correct number to use for weighting your future positions not just the aforementioned vol ratio. The beta can change due to the vol ratio or the correlation and it’s worth decomposing it to see what’s driving the inevitable mismatch between your hedge ratios and realized p/l.

Trading commodity vol after trading equity vol can feel like a foreign world at first. But trading equity vol from trading delta one is an even bigger leap. Once you get used to commodities they actually feel cleaner. I’m super rusty on thinking about early exercise, dividends, rev/cons, and merger-arbish math because all those muscles atrophied from a life in commods.

Even the strange commodity trades I talk about with Dean in the interview revolve around USO and UNG — commodity trades that got tangled with SEC wrappers.

Commodities have their own language and their own grammar. But they are globally pertinent and tell their own gripping stories of history. I’ve recommended it before, but Javier Blas and Jack Farchy’s book World For Sale is an absolute banger. The best book I’ve read in my last 20.

Let’s leave it there.