years worth of option education in under 90 minutes

A few days ago I got the idea to do a screencast where I use an option chain and greeks explain a bunch of vol trading concepts.

None of my front-ends really look like what I had in mind so I spent Wednesday building a minimal viable version to allow viewers to look over-my-shoulder as I explain some stuff.

On Friday, I just turned the camera and started blabbing. No prep. I had an open afternoon so no time constraint. I just let it rip. On a Twitter livestream.

I hear it was helpful. I decided to call it Years worth of option education in under 90 minutes. That was the most click-baity title I could give it and still live with myself.

I re-watched it to chronicle what you actually can learn. Turns out it’s a lot of stuff that’s pretty hard to come across if you haven’t spent time on a prop desk.

Give it a gander. Love to know what else can help.

Modeling a vol curve

  • Computing a forward
  • Specifying a vol curve with standard deviation gridpoints
  • Computing the gridpoints
  • Inputting skew parameters at the points to fit the market
  • Using Excel’s linest function to get the coefficients of an n-order polynomial
  • Using the curve to estimate IV for any strike

Option valuation

  • Implementing Black Scholes for European-exercise style options
  • Includes greeks and N(d1) and N(d2)
  • Numerical methods for estimating gamma and theta

Interpreting skew

  • How large skew values lead to counterintuitive probabilities as the implied distribution balances probability with magnitude
  • Using vertical spreads to see the implied distribution
  • Changing skew parameters to watch the spread prices change and the distribution shift
  • How skew “corrects” the Black Scholes distribution to match empirical distributions
  • Comparing implied distributions to “flat sheet” distributions

Understanding vol changes day over day

  • The difference between fixed strike and “floating” strike vol changes
  • How fixed strike vols change arise from the interaction of spot moves and skew parameters change
  • Why fixed strike vol changes drive your p/l

Dissection

  • How market makers actually use classic option structures and synthetic relationships
  • Option traders “chunk” their positions to understand them just as seasoned chess players don’t see random configurations of pieces but see “mini-themes” that they understand deeply. For option traders these themes are structures like butterflies and condors
  • How market makers “take structures out of the position” to minimize hedging costs

Decomposing vol p/l from greeks

  • Learn how to use your gamma and theta to estimate the realized vol portion of your p/l
  • Learn how to use your vega to estimate the implied vol portion of your p/l
  • See how delta p/l comes form options and share positions
  • Understand how the tug-of-war between gamma and theta relates to the stock’s move on the day

Uncategorized

  • Pulling market data into Excel
  • why the late 90s tech bubble was not irrational and how option markets understood that
  • bubble distributions from the lens of the option market
  • Put-call parity
  • An intuitive way to estimate gamma p/l from middle school physics math: delta = velocity, gamma = acceleration, price change = time passage, and distance = p/l
  • This shows why p/l is a function of the stock move squared

how an option trader extracts earnings from a vol term structure

Earnings are a highly concentrated source of volatility for public companies because besides reporting results they give guidance on the future, discuss what they are seeing across business lines, as well as risks and opportunities for growth. Earnings reports are a rich source of information and in the Claude Shannon sense of the word, information is volatility.

As expected, option prices that include the earnings date command a premium implied volatility as the market expects the stocks to move on the burst of new information. The observation of a premium earnings IV leads investors and traders to important questions.

  • How much is the premium? In other words, how do I disentangle the amount of volatility that is “normal” vs the amount coming from the market’s expectation of how much the stock will move?
  • If I am a volatility trader focused on the relative value of options between names or I am a dispersion trader who cares about the relative vol levels between and index and its components how do I compare the volatility between a name with earnings (or a event specific to the name) to other names?

Our task is beckons. We must extract earnings from the vol surface.

That probably sounds like a tedious, quanty operation. But it’s not. It’s actually a pretty simple procedure once you understand the building blocks. In fact, the procedure is an implicit review of 2 main topics. Because this topic encompasses* the prior topics it acts as a test of your knowledge as well as a step forward.

Prerequisite Building Blocks

I won’t review the building blocks here but I’ll point you directly to the relevant calculators which document the procedures.

1) Implied forward volatility

Given 2 expirations we can effectively subtract the volatility of the near dated expiry from the later dated expiry to imply a forward volatility or the amount of volatility implied in between the 2 expirations.

2) Event Volatility Extractor

When the market anticipates events like a stock’s earnings date, it often factors increased volatility into the affected option expirations.

Traders analyze this implied volatility by separating it into the volatility for the event day itself and the typical daily volatility.

To do this, a trader estimates an expected move size for the event.

The unintuitive impact of events

It’s worth emphasizing how important events to understanding an option surface. It’s one of those things that intuition is a poor guide to. The arithmetic is worthwhile.

Consider this situation.

A straddle has 40 business days until expiry. The name typically moves 1.5% per day. We’ll just use trader math to estimate a fair annualized volatility of 24% (1.5% x 16 because 16 is approximately √251).

However we get 2 new pieces of info.

  1. The IV is actually 36%
  2. Earnings occur in 35 business days.

We can estimate an earnings vol by acknowledging that term vol includes 39 “regular” days and 1 “event” day.

We presume that a regular day has 24% annualized vol. So what “event vol” makes the term vol worth 36%?

We are basically solving for what event vol reconciles these facts given that we know the average vol (the term vol) and the “regular” vol.

[Keep in mind variances are additive but not volatility. Variance is simply vol squared.]

Term variance = regular variance + event variance

.36² * 40 days = .24² * 39 days + X² * 1 day

Solve for X.

x = event vol = 171%

The event is a 171% vol event for a single day but this is in units of annualized volatility.

Convert back to daily volatility by going in reverse — divide by 16. (I’m resisting a reference to the Spaceballs vacuum scene).

171%/16 = 10.7%

Remember that’s now a daily vol (aka standard deviation). We should convert it to a straddle as a percent of the underlying because that corresponds to the what people actually talk about — “expected move size” on earnings.

Just multiply by .8 since a straddle is the same as the mean absolute deviation.

.8 * 10.7% = 8.6%

[To review, see 😈The MAD Straddle]

Let’s take inventory.

  • The stock moves 1.5% per day which would correspond to a 24% vol name.
  • However, the vol is 36% implying that on earnings it’s expected to move 8.6% on that single day.

The variance coming from all regular days is 39 * .24² ~ 2.25 (unitless, unintuitive number)

Event variance is 1 * 1.71² ~ 2.94

Despite earnings being 1/40 or 2.5% of the weight in day terms, it’s 2.94/(40 x .36²) ~ 57% of the total variance until expiry. That day has more option premium associated with it then all the other days combined. The bulk of the straddle decay occurs on that day.

This also means the theta of the preceding days is lower than you think. In practice, what happens is the vol creeps up every day offsetting some of the model theta. You can think of a glide path where as you get closer to earnings the average vol per day increases as “low vol days” peel off and the earnings day drives bulk of the straddle. This same mental image can help you understand why an event very far in the future doesn’t show up so strongly in the terms structure — its impact is diluted by the sheer quantity of regular days before it.

[These concepts underpin the trading strategy known as Renting the Straddle.]

* See educator and MathAcademy architect Justin Skycak’s explanation of encompassing vs prerequisite graphs as well as Principles of Learning Fast


Now you are convinced that this is some part important, some part interesting and you already have a taste of the most complicated math it requires (6th grade). We just need to pull it together.

Extracting earnings from a term structure of implied volatility (as opposed to a single expiry) requires using our building blocks in conjunction. The same technique can be extended to multiple earnings as well as any kind of event.

This is a good time to remind you that much of the trading is about making apples-to-apples comparisons. Normalizing data so that the comparisons are relevant is so much of the work to be done. It’s more grindy than sexy. But it also shifts the focus from what novices think investing is about to the work that actually needs to be done — measurement not prediction or “seeing the present clearly”.

As we step through an earnings extraction, I will point to real-life examples of what I mean by measurement not prediction.

A few selling points on this post:

  1. The building blocks do the heavy lifting so this won’t take long.
  2. The yield is insane — this is one of those topics that opens lots of mental doors.
  3. I provide a link to a spreadsheet so you can play with the ideas yourself or extend them as desired.

An “ugly” term structure

I fetched .50d IV’s for NVDA at end-of-day 1/17/25.

 

This is an ugly term structure.

There are 2 primary reasons.

  1. Market widths and leans in option bid-ask will shop up as artifacts in your surface fits.
  2. There are events in the option surface. Most notably there is an event embedded in the 2/28/25 expiry. We know that because the IV jumps 10 points from the prior week.

Option market makers are like blue-collar household help. Their job is to “iron out the kinks”. Buy the cheap IV and sell the expensive IV when they see a wrinkled term-structure.

But if you did this by looking at the graph, you’d be selling the following expiries:

  • 2/28/2025*
  • 6/20/2025*
  • 9/19/2025*
  • 1/31/2025

You’d be buying:

  • 2/21/2025
  • 1/24/2025

Here’s the term structure again but we simply change the x-axis to DTE instead of expiry and annotate our naive buy and sell axes.

The forward vol matrix is a granular way to get the same idea.

I’m obviously using leading words like “naive” to indicate something important is missing which is creating all these kinks.

We are going to address the most glaring kink, the 2/28 expiry which jumps 10 vols from the prior expiry and showing up as a 1 week forward vol of 80%. The other kinks are naturally handled from the transitive logic of how he handle the big kink. Always handle your big kinks first ;-P

Accounting for events

Your reflex when you see a bump in the term structure is “what known event is happening between the expiry dates?”

In this case, NVDA report earnings on Wednesday, 2/26/2025. The 2/28 expiry “captures” earnings most acutely. The earnings vol is embedded in every expiry from 2/28 and beyond it’s impact is attenuated as DTE grows. That earnings even becomes a smaller percentage of the total variance in the term. If you are looking at a 5 year option, the earnings vol will be invisible whereas the 2/28/2025 expiry has the bulk of its variance coming from that single day.

Here’s the plan.

1) We use the event volatility extraction formula which takes in DTE, IV and a guess for the event straddle (ie move size) to translate all the vols with earnings into “ex-earnings” vols. I’ll use the terms “ex-earnings vols” and “base vols” interchangeably.


💡A note on nomenclature

I’ve heard “ex-earnings vols” called:

  • clean vol
  • base vol
  • regular day vol
  • non-event vol
  • non-earnings vol

The point is that you are looking at a surface where known events have been removed. This allows higher fidelity comparisons between names. The event that is extracted is an implied or consensus move size. If you buy a vol with an event inside it you might be betting on the base vol being too low OR you believe the consensus move size is “too cheap”. Isolating what your betting on comes down to trade structuring. Maybe it’s a calendar spread, maybe it’s a “rent the straddle” glide path trade.

Depending on your trading lineage, even this nomenclature can be confusing. In my background I also referred to vols using a 365-day tenor as “dirty vols” and vols implied from a custom tenor as “clean”. For example, if you treat holidays and weekends as a 25% variance day your tenor will be about 280 days. I don’t want this post to encompass too much, but if you’re a glutton see Understanding Variance Time.


2) We can now chart the term structure of the base vols and feed them back into the forward vol matrix!

The goal of the guess is to see a smoother chart and matrix. Smoother. Real science-y stuff here.

Based on simple guess-and-test (which is easy to do with the spreadsheet I’ll provide) I came up with an earnings straddle of 6.5%.

The matrix looks much better (the smaller kinks are still present since we only dealt with one, albeit large event, but I’ll address that later. It’s easier to focus on one major thing at a time.)

Notice how the 2/28 expiry, stripped of earnings, now follows a gentle up-sloping term structure instead of stickin’ up like a sword from a stone.

[Bonus observation: power law functions handle vol term structures well. Remember a power function can be converted to a line using a log-log transformation where your variable Y is vol and X is DTE so you can fit a linear regression. You can start to imagine a wider infra where you have a well-defined event calendar, extract implied events sizes everywhere, and fit base vol term structures to identify kinks, ie buy and sell signals. As it dawns on the reader what relative value vol trading looks like. Throw in layers of execution topics and you can see the basic truth — there isn’t any magic sauce it’s just fastening a thousand submarine doors before the thing can go anywhere. And every day the state of the art of little door details inches up.]

Let’s see the chart if we try 5% and 8% earnings move respectively. The first chart keeps the dirty earnings curve and ex-6.5% earnings curves for reference.

The second chart zooms in.

And the matrices:

If you assume an 8% earnings move, the 2/28 expiry looks cheap. If you assume 5%, it looks rich. The first step is to find what makes the curve look well-behaved and then based on your view on base vol, earnings vol, or both you can isolate how you should trade it.

Beyond a single event

Even with the adjustment, I’d readily admit the term structure is pretty kinky. But the ugliness is useful because it’s an opportunity to step-through what actually happens in practice. Let’s talk about how to iron the kinks that remain and see what’s left over.

Let’s step through some notable points on this chart. I’ll be clear in my explanation but I’ll do them a bit out of order.

Point #1: This doesn’t stand out as being too cheap relative to the rest of the curve because there’s no reason to assume that upward sloping-term structures are “wrong”. But there’s a technical reason it might appear so low…these vols use a 365 day model. This snapshot is taken on Friday. Vols “appear” to go down on Fridays and shoot up on Mondays but its a sawtooth artifact of a model which treats every day equally. This is explained thoroughly in Understanding Variance Time.

Point #2, a thru c: The 1/31/2025 expiry is a busy macro week — inflation data, jobless claims AND an FOMC meeting. If we extract event straddles from this expiration the base vol will fall to line up much better with the power function. The expiries behind it (b and c) also contain that busy week but its effect will be diluted while contains the brunt of it. So will be higher than a but less than c which will fall the least. See how it’s creating upward steps!

We’ll come back to #3 and #5.

Points #5 and #6: June and September expiries. Notice June has a bigger bump than September. Have a guess?

Earnings! Although the earnings dates are in May and August they fall AFTER those expiries and are captured by the following month!

via Wall Street Horizon

September has a smaller bump than June is because it’s further out in time. The impact of a single day move is proportionally smaller for a longer-dated option than a shorter-dated one.

Back to #3: This is the 2/21 expiry. This one is interesting. If we impose a lump of variance for FOMC week while it definitively has a larger impact on #2 a thru c, it will still have some impact in the 34 DTE 2/21 expiry which means as low as it looks, it’s even cheaper than it looks. If the snapshot is accurate, 2/21 looks like a candidate to buy vol. If you thought earnings vol looked “expensive”, you could sell 2/28, buy 2/21, then cover your 2/28 short as 2/21 expired. When I say “you” this is pros who’d kick their grandma down a flight of stairs for a tick of edge. You can throw trades like this onto the pile of tiny edges. I doubt the juice is worth the squeeze for retail. However, if you are looking to buy or sell options outright, then understanding this can help you on the margin. It’s a “I’d go for it on 4th and 3 but not 4th and 5” type of knowledge.

[An alternative thought…does the week before earnings structurally deserve a lower vol because the chance of the company saying anything material is close to zero? It’s good hygiene to wonder what you’re missing whenever something looks cheap or expensive. If you cleaned up all surfaces for events would you find the week preceding earnings to look cheap across the board?]

And finally #4: The 3/21 expiry looks expensive even after adjusting for earnings. Again, you’d want to doublecheck the vol returned by your snapshot.

If you thought earnings were expensive, a more oblique way to express it would be to buy #3 (2/21 expiry) and sell #4 (3/21 expiry) which corresponds to a 51.5% forward vol NET of adjusting for earnings. Most of that “value” does seem to be driven by 2/21’s cheapness rather than 3/21’s expensiveness (something you can notice by observing how the 2/28-2/21 forward is more stretched than the 3/21-2/28 forward despite us thinking when we anoint 2/28 as fair. This can inform your weighting of a calendar — you sell 3/21 maybe you buy twice as much 2/21 if you think that’s the best leg. This is where having additional info about the flows that are pushing options around and the general “art” of trading is apparent.)

Pretty pictures

If you clean every event, iron out all the kinks you might just find a well-behaved curve. “Listen” to the market carefully It’s a call and response:

You: “Look a kinky opportunity”

Market: “Nah, there something coming up. This is what people expect.”

You: “Ah, thanks for the heads up, I’ll incorporate it”

Market: “Aren’t you gonna cast your vote?”

The response is up to you, but see the present clearly. Remember, measurement not prediction.

I’ll leave you with a spreadsheet so you can play with an event size and see how it propagates through the term structure. Smooth curves = smooth forward vol matrix.

💾Moontower Event Vol Matrix

Spreadsheet screenshots:

 

Final thoughts

Trading vol around events is a major topic.

At scale, quants will have more “proper” methods for doing this but I can tell you that a significant portion of my career earnings have come from understanding this stuff. (It was 20 years ago, about 2005, that I was starting to build this infra. All in Excel by the way.)

The techniques improve. I’m not a quant as I’ve said many times. I don’t know the state-of-the-art but with some simple math and yea a lot endurance, observing, noticing you can go quite far.

Is this gonna turn you into SIG or Jane? Hell no, but these are the ant trails that take you to the questions. To a frame of mind that measures for and seeks contradiction. Notice how little broad opinions matter. Instead, you are trying to turn market prices into mini-hypothesis. Trades are tests against hypothesis.

But it starts with measurement.

Here’s a few questions that option traders are asking every day.

How does the surface/consensus synthesize knowledge about:

  • prior earnings moves?
  • seasonal earnings moves?
  • time-series of implied earnings moves?
  • how earnings vol is cross-sectionally priced broadly and by sector?
  • how implied correlation is priced during earnings season?
  • how VRP’s look ahead of earnings? after earnings? if we clean implieds to get to base vols, do we clean realized vols after known events have passed to have base realized vols?

These are all active areas of inquiry. They are not solved problems. They will be eventually but then the conditions of their solving will have meant another set of opportunities will emerge.

If you are an aspiring pro, insert yourself somewhere and just start chopping wood.

how to become the main character on voltwit

A reliable way to have Nassim Taleb come through your window to call you an iiiiidiot is to see a stock crash and say something like “That was a 10-standard deviation move!”

On #voltwit, the corner of fintwit with an option fetish, a surefire way to get QVR’s Benn Eifert to QT you is to quote VIX changes in percent terms.

The topic resurfaces every time VIX makes a large move, voltwit predictably reacts so I figured Grok (X’s LLM chatbot) must know about this.

I asked Grok some questions and sure enough it seems like it’s been paying attention.

“Is it wise to quote VIX changes in percent terms?”

  • Volatility of Volatility: The VIX itself is inherently volatile, so small percentage changes might not always reflect significant shifts in market sentiment or conditions.
  • Misinterpretation: Large percentage changes from a low base might exaggerate perceived market fear or complacency. Conversely, small percentage changes at high levels might understate the market’s condition.
  • Contextual Information: Percentages without context (like the baseline level of the VIX) can be misleading. A 5% increase in VIX might mean different things at different times.

Do vol traders dunk on tourists who quote VIX percent changes?”

In the trading community on platforms like X, there’s often a playful or critical banter between experienced volatility (vol) traders and those perceived as “tourists” or less experienced traders.

  • Misinterpretation Risks: Vol traders often emphasize that quoting VIX in percentage terms without additional context can lead to misinterpretations. For example, a 10% increase from a very low VIX level might not be as significant as the same percentage increase from an already high level.

     

Benn’s primary gripe with VIX percents is the behavior of VIX is level-dependent. Its distribution is not congruent at high and low levels of vol.

Notice how the Y-axis is VIX vol points not percents.

In chatting with Benn about this article he pointed out a basic mechanic that makes vol level-dependent:

Volatility is inherently about squared returns, so you can have a very low base level of realized vol but all it takes is one big-ish sized return and because we’re squaring it (along with all the other little returns in the window) it’s going to have a massively outsized impact on window realized volatility. That makes vol very jumpy from low levels.

Another vol manager, Kris Sidial of Ambrus, explains it simply. Note my response below it.

There are multiple contexts in which it is quite useful to measure percent changes in volatility. There are tradeoffs, as you’d expect with any measure. But I’ve always been forceful about the need to slice things from different angles. It’s a healthy way to identify mixed signals, but it’s also affirming when sufficiently different angles agree.

A good example of this “multiple angles” idea is the 2 part series:

Let’s get into a few reasons to measure vol in percent changes.

Cross-sectional comparison

As a relative value options trader, I would typically have an “axe list”. These are vols in various names in various parts of the surface I thought were relatively cheap or expensive.

[The idea of an “axe list” is covered in the Moontower Mission Plan]

Armed with my opinions, I would then buy the options I thought were cheap on days when their strike vols were underperforming, and sell the expensive ones on days the strike vols were outperforming.

Because I’m looking at vols cross-sectionally it makes sense to look at the percent changes in the vol. A one-point move in SPY is much larger than a one-point move in TSLA.

[See Understanding The Vol Scanner for a full explanation.]

Notes and caveats

1) Measuring percent changes in vols work well “locally”.

For example, it was common in modeling spot-vol correlation in oil to assume that as oil futures went up 1%, that vol declined 1%.

[This dynamic corresponds to a “constant ATM straddle regime”. It is easily visible from the straddle approximation formula.]

But nobody believes that doubling the oil price will suddenly lead to a halving in vol. The model only works “locally.”

2) Percent vol changes can be further refined by normalizing for “vol of vol”

If SPY vol changes from 20% to 21%, a 5% change in vol level might still be more significant than TSLA vol changing from 60% to 63%, also a 5% change, because TSLA vol of vol might be higher. After all, it might be common for TSLA vol to move 5% per day.

The analogy to regular investing would be the difference between a dollar-neutral position and a beta-weighted position. If you are long $100 of TSLA and short $100 of SPY, your portfolio will act like it’s long even though in dollars it’s flat. You are long beta because TSLA is more volatile.

[I’m ignoring the correlation aspect of beta because it’s not central to the argument.]

3) An extra note on “vol of vol”

If you measure vol of vol based on changes in ATM vol you are getting a confounding reading. Like if you measured your pulse with your thumb.

Why?

ATM vols are “floating” strike vols. If SPY drops 1% and ATM vol increases by 1 point, that might just be movement along the vol curve. The vol on the 99% strike might have simply been 1 point higher than the prior day’s 100% strike. On a fixed strike basis, the vol didn’t change. In this case, the appearance of a vol change merely reflected a change in the underlying.

For vol trading purposes, you usually care about fixed strike changes (ie curve shifts not movement along the curve) because that’s what drives the vega p/l of the attribution.

Risk and P/L measurement

The second reason to care about percent changes in vol only applies to vol traders. Vol traders defined as traders who run a delta-neutral book and make their edge from isolating cheap and expensive vol.

That said, the discussion should be highly educational for anyone trying to learn options or as a useful self-test for traders who might be interviewing and expected to talk about managing a book.

Let’s back up to consider vol risk. Specifically, vega, the sensitivity of your option p/l due to changes in implied vol.

We start with a scenario. Assume the ATM and at-the-forward (ATF) strikes are the same.

You buy 100 December ATM straddles in stock A and short 100 December ATM straddles in Stock B.

Stock A and Stock B trade for the same price.

Stock B has 2x the implied vol of Stock A.

Are you vega-neutral?

Are you theta-neutral?

[You can look at the greeks from an option calculator to help but if you are an experienced option trader you shouldn’t need to.]

Ok, let’s get to the answers.

You are vega-neutral. Recall the straddle approximation:

Since vega is just change in option (in this case straddle) price per 1 point change in vol, then:

vega = .8 * S * √t

Look at the formula — ATF vega has no dependence on vol level!

Since S and t are equal then your long and short vega perfectly offset.

[Note: OTM option vega DOES depend on vol level. They have volga or “vol gamma” which is what fuels vol convexity.]

Ok, you’re vega-neutral.

Are you theta-neutral?

Again we don’t need an option model. If Stock B is 2x the vol as Stock A its straddle is 2x the price. If both stocks don’t move until expiry, all options go to zero. Necessarily, Stock B experienced 2x the decay.

If you are short the straddle in Stock B, your portfolio collects theta. It is NOT theta-neutral.

Vol traders will often think in terms of vega. “I bought $50k vega in ABC today”.

At the same time, they often try to run a roughly theta-neutral book.

[See Weighting An Option Pair Trade for a discussion about vega and theta weighting and how the weighting should be matched to the expression of your bet — proportional vs spread].

In the riddle above, being vega-neutral did not mean theta-neutral. But we can actually transform vega so that a vega-neutral position is correlated to a theta-neutral position!

Another way to measure vega: “vega per 1%”

Let’s say the vega per straddle was $.50

If you buy 100 straddles your vega is 100 x $.50 x (100 multiplier) =

$5,000

If vol increases by 1 point you make $5,000 from the change in implied vol.

Assume that the implied vol of the straddle is 25%

Multiply the vega by the vol:

$5,000 x 25% = $1,250

Watch what happens if we raise the vol by 1% or .25 points instead of 1 point:

Vega p/l: $5,000 x .25 points = $1,250

Remember when we raised vol by a full point from 25% to 26% (or a 4% change in the vol) you made $5,000 or $1,250 x 4)

By multiplying the vega by the vol itself we have created a new measure:

Vega per 1% measures the vega p/l per 1% change in the vol.

Let’s return to the original riddle.

We now assign implied vols to the straddle. You are long 100 straddles of Stock A at 25% vol and short 100 straddles of Stock B at 50% vol.

While this is vega neutral, it is NOT vega per 1%-neutral

Stock A “vega per 1%”: +$5,000 x 25% is +$1,250

Stock B “vega per 1%”: -$5,000 x 50% is -$2,500

You are net short $1,250 vega per 1%

This perspective is useful for a few reasons:

1) Linear estimate of p/l with respect to percent changes in vol

If vol is up 3% your p/l is simply 3 x vega per 1%. If you are using a view like “vol scanner” to see all the percent changes in vol cross-sectionally the changes will map easily to your vega per 1% risk

2) Vega per 1% proxies a theta-weighted position which is how vol traders often think about their risk and the idea that they are betting on relative proportional vol changes.

If you are short vega per 1% you are collecting theta

Multiple angles

Looking at vega in both the conventional way (p/l sensitivity per 1 point change in vol) and vega per 1% reveals features of a position.

If you are long vega per 1% but short vega, what does that mean?

Any combination of the following:

  • You are short time spreads,
  • You are long high vol options and short lower vol options. Owning skew or vol convexity are both examples of this.
  • Cross-sectionally you are long high vol names and short low vol.

[Note in all these case it’s possible to be paying theta and short gamma locally. But if you shocked the position in a scenario analysis you likely make a ton of money. The relationships between Greeks are all clues as to what is lurking in a complex portfolio.]

In the riddle scenario, to be flat vega per 1%, you must ratio the trade and be short 1/2 as many high vol straddles. Note you will be net long vega. You will win if all the vols parallel shift higher (ie they all go up 10 points), but if they maintain their .5 relationship the p/l will be flat, consistent with the meaning of flat vega per 1%.

Understanding your greeks means understanding what you’re rooting for. You’d be surprised to know that sometimes option traders don’t even know what they’re rooting for.

When you get down to it, any large percentage change in vol is going to require multiple angles to understand. Your p/l isn’t going to line up because vega itself will change as the underlying changes and vol changes interact. Measuring percent changes on small numbers is usually a bad idea and requires transformations to find divine anything worth mentioning.

Does it make sense to talk about a 75% change in VIX from a base vol of 8%? Of course not. One of the reasons you know that is because it can’t fall 75% from 8%. That’s a clue right there that “standard deviation”, a concept we learned about from symmetrical pictures in HS math texts, is not in charge.


In sum, percent changes in vol can be useful measures but you have to know how to wield them and where they break down.

Unless you want to be the main character on voltwit for a day (and have to fix your broken window). But if you’re ok with that at least go the extra mile and do some technical analysis on VIX.

Final caveat

When you use vega per 1% you implicitly assume that both assets have the same vol of vol. In other words, if a 15% vol name’s IV bounces around 1 vol point per day, then a 30 vol name bounces around 2 vol points per day. This may or may not be true but it’s a better guess than raw vega weighting, which would show being long 100 straddles in A (25% IV) and short 100 straddles in B (50% IV) as flat.

arbitrage is a hall of mirrors

Given where markets are these days, there are a lot of investors, often former or current employees and execs of Mag7 names that are sitting in large, concentrated position at a low cost basis.

In English, they’re as rich as celebrities but standing standing right next to you giving out Pocky on snack duty for 3rd grade soccer.

They are reluctant to sell because the tax hit is immediate. One possible solution to “have their cake and eat it too” is to stay long but collar the stock. This is typically presented as buying a put option financed by a covered call.

Here’s an example based on closing TSLA option prices on 1/28/2025 for the Jan 15th 2027 expiry (ie 717 DTE).

The stock closed around $396.65.

We can just round numbers, call it $400.

You can buy the 25% out-of-the-money put, the 300 strike, for about $56 and sell the 25% out-of-the-money call, the 500 strike, for about $108.

To be perfectly clear — you can buy the put for protection, sell the upside call and COLLECT about $52 or about 13% premium.

Think about the risk/reward for a moment.

If the stock drops $100 in 2 years you are stopped out at $300 but you collected $52 so your net loss is only $48 or about 13%.

If the stock climbs to $500, you will get assigned on the short call so you’ll make $100 on the long stock position but still get to keep the $52 premium for a total gain of $152 or about 38%.

In other words, you can stay long the stock but you get paid 3x what you lose on a $100 up move vs $100 down move.

It sounds like free money.

The prices come from option theory’s arbitrage-free (ie risk neutral) pricing.

This is a checklist of forces that seem to create the illusion.

✔️The forward price is actually $430

We know that because if you look at the option chain, despite the $430 call being ~ $33 out-of-the-money, it’s the same price as the 430 put which is in-the-money.

The reason for this is because if it didn’t you could put on a reversal or conversion trade to arbitrage the funding rate on the stock.

Think of it this way, if the 430 call cost $20 more than the $430 put you could sell the call, buy the put and collect $20. At expiry, since you are short the $430 synthetic stock you are guaranteed to sell TSLA at $430 (either you exercise the 430 put if it’s ITM or get assigned on the 430 call if it’s ITM). So you can buy the stock today for say $397 which would be a (mostly) riskless position since you are long the stock and short the synthetic. The cashflows would be:

  • Collect $20 on the synthetic (remember you sold the call for $20 more than the put)
  • Ensure a profit of $33 by expiry (you bought TSLA for $397 and will sell it at expiry at $430)
  • Forgo ~$32 interest on $397 for 2 years (assume 4% rfr)

Net arbitrage profit: +$21 in excess of funding costs!

If the 430 call traded $20 UNDER the put you would do the arbitrage in reverse. You’d buy the call, sell the put and be guaranteed to buy the stock for $430 at expiry. To hedge you would short it today at $397 and collect $29 on the cash in your account.

So at expiry you are buying the stock for $430 that you shorted at $397. Cash flows:

  • -$33 on buying TSLA synthetically and shorting it today
  • +$32 in interest on cash proceeds from the short
  • +$20 in option premium (remember, you sold the put $20 higher than the call you bought)

Net arbitrage profit: +$19 in excess of funding costs!

If the RFR is 4% (which it approximately is) then the 430 call and put must traded around the same price for there to be no arbitrage.

Therefore $430 is the 2 year at-the-forward strike.

✔️Despite both option strikes being $100 or 25% away from the spot price, the call is much “closer”

Part of this has to do with the forward being $430. Referencing the 430 strike the 500 strike is only 16% OTM while the 300 strike is now 30% OTM.

The option that is “closer” has a higher delta and worth more due to moneyness.

But the other reason comes from the fact that Black-Scholes assumes a lognormal distribution of returns (which is a positive skew distribution).

Why? If a stock is bounded by zero but has infinite upside the OTM call will be worth more than the equidistant OTM put. The distribution is balanced around a median stock expectation that is dragged lower by volatility (if you make 25% then lose 25% you are net down over 7%).

In TSLA’s case the 300 put has a -.20 delta while the 500 call has a .60 delta!

(TSLA also has an inverted skew — the call IV is touch higher than the put IV but that has a minor effect on the cost of the collar in the context of this discussion.)

Here’s a summary table including the collar price if the IV was the same for the 300 put and 500 call:

💡What this post “encompassed”

If you understand this post you have implicitly reviewed:


I called this post “arbitrage is a hall of mirrors” because no-arbitrage pricing theory created this situation where the risk/reward of the collar looks incredibly attractive.

Part of that is theory explicitly incorporates the opportunity cost (the risk-free rate) while our intuition tends to gloss over it. Opportunity cost is an easy topic to understand when someone explains it to them, but it’s trickier to apply in live decision-making scenarios. Look no further than rich people who clip coupons or drive 10 miles out of their way for Costco gas.

The output of arbitrage-pricing can be dissonant to our eyeball tests. It was one of my favorite topics to write about because it does feel so warped.

🟰Understanding Risk-Neutral Probability

This is my guide to the subject. It’s full of nested problems, Socratic method, and even financial theory as philosophy. I’ll re-print one of the nested sections:

👽Real World vs Risk-Neutral Worlds

No-arbitrage probabilities allow us to price options by replication

The insight embedded in Black-Scholes is that, under a certain set of assumptions, the fair price of an option must be the cost of replicating its payoff under many scenarios. Any other price offers the opportunity for a risk-free profit. Have you ever wondered why the Black Sholes “drift” term for a stock is the risk-free rate and not an equity risk premium (like you’d expect from another type of pricing model — CAPM) or the stock’s WACC? A position in a derivative and an opposing position in its replication is a riskless portfolio. Therefore that portfolio only needs to be discounted by the risk-free rate. Option pricing derived from a no-arbitrage replication strategy means we should use the risk-free rate to model a stock’s return.

‼️What seasoned option traders get wrong: Outside of the option pricing context, the risk-free rate is the wrong assumption for drift!

From Philip Maymin’s Financial Hacking:

One of the most common mistakes that even highly experienced practitioners make is to act as if the assumptions of Black-Scholes (lognormal, continuous distribution of returns, no transactions costs, etc.) mean that we can always arbitrarily assume the underlying grows at the riskfree rate r instead of a subjective guess as to its real drift μ. But this is not quite accurate. The insight from the Black-Scholes PDE is that the price of a hedged derivative does not depend on the drift of the underlying. The price of an unhedged derivative, for example, a naked long call, most certainly does depend on the drift of the underlying. Let’s say you are naked long an at-the-money one-year call on Apple, and you will never hedge. And suppose Apple has very low volatility. Then the only way you will profit is if Apple’s drift is positive; suppose Apple has very low volatility. Then the only way you will profit is if Apple’s drift is positive…if it drifts down, your option expires worthless. But if you hedge the option with Apple shares, then you no longer care what the drift is. You only make money on a long option if volatility is higher than the initial price of the option predicted. The drift term of the underlying only disappears when your net delta is zero. In other words, an unhedged option cannot be priced with no-arbitrage methods

💡Takeaway: Arbitrage Pricing Theory

Sometimes called the Law of One Price, the idea contends that the fair price of a derivative must be equal to the cost of replicating its cash flows. If the derivative and cost to replicate are different then there is free money by shorting one and buying the other. This approach is how arbitrageurs and market-makers price a wide range of financial derivatives in every asset class including:

  • Futures/Forwards
  • Options
  • ETFs and Indexes These derivatives are the legos from which more exotic derivatives are constructed.

A Source of Opportunity

Let’s recap the logic:

  1. Arbitrage ensures that the price of a derivative trades in line with the cost to replicate it.
  1. A master portfolio comprising:
    1. a position in a derivative
    2. an offsetting position in its replicating portfolio
    3. This master portfolio is riskless.
  1. A riskless portfolio will be discounted to present value by a risk-free rate otherwise there is free money to be made.
  1. The prevailing prices of derivatives imply probabilities.
  1. Those probabilities are risk-neutral arbitrage-free probabilities.

But those probabilities don’t need to reflect real-world probabilities. They are simply an artifact of a riskless arbitrage if it exists.

This can lead to a difference in opinion where the arbitrageur and the speculator are happy to trade with each other.

  • The arbitrageur likely has a short time horizon, bounded by the nature of the riskless arbitrage.
  • The speculator, while not engaging in an arbitrage, believes they are being overpaid to warehouse risk.

Examples

1) Warren Buffet selling puts

The Oracle of Omaha engages in oracular activity — not arbitrage. Warren is well-known for his insurance businesses which earn a return by underwriting various actuarial risks. Warren is less famous for his derivatives trades. [The fact that he rails against derivatives as WMDs might be the most ironic hypocrisy in all of high finance but as I always say — we are multitudes.] Like his insurance business, the put-selling strategy hinges on an assessment of actuarial probabilities. In other words, he believes that real-world probabilities suggest a vastly different value for the puts than risk-neutral probabilities. The major source of the discrepancy comes from the drift term in Black Scholes. Warren is pricing his trade with an equity risk premium in excess of the risk-free rate that a replicator who delta hedges would use.

The option traders who trade against him can be right by hedging the option effectively replicating an offsetting option position at a better price than the one they trade with Berkshire. Warren is happy because he thinks the price of the option is “absurd”. In Warren Buffett is Wrong About Options, we see this excerpt from a Berkshire letter during the GFC:

notion image

Jon Seed writes:

Warren’s assumptions aren’t crazy. In fact, they seem to be pretty accurate. As Robert McDonald derives in the 22nd Chapter of his 3rd addition of Derivatives Markets, a 100 year put for $1bn assuming 20% volatility, a long-term risk free rate of 4.4% and a dividend rate of 1.5% implies a Black-Scholes put price of $2,412,997, close to Buffett’s $2.5 million. But Warren isn’t discussing risk-neutral probabilities, those assumed in Black-Scholes and imputed by volatility assumptions. He’s evaluating the model’s probabilities as if they were real, actual probabilities. If we, (really Robert McDonald), evaluates Black-Scholes using real probabilities by also incorporating our best guess of real equity discount rates, we see that the model is consistent with Warren’s common sense approach. Assuming stock prices are lognormally distributed and that the equity index risk premium is 4%, we would substitute 8.4% for the 4.4% risk-free rate, obtaining a probability of less than 1% that the market ends below where it started in 100 years. Buffett also assumes that the expected loss on the index, conditional on the index under-performing bonds, will be 50%. This again is a statement about the real world, not a risk-neutral world, distribution. With an 8.4% expected return on the market, the implied expectation of $1 billion of the index conditional on the market ending below where it started is $596,452,294, or 59.6% of the current index value. Again, this is close to Buffett’s assumption of 50%.

2) FX Carry

FX futures are derivatives. Their pricing is a straightforward output of the covered interest parity formula. I think I learned this concept on the first day of trader class back in the day.

The key to the formula is recognizing that the value of the future is just the arbitrage-free price arising from the difference in deposit rates between the 2 currencies in the pair.

If a foreign currency offers a higher interest rate than a domestic currency, you expect its future to trade at a discount. We won’t bother with the math since the intuition is sufficient:

If you borrowed the domestic currency today to buy the foreign currency so you could earn the yield spread for say 1 year, you’d have a risky trade — you’d be exposed to the foreign currency, and its associated interest income, devaluing when you try to convert it back to the domestic currency.

therefore, to make the trade riskless, you need to lock in the forward rate today by selling the future. You know what that means — you expect that forward rate to trade at the no-arbitrage price

The higher-yielding currency must therefore have a lower forward FX rate.

The carry trade is basically a speculator saying:

“I know the future FX rate should trade at a discount to the spot rate but I’ve noticed that the future rate rolls up to converge to the spot rate, rather than that lower rate being a predictor of the spot FX rate in one year.

So I’m going to buy that FX future and hold it for a profit.”

The carry trade is not an arbitrage or riskless profit. It’s a risky profit. But the opportunity arises because the futures contract would present an arbitrage at any price other than the risk-neutral price.

another XIV brewing in crypto?

If you don’t know what MicroStrategy (MSTR) then congrats, you have won life. Close this tab and go back to sliding down rainbows and swimming with otters.

For those who remain you likely know that Saylor has been financing his BTC purchases from sale of convertible bonds.

I have nothing to add to that conversation but I have a trade idea. It’s gonna take some background to build up to it.

First, there are 2 required reads. They aren’t long and they’re excellent. The best combination. I will highlight some key points from them.

Convert of Doom: Microstrategy and the dark arts of ‘volatility arbitrage’ (6 min read)
by Alexander Campbell

This post explains how Saylor is effectively arbitraging the MSTR’s volatility by issuing convert that pay zero interest. This works because a convert is just a bond with an embedded call option. By delta hedging the implied vol in the embedded option, dealers or investors can earn a return if the realized vol exceeds the implied vol. The expected return presumably compares favorably or at least similarly to if MSTR just issued interest-bearing debt but Saylor, is effectively transmuting volatility into interest payments.

[In general, when a convert is first issued it’s common for both the stock and vol to decline as dealers hedge both the delta and often the implied vol by selling long-dated options to offset some of the vega they’ve bought at a discount.]

Campbell is both educational and insightful showing how:

1) the Merton model can be used to understand why MSTR is so much more volatile than BTC — the MSTR’s premium to NAV is positively correlated to BTC!

(In Battle Scars As A Call Option, I explained how one of my most painful trades occurred when I was long UNG vol when it went premium to NAV. In that case, the sizeable premium was inversely correlated with the price of gas. The exact opposite scenario of MSTR’s juiced vol today!)

2) this is a regulatory arbitrage.

Quoting Campbell:

Result: Retail buys MSTR shares at 150% premium while sophisticated investors arbitrage vol differentials and MSTR books the diff between all these trades as profitable transactions.

Here’s the irony: We require hedge funds to register with the SEC, spend $50-500k annually on compliance, and limit themselves to accredited investors with millions in the bank. Yet retail investors can freely buy MSTR shares through Robinhood.

And therein lies all the difference. There’s nothing wrong with what MSTR is doing, but it’s a good example of the law of unintended consequences.

Regulators block retail from ‘risky’ hedge funds while inadvertently pushing them into something potentially more dangerous.

By restricting crypto access for years, regulators left retail investors few options. Bitcoin futures required $300k contracts with 50-100% margin. ETFs were obscure or nonexistent. So people bought MSTR instead – a far more complex and potentially risky vehicle.

In trying to protect retail investors, the SEC has inadvertently funneled them into a potentially much riskier product.

Which brings us to the next required reading:

Moonshot or Shooting Star? A Volatile Mix of MicroStrategy, 2x Leveraged ETFs and Bitcoin (7 min read)
by Elm Wealth

Oh how I love the existence of levered ETFs on concentrated ideas. This post echoes a very real possibility of XIV’s “volmageddon”.

Something we’ve discussed ad nauseum in this letter is volatility drag and how geometric returns diverge much lower from arithmetic returns as we increase volatility. The divergence is proportional to variance or volatility squared.

The article links to a neat calculator which offers hands-on lesson in volatility drag.


💡Learn more💡

And linking these to options which is where we are heading:


Exponents are good, wholesome fun. And this post was certainly that, inspiring the the trade idea we’re building towards.

The long quote below (emphasis mine) cuts to the heart of the matter.

Now let’s use some data to look at the probability of going bust just from a single really bad day. The price of a 2x leveraged ETF should go to zero if the price of the stock underlying the ETF goes down by 50% or more in a single day. The probability of such an event is a function of the variability of the MSTR stock price. If we assume the volatility of MSTR will be about 90% (or 5.6% per day), then we could think of a 50% decline in the stock price in one day as being a roughly 9x daily volatility move. A natural question is how often do stocks with very elevated variability, like MSTR, experience days when they decline by 9x their daily variability in returns?

We looked at about 1500 US stocks over the past 50 years, chosen so that at some point they were within the top 1000 stocks by market-cap. We found that the annual probability of such stocks experiencing a one-day price decline of 9x daily volatility was about 6%.

[Kris: The fatness of the tails should swipe you like a dragon. In Mediocristan, 9 standard dev moves don’t happen.]

This isn’t quite the final answer though, as we need the probability of a stock dropping by that much some time during the day, rather than just close-to-close. The usual estimate for the probability of touching a level over some time interval is to simply double the probability of being below that level at the end.

[The explanation of this is the same logic we’ve discussed whereby we estimate the probability of a one-touch by doubling the delta. Here’s Elm Wealth explaining:

To see why this is true in a simple random walk without drift, note that for every path that finishes below the level at the end of the period, there is another path where it hit the level and then followed a path that was a mirror of the path that finished below the level. So, for every path that finished below the relevant level (here a 50% drop), there’s another path that touched the level but then reflected and wound up above the level at the end.]

So, assuming MSTR volatility of 90% per annum, the probability of a down 50% intra-day move occurring at least once over the next year is about 12%.

If we use the MSTR volatility implied by the options market of 160%, then down 50% is only 5x daily volatility. The same data as above yields a close-to-close annual probability of about 30%, which we estimate as about a 60% probability of an intra-day drop that would send the ETF to 0.

There are a number of alternative perspectives one could take in trying to estimate this probability: for instance, trying to estimate the probability of a large one-day drop in Bitcoin and how that might impact the MSTR premium to BTC. For example, a 25% one day drop in BTC and a 33% collapse of the MSTR premium would imply a 50% drop in the MSTR share price.

[Kris: This hints at the MSTR premium vs BTC correlation Campbell wrote about]

A more complex analysis might try to estimate whether it is possible for these leveraged ETFs to become large enough that their daily rebalancing trades could themselves drive the price down 50% in one day. For example, imagine that MSTR rapidly triples in price due to some combination of BTC rally and an increase in MSTR’s premium to the BTC it owns, and the assets in the MSTR leveraged ETFs go from $5 billion to $30 billion. The market capitalization of MSTR could be about $270 billion and the leveraged ETFs would be owning $60 billion, or 22%, of MSTR stock outstanding.

Now imagine for some reason, MSTR stock drops 15% during the day – which, given MSTR volatility, would not be unusual. The leveraged ETFs would need to sell $9 billion of MSTR stock at the closing price. Recently, MSTR daily average trading volume at the close of the day has been about $2 billion, so this would be quite an impactful amount of MSTR to sell at the end of the day. For every 1% the price declines further than the 15%, the ETFs will need to sell another $500 million of MSTR, and if that pushes the price down by another 1%…well, you can see this doesn’t have a happy ending for owners of the leveraged ETF or MSTR.

[Kris: see The Gamma of Levered ETFs]

Bottom line, we think there’s a pretty decent probability – somewhere in the range of 15% to 50% – that these 2x leveraged MSTR ETFs are effectively wiped out in any given year if they are not voluntarily deleveraged or otherwise de-risked sooner.


Towards a trade idea

The 2x ETF is MSTU and the 2x inverse ETF is MSTZ. Unless these are delevered, if MSTR [falls/rises] by 50% in one day [MSTU/MSTZ] goes to zero.

I’m going to walk you through my stream of consciousness as I reached the end of the article.

1) I’ll accept Elm Wealth’s logic , my first question is…um, are there options listed on the levered ETFs?!

Checkmark✔️

MSTZ is thin but MSTU has over 350k contracts of OI.

2) We are not in Kansas anymore. The distribution is extremely discontinuous.

On a hellacious down day in BTC coinciding with premium compression (that positive correlation that Saylor has been monetizing being his undoing is the kind of poetry markets like to write) and the telegraphed, reflexive ETF rebalance flows can take MSTU straight to zero XIV style.

And this can happen on any day.

3) So the next thought in the chain was to consider buying 0DTE puts. Like every morning before brushing my teeth.

This is a non-starter for 2 reasons.

i. 0DTEs are not listed on MSTU

ii. ODTEs don’t capture the overnight vol so you don’t own “all” the risk. This is especially important in BTC since it’s a 24 hr market.

4) We’ll come back to the question of expiry. I’m just adhering to the sequence of my thinking for better or worse (feel free to debug my mental compiler).

So what strike do I want? The bet only hinges on a Boolean outcome — did MSTR fall 50% or not?

If the thrust of the trade is so starkly binary then the put I want is lowest strike on the board that you can pay a penny for. I only care about maximum odds. The strike is the payoff, the premium is the outlay. So if I buy the $5.00 put for a penny I get 499-1 odds.

[Since we are thinking in a risk-budgeted binary way rather than in continuous option terms, a parallel framing would be the $5/$0 put spread]

It’s worth noting that this is a bit weird compared to typical investment scenarios. You really only care about the distance of the strike from 0 which determines the payoff and the premium. The price of the stock doesn’t matter because your payoff depends on a certain percent move happening. No matter what nominal price MSTU trades for, if MSTR goes down 50% MSTU gets wiped out.


Let’s start by thinking aloud about constructing a bet and work from concrete to abstract before we bring it back to concrete again.

Buying a 2-week put

Suppose you spend $1,000 2x per month to buy puts that cost $.01.

(Because of the 100x multiplier this translates to 1,000 option contracts)

If you are buying the $2.50 strike, you will get paid $250,000 if MSTU hits zero.

Over the course of the year, following this strategy will cost $24,000 ($1000 twice a month).

Say it hits on the last trial, your net profit is $226,000 (payoff – cumulative outlay). Call it 9-1 odds.

If MSTU has a 10% chance of hitting zero within the year, this is a fair bet. If the probability is higher, you have positive edge, lower you have negative edge.

This a good place to pause for birthday problem math. It allows us to convert into a useful unit of probability per day.

MSTU trades 251 days a year. If we think it’s 10% to hit 0 one of the days we can compute the probability of it NOT hitting zero on any given day like this:

(1 – p²⁵¹) = 10%

p = 99.958%

Converting to odds:

.99958 / ( 1- .99958) ~ 2382

The odds against MSTU hitting zero on a random day is 2382-1.

If there were 0DTEs if you could buy say any strike from $24 or higher for a penny you would have edge to your model probability of “There’s a 10% chance that MSTU hits zero this year”.

Using the daily probability to compute the chance of MSTU going to zero in 10 business days (roughly what a 2-week option encompasses).

1 – .99958¹⁰ = 99.581% or 238-1 odds.

If you can buy a 2-week $3 strike put for a penny you’d have edge to this probability.

We can extend the reasoning above to construct useful tables based on a range of assumptions about the “probability of MSTU going to zero with a year”.

 

Let’s step through an example.

If you believe MSTU has a 20% chance of going to zero in a year, then you need to 56-1 odds on a 1-month put for it to be fairly priced (and assuming the ETF getting zero’d is the only way to win).

[To compute the payoff ratio: strike / (strike – premium)]

If you could buy the 1-month $.57 strike (yes, a very low strike!) for a penny you would get the 56-1 odds.

I started with this whole “what option can I pay a penny for” reasoning because my intuition told me that for a trade like this you will want a strategy that trades an a very near-dated option for a teeny price because that’s probably where you are going to find the best odds in this framework.

But I should not get to anchored to either this penny idea or the notion that the near dated is absolutely the right way to play this.

At this point, it’s time to look at some data to see if:

1) the prices are ever attractive

2) can we narrow down an expiry range

Market prices

The first thing I did was pull up an option chain for the regular monthly expiry — Jan2025. Good news. While the far OTM puts markets are sometimes wide, several strike are 0 bid, offered at $.05 but critically last sale is $.01. There’s someone who sells these things for a penny.

[This is not the case for the puts in the less liquid MSTZ double short ETF. The markets are also much wider. What does this tell you?]

I fetched fitted end-of-day put prices for MSTU options from 9/24/24 to 1/6/25, filtering for all puts below 1.5 delta. FAR out-of-the-money puts (the puts that correspond to the 98.5% call).

  • I computed the payoff ratios for all puts less than 1.5 delta by comparing strike and premiums as explained earlier.
  • The color coding corresponds to expiry buckets in calendar days (ie 1-10, 11-20, etc)
  • I added a penny to all the premiums. So an option fitted to be $0 is marked at $.01

Right away 2 things stand out:

  1. The chart has trash scaling because of point #2
  2. As expected, you are going to get much better payout ratios on near-dated options. If there’s 2 DTE and the stock is $100, buying the $50 put for a penny offers 4999-1 odds.

So the intuition about the near-dated being better bang for the buck seems correct but the scaling is obscuring the picture and there’s another problem (we’re going to get to it but if you feel up to it, try to guess what it is. One hint is it’s not about transaction costs. That’s important and I’ll say a word about it later as well but that’s not the angle here.)

Let’s fix the scaling. Log base 2 compresses this range nicely and is easy to interpret (every tick mark doubles the payoff).

Ahh, much better. Now we see a smooth descent of payout ratios. To be clear, a y-value of 10 is 2¹⁰ or a payoff ratio of 1024-1. An 11 is 2048-1, and so forth.

Here’s the payoff table reproduced in log base 2 terms:

You can see how the puts suggest the market thinks the probability of MSTU disappearing is somewhere in the 10-15% range (but probably less since you can win on the puts without the ETF zero’ing).

The risk/reward on these near-dated puts is much higher than the deferred puts which is expected since we require much higher odds. Remember if we thought that there’s a 10% chance of MSTU zero’ing in a year, a $10 strike put can trade for $1 (ie 9-1 odds) and be fairly priced. But these short-dated options need to offer much better odds to compensate for a much smaller probability window of MSTU going under.

We need to compare the payoff ratios with the probabilities we inferred from the annual probability (the birthday math) for the stock zero’ing in 1 week, 1 month, etc.

But before that we can address the mystery problem.

By comparing the strike & premiums we can identify if an option is cheap or expensive compared to our model probability but we can’t assess the validity at the strategy level. In other words, we can’t answer whether it’s better to spend $24k on long-dated puts or $2k a month on 1-month puts.

To handicap that we need to adjust our payoff ratios by how often we need to trade so that we can now compare all the strategies on the same measuring stick — “if my annual probability of MSTU zero’ing is X, what’s the best approach”.

So we divide the payoffs by the number of times you must trade per year.

[Used some simple rules, ie for 1 dte, we divide by 251, for 30 dte by 12]

We don’t need to use log scaling for the strategy level chart.

The way to read this is if you think the annual probability is greater than say 20% (see the horizontal dashed lines) than all opportunities above the line are positive expectancy. There’s a lot more opportunity in the nearer-dated confirming the original intuition but every now and again it looks like a 2-3 month put gets fairly cheap.

The median payout ratio normalized to annual odds is 7-1 or 12.5% implied probability of MSTU offing itself.

 

In summary:

  1. MSTR is highly volatile
  2. If it moves 5-10x it’s daily standard deviation in one day to the downside, MSTU can go to zero.
  3. Those size moves historically (via Elm’s article) happen about as often someone rolls a 10 with 2 dice if we say MSTR is 100 vol. If we use it’s implied vol which is more forward looking, we’re it’s more like rolling a 7.
  4. The market seems to price the puts in-between those possibilities but we see that the price moves around quite a bit so you can scoop some when they get offered cheap.
  5. The more aum MSTU gathers the larger the end of day rebalance trade. Something to keep in mind.

Keeping a close eye on this, perhaps building a monitor around this idea is a nice way to grab a convex outcome. Especially one that I suspect has reflexive properties that are conservatively ignored in this independent events “birthday math”.

Endnote on execution

I assumed $.01 slippage on these options. If you pay $.02 for an option that we computed the payoff based on a penny, you’re getting half the odds. So when talking about really long odds and teeny probabilities and option prices, costs matter. Regardless, you have all the knowledge you need to compare max payoff to your own execution prices to bridge this fully to reality.

Path, VIX, & Hit Rates vs Expectancy

The CBOE’s VIX index interpolates 30-day implied volatility based on options struck on the SPX index.

A VIX future settlement price is based on the prevailing VIX index at the future expiry date. It’s a bit of a confusing concept. A future that expires to a VIX index level that looks ahead 30 days.

There are ETFs and ETNs that reference VIX futures (VIXY, VXX). They also come in levered and inverse forms (UVXY, SVXY).

Despite the abstract nature of trading “a level of volatility”, these are popular products. There is 2-way interest in them. SPX returns are inversely correlated to implied volatility making long VIX positions a natural hedge. At the same time, the upward-sloping term structure of SPX implied volatility means implied volatility in the future trades at a premium to volatility today. Many traders will short VIX futures and ETFs to capture the downward drift they expect if the market remains calm as the futures will “roll down” to converge with spot VIX by expiration.

Quant finance geeks about these volatility term premiums. Term structures recognize that volatility is mean-reverting. Historically, SPX realized volatility bounces around 15% give or take a couple percent over long periods. Implied volatility typically trades at a risk premium. The premium also bounces around but 16% IV on a 15% realized vol (ie 1% premium) is in the right ballpark.

The averages mask the distribution. VIX is bounded by zero. It’s rare for it to get to about half its average. It’s rare to double but less rare than halving from 15%. But it’s even possible to triple or quadruple (Covid and Aug 5th 2024 for recent examples). It’s also more common for VIX to go to 12 than 18, at least in recent years.

This low-res farmers almanac description paints a picture of a lognormally distributed index. VIX futures will drift lower frequently but occasionally spike and sometimes those spikes are very high (and fast).

It’s natural for us to think in terms of averages. This habit persists despite witnessing price moves that would be impossible if normal curves were in charge (and despite the warnings from cranky Lebanese deadlifters). The nasty side effect of Gaussian-brain is when it creates the illusion that something is massively mispriced when prices are just properly reflecting a skewed or fat-tailed distribution.

In the 2 min read, The Benefit Of Betting Culture, you can see how the price of a futures-style bet vs an over-under style focuses your attention on the distinction between probability and expectancy. This is the heart of the matter. Investors confuse hit ratios with expectancy constantly.

I field emails and calls too often that are basically retail traders saying “I was doing great selling options for 6 months then I lost it all in month 7”. The reasons for these mini-blow-ups vary from oversizing because they’ve been winning to naive pricing but the universal mistake is in the epistemology.

Some traders are executing without understanding the nature of the proposition. It’s not that selling options is a mistake (there’s a price for everything). It’s that you shouldn’t be surprised by the shape of the payoff. Roughly speaking, if I sell a 10-delta option every month and I win 6 months in a row, I haven’t learned anything about whether my strategy has an edge. I should expect to win most of the time. That says nothing about the expectancy. The person is thinking in terms of 50/50 averages, ie win or lose. But the proposition if it’s fair is more like win $1 9 times and lose $9 one time. If you have an edge, then you either win more often for the same payouts or the payouts are not as far apart but the hit ratio is the same. But most retail traders don’t have large enough sample sizes to infer anything from such skewed results. The track record is nothing but a statistically underpowered study.

Unlike rolling dice or flipping coins, it’s hard to learn anything about the distribution of prices from direct experience. Historicals help but you only have to look at acute incidents in markets over the past 5 years alone to appreciate the challenge of calibrating what’s improbable.

But we can strengthen our conceptual understanding to hopefully be a little less blind to hit rate vs expectancy (or median vs mean) illusions. Option surfaces themselves are great teachers in this regard. In a deeper understanding of vertical spreads, we’ve seen how call and put spreads are a rich source of information about a distribution.

In the remainder of this discussion, we’ll get some mileage towards internalizing the difference between hit rate and expectancy from a non-technical discussion about the price of a VIX future.


Pricing a VIX future (via arbitrage)

If you are a professional trader who just heard me say “price a VIX future” and “non-technical” in the same sentence, you feel like you’re at a Houdini act…” How’s this mf gonna pull this off?”

[cracks knuckles, bends neck side-to-side, deep breath]

Ok, a little background for the uninitiated.

The VIX complex of futures, ETFs, vanilla options and VIX options is one of the more technical areas of options trading. There are arbitrage triangles between these things.

They’re not exactly clean though.

Replicating a variance swap also isn’t clean (not every strike exists and even for the ones that do transacting entire strips is not economical). Neither is dispersion. Neither is isolating forward vol.

But all of these things lend themselves to a fair value that can be F9’d in Excel if you ingest the real-time bid/asks for the building blocks. Every large vol desk has a group that computes a fair value for VIX futures that is derived from SPX options and VIX options. You can trade around that fair value by being better bid on the building blocks that are relatively cheap and vice versa. Manage the residual risks and over time you make money.

I’ve never worked out a model for VIX futures fair value myself as I’ve never traded the SPX complex. But we can still step through it conceptually.

Imagine you are short 100,000 shares of VXX at $16.

*VXX references VIX futures. Just to avoid computing position ratios let’s just pretend VXX and the VIX futures trade for the same price.

You are short 100,000 vega because your position vega by definition is “change in p/l per 1 point change in volatility”.

If vol (ie the VIX future) drops by 1 point, you will make $100,000.

Arbitrage pricing comes from replication. If I can construct a portfolio with a cash flow of 0 in every state of the world, then I have a risk free position (and if I get paid to hold that portfolio I have an arbitrage profit).

To offset my VIX futures risk, I must therefore buy 100,000 vega via SPX options.

[This is conceptual, so we are hand-waving important details like what strikes, expires, weights and managing the deltas.]

At this point we are vega-neutral. Long SPX options, short VXX.

What happens if vol suddenly doubles?

You’re going to make a lot of money.

Why?

Because you lose linearly on your VXX short (-$1.6mm or 16 vol points on 100k shares) but you win more on your SPX option longs.

The reason: you are long not just ATM options but OTM options too. OTM options pick up more vega has vol increases. It’s like being long “vol gamma” (it’s literally called volga). Remember how a long gamma position gets longer delta as a stock goes up and shorter delta as a stock goes down. Well, this is the same effect but for vega via vol.

💡See Finding Vol Convexity for a full explanation.

The fact that you make money because your are long “vol of vol” means you aren’t quite replicating the VIX future though. That’s a problem.

[There’s a cost to being long “vol of vol” so we can deduce that vol never changed and expiration arrives that this so-called hedged position would have lost. There has to be a flip-side to the fact that if vol makes a large move that portfolio wins.]

Conveniently there is an instrument that’s a pure expression of “vol of vol”. You guessed it — VIX options.

The conceptual algebra:

VIX future = SPX options – VIX options

In our example, you can short VIX future, buy SPX options, and then sell VIX options to neutralize this long volga exposure.

This identity is loaded with insight.

  • If I’m long VIX futures and short SPX options, I’m synthetically shorting “vol of vol”.
  • If I’m short VIX futures and long SPX option, I’m synthetically long “vol of vol”.
  • If I’m short VIX futures and long VIX options, I’m short vol but long vol of vol which is similar to be being short SPX straddle but long strangles.

You can envision how looking at the VIX complex you can see which leg stands out as cheap or expensive relative to the others. Layer in implied correlation which relates index vols to single stock vols and suddenly you’re Neo in the matrix.

A day in the life of a vol arb desk is market-making all the flows with an axe. Based on the price of the various parameters like vol, skew, convexity, term structure and correlation you might be:

  • Selling VXX
  • Buying 1-month VIX calls
  • Selling 1-month single stock OTM calls
  • Buying weekly SPX calls
  • Selling SPX 6-month straddles
  • Buying 9-month single stock downside puts

Like a chess player chunking their position, you look at this and think:

“I’m short SPX call spreads and vol near-dated, long upside implied correlation near dated, long a 6 month/9month time spread with a dispersion kicker.

I’m long gamma, short vega, long tons of volga, paying theta”

[Note: the greeks will vary based on the ratio of position sizes. If you’re playing along at home you can try to map the positions to the first line of the summary. And for the greeks you can try to imagine what position sizes are required to make the sign of the greeks make sense]

You do this not because these positions are inherently right or line up with some macro view. You do this because the prices are “right”.

You take what the market gives you. Everyone who’s out there trading on their opinions is moving the price of these parameters around. You are agnostic. Pick up the edge, manage the risk.

All you care about is others having strong enough opinions to move prices around and that you can find contradictions in the matrix.

To quote the closing line in Pacino’s speech in Any Given Sunday:

“That’s football folks. That’s all it is.”

Pricing a VIX future (like an option)

The fact that a VIX instrument has a fair value in a similar manner to how an ETF has a NAV has always kept me away from it. Just like I wouldn’t trade an ETF if I didn’t know its premium/discount. If a box has a dozen donuts I don’t want to buy it for a price that implies a baker’s dozen. Negative edge.

That said, lots of people trade VIX products with a belief that they have an edge based on a relative value lens rather than an arbitrage framework. I’m guessing this leads them to selling VIX futures (which is probably the right side from the arbitrage perspective as well.)

[I’ve often thought that if I were to build a VIX or SPX suite in moontower.ai I’d want to “do it right” which is to use the arbitrage lens rather than extending the in-place moontower analytics to VIX as if it applied. I’ll leave it to you to decide if other platforms do it right or if they’re like children playing house pretending to be grown ups. By the way I have similar opinions about 0DTE. I’d use a totally different framework than the one we currently use in moontower.ai to deploy a 0DTE suite.]

Since a proper VIX complex treatment is prohibitively scarce for retail, it’s additive to think about another way to price VIX. I think it’s intuitive to consider VIX itself an option.

(Again, we’ll stay conceptual. Working out the details is out of scope for this post).

I got the idea for “vix as an option” while answering a reader who emailed me. I’ll share my response so as to not expose the question explicitly.

I wrote (this is edited and expanded):

How do you model option prices and even the underlying price itself if it’s a future that is trading for $1 that will likely expire at 0 but can surge to $10 sometime before that? It’s basically a bubble pricing problem because all known bubbles start to have that distribution and even tech stocks themselves in 1999 were pure extrinsic values themselves. It’s also the distribution that governs the H/J nat gas futures spread.

First let’s discuss pricing an option on this asset. Like what’s a reasonable vol for the $10 call?

I understand your temptation to think strike vol is “what IV will be when it gets there” but this like saying life expectancy is 85 if you survive the first year of life. The option needs to balance the price of many states of the world not just the conditional case. In other words, it’s more like what is your life expectancy at conception.

[For the technical, non-metaphor version see the “local vol” discussion in Chapter 7: Skew Trading of Colin Bennett’s Trading Volatility. The book is a free pdf.]

Another approach might be bootstrapping a discrete model. For example, you could use the price of vertical spreads to compute the implied distribution. Then you can those probabilities as your p and then fit various levels of vol to the call options in various states to see which vols are reasonable. I’d guess you’d end up with something that made the market look pretty rational. Like that call option might have a 10% chance of being 200 vol contributing 20 vol points to its IV and the remaining vol points are some sumproduct of the non spike scenarios.

One thing that a bit hairy is implied probabilities are “terminal” probabilities.

It’s easy to understand the distinction when you think of VIX. You have a 9m future that’s trading 18 but will probably expire at 12 or 13. But if I told you it’s 75% to touch 30 during its life how does that effect your intuition of value?

If you use VIX call spreads to assess the probability you miss this because they will assign very little probability to touching 30.

Instead you can use the deltas (delta x 2 is a useful guess for a one touch probability). The one-touch probability is much higher because it respects path.

 

That was the end of my response. But a skeleton for pricing VIX as an option is there.

Think of the lognormal distribution (bounded by zero, positive skew, fatter right tail). As you increase volatility, the distribution “squishes” to the left.

wikipedia commons

A look at March 2025 VIX Implied Distribution from the futures options

Here’s a condensed view of the Mar2025 options chain using mid-market for calls and puts.

Things to note:

The extreme IV skew

  • The 20% OTM call (~21 strike) is 3x the price of the 20% OTM put (~14 strike)
  • The delta of that 21 call is 2x the delta of the 14 put

The distribution

Remember:

the price of a put [call] spread divided by the distance between the strikes estimates the probability that the underlying expires below [above] the midpoint of the strikes.

By looking at the spread of adjacent spreads (ie the butterfly) we estimate the probability density at the midpoint. If we do this across strike we have the implied PDF.

[It’s a bit noisy because of market widths and strike distances not being uniform but I normalized in a reasonable way for these artifacts]

Even though the future is 17.35, the put spreads are expensive and the call spreads are very cheap telling us that March VIX is most likely to expire between 12 and 14.

If you are betting on roll down, it’s already priced in. The 16/14 put spread is $1.03 but the most it can be worth is $2. So despite the fact that the future is 17.35, you get slightly less than even money on the future expiring below 15. In other words, the future falling 14% is already baked in as the median outcome.

Implied distributions like this tell you the market expects the price to fall but it must still balance the chance that in the meantime it can double, triple, or more. It’s the kind of distribution you expect in bubble names where “the market can remain irrational longer than you can remain solvent” but everyone knows the asset is eventually going to be much lower.

Next time VIX spikes watch what happens. The VIX vols will pop, but the put skew will get smashed. The net effect is the put spreads get very expensive because VIX looks like a rubberband to investors…the higher it rips the more distance it has to snap back down which it eventually does. On a VIX rip everyone wants to buy put spreads to have a rick-contained way to capture that reversion, but the surface is too smart for that. You might end up with a VIX future at 25 and all spreads say…”meh, it’s going back to 15”. The contrarian bet would be to bet that it it’s NOT going back home. The options market will give you that bet all day. for good reason. But the trade you want to do is priced like a Chiefs point spread. Sure you’ll probably win, but the risk/reward a priori not a “excess return”. It’s consensus so you’re flipping coins for fair.

This is how the SLV surface repriced in 2021 when the WSB apes tried to squeeze it higher GME style. I was a very active silver options trader then and just found myself frustrated about how smart the surface was in adjusting. Speaking of GME, this type of extreme lognormal distribution took hold when Kitty roared. The cheap call spread beg you to buy them because nobody thinks GME is actually going to expire higher even though it might touch a high price. That it will touch a price is basked into the expensive calls outright and their deltas.

Look at the VIX chain again. The 30 call has a .24 delta. This implies that there’s a 48% chance that VIX will touch 30 at some point before expiry. With such a framework, you can start to see how the VIX future option vols and therefore deltas inform what the price of VIX futures should be. You might draw opinions about a VIX call being expensive relative to VIX itself (notice this is exactly equivalent to saying the implied volatility is expensive).

To be honest, this is all dancing around the fact that just pricing the VIX futures, SPX option, VIX option triangle is the final boss. But the point of this is to give you exercise in noticing that the fact that a VIX future is probably going from 18 to 13 doesn’t mean selling it is necessarily edge. There can be a better leg out there but focusing on “what will probably happen” is a form of probability myopia that distracts from expectancy thinking.

[The difference between positive expectancy and probability is the fertile soil of investing charlatanism. If you were to start a scam strategy from scratch you’d start with trades that have a high hit rate and just hope you collect enough profits before you see the whole distribution. Ideally with someone else’s money.]

We’ll leave it there.


Related reading:

What Equity Option Traders Can Learn From Commodity Options

Bubbles: Knowing You’re In One Is Not Even Half The Battle

Guest Post: Market Impact and Strategic Execution

Quant @imotw2 published Market Impact and Strategic Execution as an article on X.

With permission, I’m cross-posting it here. It turns execution inutition into market quantitative models. I’ll let the quants debate the models — my interest is in how well this post explains the actual dynamics of accessing liquidity, minimizing slippage, and leaking information with trade/bid/offer behavior.

Enjoy! 


The dirty secret of market impact? It’s messier than anyone admits. This article challenges conventional models by looking at how information actually flows through markets. Here’s the real problem: impact hits differently when you’re getting in versus getting out of positions – a fundamental asymmetry that most models miss entirely. Add in the fact that every major player’s algos are scrapping for the same liquidity, and traditional execution models fall apart.

By diving into the weeds of price formation and what drives liquidity providers, we crack open why impact behaves this way. The result? A framework that bridges pure theory and trading reality, giving you both the mathematical firepower and street-smart tools to optimize execution in markets.

1. Introduction

It’s the ultimate catch-22: you’re playing both sides of the impact game whether you like it or not. Every time you trade, you’re leaving footprints in the market while simultaneously trying to read everyone else’s tracks. Consider a large institutional trader executing a significant position. Each clip they trade isn’t just soaking up liquidity – it’s sending smoke signals to every shop watching the tape. The market sniffs out these patterns and adapts, forcing the trader to navigate the mess they’ve created for themselves. Talk about trading against your own shadow..

Impact also isn’t just about what you’re doing now – it’s about what the market thinks you’ll do next. Planning to unwind a monster position? Just prepping for it changes how you trade today.

You’re not the only one playing this game. Every decent-sized player is running their own book, trying to figure out your next move while hiding their own. Your optimal execution strategy? It depends on guessing their guesses about your guesses. Welcome to the hall of mirrors that is modern market microstructure.

2. Foundations of Impact

2.1 Beyond the Square Root Law

Everyone knows the square root law of impact – trade twice the size, eat about 1.4 times the slippage. But that’s not the whole story, actually its not even close. Markets aren’t this clean. Impact comes from a brawl between volatility feeding on itself, how fast the market digests your trades, and dealers trying to figure out if you know something they don’t. This changes how we need to think about moving size in markets.

At the heart of our framework lies the recognition that trading activity both responds to and generates volatility. Consider a large institutional trader executing a significant position. Their trading consumes liquidity and reveals information, and also increases local volatility, that same volatility changes how the market reacts to their next trade. It’s like throwing stones in a pond where each splash affects how the next one ripples. Classic impact models completely miss this self-reinforcing cycle

Sure, cramming all this into one model is like trying to catch lightning in a bottle. But we’ve built something that tries to capture what we see in live markets.


This model implements the volatility feedback term:

But we have to acknowledge some limitations. Trading isn’t as clean as the math suggests. That linear relationship between impact and vol feedback? Markets are messier than that. And β and γ(t)? They jump around – and good luck predicting when they’ll shift. And the volume-volatility interaction f(v)? Probably more complicated than we’re letting on. Plus, markets don’t slide smoothly between states – they snap.

So why bother with this framework? Because it gives us something concrete to work with. It nails the basic idea that impact and vol are stuck in this toxic relationship, and it actually works well in normal markets – which is most of the time. Run this during the 70-80% of trading days when markets behave, and you’ll get solid signals.

No model’s perfect. This one’s useful not because it nails every market move, but because it gives us a solid framework for thinking about how our trading pushes markets around and how markets push back. Use it to shape your execution strategy, but keep your eyes open – markets have a way of humbling anyone who thinks they’ve got them figured out

2.2 Ready, Set, Go!

Impact hits the market in three waves, and each one in its own way. Even in those first few milliseconds, when you’d think simple orderbook mechanics would rule everything, volatility’s already messing with you. Start working a big order, and the vol spike from your first fills immediately changes how the market handles your next batch.

Then things get interesting. Market makers aren’t just looking at your flow anymore – they’re betting on where vol’s heading because of it. You need to get your trade done, but you’re walking on eggshells trying not to kick off a vol spiral that’ll jack up your costs.

Here’s how it plays out in real time: You start hitting bids with size. Not only does price take a hit, but vol spikes. The market makers see this and think, ‘Great, vol’s about to rip – better back off these quotes.’ Now you’re stuck paying up even more for your next fills. Classic feedback loop that can turn a tough trade into a nightmare.

Long-term is where it gets really twisted. Trades that set off these vol feedback bombs? They tend to leave permanent marks on price. Why? Because the market’s reading these vol spikes as smoke signals – where there’s smoke, there must be fire. Throws the whole temporary versus permanent impact debate out the window. Instead, think of it like this: the bigger the vol feedback, the more likely the market thinks you know something they don’t. Price discovery through chaos, basically.

Our framework also reveals the strategic behavior of market makers and other liquidity providers. Their quote adjustment process now explicitly incorporates volatility feedback through a response function:

That λ(σ,t) term? That’s market makers getting twitchy when vol spikes. They’re playing a delicate game – dodge getting picked off versus making bank on those juicy vol-driven dislocations.

This flips everything we thought we knew about smart execution. You might actually want to take a bigger hit upfront if it keeps vol from spinning out of control. Sometimes you’ve got to pay the toll to avoid the avalanche.

We tested it briefly across equities, futures, and crypto. The model crushes it, especially during those nasty regime shifts when traditional models fall flat on their face. It nails the vol feedback effects, giving you a much better read on your true execution costs and how to optimize around them.

Bottom line? Impact isn’t just some tax you pay to trade. Every move you make reshapes the book for your next play. Once you start thinking about impact this way, whole new strategies open up. You’re not just minimizing cost anymore – you’re actively managing how your trades shape the market you’re trading in.

3. Information Theory of Execution

3.1 Order Flow Toxicity and Adverse Selection

The elephant in the room: toxic flow. Your algo’s (seeminly) crushing it, getting fills left and right – but is that actually good news? Could be perfect timing, or could be you’re getting picked off on stale quotes. Classic execution dilemma: go aggressive and risk getting fleeced, or sit back and watch alpha decay while everyone figures out what you’re up to.

We’re looking at how trades cluster in time and how order flow ripples across related products. Because let’s face it, when someone’s running a smart toxic arb, they’re hitting everything in the complex.

Here’s the math behind the magic – for a given order flow sequence, we compute the toxicity score T as:

Where V_i(t) represents the normalized volume in bucket i, C(t,i) captures the temporal correlation structure, S(t,i) measures cross-sectional spread dynamics and ω are weights that reflect market conditions.

This metric provides a measure of order flow toxicity that can be used to adjust execution algorithms’ aggression levels and venue selection strategies.

3.2 Learning and Adaptation

Modern execution algos are constantly learning – but there’s a catch. How do you know when the market’s actually changed versus when it’s just noise? Every trader’s been burned by overreacting to a head fake.

We tackled this by measuring what signals actually matter for execution. Forget trying to predict where prices are headed or cooking up the perfect schedule. Instead, we built a framework that attempts to cut through the noise to find what really drives execution quality.

Our framework splits market signals into three components:

  1. Structural information about liquidity conditions and market dynamics
  2. Temporary effects from specific order flow patterns
  3. Noise terms that should not influence execution decisions

For each signal, we measure how much juice (information ratio) it’s got like so:

where I(S; O) represents the mutual information between the signal and execution outcomes, and H(S) is the entropy of the signal. This tells us how much actual information we’re getting about execution outcomes versus how noisy the signal is. Translation? We can spot real market changes worth adapting to and ignore the head fakes that would otherwise trip us up.

3.3 High-Frequency Market Making and Latency Arbitrage

Evolving market structure has introduced a new layer in the interaction between execution algorithms and high-frequency traders. Your execution algo sees a fat stack of liquidity on the book – but good luck actually hitting it. Modern markets aren’t your grandfather’s NYSE floor where a handshake meant something. That liquidity? It in a way both exists and doesn’t exist until you try to trade it.

Imagine this: you spot 100,000 shares posted at the offer. Traditional models would tell you to size your child orders assuming that liquidity is actually there. Rookie mistake. Those quotes vanish faster than you can even start thinking about your next move. Market makers are running smart playbooks, adjusting their quotes every microsecond based on everything from correlated ETF moves to the temperature in New Jersey. Yeah, seriously.

What you see isn’t what you get. The odds of filling against posted liquidity aren’t just about time priority. This isn’t even academic hair-splitting – it changes how we need to think about modeling impact and designing execution strategies.

The displayed liquidity in the order book represents a conditional commitment that is fundamentally probabilistic in nature. This probability structure manifests through several key mechanisms:

Firstly, quotes don’t just sit there waiting to be hit. They dance. And not randomly – they move in patterns. When a big order hits, market makers don’t just pull their quotes at the affected price – they scatter like cockroaches under a kitchen light, repricing their entire book before you can blink. Why? Because they’re not idiots – they know one big trade usually means more are coming.

Secondly, What looks like a simple limit order is actually a war zone. That 100k share quote? It’s really multiple market makers playing chicken with each other, each one’s algo trying to figure out if the others know something they don’t. Meanwhile, every other trader with a similar strategy is trying to get to the same party.

Finally, there’s the speed game. By the time your order hits the exchange, that juicy quote you saw might be ancient history. We’re talking microseconds here, but that’s an eternity. Between your network latency, the exchange’s matching engine having its morning coffee, and Jump Crypto cooking you with their radio towers – well, good luck. Even if you’re fast, queue position is usually everything. Being second in line might as well be last when the music stops. The order book isn’t a menu, and if you’re modeling it that way, you’re bringing a knife to a gunfight.

We find that the likelihood of a quote remaining available for execution follows a more complex distribution than previously recognized:

Where λ(t,s) represents the base cancellation intensity that varies with both time and market state, τ(v,σ) captures the reaction time that varies with volume and volatility, and f(OFI, IMB) adjusts for order flow imbalance and book pressure. This creates an effective liquidity profile that differs markedly from the observed order book, forcing execution algorithms to operate in a probabilistic framework.

In English? The faster you need to trade, the bigger your order, or the more the market’s moving – the more likely that liquidity is gonna vanish before you get there. And if the order flow’s getting lopsided or the book’s getting thin? Forget about it.

The implications of this extend beyond simple quote availability. Market makers aren’t just watching your stock. They’re watching everything. Hit them too hard in SPY? Watch their quotes disappear in IWM before you can even think about going there. These guys are hunting for any sign that you’re about to unload size.

The presence of latency arbitrage opportunities introduces systematic biases in observed market impact that must be considered. Every microsecond of delay between exchanges is a gold mine for the right setup. While you’re still seeing stale quotes in Jersey City, someone’s already cleaned up in Mahwah. The really fun part? This isn’t just noise – it creates predictable patterns in how markets react and impact spreads. Those theoretical arb profits (αcross)? They’re like a fingerprint showing exactly how quotes are gonna move and where the impact’s gonna hit first.

Market manipulation remains a massive concern in markets, especially in crypto where it’s not just an edge case – it’s a daily reality you need to deal with. Quote stuffing, layering, and other manipulation schemes regularly distort impact measurements. By tracking specific order book patterns – quote stuffing intensity, layer depth stability, and price oscillations – we can build real-time manipulation indicators. This lets execution algorithms adapt dynamically, threading the needle between adverse selection and efficient trading paths.

Speed management is make-or-break in this environment. Every algo faces a core tradeoff: execute faster to dodge adverse selection, or slower to minimize information leakage. It’s an optimization problem where the sweet spot shifts constantly with market conditions, participant behavior, and execution objectives.

The data shows static speed strategies consistently underperform. Effective execution requires continuous adaptation based on market toxicity metrics, participant patterns, and market state. But nailing this in practice? That’s where things get interesting.

Implementation demands extreme precision at the infrastructure level. Microsecond-level clock sync isn’t a nice-to-have – it’s essential. One timing slip and your execution quality falls off a cliff. Network stacks need to handle the market data and order flow and how they interact, seamlessly. Risk controls need to maintain deterministic latency while actually protecting you. Memory management has to deliver rock-solid performance at the microsecond level.

4. Advanced Impact Modeling

4.1 Cross-Asset Impact Propagation

Trading ETFs or index futures? Your impact bleeds everywhere, and not in the neat way most risk models predict. Markets are connected through a nasty web that shift based on regime and time. Worse still, these relationships aren’t symmetric – the way impact flows from futures to stocks isn’t the same as stocks to futures, and good luck if volatility is spiking.

Here’s how we try to model this mess:

Where:

  • K_ij(t) represents the kernel function capturing lead-lag relationships (how moves in one market spill into another over time)
  • α_i(t) models time-varying direct impact sensitivity (how much bang for your buck you get in each market)
  • β_ij(S(t)) are state-dependent coupling coefficients (how strongly markets are linked, which changes with market conditions)
  • S(t) encodes market state including volatility regimes, basis levels, and order book conditions
  • φ(L_i, L_j, t) captures non-linear liquidity interaction effects (how liquidity dynamics in one market screw with impact in another)

The coupling coefficients β_ij(S(t)) follow a regime-switching process:

Where R(t) indicates the market regime based on a multivariate state classification incorporating relative volatility levels, basis spread dynamics, order book imbalances, trading volume ratios and market stress indicators.

The liquidity interaction term φ(L_i, L_j, t) models how the availability and cost of liquidity in one market affects impact propagation in related markets, which essentially ties it all together:

This captures several features absent from traditional correlation-based approaches

  1. Temporal lead-lag relationships through the integral terms
  2. State-dependent coupling strength
  3. Non-linear feedback effects
  4. Regime-switching behavior
  5. Cross-market liquidity interactions

If you’re trading across multiple markets, you need a model that deals with reality, not just correlation matrices and linear spillovers. This framework gives you the tools to handle the mess.

4.2 Queue Position Game Theory

The management of queue position in modern markets creates a game-theoretic problem. When placing passive orders, traders have to consider their position within the queue and the information content of their queue placement decisions. Every time you join a queue, you’re not just picking a spot in line, you’re sending signals to every other algo watching the tape. And trust me, they’re all watching.

Think your queue position is worth what your transaction cost model says? Think again. That premium spot at the top of the book might be fool’s gold when everyone’s running the same playbook. Or it could be pure alpha when the market’s choppy and other players are gun-shy.

You’ve got to price in the option value of that queue spot, factor in what your queue placement tells the market about your book, and figure out what everyone else’s queue dancing means.

5. Implementation

5.1 Engineering in High-Frequency

Developing excecution algorithms in the context of latency creates significant engineering challenges, that if you get wrong, will crushingly affect trading performance. First up: timestamp management. Sounds boring, right? It is, but it could the difference between making money and getting run over.

Here’s the thing about timestamps in modern markets – they’re a mess. That pristine timestamp you think you have? It’s about as accurate as a sundial here in Norway. When you’re building impact-aware algos, you need to know exactly when things happen, but that’s harder than it looks. Here’s what you’re really dealing with:

Where:

  • Δ_netword_send represents outbound network latency including TCP/IP stack and route-specific jitter. Hope you like randomness.
  • Δ_exchange captures exchange gateway processing time and order validation delays. Different by exchange, time of day, etc.
  • Δ_matching accounts for matching engine queueing and processing time. Where your order sits in the queue. Spoiler: not first.
  • Δ_network_recieve represents inbound network latency and potential packet loss recovery (So fun, said no one ever).
  • Δ_market_data includes feed handler latency and order book reconstruction time
  • Sum of epsilon sub i represents a composite error term incorporating:
  • Systematic variations in processing times
  • Random jitter components because why not
  • Regime-dependent uncertainty factors

This model creates option-like properties in order management. For example, the uncertainty in matching engine response times during high-volatility periods effectively grants a free option to other market participants, who can react to price movements before our timestamp uncertainty is resolved.

5.2 Risk Management Under Impact Constraints

Here’s where VaR gets interesting – and by interesting, I mean breaks down completely. You can’t just run standard VaR when your exit price depends on how fast you’re trying to get out. It’s a nasty circular problem: your risk depends on your liquidation strategy, but your strategy depends on your risk limits. Fun times…

We develop a modified risk framework that tries to account for these dependencies:

where E[I(Q, t) | Q, M] represents the expected impact given position Q and market conditions M. This allows for more realistic risk assessment and better integration of risk constraints into execution algorithms. Now your risk numbers might actually mean something.

6. So What? And Where Next?

We’ve built something that tries to bridge the gap between ivory tower theory and reality. By thinking about execution through the lens of information theory, we’re getting at how markets work.

Here’s the big takeaway, TL;DR if you will: stop thinking about impact as just a cost to minimize. It’s actually telling you something about how the market reads your flow and how other players are positioning against you. Once you get that, everything about execution strategy starts looking different.

It’s about making money and, more importantly, not losing it when things get ugly.

Tax-Loss Harvesting On Levered Long/Short

Real estate people understand the value of accounting losses in service of deferring taxes while an asset’s returns compound.

In the institutional investing world many investors such as endowments are tax-exempt.

Retail investors in public stocks have less places to hide outside of tax-advantaged accounts which are hard to jam lots of assets into in the first place.

The rise of ETFs have come with some relief on the tax side as you decide when to pay taxes because you decide when to sell even as the holdings are rebalanced. Mutual funds can leave you footing a prorated portion of the pool’s taxes regardless of how long you’ve been an investor.

While the ETF advantage is real it’s relatively minor compared to the ability to tax-loss harvest. By owning the individual components of a stock index you can sell losers, rebalance into peer stocks, and accumulate short-term losses to offset long-term capital gains on the subset of names that moon.

I say minor because of the “brain damage” (more effort, slippage, tracking error although if it’s random only matters if you’re managing money for others) and higher management fees associated with TLH. See Alpha Architect’s The Costs and Benefits of Tax-Loss-Harvesting (TLH) Versus an ETF.

Another restraint on TLH enthusiasm is limitation on writing off losses greater than $3,000 per year. Losses are more valuable in an NPV sense if you can use them to offset significant capital gains when diversifying out of a large gain in a concentrated position. With markets where they are, especially the Mag 7 and BTC, this is common high-class problem.

Still, the fintech world with the rise of robo-advisors and software is enabling both retail and advisors to “direct index” making TLH both easier and less costly.

Getting a sense of proportion

Let’s do some simplistic hand-wavey math to get a sense of proportion for how TLH might work if you were simply long a $1mm basket of stocks.

Assuming individual stocks were i.i.d. (“independent and identically distributed”) with expected mean monthly return of 0 and standard deviation of 10% (10% *√12 ~ 35% annual estimate of single stock volatility) then conditional on a stock being down its expected loss is 6.75%

This is symmetrical. Given that a stock is up the average expected return is +6.75%. We are just using arithmetic returns.

In mathematical expectation you expect the portfolio p/l to be 0, with half the stocks up and half down. Thinking about the portfolio as 2 halves, you expect $500k to earn 6.75% or $33,750 and $500k to lose $33,750.

Now suppose you rebalance the losers into a Wario basket of names that have the same exact characteristics as the ones we sold. You have now crystallized $33,750 of short-term tax losses but our exposure is the same. You have gained an asset to offset future tax liabilities.

Just staying simplistic, if all the stocks proceed to go up by 5% over the next 365 days and you sell on day 366 to get LTCG treatment (assume 23.8% — which is just Federal!) then what is your after tax return?

Winning stocks:

$533,750 * 1.05 = $559,728.75 or $59,728.75 total profit on a basis of $500k

Losing Stocks:

$466,250 * 1.05 = $489,562.50 or $23,312.50 profit on a basis of $466,250

resulting in:

+$83,041.25 LTCG (5% on $1mm exposure)

-$33,750 short term losses

= taxable gain of $49,291.25

tax bill = 23.8% x $49,291.25 = $11,731.32

After cutting the check what do you have in your account?

$1,049,291.25 – $11,731.32 = $1,037,559.93 or a 3.75% after-tax return.

 

💡The tax benefit

You’ll notice that if you bought $1mm of an ETF that went up 5% in a year and sold on day 366 your $50,000 profit less 23.8% taxes would net you about the same after-tax return.

So where’s the benefit?

It’s in the optionality of this pool of short term losses that you have control over. You could just let this $1,050,000 portfolio grow and keep the short-term losses as an asset in your back pocket to use against future tax liabilities, some of which are going to be taxed higher than LTCG rate. Alternatively if you need to divest a large chunk of a profitable position to raise cash, you’ll have a large pool of losses to offset the gain.

The real value of TLH emerges when:

  • Offsetting Higher Tax-Rate Gains
    • Short-term capital gains (STCG) or ordinary income are taxed at higher rates than LTCG. If your harvested losses offset STCG or ordinary income, you reduce your taxes significantly.
  • Perpetual Deferral
    • Step-Up in Basis
      • If you hold the portfolio until death, the heirs may receive a step-up in basis, erasing deferred taxes entirely.
    • Charitable Contributions:
      • Gains on low-basis positions can be avoided by donating appreciated securities to charity.

This toy example is compelling enough to realize it’s important. But there’s also another flavor of TLH on the scene with the potential to generate significant short-term accounting losses while the overall value of the portfolio grows.

TLH on levered long-short portfolios held in a separately managed account (SMA)

Using portfolio margin and a quant framework (this can range fairly basic to factor-intensive), an investor can run the same beta they desired in a typical long-only ETF but generate significant short-term losses by using their stocks as collateral to overlay a long-short portfolio.

This is typically done with an advisor who will in turn be using a sub-advisor whose infrastructure allows them to scale portfolio adjustments across thousands of custom custom portfolios held in SMAs.

I’ve heard some claims of how much more impactful this can be but again it’s critical to sanity check with actual numbers to make sure the sense of proportion is reasonable. From there you start layering common caveats which are easier to handicap in terms of bps per year.

In this case, the sanity checks called for simulation.

A little foreshadowing — if you are in a high tax bracket or trying to work out of a concentrated position with a low cost basis you are going to want to see this.

You get the simulation code, you can run it in your browser and it will even download the full output. I’ll show you a few manipulations for the output so you can get a strong grasp on the mechanics. This is one of those concepts that you can’t unsee once you see it. Multiple bulbs going on at the same time.

(It’s also quite depressing so much time is spent on taxes and the ROI on that time is validated by the math. Both taxes and the time spent on their minimization is deadweight loss. I like markets, I hate structuring and law and tax and basically all the crap that’s probably higher yield to understand. And just going through this exercise depressed me even further because it confirmed how important it is.)

Onwards…

This Jupyter notebook can be run directly in the Google Colab environment.

🔗TLH.ipynb

Open the link and press “play”.

The output will:

Return a summary in the browser of the simulation results

download a CSV to your browser

 

Stepping through the tax-loss harvesting (TLH) simulation

This simulation models a tax-loss harvesting strategy applied to a hypothetical stock portfolio over a 12-month period. The objective is to demonstrate how TLH can potentially reduce taxes by systematically harvesting losses on individual stocks while maintaining the portfolio’s market exposure.

Key Steps and Mechanics of the Simulation

  1. Portfolio Setup: $1mm long equity portfolio
    • The portfolio consists of two parts:
      • Long Positions: $1,300,000 allocated across 100 individual stocks.
      • Short Positions: $300,000 allocated across another 100 individual stocks.
    • Each stock in the long portfolio has a starting price of $100 and is equally weighted. Each stock in the short portfolio also has a starting price of $100 and is equally weighted.
  2. Return Simulation:
    • Every month, the returns for each stock are randomly and independently generated based on a normal distribution with:
      • Mean Return: .80% per month (approximately 10% annualized compound return).
      • Volatility: 10% per month
  3. Monthly Rebalancing for Tax-Loss Harvesting:
    • Harvesting Criteria: At the end of each month, the simulation checks each position to see if it meets the tax-loss harvesting criteria.
    • Long Positions: If the price of a long stock falls below its cost basis ($100), the position is “harvested.”
      • Harvesting Process:
        • The position is closed, and the realized loss is calculated based on the difference between the cost basis and the current price.
        • This loss is recorded as a crystallized loss, and the realized loss amount is added to the cumulative short-term losses for the portfolio.
        • A new stock is bought in its place with a fresh cost basis of $100 using the harvested amount. Since the original position suffered a loss, the proceeds will be less than $13,000 ($100 * 130) worth of shares. Since the new stock is also $100, the share quantity must be less than 130.
    • Short Positions: If the price of a short stock rises above its cost basis ($100), the position is similarly harvested.
      • Harvesting Process:
        • The position is closed, realizing a loss based on the difference between the cost basis and the current price.
        • This loss is recorded as a crystallized loss, and the amount is added to the cumulative short-term losses.
        • A short position in a new name is established to match the notional amount of the covered position. A fresh cost basis of $100, but the short share quantity will necessarily be less than 30 shares.
    • No Harvest for Profitable Positions: Positions that remain above (long) or below (short) their cost basis are not harvested and continue with their updated prices and fixed share quantities into the next month.
  4. Tracking Results:
    • For each month, the simulation tracks:
      • Monthly Short-Term Losses: The sum of all realized losses from harvested positions within the month.
      • Cumulative Short-Term Losses: The running total of all realized losses harvested up to that point in the year.
      • Monthly Tax Benefit: Calculated as the monthly short-term losses multiplied by a specified long capital gains tax rate (assumed to be 23.8%) since that’s what they will be used to offset.
      • Cumulative Tax Benefit: The running total of tax savings from all harvested short-term losses over the year.
  5. Assumptions:
    • Consistent Cost Basis: Each new position, whether long or short, always has a fresh cost basis of $100, regardless of the prior stock’s price at liquidation.
    • Monthly Frequency: The portfolio is evaluated and rebalanced for TLH at the end of each month, meaning opportunities to harvest losses are considered 12 times over the year.
    • Independent Stock Movements: Each stock’s returns are generated independently of others, with no correlation among stock prices.
    • Equal Allocation and Reinvestment: Both the long and short portfolios are equally allocated across the stocks, and any harvested amount is fully reinvested into a new position with the same initial investment amount.
    • Static Portfolio Size: The portfolio maintains 100 long and 100 short positions, with new stocks replacing harvested ones to keep the portfolio composition stable.
  6. Output:
    • At the end of the simulation, the following information is displayed:
      • Detailed Monthly Summary Table: Includes individual stock performance, crystallized losses, and other details for each stock every month.
      • Month-End Summary: Shows monthly and cumulative short-term losses and tax benefits. This provides insights into how the strategy’s tax benefits accumulate over the year.
      • Overall Portfolio Statistics: Total portfolio gain/loss, gross return, accumulated short-term losses, and the final tax benefit as a percentage of the initial portfolio value.

Summary

I address a few of the real-world considerations further below.

But to put the value of this concisely:

Making a $100k capital gain on an investment is not as useful as making $300k with $200k of short-term losses even though the net is the same.

[Notice how you might not have enough capital gains to take advantage of all these short-term losses. Which is why a strategy like this is especially useful for investors sitting on concentrated profits — they can work out out of it much with smaller tax impact. Holdings can be used to collateralize shorts with portfolio margin!]

Real World Considerations

  • In practice, a TLH strategy would seek to rebalance into names with similar characteristics (whether by factor, sector etc) to avoid wash-sale rules. In the simulation each stock has the same vol but this proxies using equal-vol weighting in the real world.
  • There is a cost of leverage although it is partially offset by the short-stock rebate on shorts.
  • Note that as the market rises, the number of positions that are underwater declines. Names you recently rebalanced into will have a better chance to experience loss to harvest than a name that you have been holding for years of a bull market. However the levered version of TLH which includes shorts and longs offers far more opportunity to harvest than a long only portfolio which might have very few losers after several years.
    • As time passes the surface area for loss harvesting stabilizes towards something of a steady state.
  • The strategy is meant to maintain a 100% long exposure (130 long vs 30 short) so if volatility increases it is still bad news. But the tax loss harvesting portion can actually benefit from higher volatility so there’s a natural buffer.
  • Monthly turnover means slippage costs. Those will increase with volatility.

Wrapping up

I leave it to you to decide how interesting this all might be.

(I find it compelling but still taking it apart.)

I’m not a tax expert. I’m not a simulation expert. Hell, it was a long battle with ChatGPT to get that code to a place that felt right. (It was about 50 iterations of “run code”, “pivot table the CSV data”, “see if the lifecycle of trade/rebalance/accounting made sense”, “tell ChatGPT how the desired behavior of the code diverged from the actual behavior”, “repeat”).

Example of stitched together images of pivot tables to investigate:

  • This one allowed me to see crystallized losses by month. You can see how they decline over time. That’s because as the market rises the further stocks are above their cost basis which means less opportunity for harvesting. This would be much more pronounced if there you did not employ shorting.

This table lets you see price paths

 

Position and portfolio values over time

I hope some of you will get your hands dirty with this as well. I want to know what I’m missing or flat-out misunderstanding. Even placing sane error bars around the real-world considerations would be helpful.

Option Spellbooks

When you learn a new language you ascend levels of competence. You start with everyday words and basic grammar with a goal of at least being comprehensible. This will unlock simple conversations, a milestone that enables and encourages regular practice. Like getting through the first month of guitar when you just wrestle with physical finger placement and learning your A, E, G, C, D open chords. The moment you can fluidly switch between the chords you can play the majority of music you hear on the radio. That’s when your learning takes off.

With enough practice in language or music you can think outside your native tongue or improvise in real-time. In skill acquisition terms, you have achieved “unconscious competence”.

Options are the language of risk

You start by mapping basic vocabulary to real-world sensation. “Long call” = “happy if stock go up”. With practice, you develop taste in how to use this vocabulary. In English, “I know how you feel” expresses empathy but you don’t say that to someone who just lost a loved one both because it’s not true (you don’t know how they feel) and because this comment awkwardly makes the moment about you. Well-intentioned commiseration clumsily executed.

Trade structuring can be clumsily executed too. You think something could double in the next 6 months and you…sold puts? If you are right, congratulations on making a tiny bit of cash on direction but opportunity costing yourself a small fortune by expressing the trade with a short vol position when your entire thesis insinuates orgasmic levels of vol!

You touched options without understanding that they are always about vol. Like knowing the words but not how to use them. You’re still stuck at “unconscious incompetence”.

Before you go further it will take 4 minutes to review this idea directly…see Translating to “option surface” language

Wizardry

You have a book of spells. You desire stealth so you look up “cloak of invisibility” and see you need some a salamander tail, a runestone and peroxide to conjure it. Common ingredients in a sorcerer’s home.

In investing you have a sense that a stock is “probably going higher but if I’m wrong I’m really wrong.” That’s a natural language description of a somewhat routine scenario. You need to translate that to code. Option code.

  • What instrument is levered to the “probability that a stock goes up”?
  • What instrument is levered to “if down, magnitude is yuuuge”?
    • far OTM put

So the package of “probably going up, but if down, then down big” can be represented in option language as “long a call spread plus a teeny put”.

That’s the name of the spell.

Now, as a wizard you can adjust the size, ratio, and particular strike to both

a) suit your taste and

b) take “what the option market is giving you”.

What the option market is giving you

This ties right back to the most underappreciated concept in trading….is it bid or offered?

If you believe that an asset’s upcoming moves are distributed so it probably goes up but if not it crashes, you know what spell to cast. That’s half the battle.

Next question:

Is the option market offering you the ingredients for the spell at an attractive price? Or is every wizard trying to cast the same spell and bidding for the same ingredients?

We are going to look at a couple spells and their cost of ingredients.

You’ll learn how to measure price, decompose recipes, and if you read between the lines you’ll notice — even when no trade is indicated you receive a valuable update to your assumptions or an opportunity to express the bet in a way that exploits the price anomaly. There’s a beautiful yin and yang to options. If the cost of a one spell gets expensive there’s a counterspell that gets cheaper with respect to your view!

Let’s get to some data.

Setup

We are going to look at 2 spells that can make sense for betting on “probably going up, but if down, then down big”. An interesting thing about this distribution how its kinda the base case distribution for SPY right? Both history and the options market agree that the SPY proposition looks like “hey you’ll make some money on average and every now and then there will be a big drawdown. also, don’t expect the market to crash up overnight”.

[💡That fact in itself is useful because if you think another market conforms to that distribution you know how the SPY option market prices that spell! You may even adopt a prior that the SPY price for that spell could be an upper-bound or be considered “expensive” if another asset started trading that way. If that’s hard to understand, don’t sweat it, it’s a slightly more advanced inference that a relative value trader might make.]

We are going to price the spells and look at charts.

  • Our universe will be: SPY, TLT, GLD, USO and BITO
  • Our date range is almost 4 years (Jan 2021 – Nov 15, 2024…except for BITO which start 10/21)
  • We are looking at the closest listed expiry to the 90-day maturity (based on what option chains are listed there is about 15% tolerance on DTE around 90 but BITO chains are more sparse so DTE can vary by as much as 30 days or 33% tolerance)
  • We are choosing strikes closest to breakpoints that are 1 or 2 standard devs OTM depending on the spell. If we measure strike distance with standard deviation we are using the ATM vol to define the SD. So if 2 assets are both $100 a 1 standard deviation OTM call will be further away (ie a higher strike) in the higher vol asset. This concept is detailed in these series of posts:

There will be lots of little observations along the way so in the spirit of “enough with the rulebook, let’s just play”, we’ll proceed.

Learning the method via SPY

The first spell we’ll examine is a package of:

1-s.d. / 2-s.d. call spread

+

2-s.d. OTM put

measured as a percentage of the spot price.

For example, on 1/12/2021 the details were:

The package cost 1.16% of the spot price on that day.

Exploring the long call spread + long far OTM spell

Let’s move on to charts. I will point out notable features of each chart. I’ll point out the bits of learning in each one.

Chart 1

🌙The red line is the price of the package. It ranges from about 70-150 bps for a 3-month expiry

🌙The blue line is the ATM call as a percentage of spot. If you remember the straddle approximation formula you know that by dividing the spot price out this measure will track implied volatility.

🌙The package is long volatility (you lay out premium for it) so it’s not surprising that its price is correlated with the vol level. You also see that the package price itself varies between something like 20% and 50% of the ATM call value. When vol is low, it’s a higher percentage of the call price. Ponder that for a moment. It means OTM options are a bit “stickier”. Skew as a percentage of ATM vol (aka ”normalized skew”) flattens as vol increases and steepens as vol softens. We’ve talked about before in the context of conditioning skew percentiles on vol levels by using scatterplots to see that skew has a curved relationship to vol.

Chart 2

🌙Same data in scatterplot form. You can see the correlation clearly (r = .76). Red dot is the recent observation (11/15/24). The package price is high compared to the vol level. I don’t like to anthropomorphize markets and say “the vol surface agrees with the statement that the market is likely to go up but if not crash” but the price of that expression or spell is high. The market is not offering it cheaply.

Chart 3

🌙Lots of smoothing — dark red line is the 200d moving avg of the package. The dark pink is the rolling 200d standard deviation of the package price. Dashed pink is the 200d MAD (mean absolute deviation — just another measure of deviation. If it’s close to .80 then the distribution of the stat is close to normal, if it’s less it’s fatter tailed. It costs nothing to add this metric so I tend to do it whenever I look at a standard dev just in case there’s a disparity from .80. The dashed light blue line shows that its stable at .80 even if it’s hard to see on that axis).

🌙The dark blue dashed line is the z-score of the observation vs a 200 day lookback. You can see the spike on 8/5/24

Chart 4

🌙I was just curious so I threw up a scatter of the package price vs the trailing 1-week return. It gets cheaper when the market rallies but that’s just the spot-vol effect. On large moves either way it does look like there’s some modest smile effect where even on a large up move the package is more likely be on the more expensive side. Maybe the most interesting part of the chart is the X-axis itself. You can see the left skew of weekly returns. There are more large up moves than down moves but 2 largest moves are negative and out of bounds compared to the rest of the blob.

Decomposing the spell: the price of the ingredients

Let’s look at the building blocks of the spell to see what’s driving the value of the package.

🌙In 2022, the bulk of the package price was driven by the far OTM put. But you can see some small stretches like October 2022 when equity markets bottomed for the year (lows that haven’t been touched since) where the call spread was actually more expensive than the puts! We’ll chart the ratio of the package to the ATM call price directly to see that the package got cheap relative to the vol — again the skew flattening (shaded region was Oct 2022 lows). In this case, it was the put skew especially that got hammered.

 

[I remember 2022 in vol markets as the year tail and defensive strategies massively underperformed because funds had boughts puts at the end of 2021, a frothy year in from which a sell-off was actually consensus. If a sell-off is stabilizing, and restores order to the force, the skew is not gonna perform.

Markets are biology not physics he shouts into the void. It’s poker all the way down. Notice how I don’t try to create rules from what has happened or play technical analysis. I just look for prices that are anomalous, which doesn’t mean wrong, it just means it deserves attention. From there you have to like think and stuff. Get used to sitting in paradox. There is no closure. Markets are utterly indifferent to your need for coherence.

For what it’s worth a similar setup happened in late 2018 — vols ripped into the Q4 sell-off but skew got crushed because of a large overhang of recycled downside vega from the autocallable structured products. There was an inflection point in the open interest though so a clever cat would realize you could load up on the hated, supplied region of the skew because the market’s outstanding vega profile would flip on a second leg down (ie well supplied to flat).

So what do you do there? One thing that you can do (and it worked) was buy the puts and hedge on a heavy delta in expectation of a rally. If it rallies, you win on the outsize delta plus the puts sliding up the skew curve as they become further OTM. If the market sells off into the abyss where the open interest and market-wide greeks quickly decay you are long vega into an exploding vol market. It was one of the easier risk-reward setups that comes along every now. I know I’m gonna get emails from retail traders asking about how you can plug into this and I’ll just preempt it now. You can’t. It’s an information game the propagates out from the OTC market and the all the risk reshuffling that gets farmed out in chunks to institutional vol desks. At some level it gets “into” the listed market but the scent has to be picked up higher in the chain of issuance to understand the arrows, dependencies, and inflections.]


Exploring the 1 sd collar spell

The next spell is the 1 sd collar (also known as a risk reversal or god help you if you trade commodities, “a fence”)

Long 1-s.d OTM.put

1-s.d. OTM call

Notice how this is put “minus” call. that means if the collar trades for a positive value then the put price > call price.

Again we will divide the package price by the spot price to normalize and we are still using 90 day maturity.

Let’s use USO, the oil etf. Instead of pointing out what’s interesting in each of the charts I will cherry-pick questions so you can practice.

USO:

Chart 1

🌙What does it mean that the collar price sometimes goes negative? Do you think you can spot the Ukraine invasion in the chart?

Chart 2

🌙 The collar price can get extreme in favor of the put OR the call at high volatility. Why might that make sense for oil?

Chart 3

🌙The chart starts in January 2021. The puts started the year high relative to the calls declined over the course of the year then flipped in 2022. What’s the story behind the change?

Chart 4

🌙Compared to the SPY chart’s x-axis earlier what do you notice about oil’s weekly returns?

 

The decomposition of the collar into the separate call and put legs will let you see the drivers.

In early 2022, you can see the put leg was relatively stable after peaking at the end of 2021 but the collapse in collar pricing was driven by the calls exploding higher.

🌙Can you see the Oct 7, 2023 Hamas attacks in the chart?

SPY collars

I’ll just point out the animal spirits in the stock market. SPY collars are relatively cheap these days.

The contribution of the legs to the cheapness looks balanced. Puts are a bit low and calls are a bit high.

BITO collars

BITO is an etf that tracks BTC futures. It’s performance is marred for the same reason VXX stinks. Negative roll returns as the prompt future must be rolled into a premium 2nd month in a contango (upward-sloping) futures market.

You can see the effect clearly in the BITO cumulative return chart but it’s still useful to see the point to point changes. Even though the cumulative return since its inception is -35% you can see that in the past 2 months its surged from -60% to -35% mirroring BTC giant rally.

Interestingly, it looks like when they were first listed, the calls were premium to the puts (negative collar prices) before establishing the familiar regime we see in many assets — the puts are premium.

[More story time: In the mid-aughts during the commodity “supercycle” there was a lot of noise about commodities as an asset class, financialization, yada yada. I was on the NYMEX floor when WTI first breached $100. IIRC a trader overpaid for a one-lot in the pit lot to own the memorialized print.

Anyway, call skew in the oil markets got annihilated with financialization. Why? Because to passive investors, calls are free money to be sold. So as something becomes financialized you have some prior about what will happen to the call skew. Something to consider for all those bulled up on IBIT calls these days. I’m not exactly sure what happened in BITO when it got listed, but maybe the first option quoter was still wearing footie pajamas back when commodities became an asset class.

As you can see from the USO charts, the skew in oil is fascinating because barrel prices can crash down like during COVID or up. From a long history of trading oil, owning the call skew has been positive expectancy because hedgers are a steady supply of upside-vol yet the true distribution includes crash-up potential.

This was also true in heating oil and gasoline. Gasoline is a smaller market than crude oil and less liquid. My approach to OTM calls in RBOB — buy em’ when they’re cheap, only sell them closing. Never open on the short side “because they screen high.” Same attitude in nat gas or anything that gets real sloppy on the upside as liquidity evaporates. There’s probably edge in being short because that’s where you’d expect the risk premia to live but the risk/reward and survival imperatives mean only those messing with other people’s money get wide-eyed at that stuff. I guess you could always trade it small but also maybe find other ways to deal with your boredom.]

Spellbooks

I’m not going to show charts for each asset but just summarize just show this self-explanatory spellbook.

You could imagine columns for decomposing the legs and their stats. On and on. You could see how one might want 10 screens.

[We don’t have this in moontower.ai but we’re excited about all the wood we’re gonna chop in 2025 😉]

Actually on the 10 screens bit — a normal human thinks that sounds like a nightmarish work environment. But you have to have some faith that this works just like walking and chewing gum. You get used to it. Plus all the dashboards are the survivors of the evolutionary screen-space tournament. What’s left is a bunch of tables and charts that your eyes easily scan and process into a gestalt. It’s like the market is having conversations all around you but a well-constructed cockpit will simulate the cocktail party effect — you’ll identify the most prominent features of what market parameters are changing today. This is the “science” part. The actions are the art.

Wrapping up

Some repetition and a smattering of observations that might not be obvious to learners

  • Using 90 day options means we need to multiply the typical cost of these structures by 4 to annualize
  • Assets vary in what a “typical” vol surface looks like. SPY always has a pronounced put skew. Coffee and VIX have inverted skews. Gold is usually a true smile with both far OTM puts and calls trading at premium vols to ATM. The same spell therefore varies in cost across markets.
  • Decomposing spells will let you see what legs are driving the cheapness or expensiveness of the package.
  • In keeping with the “options as a language” metaphor, volatility metrics like “implied vol” and “skew” can be seen as a abbreviations. Like if the put skew is high, the price of the collar will be high. But when you trade you don’t trade “skew” or “vol” directly but you trade contracts with actual prices. The vol metrics are handy ways to normalize in the same way that P/E is an attempt to normalize for comparison. But you trade actual instruments not metrics. As an option user, you open an option chain but its gibberish. Vol surfaces and metrics help you make sense of “all the numbers” on the screen with curves and historical context. We use the metrics to zero-in on what we want to do, but for execution we shift gears back into prices and strikes not vols and deltas. These spells are like a transformer layer between vol lens thinking and price thinking.
  • You have an opinion about the market, but is your opinion bid or offered? The options market can tell you. Remember stock and asset prices in general are low resolution. Options are surgical. Just like parlays, the relative prices between the propositions imply tradeable ideas. Options are the coding language in which ideas are scripted into trades.

the option market’s point spread (part 2)

In part 1, the option market’s point spread, we introduced the idea of the VRP or volatility risk premium which sets the line on whether an option buyer or seller will win. Usually the sellers win but that statement is uselessly low res.

💡When we say “win” we mean even in expectancy terms. In frequency terms it’s even more true. From Straddles, Volatility, and Win Rates:

Expectancy and win rate are not the same. Remember that the most you can lose is 12% but since there is no upper bound on the stock, your win is theoretically infinite. So the expectancy of the straddle is balanced by the odds of it paying off. You should expect to lose more often than you win for your expectancy to be zero since your wins are larger than your losses.

So how often do you theoretically win?

A fairly priced straddle quoted as percent of spot costs 80% of the volatility. We know that a 1-standard deviation range encompasses about 68% of a distribution. How about a .8 standard deviation range?

Fire up excel. NORMDIST(.8,0,1,True) for a cumulative distribution function. You get 78.8% which means 21.2% of the time the SPX goes up more than .8 standard deviations. Double that because there are 2 tails and voila…you win about 42% of the time.

So in Black-Scholes world, if you buy a straddle for correctly priced vol your expectancy is zero, but you expect to lose 58% of the time!

My teaser for this week’s post claimed:

[This week] we will go a bit deeper to appreciate how you can manipulate the inputs into VRPs to identify potential vol trades. I said VRP is the option market’s point spread.

Except for a tiny wrinkle.

There’s no single line.

The VRP computation (IV/RV) is just one measure of relative value. There’s no single point spread.

It’s easy to demonstrate this by casting doubt on the denominator. Here’s a thought exercise:

Stock ABC has a VRP of 10%.

Its IV is 17.6%

Its realized vol based on the last 20 days of daily log returns is 16%

📅16% annual vol is approx 1% per day…16%/sqrt(251)

17.6/16 = 1.10 – 1 = 10% VRP

…but

Now I tell you that the stock went up every single day by 1%.

💡If you compute the standard deviation the standard way by subtracting the sample mean (ie x̄) you’ll actually get a standard deviation of zero. If an asset has exhibited a steady trend this is obviously misleading since calling a stock that went up 1% per day for 20 days “zero vol” drains all semantic meaning from “vol”. The fix is easy…when you compute realized vol just don’t subtract the mean or use mean = 0 before squaring the individual returns. This caveat wasn’t the point of this exercise but if your antennae went up, fair play to you.

What’s the volatility of this stock?

Sampling daily returns we get 16%. But the stock is up 22% in 20 days.

We can annualize that point-to-point return:

22% * sqrt(251/20) ~ 78% realized vol

We can annualize the weekly (ie every 5 day returns):

5% * sqrt(251/5) ~ 35.4% realized vol

You get the picture.

What is the right vol for the option?!

16% seems way too cheap given what just happened.

So depending on what vol you pick your VRP ranges from 10% on the high end to 17.6/78 – 1 = -77% on the low end.

There’s not one point spread!

Realized vol is sensitive to your sampling periods. And I’m not even getting into super-fast updating vols (computing realized vols from tick data, a fun rabbit hole of its own).


On the desk, I was always on alert for highly divergent vol readings based on sampling periods.

[This is hardly a silver bullet. The implied vol on the hypothetical stock above is likely to be higher than 16%. The market’s not stupid.]

Honestly, I didn’t screen for those scenarios explicitly. I traded commodity options. The universe was relatively small so whenever I looked at realized vol numbers, which I did often, I had a feel for whether they made sense. If the realized vol (sampled daily) in gold is 12% but the metal is up 7% in 2 weeks I know the realized vol is misleading.

7% * √(52/2) ~ 36%

That’s like a 36-vol move. If near-dated options are trading at a 10% or even 20% VRP to 12% realized (or 13.2% to 14.4%) I’m out there accumulating a long gamma position.

I didn’t have a tool to necessarily flag these scenarios but if you trade a relatively short list of names all the time you build a mental history. I’d know which brokers were selling vol in the name, I might ask them where their customers are offering, or I might even go to brokers who were buying at cheaper vol levels before the move and see if they want to sell at the higher IVs and “take profits”.

[The game is a mix of what the cards are (examples include implied and realized vols) and human behavior —who has an axe to buy or sell and how do those axes change with the cards.]

The point is there’s a bunch of tacit knowledge that I don’t bother displaying on yet another monitor.

[There’s a wiseguys-trying-to-outdo-wiseguys snark about having lots of monitors. Like real Gs need nothing more than a laptop. You know something, being Warren Buffet or a VC is nice work if you can get it…but if you get the chance to visit the office of a market-making group you’re gonna see a LOT of screens. My set-up had 6 24s, a tablet for Cloud9, and windows into several virtual machines. What do you want me to say…professional trading is a video game. Put a price on 100k vega in comp, you have about as much time as it takes to move your eyes while chatting them up about last night’s World Series game.

I can see how a normal business person might think this is crazy. Lucky for them, most jobs are civilized. If you want to look down on the animals who need a wall of monitors you’re only soothing yourself — the feral don’t give a f what tool you use to convert time to cash.]

Using the data and background on VRP in last week’s post we can examine my tacit hunches more closely.

The goal of the post is to:

  1. inspire seasoned traders to explore fresh inputs into pricing volatility
  2. for novice traders, to have each of the building blocks in the exposition expand their frontier of knowledge a little further.

Ratio of realized vols sampled at different frequencies

In Risk Depends On The Resolution, we see that volatility depends on the sampling frequency. In general, more frequent sampling results in higher levels of measured volatility. This is a relevant observation for all investors not just option traders. It means that simply looking at an asset’s annual or even monthly returns smooths (if you are a long-term investor) or masks (if you run a strategy whose stakeholders are shorter-term oriented) the path. It’s a warm blanket for the patient and dragon for the churner. Whether the observation is reassuring or a warning depends on the context.

(I leave it to the reader to spot the asset management marketing departments who use this observation to flatter themselves by invoking it inappropriately).

For our purposes today we will measure 1-month realized volatility at 2 sampling frequencies:

  • daily (day-over-day logreturns)
  • weekly (5-day point-to-point logreturns)

Our monthly window will constitute 20 business days, therefore 1-month vol sampled:

daily means 20 returns or data points

weekly means 4 returns or data points

🗓️As a reminder we are using data from the past year (10/11/23 to 10/16/24).

 

Across our 43 names, the average ratio of volatility sampled weekly vs volatility sampled daily is 94%.

By example, that means a stock that realized 10% volatility using daily returns for the past month, would have realized 9.4% volatility if we sampled weekly instead.

Only 5 names had a realized vol that was higher when sampled weekly instead of daily.

This is in keeping with the general empirical principle — volatility sampled less frequently tends to be lower.

However, there is tremendous variation in this ratio even if it averaged 94% over the full sample. The standard deviation of the ratio is a whopping 35% meaning 2/3 of the time the ratio was between:

  • 59% — the daily sampled vol was 66% higher the weekly sampled vol!
  • 129% — the weekly sampled vol was 29% higher than the daily sampled vol.

The lower measured vol effect from less frequent sampling holds generally, but it’s very noisy.

A word on trending vs mean-reversion

Going back to our introductory puzzle, if a stock goes up 1% per day for a month its vol, if we sample weekly, is much larger than if we simply annualize that typical daily volatility.

The ratio of vol sampled weekly / vol sampled daily is much greater 100%. This scenario corresponds to our colloquial understanding of the word “trend”. The stock “trended” higher. A quant might say the drift dominated the volatility. As far as I know, this ratio being > 1 is not an accepted definition of “trend”. But even if it is not formally defining, I suspect it’s a common characteristic of a market that is labeled “trending”. (I have a post in the queue that will unpack this further so we will put a pin it in for now.)

Regardless of how the wider quant community views trend, the ratio and its suggestion of trend is deeply relevant to option traders.

If the ratio is greater than 1, the long option holder will have wished “they let their gamma run” while the short option trader will have wished to hedge more frequently.

I suspect any option trader reading this will drink to that since the memory of how they hedged is inseparable from large p/l events, positive or negative.


Ya know what, let’s take a breath and acknowledge something…”ratio of realized vols sampled at different frequencies” is a miserable mouthful. Just take a moment to digest it. It refers to the same window of time, it’s just that the numerator (weekly sampling) is less frequent.

Another way to think of it: it takes longer to converge to a estimate of the volatility

If we computed volatility based on 10-year returns you’d die before you felt like you had a reasonable guess of the asset’s volatility. A long sampling period is a slow-moving measure of variation. Higher frequency sampling gets us to a reliable measure of volatility much faster.

While we’re at it, I have another simplification.

Sacrificing formality for ease of readability, let’s call the “ratio of realized vols sampled at different frequencies” the trend ratio. If the weekly sampled vol exceeds the daily sampled vol we are trending, if it’s lower there’s mean-reversion. (Again, not officially, and don’t tell the quant police or the publishers over at Wiley.)

From trend ratio to VRP

We typically measure VRP as the ratio of implied vol to realized vol sampled daily. But there’s no single VRP. We could make the denominator realized vol sampled weekly or any other interval.

Let’s consider a VRP using the weekly sampled vol.

If the weekly sampled vol is greater than the daily sampled vol (a trend ratio greater than 1), the VRP is algebraically pulled lower. Options appear cheaper.

We expect the options market to correct for this when the trend ratio is much greater than 1 by bidding the implied vol higher.

We expect that a trend ratio greater than 100% will coincide with elevated VRPs when the VRP is computed traditionally (ie with a daily sampled realized vol denominator).

Just looking at the bulk data across names (we limit the x-axis to 2 standard deviations on either side of mean of .94), there’s no relationship.

Let’s look by name.

This is a table of VRPs partitioned by trend ratio. The names that are mostly green have low VRPs and the red ones have had lots of volatility risk premium.

My hunch was that high trend ratios (where weekly sampled vol is much higher than daily sampled vol) would correspond to higher VRPs as the market understands that the traditional measure of realized vol is understating the variance. It’s like the variance is smuggled into a steady trend.

It’s a noisy table. At best maybe BITO and MSFT conform to my expectation. In fact, the broad indices (SPY, QQQ, IWM) and SPY sector indices (the “XL’s”) seem to have an even lower VRP when the trend ratio is highly positive. Considering that the market is up substantially from a year ago, the trend has been positive which tends to correspond with vol dampening option selling. This can push down the VRP via the numerator. Algebraically, it implies IV is well below realized vols that are computed using weekly sampling.

I did not expect this. My intuition is mostly tuned on commodities. I don’t see my hunch turned on its head there, but I don’t see a relationship either.

I have another idea. Since each name has its own mean VRP, let’s redo the table where each cell is a diff from the name’s own mean VRP.

I’m getting the same feeling from this table especially in those broad indices. When the trend ratio has been much greater than 1, the VRP nosedives below its average. My suspicion is the vol is getting trashed on those steady “frog-in-the-pan” rallies.

Since the trend ratio is positive, the VRP based on the weekly sampled vol is even lower still!

This begs the question…were the options therefore cheap?

Trend ratios and lagged VRP

One way we can assess if the options were cheap is to look ahead in time to see if the realized VRP was less than 100%. In other words, did the realized vol that prevailed the subsequent month outperform the IV? We call this realized VRP a lagged VRP.

[Lagged VRP was a core topic in last week’s the option market’s point spread]

This table once again partitions by trend ratio but now it displays the subsequent lagged VRP.

We do notice a preponderance of green amongst the XL sector indices suggesting the realized vols did in fact perform well vs the seemingly cheap IVs we spotted earlier.

But we did not see this hold for the broader indices. It also doesn’t mean that the IVs were absolutely cheap — after all, the lagged VRP’s on average are still higher than 100%…these options turned out to be fairly priced vs the subsequent realized vol.

Overall nothing stands out as blatantly interesting. At the same time, the non-finding, is also not a dead end. This exploration is incapable of being conclusive. It’s a year of data in 43 names with overlapping windows which means it’s not much data at all. Furthermore, the sample sizes in these individual cells is also small — sure when the XLU trend ratio was between 1.5 and 1.6 the options turned out to be cheap the following month but how often did that happen? That could be a sample size of 1. FXI and URNM never even experienced a trend ratio greater than 1.6.

Wrapping up

Like the last post, this was a demonstration of how to explore a vol idea. We started with the premise that realized vol measures can be a poor reflection of what an asset’s future volatility might be because its sensitive to the sampling period.

By choosing a different sampling period (which is what we effectively did by partitioning by trend ratio) we change the VRP which means changing how cheap or expensive the options appear. Then we see if our adjustments singled out vols that did in fact turn out to be mispriced.

There are so many parameters to play with. These were just a few from this post but you can imagine so many more:

  • Realized vols from different sampling periods
  • Different ways to compute realized vol (close-to-close vs range aware computations)
  • Any number of implied vols you can choose from the surface.
  • Asset classes, sectors, individual tickers

Personally, I start with ideas that make sense. If I measure realized volatility in 2 different ways and get vastly different numbers, it seems possible that the market might blend them incorrectly. Seems like a good place to look.

[💡General observation: The downside of an interpretable hypothesis, is you’ll probably have company. There are quant funds that generate signals they don’t even understand. The downside is when it goes wrong, it’s harder to troubleshoot. But at least nobody else is likely to find the edge while it persists.

In any case, that style of trading sounds like alien hunting. I am incapable of putting myself into the mind of alien to understand what it might do so it’s not a sport I’d think to play.]