stepping through an oil put option trade

Back on Oct 17th, I sold Z24 WTI (oil) 67 strike puts unhedged.

I explained my reasoning in this thread back when I did the trade. They were the equivalent to the USO Nov 15th 69 puts (I said Nov 22nd expiry but that was in error.)

I covered the WTI puts on Thursday morning. I published this thread when I covered them. Here’s a mildly edited version:

We’ll use the USO puts to write the post-mortem of a roughly 2 week trade. I hope its super educational.

Let’s start with the vibes.

Markets feel a bit, I don’t know, ahead of themselves. Everything but oil popping (til today). Rates & USD higher.

I’m less bullish on oil (and broadly bearish).

I cut oil length and bot t-bills this am.

Let’s get to the specifics of the “cutting length” trade because there’s many ways to do that. But my directional bias coincided with wanting to buy back my short vol (the moontower mantra — don’t touch the options without a vol lens).

Why?

Vol has sufficiently relaxed in oil since i sold the puts. The put skew was at normal levels when i first sold the 30d delta puts but the vol was high. Since spot/vol corr was positive that made them seem extra high!

Today there almost no skew in those (now) .29d puts
The implied vol is below realized (granted realized is on the high end of the range)
AND the election is getting negligible vol premium in oil

These pics show the negative VRP and negligible event premium assuming 35% fair base vol (which comes from eyeballing the USO vol term structure).

While that explains the how and why of covering the position, let’s understand the p/l attribution of the position while I held it. We do this with the same type of charts I’ve been showing post-mortems with.

I’ll narrate where your eyes should go so it’s easier to learn.

If the puts were hedged the trade was steadily profitable except for 2 days out of 11 when the market popped.

See the yellow boxes on the red line:

Every day we can see the contribution to the hedged p/l from 2 components:

realized vol p/l (tug of war between gamma & theta)
vega p/l (change in IV)

The yellow boxes are examples of the daily decomposition.

Look what happened on Friday 10/25’s big down move…the hedged p/l was still positive. Yes, you got hammered on the realized p/l but vol got slammed! The put skew was in fact unjustified. The down move was what I call stabilizing to the market

Down moves aren’t normally stabilizing but my idea was that the Middle East conflict was driving the high vol so in this context a down move would be stabilizing so the total vol was unjustified if oil is lower.

(Ofc I was naked short the puts so that Friday was still a tough p/l day because i experienced a rough delta p/l but overall it was buffered by the puts underperforming)

Over the life of the trade, on a delta hedged basis you would earn $.45 being short the option from $1.76 On an unhedged basis it was $.78 (I actually made more than that because I actually sold the options closer to $2 bc the stock was about the same place it is right now, $71.85)

More importantly let’s look at the cumulative p/l attribution:

Almost all of it came from vega. The option was well-priced from a realized vol point of view!

This all ties in to my general gestalt of “short where she lands, long where she ain’t” bit. If vol is going to relax when it “gets there” then you don’t wanna use the option to bet it’s gonna get there. And if everyone thinks it’s “not going there” then when it does it will destabilize and you’ll wish you owned that option.

As a junior trader I remember selling calls bc “it’ll never get there”. I promise you there are many people who think like that. They don’t understand vol trading.

[An aside: That statement sounds more incendiary than I’d like. It troubles me I can’t fully articulate it. It’s a bit of an ink blot test. It’s understandable if you find that unsatisfying but the raw reality is indifferent to both of our dissatisfaction. In truth, there was a point of separation somewhere along the evolution of trading careers where as things got more competitive from the floor days to today, the traders who were copycatting disappeared into other parts of the business. When trading morphed from time/place advantage edge to positional edge it exposed the copycats lack of deep options understanding. Before Hollywood, there was a “me too” era on option pits across the land, where you only had to be savvy enough to identify who was smart and just make sure you yelled “buy’em” at the same time.

There are a lot of people who sound like they get vol trading. In fact I can’t fully imagine how hard it is for someone who’s not super experienced to tell the difference. The problem with codifying a “trading Turing test” is the same one interviewers have with candidates — as professional-grade info gets disseminated it’s hard to know if someone has earned it or parroted it.]

One of the savviest oil options traders I ever knew had a good formulation:

He’d buy those nominally cheap (but vol expensive) weeny puts when he was bullish. Because if the market dropped he just wanted to own what I call “the trap door” to protect what he really wanted to do…get balls long

In my own trading, i wanted to own the trap door. Whatever everyone thinks is impossible is the option i want. Stick’em in my back pocket and if it ever comes into play I’m the only one with 2 hands on the wheel

I admit this instinct was much stronger when i was trading a big book and i don’t have it as much now (that’s another discussion altogether).

Anyway, I hope this was overall educational. There’s an art to this game called options. If anything i maybe it gets the ole mind bicycle spinnin’ in such a way that even if you don’t trade options, it can mentally upgrade your whole investment decision OS.

“negatively priced lunch”

Markku Kurtti is an engineer in the telecom world. His outsider quant take on portfolio construction is beautifully derived and intuitive.

I strongly recommend his interview with Corey Hoffstein:

🎙️Diversification is a Negatively Priced Lunch (Flirting with Models podcast)

His blog is also outstanding. I’ll point you to this post in particular:

How much skill a concentrated stock picker needs to beat a diversified benchmark? (17 min read)

I summarize key findings below (with the aid of an LLM). The “Moontower highlights” are direct quotes from my kindle.

The central theme:

For a stock picker to successfully manage a concentrated portfolio, they must generate sufficient alpha to overcome the inherent risks and volatility associated with fewer holdings.

Supporting points:

1) The Balance Between Concentration and Diversification

Concentrated portfolios inherently carry more risk due to idiosyncratic variance, or the unique risks associated with individual stocks. To overcome this risk, stock pickers need to generate enough alpha to offset the “variance drag” — the reduction in expected growth rate caused by high volatility.

🟡Moontower highlight: “Portfolio construction of a skilled stock picker is a compromise between enhancing alpha by concentration and mitigating idiosyncratic variance drag by diversification.”

2) Importance of Consistent Skill and Alpha Requirements by Stock Size:

Different types of stocks require varying levels of alpha to beat the benchmark. Larger, more stable stocks typically require less alpha than smaller, more volatile stocks. Consistency in skill is crucial, as erratic performance increases the minimum alpha required to compensate for the higher risk.

🟡Moontower highlight: “Assuming perfectly consistent stock picking skill over time, 10-stock big stocks portfolio has historically required roughly 0.5 percentage point (pp) annualized alpha, small stocks ~1pp and micro-caps ~2pp. High E/P, E/B, Mom and B/P styles, in the universe of all stocks, have required roughly ~1pp and low E/P, E/B, Mom and B/P styles north of ~2pp. Low E/P style (smallish growth stocks with low profitability) have required the highest 2.55pp alpha.”

3) Risk of Concentration Without Skill

Concentration magnifies returns but also heightens risks. Without genuine stock-picking skill, a concentrated portfolio becomes increasingly likely to underperform over time. The document cautions against relying too heavily on concentration to boost returns without sufficient alpha.

🟡Moontower highlight: “But concentration is risky. If you concentrate and don’t have genuine stock picking skill, time will be your enemy.”

4) Circle of Competence and Style Diversification

The post emphasizes the value of investing within one’s “circle of competence” — areas where the investor has the most knowledge or advantage. However, it also warns that focusing exclusively on a single style exposes investors to style risk.

5) Predictability of Variance Drag Over Return

Idiosyncratic variance drag, the penalty for concentrating in fewer stocks, is more predictable than expected returns.

🟡Moontower highlight: “Idiosyncratic variance drag differences are easier to predict than expected return differences. It is therefore safer to increase diversification, which reliably decreases minimum alpha requirement, than to increase portfolio concentration to enhance uncertain alpha.”

🟡Moontower reference: The idea that volatility is more predictable than returns is a foundational principle in portfolio management. See Know Nothing Sizing

6) Lottery Preference in High Variance Styles

Some investors are attracted to high-idiosyncratic-variance stocks with potential for lottery-like returns leading to lower forward-looking returns.

🟡Moontower highlight: “Some investors may prefer stocks that may pay off big and this is exactly what idiosyncratic variance delivers: large dispersion of returns among individual stocks.”

🟡Moontower reference: See A Recipe For Overpaying for a succinct explanation by Chris Schindler.

7) Takeaway on Diversification for Risk Management: Diversification not only reduces variance drag but also lessens reliance on unpredictable alpha.

🟡Moontower highlight: “Our take away is that idiosyncratic variance drag is much more predictable than expected return. More generally, it is easier to predict variance than mean return. It is therefore safer to diversify more as it will reliably bring down idiosyncratic variance drag compared to concentrating more in a hope of higher alpha.”

It’s a love letter to diversification mixing words and math. For what it’s worth, at SIG Jeff Yass also called diversification a free lunch.

I’m partial to my Sun/Rain example in You Don’t See The Whole Picture which is an even stronger statement — you are incinerating money by not diversifying but if you evaluate yourself by “resulting” you won’t see it. That’s because the highest bid for risk is the most efficient at absorbing it. This is deeply true in the derivatives world. In the broader investment landscape it’s confounded by info asymmetry, principal-agent conflict, and the comfort of (perceived) safety in herding.

If you want to get deeper into this idea see the back half of the moontower guide:

🟰Understanding Risk-Neutral Probability (link)

But be aware…”diversification always means having to say you’re sorry” since something is always losing.

And sometimes almost everything loses. This was Wednesday. Eww.

ETF slop

On the investing front there is an absolute explosion of new ETFs being listed every month.

Dave Nadig gave a presentation for Kitces.com and summarized the key points in:

The ETF Market: A Zine (14 min read)

A few notable takeaways:

ETFs have become a behemoth of $10 Trillion in assets across some 4,000 products.
That growth has come largely at the expense of traditional active equity mutual funds, although the worst of that outflow seems to have abated a little. As every asset manager on the planet finds a way into the ETF market, the “horse race” between mutual funds and ETFs matters less and less.
Traditional Mutual Funds will exist forever thanks to 401ks, or until someone rewrites the entire US retirement system.
The industry is on a massive product development binge, launching 650 ETFs this year so far with an open/close ratio of 3:1.
Over 40% of industry revenue comes from products that aren’t cheap beta.
There are more ETF Brands now then there were ETF Tickers 20 years ago

The post is directed at financial advisors but hands-on individual investors should certainly read it.

And if you’re interested, there is even a “how to launch your own ETF” discussion including a link to Corey Hoffstein’s tutorial.

One of the comments describes the post well:

A wonderfully-written, comprehensive, and refreshing time piece about the real story of ETFs for all – pro’s or not!

All this financial, umm, innovation does get a little chuckle from me (levered exposure to individual stocks? Really? It’s like ghost of single stock futures haunting your watchlist).

The chuckle:

I’m not the only wiseguy feeling this way. This wiserobot is less lazy than me in its skepticism:

The thread continues…

Shorting all this nonsense (uncle nonsense reporting for duty) vs going long whatever it’s trying to replicate directly is a labor-intensive way to effectively pay yourself the embedded management fees. But the feasibility is predictably undermined by borrow costs.

But as a trader, it’s a useful reflex to:

Observe the growth of “product” incentivized by fees and lowered barriers to entry
Expect a bunch of trash to be launched with the logic of “it’s a call option on asset gathering”

It can inspire trade ideas from a place of maximal interpretability — you can’t launch all this stuff and expect none of it to be steaming hot turds.

Dave even warns you about what’s coming to the crap carousel:

I have been asked about getting private equity and credit into ETFs every single week this year so far. I’ll just put the marker down here again: this is a bad idea. YES, it is the case that we have broken market capitalism so badly that the majority of what we would recognize as actual capital allocation and risk taking happens privately. NO, that is not a good thing, and it does not mean we should shove all that private capital into daily-liquidity structures like ETFs.

The money currently trapped in private markets is desperate for liquidity so it can invest back into greener deals where there’s more profit runway. That money will push, and push, and push until it finds a new pile of money to sell to. Don’t fall for it. Be super skeptical.

The “world’s worst time traveler” investing style

@lastoneslaughing

#natebargatze #natebargatzecomedy #standup #comedy #standupcomedy

♬ original sound – LastOnesLaughing

I’m going to share 2 studies, one new and one old that say something that is counterintuitive to most people but probably not to traders:

Even with perfect foresight of market movements, there’s no guarantee you’ll be a better investor.

The old one first:

Even God would get fired as an Active Investor (7 min read)
via Alpha Architect

This post from 2016. It demonstrates how a hypothetical portfolio built with perfect knowledge of the top-performing stocks over the next 5 years yielded impressive returns (29% CAGR) but also experienced significant volatility and a 76% drawdown.

Even a ‘perfect’ long portfolio can bring a long-only investor a ton of pain.

The hypothetical long/short portfolio, again with perfect foresight, achieved a remarkable 46% CAGR but still faced a 47%+ drawdown.

For investors who are benchmarked the news is still tougher — The god portfolio still underperformed SPY for extended periods that make it hard to stick with.

When a Crystal Ball Isn’t Enough to Make You Rich (20 min read)
Elm Wealth

Victor Haghani and his team discuss an experiment where participants were given historical front pages of the Wall Street Journal and tasked with trading stocks and bonds based on the news.

Hijinx ensue.

The majority of participants, despite having access to “future” news, failed to generate substantial profits. Many even went bust. This highlights the difficulty of translating information into profitable trading decisions. (It’s why opinions are worthless. The question is always “ok, what’s the trade?”)

Notable findings:

Trade-Sizing Crucial: The disappointing results stem primarily from poor trade-sizing decisions. Participants often overleveraged, leading to significant losses when their predictions were wrong.
Experience Matters: Seasoned traders fared significantly better (they even get a senior Jane Street traders to try it), demonstrating the importance of experience in interpreting information and managing risk.

It’s not shocking that Haghani, one of the principals of LTCM back in the day, reminds us that there is little value in the crystal ball without sensible trade-sizing.

You can try the game yourself:

🔮Crystal Ball Challenge

Haghani is also half the duo behind the famous Haghani-Dewey study where economists and investment folks, many with graduate, degrees embarrass themselves with their inability to size bets on a coin weighted in their favor.

You can play that game too:

🪙Elm Wealth Coin Flip Challenge

My synopsis of it was a very popular post when I published it 2 years ago, (see Bet Sizing Is Not Intuitive) because the conclusion is profound:

Like these crystal ball/god studies, prediction is just not enough. Betting and trading require a far richer set of practices than just having an edge. Edge is a necessary but insufficient criterion for sustained success.

In the spirit of these games, I’ll remind you of this riddle from Philip Maymin’s Financial Hacking (GOAT-tier trading book — if I’m ever tasked with developing firm or education department this is required reading).

This is excerpted from my extensive guide to the book:

🧩How much would you pay to know the closing price of SP500 in one month?

I can tell you where the SP500 will settle in one month. How much would you pay for this information? (And then, what would you do with it?)

Let’s say you give a number like $ 10 million, and I accept it. The S& P 500 is currently at 1000. I gaze deeply into your eyes and tell you the truth: in one month, the S&P 500 will close that day’s trading at a level of…. 1000. Oops! Now what? How are you going to make money? You owe me $ 10 million in a month, and I will collect. There is no point in buying or selling futures at the same price at which you expect them to expire. So what can you do? [He doesn’t mention the strategy of announcing your shot on social media and using it to gain followers. The value of this will depend on whether you have something to monetize or follow it up with…and if you do not already have a following it’s likely you don’t have skill in monetizing one so again the value of the follower windfall depends on its beneficiary]

All you can do is hope the market moves in the meantime, and it really is a hope, because you have no other information about what is going to happen over the course of the next month, not the volatility, nor the volume, nor the highs and lows. All you know is that it will be at 1000 again a month from now.

So how do you time your entry points? Say you have $1 million of liquid assets and say that this much money would let you support up to $10 million in notional, because futures have a haircut of about 10 percent.

Suppose you are very lucky and the S&P 500 jumps down to 900 before you even have a chance to put in your order. Now you would want to buy. But how much? Do you put your entire amount on the line, such that even a single tick against you triggers a margin call?

Ultimately you can perhaps do best if you are able to buy and sell options, but there won’t always be a liquid options market at every strike you need at the asset that you want to trade, and besides, we haven’t really discussed options yet. [Kris: This is actually the key — you could use options to structure a bet on terminal value but this riddle in general is insightful because it shows just how much you are missing if you don’t understand options]

This is the exclamation point on the matter:

These kinds of practical issues are ignored in standard textbook discussions of riskless profit opportunities but they are precisely the issues that financial hackers worry about most. And you will almost surely never experience anything with this level of certainty at any time in your career. There will always be doubts about your model, your inputs, and your forecasts…According to standard theoretical concepts of arbitrage, none of those questions matters. According to real-world practical experience, you can’t even begin to trade until you have answered all of them.

Volatility term structure from multiple angles (part 2)

In part 1 of Volatility term structure from multiple angles we opened by discussing how nearer dated implied vols move around more than deferred implieds. Recognizing that dynamic, our net vega position for a time spread can be ambiguous.

Just as stock traders use beta to normalize risk to a benchmark such as SPX, volatility traders will normalize their vega to a fulcrum month. √t scaling corresponds to a model world where time spreads between months remain relatively stable. It’s not reality but it’s a vast improvement over summing raw vegas.

In comparing vols between 2 months, vol ratios are popular. If M1 is 18% vol and M6 is 20%, the vol ratio is 90%. It’s a measure of how steep the term structure is. If you track the ratio for constant maturities then you can get a quick sense of the relative supply/demand for IV. If the ratio is less than 1.0, the term structure is ascending, a shape typical of “it’s quiet now, but we expect mean reversion to typical higher levels of vol”. A downward sloping vol curve is more closely associated with high vol periods or the market’s anticipation of an even such as earnings or the election.

Vol ratios are only one way to measure the slope of the term structure. We saw that implied forward vols are a complementary measure that also describes the relationship between 2 volatilities on the term structure. The computation tells what volatility is baked into the period between the 2 expirations. The logic is that the deferred expiration accounts for all the volatility from now until the option’s last trading date while the near-dated expiry isolates the early period’s expiration. If you consider an extreme example where the time spread is worth 0, ie the deferred option and the nearer-dated option are the same price, the forward vol is zero.

So why look at 2 measures, vol ratio and implied forward vol, if they both tell us about the relative price of implied vol on the term structure?

Remember, we were looking at GLD 1m/6M vols for the 1-year range 10/2/23-9/23/24:

We saw:

The forward vol is sometimes high and sometimes low regardless of the ratio!

But look at early March — not only was the vol ratio low, the forward got crushed. If you only look at vol ratio, you missed this.

Implied forwards are an orthogonal or complementary measure of relative volatility that is additive to your perspective.

Today, we will dive further into the relationship between vol scaling, forward vols, ratios. We will come out on the other side with what this all means for finding trades and managing risk.

Off we go…

Constant Straddle Spread vs Forward Vol

Suppose you are long a time spread.

We’ll say a straddle spread because that’s a common trade expression and it also connotes delta-neutrality.

[It’s probably helpful in thinking about these things to not have to process “call”, “put”, and their associated directionality. Depending on where you are in your learning I’m trying to be mindful of cognitive load.]

You’re long a 6-month straddle for 15% vol and short the 1-month straddle at 15% vol. Front month vol suddenly spikes a point to 16% and the 6-month vol increases by 1/√t (ie 1/√6) or .41 vol points to 15.41%

Here’s what we know:

The term structure went from flat to descending.
You lost 1 vol point on your 1-month straddle, gained .41 on your 6-month straddle. Because the 6-month straddle has 2.45x as much vega, you broke even.
The straddle spread is unchanged.

What happened to the forward vol?

You can compute it yourself with the moontower calculator but I’ll just tell you…the implied forward vol increased to 15.3%

Let’s summarize what happened:

You are long raw “click” vega since you own the longer-dated option
You are flat weighted or scaled vega if we use √t scaling with reference to M1
Vol across the curve increased in proportion to √t weights leaving the straddle spread unchanged
The forward vol you are long increased, although your p/l is unchanged.

The key point to appreciate:

A constant straddle spread price does NOT mean the forward vols are also constant.

Another way to say this:

The same straddle spread price can yield different implied forwards!

Implied Forward Vol: a “many-to-one” relationship

For any implied forward vol there are many pairs of vols that can produce it. This is why the straddle spread and forward are not overlapping measures of term structure.

The table shows:

many combinations of vols that generate a 15% forward
√t or constant straddle spread scaling means the forward is changing
The scaling would need to be sub √t (ie more muted) for the forward to not change. If fact, in a vol increase scenario, the straddle spread price would need to narrow for the forward to be unchanged.

Observing GLD vol data for the past year we can see the many-to-one relationship between:

a) vol ratio and forward vol

For any given vol ratio there are many forward vols. It’s not a function.

b) vol pairs and forward vol

For any given forward vol, the 6M vol can be a fairly tight range while the 1M vol could be far above or below the 6M vol!

You can see how the forward vol and the 6M vol are positively correlated. This makes sense — the forward vol is driven by 5 out of the 6 months in the tenor.

You can also see that at low (high) 6M vols the 1M vol tends to be even lower (higher).

Linking the scaling relationship to vol of vol

Remember that weighting our vega by month is analogous to beta weighting a stock portfolio. Beta weighting summarizes a portfolios market exposure with respect to SPX or some other benchmark. Looking at raw unweighted vega is like ignoring the relative volatilities between stocks in a traditional portfolio. $1mm of risk in SPY is not the same as $1mm of NVDA.

√t weighting is therefore suggesting a vol of vol with respect to the fulcrum month (in our examples M1). If the back month vol is more volatile than √t suggests than our long straddle spread is long raw click vega AND weighted vega. If M1 vol increases by 1 point and M6 increases by .75 points, than the straddle spread and forward are expanding quickly. The beta of the back month vol to the front is high.

We can regress GLD 6M vol vs 1M vol to see the sensitivity. Remember a sensitivity of .41 would be √t or constant straddle spread scaling.

√t scaling seems to underestimate the beta of 6M vol to 1M vol. A long straddle spread would be long weighted vega not just raw vega.

We can also compute the standard deviation of vol changes to see the vol of vol. You can see that the empirical 6m vol of vol is less than 1M vol of vol (totally expected) but it’s higher than what √t predicts.

If you truly wanted to be flat weighted vega you’d need to ratio the spread to have less units of the deferred straddle.

Takeaways and Discussion

Vol ratios are common ways to represent the steepness of a term structure.

A tradeable expression of a vol ratio is the price of a vega-neutral straddle spread (using our example from today, if you buy 1 6M straddle and sell 2.45 1M straddles). Its performance mimics the vol ratio chart at the beginning of the post!

Ratios just like beta-weighting sterilizes a position from the price level to isolate the slope.

You can even track or trade straddle butterflies to bet on the curvature of the term structure.

These are popular ideas from the futures and yield curve worlds:

Methodology – Yield Curve Spreads — https://yieldcurvespreads.com/methodology/

From that picture you can see how you might not have a strong opinion on A vs C (a slope bet that would be expressed in a ratio trade) but on the curvature between A and C (which would be expressed via butterfly or a “spread of spreads”).

The forward vol is an orthogonal measure

Implied forward vol changes even if the straddle spread doesn’t. We saw the many-to-one relationship between forward vol and the pairs they come from as well as the vol ratio.

💡Aside on Pairs trading💡

We’ve been talking about vol ratios along the term structure but we can combine our use of forward implied vols to inter-asset or pair trades.

In moontower.ai we have this vol ratio tool. Here USO vol looks rich compared to XLE

But you could also compare forward vol to forward vol difference in a matrix view.

These are cobbled together from 2 screenshots but you can imagine a UI which shows a matrix of all the individual forwards. Those implied forward ratios could then be compared to a vol cone of realized vol ratios between XLE and USO.

The orthogonal nature of implied forwards gives you another set of data to run through all your conventional views to see if something stands out.

I’ve mentioned many times in my writing how I hate vol trades that start with “skew is cheap or expensive”. My experience is that the “skew knows”. It’s highly self-fulfilling. Plus implied skew doesn’t vary as widely as realized skew, so you’re forcing convergence trades on a compressed implied range that doesn’t compensate for how sloppy vol can get on destabilizing moves.

Term structure trades on the other hand. This is the place to look.

Risk limits

Deferred vols are less volatile than near dated vols. It’s important to re-scale the vega per month. √t is as sensible choice as any but your survival shouldn’t depend on any particular choice mattering that much since it will be wrong.

Just as you would never trust all of your delta risk management to the concept of beta, vol scaling weights should be taken with a grain of salt both in terms of modeling changes in term structure and in determining risk limits.

As a risk manager, if you constrain net vega without also constraining gross vega (ie the absolute value of vega within each expiry) you are inviting a situation where a book looks flat based on some weights but masks giant time spreads underneath the surface.

Examining vol of vol directly as well as placing term structures into context with vol cones can offer an ensemble view to understanding how extreme time spreads can get.

💡A word on measuring vol of vol💡

In this post, I computed the standard deviation of vol changes. Any experienced trader knows that this is incomplete. Because of spot-vol correlation and skew, vol changes are not pure. They are a mixture of moving along the curve vs the curve shifting.

Always ask yourself — “what measure correlates to how my p/l performs?”. That’s what you care about.

In this case, you want to measure the vol of strike vol not some nebulous concept of floating ATM vol.

Extending the analysis

Gold doesn’t typically see much event vol priced into it. Some macro reports like unemployment or key Fed meetings will get their bump in the term structure but it’s not the same degree as say earnings for a single stock or the annual USDA Prospective Planting report in ags like corn and soybeans. These sort of events can can 1 or 2 weeks of vol priced into a single day.

Event variance propagates through the term structure in inverse proportion to the DTE. To say it in friendlier terms…they don’t impact the deferred months much relative to the fronts.

I will re-run all these charts for another name (thinking NVDA) for the past year and see what I come up with. I expect a lower vol beta than we saw in GLD but mostly as an artifact of the near-dated vols having a much wider range because of earnings.

Options Riddle

I saw a familiar type of riddle on Twitter that was directed at fundamental PMs. I gave a lazy answer and later improved it with a better answer after my half-assed-ness gnawed enough at me.

I’ll reprint the riddle and the better answer here but spelling out the steps in greater detail than I did on twitter.

Question:

Estimate the price of a $180 call (20% OTM) on a $150 stock with 50% volatility, 3 months to expiry

150 Call Calculation (The ATM option)

We start by estimating the at-the-money (ATM) call value using:

ATM straddle = .8 * stock price * implied vol * √(Time to expiry in years)

ATM Call = .4 * stock price * implied vol * √(Time to expiry in years)

ATM Call = 0.4× $150 × 50% ×√1/4= $15

180 Call Calculation (The OTM option)

The 150/180 call spread links the 150 call to the 180 call.

Call Spread Value Breakdown

The call spread’s total probability of expiring ITM is around 45%. This is another estimate off the top of my head.

Although you’d expect 50%, option models assume lognormal stock distributions because returns are compounded. Compounded or geometric returns are subject to “volatility drain”— pulling median price expectations lower than the forward price.

You can think of the expected value of the $150/$180 call spread in two parts:

The probability that it expires worth its maximum value of $30. This is P(S>$180)
The value on average when the stock expires between $150 and $180.
This is 45% – P(S>180)

Computing P(S>180)

Note that the straddle is simply 80% of a standard deviation.

The $180 call is conveniently $30 OTM or .80 standard deviations OTM

We know that 1 standard dev encompasses 68% of a distribution, so at a z-score of +1.0 the one-tailed CDF must be 16%

Spelling that out: 100% – 68% = 32% but we only care about the “up” case when the call is ITM, so we cut that in half to 16%.

Since this exercise is supposed to be all mental math, I’ll guess that a Z-score of 0.80 gives a one-tail CDF of ~ 20%, meaning there’s a 20% chance this call will expire in the money (ITM).

We will assume the 180 strike has P(ITM) = 20%

Expected Value Calculation for the 150/180 call spread

The case where stock > 180
E(call spread | S>180) = Max value x P(S>180) = $30 x 20% = $6
Case where S is between $150 and $180
E(call spread | 150<S<180) = Average value of the call spread when s is between the strikes x P(stock between 150 and 180) =

$15 x 25% = $3.75

💡Why $15?

The average roll of a die is 3.5

The average roll of a die given that the roll is greater than ‘3’ is 5. This assumes a uniform distribution over that range.

This same style of approximation works well enough for the call spread. Assuming the stock expires between 150 and 180, the call spread is worth $15 on average. The probability it expires between those strikes is the total probability of the stock expiring higher than $150 which I estimated earlier as 45% minus the probability of it roofing above $180 which we estimate at 20%. So the probability of the stock being between 150 and 180 is about 25%

Hence, $15 x 25%

We sum all scenarios where the call spread expires ITM (ie when the stock is above $150):

Call spread estimate: $6 + $3.75 = $9.75

If the 150 call is worth $15 and the 150/180 call spread is worth $9.75, then the 180 call is worth $5.25

Recapping key bits:

Knowing the ATM straddle approximation .8SV√T
Guessing that the probability of a >.8 standard deviations ~ 20%
Estimating that the probability of the stock going up is less than 50% in a Black Scholes price process (and that at 50% vol that probability is lower than say at 16% vol — in fact the drag is proportional to vol squared)

In the twitter discussion, a great link from 2012 emerged:

Calculating option prices in your head (7 min read)

The Hardy Decomposition offers a handy way to estimate OTM option prices in your head. By breaking down an option’s price into intrinsic value and a HardyFactor (which depends on how far you are from the strike, measured in standard deviations), you can quickly approximate the time value of the option.

The following comes from the post:

Option Price = Intrinsic + ATMPrice*HardyFactor

The HardyFactor is:

d1 is just how many standard deviations you are from the strike.

⚠️Looking at a quant forum it looks like the HardyFactor approximation is for options being priced with the ‘normal’ distribution version of the B-S model as opposed to the more commonly used lognormal version

Revisiting the riddle

If we revisit the riddle, we know the 180-strike has a d1 = .8 standard devs

If we linear interpolate between .5 and 1 we get a HardyFactor = 40%

Option Price = Intrinsic + ATMPrice*HardyFactor

180 call = 0 + $15 * 40% = $6

My call spread method yielded $5.25

The HardyFactor method (quickly) got us to $6.00

Sound like we have a decent market!

I put into an option calculator:

https://www.cboe.com/education/tools/options-calculator/

Pretty fun stuff. If the OTM call IV is discounted by 1 vol point (so -2% skew vs the 50% ATM IV @ the .27 delta option) then the theoretical call value would be $5.616 – .2575 (ie the vega) ~ $5.36

If you want more reinforcement on this I wrote a thorough twitter thread explaining vertical spread comprehension in detail.

learning options from scratch

A shortcoming I feel in my writing is none of my posts make for a good calling card. There’s no obvious “banger”. A friend described moontower as a “slow burn”. If you read it consistently you feel like you know me (and you do…I don’t have the energy or Huberman-esque levels of productivity to get me in trouble so yea WYSIWYG).

To me, it feels, in a good way, like one long conversation. (I hope the more you read it the more you’re attached to it. A literary stuffie.)

But because I often feel like I’m picking up where we left off, I realize I don’t address some basic info. For example, we cover a lot about options and trading. But unless you are part of the 40 people who first started reading moontower 5+ years ago you’d never know that I haven’t covered the absolute basics of options.

The closest thing to it is a non-exhaustive list of Arbitrage Identities. Even this is not hockey-stick diagram basic.

My recommendation for basics is to check out all the free educational stuff on the Option Industry Council website:

www.optionseducation.org

Within, you shall find the OIC Academy

You’ll need to create a login and password to access the educational materials, but it’s free and excellent.

The OIC education initiative is led by Mat Cashman. We’ve had several conversations recently because I’ve been looking for a place to point others for a good foundational education.

Mat has 20 years experience as options as a trader and market-maker. (We started the same year).

He started his career on the trading floor of the Chicago Board of Options Exchange in 2000 and has since traded multiple asset classes across a wide array of exchanges including the CME, CBOT, and the Eurex Exchange. In 2005, Mat helped launch the London trading desk of DRW, a Chicago-based options trading firm, and was instrumental in building the DRW presence in the London trading community. After his time in London, Mat returned to Chicago to join Toro Trading, and was quickly named a partner, overseeing all aspects of their growing U.S. Index Options trading business. During Mat’s time in the options industry, he has always been heavily involved in training and overseeing new traders and has created multiple options education programs along the way to share his knowledge of trading and optionality with people new to the industry. At OCC, Mat is responsible for providing support to a comprehensive options resource center that provides information and education about options. In addition to his responsibilities with OCC, Mat also serves as an instructor of The Options Industry Council (OIC), conducting option seminars and presenting online webinars to all segments of the investing community, including registered representatives and advisors as well as individual investors.

He knows options inside-out. And his mandate is to be a resource for the option industry. You can hit him up on LinkedIn. You can tell him I told you to bug him.

my aversion to trading implied skew

First of all, free subs to moontower.ai can access a few tools and reading materials as well as the community but they cannot post and can’t see analytics.

Here’s a question that was posted in the community this week:

I was reading thru an old tweet of yours on trading skew. The tl;dr of the tweet was don’t trade skew… Given I am in a masochistic mood, how would one go about backtesting a skew trading strategy?

I had 2 ideas, which I’d love to get your thoughts on.

Idea 1:

X asset 25d 3M normalized put skew is in the 100th percentile, sell a 25d put strike, delta hedged
hedge the delta daily or at some discrete interval
check how this strat would have performed assuming the trade is held until expiry

Idea 2:

X asset 25d 3M normalized put skew is in the 100th percentile, sell a 25d put strike, delta hedged
wait until normalized skew returns to some threshold, for example 75th percentile
hedge the delta daily but close out the trade as soon as the threshold is hit

Lots of questions, but the main ones are:

for idea 1, does your pnl depend on implied skew vs realized skew (similar to implied vol vs realized vol). How would you measure this?
for idea 2, does your pnl depend on a combo of realized skew (for as long as the trade is held) as well as surface repricing (ie selling at 100th percentile implied skew and closing out at 75th percentile). The thought of measuring this gives true masochists vibes, but how would you?
I wonder if the juice is worth the squeeze? Meaning, assuming you built the foundation to measure/test all the above, is there really any pnl in it / are you better off focusing on VRP trades?

My response:

As a matter of practicality, I think the test should be more in the vein of idea #1.

If you consider skew percentiles, the difference between the the 25th and 75th percentile could be some absolutely small number like 2 vega points. And the level of skew itself measured by percentile is sensitive to the percentile lookback such that the range you are trading over is just quite small. Your interim p/l will be the sum of implied vol change but plus realized delta hedging p/l.

But consider this…let’s say you sell the 25d put and it becomes a 50d put but the skew normalizes. That skew metric is no longer referencing your position. You have a floating vs fixed problem. In other words, you can’t really trade implied skew directly.

Your results are basically going to come down to path. Your interim p/l is going to get marked based on the IV of the fixed strike you have on and that in turn is going to influence the delta you hedged on.

The delta you hedge on is going to have a large impact on your final p/l so it’s not just where does the stock go but what deltas were imputed along the way. For example suppose you run a model with spot/vol correlation embedded in the SP500…this will generate higher OTM put deltas.

If the market trends down you will win to this but vice versa. However, if you used B-S deltas you will get hurt as the market goes down and vice versa. And even then, you will def get hurt on the marks, but if the stock expires near the short strike you will probably still win by expiration even though the mark-to-market path is hairy.

I used to work with a big oil options trader that would on a monthly basis stick a hedged 1-month risk reversal in a separate account and hedge it on B-S deltas. My point is that is an active choice that influences the results. Another choice could be to hedge on deltas that don’t incorporate implied skew at all but just use ATM vols.

Overall, testing the idea, even a monte carlo, is a great way to get a shape of the problem but more importantly because you can see how the parameters you choose impact the p/l path.

I’m not kidding when I say skew trading is masochism. If oil is $75 and has massive put skew and the market drifts down to $55 and the skew gets hammered (so say the 40 puts don’t perform) but you sold the 60 put what skew did is irrelevant. All that will matter is how fast did the stock go to $55 and what deltas were you running on the $60 strike along the way.

The weirder the distribution the crazier this is. I’ve seen nat gas option traders blow out being long put skew on a 15% drop in the underlying because they used too high of an implied option delta and they delta hedged several times on the way down.

Had they they run a lower vol and delta OR hedged less they might have survived. There’s not much lesson from this other than…sometimes a 15% selloff is interpreted by the market as “stabilizing” and sometimes it’s destabilizing and that is what’s gonna dictate the options behavior.

Volatility term structure from multiple angles (part 1)

The post Dragonfly Eyes served as a broad preamble to our exploration today. Be like a dragonfly — look through multiple lens.

We’re going to expand our thinking about volatility term structure to see why it’s a diamond with several facets — and most interestingly — why multiple ways of looking at it are not all correlated.

We are going to consider volatility term structure in a few ways. The differences will make the value of multiple lenses self-evident. The source of the differences have highly practical ramifications for 3 tasks:

risk management
surface modeling
trade prospecting

I would be surprised if even an experienced trader didn’t walk away folding the Rubik’s cube known as vol term structure in their head. If anything, I’m sure a seasoned trader can find some interview questions embedded in the concepts to bounce off candidates.

If you are a novice trader, you will still benefit. There’s nothing more than arithmetic in here. The value of hacking ideas from several vantage points will be obvious plus you will learn some basic transformation and measures that the more experienced folk take for granted.

About this post

The post is the first of a 2-parter. It won’t all fit in the single email view plus I’ve been under the weather this week. All the background work is done but I’m running on fumes to write it all up.
It’s a semi-Socratic progression of “show don’t tell” which serves to make the lessons your own.
We will use GLD vol data for the past year which was whimsically chosen. I didn’t snoop at the data first.
We get into implications for your own procedures.*
Finally, I talk about how and why I will extend the analysis.

*The word “your” prompts a reasonable question — who’s this for? I’m imagining a trader or risk manager at a prop shop/asset manager or an extremely sophisticated retail option trader. The material comes from pragmatism and experimentation. A durable way of seeing based on lots of pain. This is the stuff of salt mines. The way traders think.

Where does this intersect with quant and formal risk management?

Everywhere.

Quants may have a different language and set of methods for computation but the concerns are the same. I’ve said it before, but the caricature of the ivory tower theoretical physics PhD without street smarts is foreign to me. All of the gigabrain quants I’ve worked with were both practical and exceptional at asking questions. Their priority was reality.

Their knowledge becomes indispensable as risk management scales and portfolios become far more complex cross multiple strategies and asset classes. I have little to add to the code-level minutiae of implementing a large-scale risk OS. I pretty much operated at the frontier of how much I could keep in my head at once but as combinations expand exponentially, well, you’re gonna need a bigger boat.

Let’s open with a “simple” question:

If you buy a 6 month/1 month straddle spread on an equity or ETF, are you long vega?

(To be linguistically clear, you are buying the 6 month straddle and shorting the 1 month)

If I’m asking the question you already know there’s more to this than simply “6 month option vega is > 1 month option vega” so YES.

You can see where I’m going with this if I use an analogy question. If I buy X dollars NVDA and short Y dollars of SPY, am I long the market if X = Y?

The question comes back to beta. Beta is a function of correlation and vol ratio. Just like with equities to a benchmark, the correlation of vol changes across a term structure is usually quite strong. We are mostly concerned with how much the vol moves with with respect to another vol.

We don’t expect 2-year implied vol to move as much as 1-week implied vol. In the NVDA question, beta tells us the ratio of X to Y to be “market neutral”.

Going back to the vol question. It’s true that you are long vega if you buy a 6 month straddle and short a 1 month straddle. You are long “click vega”. If the entire term structure parallel shifted higher 2 points you will win 2 x [net vega].

But vols don’t generally move in lockstep across the term structure. Instead, it’s common to weight the vols by their sensitivity to a fulcrum month and then re-scale all your monthly vegas by the weights.

So the answer to the riddle might be YES you are long vega but it’s not necessarily the most helpful answer. If vol parallel shifts up, you will definitely win and the idea that you are long vega was certainly true in this unusual dynamic. It is reasonable to expect 1-month vol to increase faster the 6-month vol.

This brings us to our next question.

If the front month vol increases by 1 point, how much does the 6-month vol need to increase to merely break even?

Hint: For an ATM (well technically at-the-forward) straddle the only things that affect the vega are spot price and DTE

This is an equity or ETF so the spot price is the same for both months. Note this would not be true for futures which have a curve of different underlyings.

The difference in vega is proportional to sqrt(dte). In this case the sqrt (6 / 1) = 2.45

[If this is not clear, recall the ATF straddle approximation from The MAD Straddle is straddle = .8Sσ√T.

Straddle vega is just change in straddle price per 1 point change in vol. We just re-arrange the formula:

straddle/σ = .8S√T

Since we are comparing 2 months, .8* S cancels out and we are left with the vegas being proportional to √T]

If 1 month vol increase by 1 point, then the back month vol needs to increase by 1/2.45 or .41 to keep the straddle spread price constant (ignoring theta).

If 6-month vol increases by more (less) than .41, we make (lose) money on the vol expansion.

Whether we are long or short vega is more ambiguous than it appears from our headline “click vega” measure.

√t Vega Scaling

Just like we beta-weighting a basket of stocks allows us to group directional exposure into equivalent SPX delta, it’s common to weight vega as a function of √dte. You may choose a “fulcrum” term such as 180 days to anchor the definition of vega and then re-scale each month’s vega by √(DTE/180)

This is review from Understanding Vega Risk:

This kind of scaling allows us to summarize the position with a statement like:

“If 6-month vol increases (decreases) by 1 point, I expect to lose (make) $13k”

This would not be obvious if you are looking at the sum of raw vega.

√t scaling is not pulled out of a hat. It corresponds to a world where straddle spreads are constant (again, ignoring theta). Armed with that straddle approximation formula, it is simple to prove that to yourself.

There’s nothing gospel about this weighting scheme. In the moontower.ai app users can view daily vol changes scaled to several choices of tenors. The point is that by normalizing at all, it is easy to see which straddle spreads changed. Implied volatility is a shortcut to get to a price and prices are what your p/l depends on. If you have a time spread on, the change in its price is the thing you care about.

Any vol weighting scheme you choose will not be perfectly accurate (if it were you could literally predict how the vols would change relative to each other, in which case you should have already chucked your phone in the ocean from the hammock of your private island). But it’s a gigantic improvement in risk monitoring from raw vega. Of course, any summary measure is a trade-off between convenience and resolution. The explicit trade-off with √t scaling:

Benefit: Easy to see changes in time spread prices across the term structure. Highly intuitive and interpretable.

Drawback: Inaccurate. Said differently — there’s lots of room to improve the accuracy of the weights from empirical data which will lead to better understanding of vega risk.

Personally, I think dialing in better weights would increase cognitive load. You’ll need to think about what regime or lookback the updating weights are drawn from. The inaccuracies of the constant straddle spread assumption are “the devil I already know”.

[At the risk of wasting digital ink, it should be obvious that designing metrics always depends user context. Building infra and dashboards is an exercise in being a product manager that serves many clients — traders, risk managers, accounting, and back office.]

√t scaling is a meaningful but rough improvement in measuring vega by treating weighting each month differently. Constant straddle spread is an implicit dynamic embedded in the calculation. It’s an assumption. So how do vols across a term structure actually move relative to one another?

Let’s see what we can discover by hackin’ on some data.

Viewing term structure

We are used to looking at charts like this:

That’s a snapshot of the SPX IV term structure. It’s an ascending shape. We used the term “steepness” to refer to the ratio of 1M/6M vol. In this case it is less than 1.0 since 6-month vol is premium to 1-month.

It’s a common shape (in oil we referred to this as the “droopy penis”). It’s relatively quiet, near vols are subdued, back months expect mean reversion to longer-term averages.

In early August, this would have been sharply descending with a steepness much greater than 1.0 as the near term stress and high realized vol was baked into the front of the IV term structure and sloping down to vols which suggest the markets will eventually calm down.

Instead of a snapshot, a time series can help us capture all that motion. We are going to use about 1 year of GLD data (10/2/23-9/23/24) for the rest of this post.

This is a daily time series of 10d, 30d, 90d, 6M IV. The darker blue line is the 10d IV and the sparkly blue line is the 6m IV. You can see how the 10d IV is itself quite volatile sometimes sagging below the 6m IV (“the droopy penis”) and sometimes shooting way above it into a backwardated or inverted term structure.

This is an instructive way to see term structure behavior albeit highly zoomed out.

Let’s use another chart for a closer look. We will include 2 time series:

The 1M/6M vol ratio
The 1M/6M implied forward vol

A forward vol represents the implied amount of volatility that exists between the 1M and 6M expirations. (It sounds complicated but 1 month VIX future settles to “what will the VIX, a 30d forward looking measure, be in 1 month”)

You can use the free moontower.ai forward vol calculator to play with the idea and read about the concept.

The central point is that forward vol is another way to consider the difference in relative volatility between 2 expirations. Seeing things in multiple ways is the focus of this post.

What stands out about this chart?

Here’s what I see:

The forward vol is sometimes high and sometimes low regardless of the ratio.

Let’s say you get an 80 on a test, but there’s a prediction market on your final grade in the class that is trading for 90. The market is implying that you are going to be acing the rest of your tests.

Conceptually the math here is similar — when the front month vol is low and the vol ratio is far below 1.0 (steep term structure) the forward vol can and often does look quite high. The market makes it expensive to lock in cheap volatility for a long time even when the vols are low (the expense will manifest as “roll down”).

But look at early March — not only was the vol ratio low, the forward got crushed. If you only look at vol ratio, you missed this.

Implied forwards are an orthogonal or complementary measure of relative volatility that is additive to your perspective.

I’m a big fan of waiting for imperfectly correlated signals lining up to size up.

Remember, dragonfly eyes.

Next week we will continue with part 2 where we:

deconstruct the nature of this relationship further
consider what the differences between lenses means for finding opportunities and measuring risk
substantiate both the how and why of extending the analysis
- I plan to actually publish the extended analysis as well, but it won’t fit in next week’s letter on top of the rest of this