I can’t remember which of the 3 Todd Simkin interviews on my blog I summarized where he mentions it but Todd was asked if SIG’s secretiveness has been an advantage. He said in trading, it’s been good, but when it comes to recruiting technologists or researchers, it’s been a hindrance. The FAANG companies are household names and since trading firms compete for some of the same talent, you’d want more people to know what SIG is.
I figure this recognition is behind their increased public outreach. Like this awesome video that recently dropped from the lecture series where Professor Costa teaches their trainees about the GFC.
It starts assuming you don’t even know what a bond is and proceeds to cover an unbelievable amount of distance in one hour. The narrative and history going back to the 80s is fantastic and I even learned (or reviewed) a lot of basic market knowledge.
#teaching_goals
While this video is loaded, here’s 5 bits that stood out for me. There’s also a very SIG-esque lesson in there about anchoring bias.
Diversification has literal monetary value – Great demonstration of how portfolio theory translates directly into pricing and risk management
Reflexivity in credit markets – Default rates weren’t actuarial constants but depended on loan originators’ incentives. Once originators became divorced from risk while retaining pricing/underwriting control, the system became unstable. A systems thinker would have spotted this disconnect.
Misaligned incentives drove market distortion – Traders focused narrowly on derivatives markets where the CDS market dwarfed the underlying bond market. Unlike bond issuance (limited by actual capital formation needs), derivative trading appetite was essentially unlimited.
Good ideas taken too far become dangerous – Diversification through low correlation assets is sound in principle, but this conceptual acceptance prevented people from asking the critical follow-up: “To what degree is this still safe?” (The opposite is hormesis – sometimes a little of a bad thing is actually beneficial. As the old saying goes “the posion is in the dose”.)
“This would turn out to be a fateful decision” – The final section on implied correlation reveals how trading desks completely inverted their hedge ratios between tranches, fundamentally misunderstanding how correlation affects different credits.
I got long IBIT (BTC etf) when my April calls expired ITM. IBIT continued rallying, I was happy to stay long. However, the vol screened cheap.
I decided to “replace” — I sold my long shares and bought calls.
Deciding which calls to buy was tricky. So tricky it inspired this post.
The heart of the conundrum lives in this snapshot from 4/23/25:
That is the constant maturity IV term structure, visually depicting a steep term structure. The implied forwards table below it gives a more granular view.
You can see the problem.
I think vol is cheap…but that was confined to the front of the curve. If we just look at the first 2 monthly expirations, May 16th and June, the vols are 45.8% and 51.1%.
The premium from May to June is so steep that the May/June forward vol was 54.3%, a full 10 points over the weekly IV!
Vol might be cheap but you can only “lock” it in for about a month. Then it doesn’t look so cheap anymore. No easy trades, right? The market is pricing mean reversion in the vol.
I chose to go with what the market gave me — the cheap vol in the front. I sold my shares and bought the May 60 calls.
Well, the stock went up about 15% that month…but expired just below my long strike.
A 15% move in a month corresponds to a 52% vol move (.15 x √12) but I picked the wrong strike. Path is Anton Chigurh’s coin.
I chose the shorter expiry because that’s where vol was cheap. June was 5 vol points richer, more than 10% premium over a single month. Is that too much to pay? Does that nullify the “vol is cheap” thrust behind the trade idea?
There’s no straight answer to such a question.
The term structure balances a tension between the supply and demand in each month. Those flows are made up of both natural needs — directional speculation, hedging — and vol trader positoning insofar as they have opinions on the relative pricing.
The term structure is the jurors’ debate of the evidence. The truth behind the case, just like any trial is not perfectly known. In the market’s case, the ensuing price action, represents the truth.
Discerning between months and trading time spreads is a significant portion of vol trading. Understanding events is a mix of computation and research labor. There’s room for alpha in the act of normalizing term structures to account for idiosyncrasies in a calendar.
But let’s put that aside and assume a calendar with no known events.
We want to build some intuition of expensiveness or cheapness in a time spread. We can use a toy calculation to do this in the context of a common scenario:
“I am buying a time spread to harvest the front-month VRP”
If you sell the front month and buy the back, it’s a hedged way to capture the premium of implied to realized.
Note this premium, despite being empirical, is neither constant in magnitude nor size, but simply “on average”. Still, we can use this type of trade to explore the tradeoffs in month selection.
When we buy say, an ATM time spread, we are going to be long vega, long time premium, short gamma, and generally collecting theta.
Wait.
Why am I bolding “generally”?
It’s a clue to the tradeoffs involved.
Let’s just jump right into the toy model to see what we can learn about time spreads. This is stuff that excites, transfixes, and haunts every vol trader.
“What needs to be true for this time spread to be cheap or expensive?”
This will be the first in a few posts as I try to eventually turn this into a webapp.
We need to build a simulation.
We are angling towards an appreciation for how a time spread behaves by decomposing how a time spreads perform until the first month’s expiration as a function of:
implied vs realized
change in M2 vol if it “rolls” towards M1 vol by the time it becomes M1
We understand that buying a time spread should win if the realized vol underperforms the M1 vol you are short, but you are often paying a premium vol for M2 (contango term structure) and we want to see for various combinations of steepness between the 2 months and VRPs how attractive it looks to buy/sell time spreads.
The toy simulation
Scenario:
Consider a $100 stock that moves at 16% vol.
You buy a 40 DTE /20 DTE 100-strike calendar spread delta-neutral. 19.2% in the back month and sell 19.2% vol in the front month. Hedge daily at the close.
It’s a flat vol time spread.
You are trying to capture a VRP by being short 19.2% vol when the moves will be sampled from a 16% vol return distribution. Since you are short the nearer-dated option you should win if implied vol > realized vol since the moves are not large enough to cover the option theta.
How can we model this?
We can look at the daily delta-hedged p/l of a time spread assuming M2 vol converges to M1 vol by the time M1 expires. In our current example, M2 vol = M1 vol so there’s no “roll down”.
Today, we will simply zoom in on what a single simulation looks like.
Step 1: Stock process
Assuming we pull a random return from a Brownian motion diffusion process with RFR = 0% and a daily vol of 16%/sqrt(251) or 1.01%
This is simple to implement. The details are here:
While the simulation has randomness based on the draw of daily moves, I’d like to minimize the dependency on path. If you trade the $100 strike and the stock trends away from your strike, your greeks “go away”. To get rid of this noise, we will reset the stock to $100 each day so that the options are once again at-the-money.
We will also assume the stock only goes up. We get the random return and assume the stock goes up by that amount. We are resetting the strike each day to whether the move is up or down makes no difference.
Bringing step 1 and step 2 together is an example of a single draw:
💡Quiz question
I computed a realized vol of 16.1% from that return stream. Do you get that same value? If not, what do you think I did and why? The answer is one of those things that makes you realize just how subtle choices in measures can have surprisingly large impacts.
Step 3: Compute daily p/l
We assume a constant implied vol for M1.
If M2 differs from M1 we assume a linear glide path for M2 to approach M1 by the time M1 expires. For example, if M1 has 20 dte and is 20% while M2 is 22%, then M2 vol will fall by .10 per day (2 vol points / 20 days).
Each day we start with a delta-hedged position on the $100 strike. We are short M1 and we assume the stock goes up by the randomly drawn amount from the diffusion process. So we will lose on the short call and win on the long stock. For M2, the opposite will happen — we win on the long call and lose on the short stock.
(I broke them out separately because it will be handy for decomposition but in reality, the long and short stock legs would be close to canceling out since it’s an ATM call spread.)
We can understand how this works for both long and short legs by just zooming in on M1 as an example.
Being short M1 at 19.2% IV while it realized 16.1% (20% vrp) yielded a profit of about 11 cents. You can see the messy p/l path and this is with the strike resetting to ATM each day which sterilized the stock path!
Doing the same procedure on M2, we can see the breakdown:
The long leg broke even while the short won. Although the win was driven by very small moves towards expiration.
I can’t overemphasize how noisy this is. The stock can sit unchanged for 19 days then on day 20 your short ATM call, which starts the day worth $.57, can blow out on a $4 rip, leaving you net loser on the whole trade. You were short vol at 19.2% and that hypothetical return stream is about 14% vol. Bad beat on the river.
(Yes, $4 is extreme but so is being unchanged for 19 days.)
The stock returns of those last few days before expiry get averaged into the realized vol calcs the same as any other day, betraying lived experience — your dollar gamma is so high near expiry that those returns have a disproportionate impact.
Which is exactly why nobody get outs alive from the fact being short where lands and vice versa will dominate everything else…hence my IBIT calls that were directionally right AND priced cheaper than the vol realized and experienced a total loss.
Here’s another draw, this one snatches defeat from the jaws of victory…
Healthy VRP. Sitting pretty with a cumulative profit 1 day before expiry and boom a 2.3% pop in the stock. A bit over 2 st devs. On that day you make 1/2 the move in share p/l because you do have the .50 delta hedge but the call you are short goes from $.57 to $2.30. Your M2 options have far less firepower so they only mitigate about 1/4 of the day’s loss.
So far we’ve stuck to the case where the time spread has flat vol (ie M1 IV = M2 IV). We’ll change that next week, when we get into “shadow theta” as we push ahead to understand calendar spreads.
Give Palantir the same $100 billion valuation that Databricks just earned. Where does that put the stock? $40. The exact same math we saw when comparing Palantir to OpenAI.
Tyler pinged me before the stock dove on 8/18 before the stock dove:
walk through of put spread LEAP payoffs on PLTR pls
I pulled up moontower.ai to see what odds you can find to bet on $40:
Jan’27
67.5/50 put pays 8-1 if the stock hits 50 or lower ~11% chance of stock dropping 2/3
Tyler’s was drawn to a similar one before asking me:
my gut brought me to 65/55
I looked it up. That one is similar, pays 7.5-1
Put spreads, like any vertical spreads, are straightforward, risk-budgeted, ways to bet on an outcome by a specified expiration. Tyler followed a proper instinct — “Hey someone made a prediction, what odds are being offered by market prices?”
Since stocks obviously only go up in America, those odds don’t sound too exciting for something that is obviously impossible. But putting aside the laws of patriotic stock market anti-gravity, the mathematics of variance drag are the true reason why the odds feel underwhelming — PLTR is a high vol stock so a high likelihood of a negative return is baked into the surface:
In risk-neutral pricing world with RFR of 4% the median one-year outcome for a 50% vol name is RFR minus half the variance.
Yesterday, we (yet again) discussed the topic of how vol drag affects the skewness of a return distribution. In particular it shifts the median return lower.
The thread I wrote bounced all around the internet but like a nerdy game of telephone the message is suffering from major info loss as it gets passed on.
Vol drag does NOT change the mean or expected return. It affects the return you are most likely to experience.
Imagine 2 lotto tickets — let’s call them WGMI and NGMI.
WGMI has a 50% chance of paying off $20k
NGMI has a 1% chance of paying off $1mm.
They have the same expectancy. But NGMI is far more volatile — your most likely result is a zero.
Those meme stock put spreads look expensive because they are bets on distribution not expectancy! Spreads are bets on pure probability NOT the magnitude where as outright options are about probability x distance just like expected value is about probability x outcome.
If I have an asset with an expected return of 10% the 2x version of it (without fees) has an expected return of 20%. Full stop.
However, the 2x version is more volatile, therefore its median expected return is less than 2x the median expected return of the asset itself. That’s vol drag. It impacts the median. The skewness. The distribution. NOT THE EXPECTANCY.
If that clarification was a lightbulb moment, feel free to stop right there. It’s already a good day. But if you want euphoria bursting from your skin, know that it gets better.
The following discussion includes:
“vol bonus” the alter-ego of the drag
the worst return I ever had in options
a simulator you can use right now to see vol drag and vol bonus
Vol drag predicts a median return over a set of branching paths. To understand that first think of a binomial tree where some paths recombine — for example “stock goes up $1 then down $1 is unchanged”.
Vol drag represents the median return being pulled down when we follow such a path in return space. So up 1% then down 1% doesn’t recombine back to $100 but $99.99. The more we raise the vol, the lower the median return gets pulled down. But again the expectancy isn’t changing. What’s happening is the drag of the recombining steps, the cases where a stock is “chopping” drive the median lower.
To imagine the ratio of chopping paths to straight trending consider flipping a coin twice. There’s 1 way to get 2 heads [h,h] and 1 way to get zero heads [t,t,] but 2 ways to get to exactly 1 head [(h,t), (t,h)] —the expected outcome of flipping a coin twice. The mean outcome is comprised of chop! An extreme outcome, one which is unlikely, is characterized by trend.
When we swap out coin flips and use returns we see expect something similar. The expected (ie mean) return is comprised of more toggling between up and down causing vol drag and pulling median return down. But for the expected return to not itself be lower it must be counterbalanced somewhere, right?
It is.
In the extremes, where the outcomes are part of the distribution but unlikely. The ones where you trend relentlessly up or down. It doesn’t happen often, but this is where you see a vol bonus in return space!
If a stock goes up 1% per day for 251 trading days it will be up 12x
Its double-levered version will be up ~144x not just 24x
I have covered this before in Path: How Compounding Alters Return Distributionsin my own Pascal’s Triangle kinda-way. Part of being undereducated in ze maths means I like to look at pictures so for better or worse depending on your leaning there’s no formal notation in that post. Arithmetic and crayons.
But I want to hone in a bit more on this vol bonus thing. I’m acutely aware of it because I was on the wrong end of it — the worst loss I ever experienced on a return basis was being short options on an inverse 2x ETF — SCO. (7 figures on a few hundred contracts 🤮)
For the uninitiated, SCO is the inverse, levered ETF on oil. I was short calls on SCO when oil went straight down on a rope in 2014 as OPEC lifted drilling quotas to squeeze the US shale complex.
[You can’t just use 2x the IV of oil to price options on the 2x levered ETF…but I’ll give you a hint: if an ETF and its 2x levered are both $100, how do you find which strike on 2x levered corresponds to the 110 strike on the reference ETF?]
If the path is a trend, the return of the double-levered ETF is going to be more than 2x the return of the reference asset. The owner of the calls I was short enjoyed a “vol bonus” that, from the macro perspective, came right out of my year-end bonus.
Vol drag vs Vol bonus…ready, fight?!
Let’s walk through another example.
Scenario: 5 Days Up, 1 Day Down
Imagine a stock that gains 2% daily for 5 days, then drops 10% on day 6.
Days 1-5: (1.02)⁵ = 1.1041 → +10.41%
Day 6: 1.1041 × 0.90 = 0.9937 → Final: -0.63%
2x Leveraged ETF Performance:
Days 1-5: Each day gains 4% (double the 2%)
(1.04)⁵= 1.2167 → +21.67% (more than 2x the 10.41%!)
Day 6: Loses 20% (double the 10%)
1.2167 × 0.80 = 0.9733 → Final: -2.67%
Notice what happened: bonus turned to drag
After 5 days, the 2x ETF had a bonus (+21.67% vs expected +20.82%)
After the volatile day 6, it shows drag (-2.67% vs expected -1.26%)
The final return is -2.67%, which is worse than 2x the stock’s -0.63%
Bonus is rare and fragile. Today, all levered ETF prospectuses (it should be prospecti out of respect for 🐙but whatever) disclose the risk of vol drag and they should. It describes the disappointment you are most likely to experience if you hang on to these things, but for those of you who care about not being faulty amplifiers in the telephone game of high-finance literacy just realize that their actual mean expectancy isn’t lower. But like a royal flush in video poker, you are unlikely to get the hand that contributes to that mean expectancy.
Interactive Simulator
Want to see this effect in action?
I vibe-coded an interactive 2x leveraged ETF simulator that demonstrates how different volatility levels and market paths create bonus or drag. You should really take note of how destructive vol is to your actual investing experience.
Vol drag does NOT change the mean or expected return. It affects the return you are most likely to experience.
Levered ETFs are more volatile, therefore their median expected return is less than 2x the median expected return of the asset itself.That’s vol drag. It impacts the median. The skewness. The distribution. NOT THE EXPECTANCY.
In the real world, levered products have direct fees as well as slippage fees because the exposure needs to be rebalanced daily. If the product has garnered lots of AUM it’s like a polar bear in the savannah — an easy mark for trading firms anticipating flow.
σ²_implied = implied variance at the start of the period
Δt = time period (typically 1/252 for daily)
This formula comes from the fact that options are essentially a bet on variance, and the P&L approximately equals half the vega times the difference between realized and implied variance.
For a more precise version that includes gamma effects:
This relationship is fundamental to volatility trading and shows why:
If realized variance > implied variance → long volatility positions profit
If realized variance < implied variance → short volatility positions profit
The factor of ½ appears because the gamma P&L from hedging accumulates continuously and represents the average exposure over the period.
KA: let’s do an example with a 20 day delta hedged .50 delta call…implied vol is .16 and the realized move is 2%…assume gamma is constant across the move
Let’s work through this step-by-step with a 20-day delta-hedged 0.50 delta call option.
Given:
Time to expiration: 20 days = 20/252 years
Initial delta: 0.50 (ATM call)
Implied volatility: 0.16 (16%)
Realized move: 2% over 1 day
Assume constant gamma across the move
Let’s assume S = $100 for simplicity
Step 1: Calculate the realized variance
Daily realized volatility = 2% = 0.02
Daily realized variance = (0.02)² = 0.0004
Step 2: Calculate the implied variance (daily)
Annual implied variance = (0.16)² = 0.0256
Daily implied variance = 0.0256 / 252 = 0.0001016
Step 3: Estimate gamma For an ATM option with 20 days to expiry:
Gamma ≈ 1/(S × σ × √(2π × T))
Gamma ≈ 1/(100 × 0.16 × √(2π × 20/252))
Gamma ≈ 1/(100 × 0.16 × 0.223) ≈ 0.028
Step 4: Calculate the P&L Using the formula: P&L = ½ × S² × Gamma × (σ²_realized – σ²_implied) × Δt
Interpretation: The position made about 4.2 cents per share because the realized variance (4 basis points) was much higher than the implied variance (1.016 basis points). The 2% move in one day represents about 32% annualized volatility, which is double the 16% implied volatility, leading to a profit for the long volatility position.
KA: compare that p/l to if you computed the p/l by computing gamma p/l and thea p/l separately
Let’s compute the P&L by breaking it down into gamma P&L and theta P&L separately.
Gamma P&L Calculation: For a delta-hedged position, gamma P&L = ½ × Gamma × S² × (ΔS/S)²
Stock move: ΔS/S = 2% = 0.02
Gamma P&L = ½ × 0.028 × (100)² × (0.02)²
Gamma P&L = ½ × 0.028 × 10,000 × 0.0004
Gamma P&L = 0.014 × 4
Gamma P&L = +$0.056
Theta P&L Calculation: For an ATM option with 20 days to expiry:
Theta ≈ -(S × σ × φ(d₁))/(2√T) where φ is the standard normal PDF
For ATM: d₁ ≈ 0, so φ(d₁) ≈ 0.3989
Theta ≈ -(100 × 0.16 × 0.3989)/(2√(20/252))
Theta ≈ -6.38/(2 × 0.282)
Theta ≈ -6.38/0.564
Theta ≈ -$11.31 per year
Daily theta = -11.31/252 ≈ -$0.045
Total P&L = Gamma P&L + Theta P&L
Total P&L = $0.056 – $0.045
Total P&L = +$0.011
Comparison:
Variance-based formula: +$0.042
Gamma + Theta approach: +$0.011
Why the difference? The discrepancy of ~$0.031 comes from:
Approximation errors: The variance formula assumes continuous rehedging and uses average gamma exposure, while the discrete calculation uses end-of-period values
Cross-effects: The variance formula better captures the relationship between realized and implied vol, while the separate calculation doesn’t fully account for how gamma P&L scales with variance
Higher-order terms: With a 2% move, we’re beyond the range where gamma can truly be considered constant
The variance-based formula is generally more accurate for volatility trading analysis because it directly captures the essence of the trade: you’re long/short variance, not just gamma and theta separately.
With a 1% daily move (≈16% annualized vol), the realized variance exactly matches the implied variance, so the variance formula correctly shows near-zero P&L
The gamma + theta approach shows a loss because it doesn’t recognize that the 1% move represents exactly the implied volatility – it just mechanically adds gamma gains to theta losses
The variance formula is more intuitive here: when realized vol = implied vol, a delta-hedged option position should roughly break even
This demonstrates why the variance-based approach is superior for understanding volatility trading – it directly answers “did volatility exceed expectations?”
KA: the gamma theta approach is the discrete approach yea?
Yes, exactly! The gamma + theta approach is the discrete approach.
Discrete Approach (Gamma + Theta):
Calculates P&L at discrete points in time
Gamma P&L = ½ × Gamma × S² × (ΔS)²
Theta P&L = Theta × Δt
Assumes gamma and theta are constant over the period
Treats each Greek’s contribution separately
Continuous Approach (Variance-based):
Based on continuous rehedging theory
Integrates the gamma P&L over the path
Recognizes that gamma P&L accumulation IS the variance realization
More accurately captures the relationship between implied and realized vol
The key difference is that the variance formula emerges from the continuous hedging limit, where you’re constantly rebalancing your delta hedge. In this limit:
The accumulated gamma P&L converges to: ½ × Vega × (σ²_realized – σ²_implied) × Δt
Theta is implicitly “baked in” through the implied variance term
Why the variance formula is more accurate:
It captures path dependency better
It correctly shows breakeven when realized vol = implied vol
It’s derived from the theoretical foundation of option pricing (Black-Scholes PDE)
It avoids double-counting or missing cross-effects between Greeks
The discrete approach is useful for understanding the mechanics, but the variance approach better represents the true economics of volatility trading.
KA: What is the expression for option p/l summation from 20 dte til expiry for the discrete method of adding gamma and theta?
For the discrete method, the total option P&L from 20 DTE until expiry is the summation:
Where Tᵢ = (20-i+1)/252 is the time to expiry at step i.
For an initially ATM option, the Greeks evolve roughly as:
Gamma ∝ 1/√T (increases as expiry approaches)
Theta ∝ -1/√T (becomes more negative as expiry approaches)
At expiry: The final P&L also includes the intrinsic value:
If call: max(Sₙ – K, 0) – Premium paid
If put: max(K – Sₙ, 0) – Premium paid
This discrete summation becomes increasingly inaccurate for:
Large moves (gamma changes significantly)
Longer time periods between rehedges
Options near expiry (Greeks change rapidly)
KA: So let’s do a pretend scenario where a stock starts at 100, moves 1%, we compute the p/l, the stock resets to 100 and we roll the day forward and do it again and so forth until we’ve run the for 20 days to estimate the cumulative p/l
Let’s run this scenario where the stock resets to $100 each day after a 1% move, rolling forward for 20 days.
Setup:
Stock always starts at $100 each day
Moves 1% daily, then resets
Option starts ATM with 20 DTE
IV = 16%
We recompute Greeks each day as time decays
Daily Calculations:
For each day i (where i goes from 20 DTE down to 1 DTE):
Total Cumulative P&L ≈ Σ(daily P&L) ≈ -$0.90 to -$1.00
Key Insights:
Even though realized vol (1% daily = 16% annualized) equals implied vol, we lose money due to the discrete hedging
Losses accelerate as we approach expiry (higher theta)
The stock resetting to $100 means we never capture the path-dependent benefits
This demonstrates why the variance formula (which would show ~breakeven) better represents continuous hedging reality
This artificial scenario highlights the limitation of discrete hedging analysis!
KA: Actually on the last day doesn’t the stock move 1% which will be more than the call value hedged
You’re absolutely right! This is a critical point. Let me recalculate the last day.
Day 20 (1 DTE → Expiry):
Starting: Stock at $100, option has 1 day left
T = 1/252 ≈ 0.00397
Gamma ≈ 1/(100 × 0.16 × √(2π × 1/252)) ≈ 0.126
Theta ≈ -$0.201 (huge!)
But here’s the key issue:
The ATM call with 1 DTE is worth approximately: S × N(d₁) ≈ $100 × 0.5 × (σ√T) ≈ $100 × 0.5 × 0.16 × √(1/252) ≈ $0.50
Stock moves 1% = $1.00
The call expires worth $1.00 (since stock ends at $101)
Final day P&L:
Gamma P&L from hedging = ½ × 0.126 × 10,000 × 0.0001 = $0.063
Theta decay = -$0.201
BUT: We were short the call (from delta hedging), and it expires worth $1.00 vs initial value of ~$0.50
Loss on short call = -$0.50
Total final day P&L ≈ -$0.64 (much worse than other days!)
This illustrates a crucial point about gamma trading near expiry:
With very little time left, the option value can’t keep up with large moves
A 1% move overwhelms the remaining option premium
This is why gamma positions become extremely risky near expiry
The “pin risk” is real – discrete moves can create discontinuous P&L
This makes the total 20-day cumulative loss even larger than my initial estimate – probably closer to -$1.40 to -$1.50 total.
KA: But if the stock moves $1 and the call goes from .50 to $1 then the call makes .50 and you lose .50 on the stock since you are short 1/2 share and it moves $1 (assuuming .50d)
You’re absolutely correct! I made an error in my logic. Let me recalculate properly.
Day 20 (1 DTE → Expiry):
Starting position: Long 1 call (delta ≈ 0.50), Short 0.50 shares (delta hedge)
This makes much more sense! The delta hedge is doing exactly what it should – protecting against directional moves. We only lose the time decay.
Revised total 20-day cumulative P&L: Since we lose roughly the daily theta each day, and theta increases as we approach expiry, the total loss would be approximately the initial option premium (around -$1.00 to -$1.20).
This aligns with the intuition that if realized vol = implied vol, we should lose approximately the option premium over the life of the trade through theta decay, with gamma gains offsetting the acceleration of theta near expiry.
Thanks for catching that error!
KA: but there is no decay…the option expire
You’re absolutely right! On the final day, there’s no theta decay because the option expires that day.
There’s no theta on expiration day – the option just settles at intrinsic value!
This actually makes the scenario even more interesting:
Days 1-19: We lose money each day (gamma gains < theta decay)
Day 20: We break even (perfect hedge, no theta)
The cumulative P&L would be the sum of losses from days 1-19 only, which would be somewhat less than my previous estimate – probably around -$0.80 to -$0.90 total.
This highlights another key concept: on expiration day, options become pure directional instruments with no time value, so a delta hedge works perfectly with no slippage from theta.
KA: i’m suspicious of this method…maybe we recalculate the option and the hedge p/l every day
You’re right to be suspicious! Let’s carefully track the option value and hedge P&L each day. The key is that we need to track:
We lost ~7 cents despite realized vol = implied vol
This loss comes from:
Discrete hedging (we hedged at initial delta 0.509 but average delta during move was ~0.61)
One day of theta decay embedded in the valuation change
After the reset to $100 for next day, we’d rehedge to the new 19 DTE delta of ~0.51.
This shows why the discrete hedging approach systematically loses money even when realized = implied!
KA: Now run thru the whole series in this way for 20 days where you reprice, reset, repeat
I’ll run through all 20 days, repricing the option each day. Let me calculate key milestone days:
Day 1: 20 DTE → 19 DTE
Start: S=$100, C≈$1.43, Δ≈0.509
After move: S=$101, C≈$1.87
P&L: +$0.44 – $0.509 = -$0.069
Day 5: 16 DTE → 15 DTE
Start: S=$100, C≈$1.28, Δ≈0.507
After move: S=$101, C≈$1.69
P&L: +$0.41 – $0.507 = -$0.097
Day 10: 11 DTE → 10 DTE
Start: S=$100, C≈$1.06, Δ≈0.505
After move: S=$101, C≈$1.43
P&L: +$0.37 – $0.505 = -$0.135
Day 15: 6 DTE → 5 DTE
Start: S=$100, C≈$0.78, Δ≈0.503
After move: S=$101, C≈$1.09
P&L: +$0.31 – $0.503 = -$0.193
Day 19: 2 DTE → 1 DTE
Start: S=$100, C≈$0.45, Δ≈0.502
After move: S=$101, C≈$0.68
P&L: +$0.23 – $0.502 = -$0.272
Day 20: 1 DTE → Expiry
Start: S=$100, C≈$0.32, Δ≈0.501
After move: S=$101, Option expires at $1.00
P&L: +$0.68 – $0.501 = +$0.179
Total Cumulative P&L: Summing all 20 days ≈ -$2.20
Key observations:
Daily losses increase as we approach expiry (from -$0.07 to -$0.27)
Only the final day shows a profit because the option expires at full intrinsic value
The total loss ($2.20) is notably higher than the initial option premium ($1.43)
Why so much loss?
The daily reset to $100 means we constantly re-enter at ATM where gamma is highest
We suffer maximum “whipsaw” from discrete hedging
This artificial scenario maximizes the hedging slippage
In reality, stock prices follow continuous paths, which is why the variance formula (showing near breakeven) better represents actual trading outcomes.
KA: in the continuous hedging p/l formula how is that derived
The continuous hedging P&L formula is derived from the Black-Scholes PDE and Ito’s lemma. Here’s the key derivation:
Starting point: Black-Scholes PDE For a hedged portfolio (long option, short Δ shares):
∂V/∂t + ½σ²S²(∂²V/∂S²) + rS(∂V/∂S) - rV = 0
Key insight: The P&L from continuous delta hedging comes from two sources:
The option value change
The hedge P&L
Step-by-step derivation:
Apply Ito’s lemma to the option value:
dV = (∂V/∂t)dt + (∂V/∂S)dS + ½(∂²V/∂S²)(dS)²
Substitute dS = μSdt + σSdW and (dS)² = σ²S²dt:
dV = (∂V/∂t)dt + ΔSμdt + ΔσSdW + ½Γσ²S²dt
P&L of hedged portfolio (long option, short Δ shares):
dΠ = dV - ΔdS
= (∂V/∂t)dt + ½Γσ²S²dt
From Black-Scholes PDE, we know:
∂V/∂t = -½σ²S²Γ - rSΔ + rV
For a delta-hedged portfolio with zero interest rates:
I got words today tying back practically to Sunday’s Money Angle about the relationship of vol to stock distributions. It’s very relevant to anyone investing in publicly traded assets, which they, with a straight face, expect to double in the next year.
Why meme stock put spreads look expensive
[This is based on a chat I responded to but I won’t doxx the stock.]
The stock is $20. The 1-year 20/10 put spread is $6.
The market is offering you odds on the proposition that this stock doesn’t tank. Said otherwise… The market thinks the most likely scenario is the stock tanks.
If you think the stock is a coin flip this looks juicy. And the nice thing about vertical spreads is if you hold to expiry, they really act like a “model-free” bar bet. Like an over/under style bet. Their value depends only on the probability of the stock expiring in some range but not nearly as sensitive to magnitude like an outright option.
Anyway, you look at this put spread and think it sounds really fat.
But the pricing makes sense for an extremely risky stock.
Why?
Because of vol drag.
Remember this guy:
Think of it this way…suppose owning the stock has 150% annualized vol and owning offers a similar risk/reward as many other assets.
Say a .50 Sharpe (you can decide if this is generous).
Therefore, you estimate an annual return of 75%.
Option math rests on continuous compounding ie log returns.
So what’s the median expected stock price in 1 year?
Really expensive put spreads go hand-in-hand with high volatility.
Whether that’s an opportunity is up to you.
(This pricing is very common and smart, ie hard to beat, in commodity markets during short squeezes. I’ve covered this in cotton, silver, and this one in nat gas).
High vol names tend to be distributed like the orange not the blue.
Do you see how this all ties together? I’m explaining why the pricing might look like it does. You can say that the model is academic, so its discrepancy is an opportunity. Yet, the quotes you see don’t come from an ivory tower but a marketplace.
The option skew shifts in response to the buying and selling, adjusting theory for what the gamblers and sharps, anchored in the reality of betting to make money, think.*
It might be a coincidence that the pricing lines up pretty well with how vol influences return math or maybe these put spreads are expectancy on sitting on the sidewalk.
Either way there is a sound explanation for why those put spreads “look high with the naked eye”. Alas, the only way to credibly say “that’s stupid” is to trade against it. TAFO.
*The more advanced audience knows skew is complicated because it’s the moderator of a dialogue between “what’s terminal value at expiry” and spot/vol correlation in the meantime. But that topic is yak shaving for most.
In options land, you constantly hear the returns are “non-linear”, “convex”, or maybe even “have curvature”.
It comes from the fact that there’s a squared term in the p/l function which comes from gamma. Mechanically, it comes from the fact that your position size isn’t constant. It accelerates. If your position starts as 1,000 shares and the asset moves $2 but your position at the end of the move is say 3,000 shares, then your p/l is going to be more than $2,000. At some point you were bigger than when you started.
Your delta, also known as your sensitivity of your p/l to the stock changed.
But I thought I fun way to show this would be to see how it’s exactly the same math as what you learned in middle school when you learned about speed and distance.
There are 2 graphs in particular that every kid learns that both show:
Distance = rate x time
A time-distance graph
The slope is rise/run, or in this chart, distance/time. 400 miles in 10 hours = a speed of 40 mph
I know it’s hard to remember but there’s another chart they are subjected to.
A speed-time graph
There’s no acceleration as the speed is a constant 40 mph. The area of the rectangle under the curve is base x height or 400 miles. It represents the distance. Base x height maps to time x rate.
Let’s extend this to a situation where there is acceleration. Instead of a constant 40 mph, let’s average 40 mph by starting at 30mph and ending at 50mph over the 10-hour journey.
We estimate acceleration with: (ending speed – starting speed) / time
(50-30) mph / 10 = 2 mph
Averaging 40 mph for 10 hours covers the same 400 miles of distance as traveling 40 mph at constant speed but the graphs are different.
Constant Speed vs Acceleration
What to notice:
🌙The Time-Distance chart with acceleration is “curved”
🌙The Time-Speed chart with acceleration decomposes the distance covered into a baseline of 300 miles (30mph starting speed x 10 hours) plus the additional 100 miles due to the 2 mph increase in pace every hour. The area of a triangle is just 1/2 the area of a rectangle.
Another way to think of that “triangle” is to just reason out the gain in distance.
With 2 mph acceleration for 10 hours your total acceleration is 20 mph. The average acceleration boost is just half that or 10 mph.
avg boost x hours
where:
avg boost = 2mph * 10 hours/2 = 10 mph
10 mph x 10 hours = 100 miles of distance covered due to acceleration
Or generally:
(acceleration per hour) * change in time/2 * change in time
distance due to acceleration = 1/2 * acceleration per hour * (change in time)²
[You may remember the function from back in the day:
Again that’s the triangle area of the Time-Speed Chart ]
From acceleration to gamma
Here’s a straight substitution:
speed ➡️ delta def: share equivalent position
acceleration ➡️ gamma def: how many deltas you gain per $1 change in the stock
distance ➡️ P/L def: profit or loss
change in time ➡️ change in stock price
Assume:
you have shares of a stock that starts at 0 (or any starting price)
your option position gives you a constant gamma of 200 shares therefore every $1 the stock goes up you get longer 200 shares
The graphs are exactly the same we just changed the words.
Time-Distance chart ➡️Stock vs Profit chart
Time-Speed chart ➡️ Stock vs Delta Position
In the zero gamma scenario, you start with 4,000 shares and the stock goes up $10.
In the 200 gamma scenario, you start with 3,000 shares and the stock goes up $10. That means your starting shares are 3,000 but ending shares is 5,000.
In the speed example:
distance due to acceleration = 1/2 * acceleration per hour * (change in time)²
The analog for options:
p/l due to gamma = 1/2 * gamma per $1 * (change in stock)²
Which explains why option p/l is a squared function of the stock price change. Hence the non-linear or “curved”p/l.
Practice
If you own an ATM straddle on a stock that has zero delta and your brokerage model says your gamma is .20 per straddle (so .10 for each of the call and the put) then for every $1 the stock goes up you get longer 20 deltas or “shares” (remember options have a 100 multiplier)
If the stock goes up $3, then you estimate your p/l due to gamma as:
1/2 * 20 * $3² ~ $90
If the theta or cost of the straddle is .10 or $10 with the multiplier than your net p/l due to the stock’s movement is $80.
Notes
1. Your gamma and theta are themselves not constant so this is an estimate. The remainder of your p/l will mostly be driven by the change in implied vol of the straddle times the vega
2. The general formula for delta-hedged option p/l is very accessible if you understand the above
It’s the tug-of-war between how much we moved vs what was expected. If positive, we moved more than implied and if it’s negative the move couldn’t cover the theta.
For long options, theta is a cost. Similarly, if you are short options you only get paid the “full theta” if the stock doesn’t move at all. How much you collect is reduced by how much we move, but the move function has squared exposure.
Hope that helps the theta is an edge crowd who says you sell options because they decay. Well, yes there’s always a theta but it’s only half the ledger.
June expiry in USO vol change on the 3% rally… OPEC agreed over the weekend to hike production…but you saw the Ukraine-Russia developments. Competing bullish/bearish effects Vol is lower today if you look at the June surface.
I wouldn’t share Tina’s answer key even if I had it. And I don’t. But understanding how to answer my “interview” question is definitely a prerequisite to Tina’s process.
And we can discuss that because the answer is available from putting together a few key concepts that I’ve already written about in detail. The concepts are conveniently self-contained within just 2 previous articles.
To try to pack even more learning we’ll add some spice for those of you who want to grab a pen and paper to give it a go — I will pose the question with some data to you and Google’s NotebookLM to see how far you each get.
If you want to just read along through the solution without burning glucose nobody is gonna judge you.
Decomposing the vol change on Monday after the event
Let’s start with the 2 articles that hold the keys:
OPEC nations are having a meeting on the weekend of May 31–June 1.
At the close of May 30, near-dated options have an “event premium” baked into the implied vols (note: these vols are based on a standard 365-day model).
In other words, the quoted vol includes the upcoming event, as opposed to having the event premium removed. If we had the event premium removed, we’d call that the “clean” vol.
On Monday, June 2, the implied vol is actually up, and the USO (oil) ETF is up 4%.
Question:
Net of two opposing forces —
The weekend effect and
The fact that the event premium comes out of the surface —
is the clean volup or down on Monday morning, and by how much?
We’re focused on the June 13th expiry.
Vol at the close of May 30: 36.2%
Vol on the morning of June 2: 37.3%
Action items:
Can you conceptually set up the solution to the question?
Can you compute the clean vol change (i.e., back out the effect of weekend decay and event premium)?
This is your time to try #1. Include the assumptions you’d need to make. (This would be a good interview question for an option trader.)
I’ll confess, there’s not enough info to solve #2. I posed it to the LLM to see what it would say (all of the LLM output will be shared below).
I’ll handle #2 in the solution but there’s enough here to at least identify the data you need.
“dirty vs clean vols”
Dirty vols are the IVs generated by an option model. Whenever you look at an IV from a vendor or broker you are looking at a dirty vol.
The most common models assume a 365-day year. They assume every day is created equal. There are also 251-day models (these subtract weekends and holidays). A 251-day model assumes time only passes on business days. Regardless of which model you use (as long as you don’t mix and match), if you are comparing vols between assets on the same calendar this assumption cancels out. Not a concern.
“Clean” vols are an attempt to smooth vol changes rather than accept artifacts from the market understanding that vol time doesn’t pass uniformly. Not every day that rolls off a calendar is 1/365th in vol time. And weekends aren’t just zero time.
Vols can be cleaned by adjusting the DTE in a standard model to the DTE in a more accurately specified calendar.
But we can clean vols for events as well.
Clean vols remove or extract events so they can be compared both along an asset’s own term structure AND across assets. If NVDA has high vol because of earnings it doesn’t mean it is expensive compared to QQQ. You want to compare event-normalized or clean vols.
cleaning the USO vol for both the OPEC meeting and the weekend
Let’s focus on what we observed from our dirty 365-day model for the June 13th expiry.
Vol at the close of May 30: 36.2%
Vol on the morning of June 2: 37.3%
In that model how many days to expiration (DTE) are the close of May 30th and the open of June 13th?
Close of May 30th = 14 DTE or 14/365 ~ .03836 of a year
Open on June 2nd = 12 DTE or 12/365 ~ .03288 of a year
Because of weekend theta effect, an artifact of treating weekend days the same as business days, we expect dirty vols to go up on Monday just visually if the clean vol is unchanged. To reiterate from that post, a 365-day model treats every day the same, but the market understands that “vol time passes more slowly when the market is closed” — which means in a properly specified calendar there is relatively less DTE transpiring over the weekend than what you see in a 365-day calendar. On Monday morning the 365-day model thinks there is relatively less DTE compared to a better-specified model so for a given option price the dirty vol (ie 365-day model vol) must adjust higher.
In the dirty model, .00547 of the year elapsed [ie .03836 – .03288]
We’ll use a calendar specification from that weekend theta post where we say that non-business days count as 50% of a business day for vol time purposes.
If we continue to denominate our basic unit, a full trading day, as 1.0 and weekend days or holidays as .5 we get the following tenor:
251 x 1.0 + 114 * .5 = 308 day calendar.
Using this calendar let’s see what the DTE is on the close of May 30th.
Close of May 30th = 10 regular days + 4 weekend DTE which get a weight of .5 or 12. We divide that by the 308-day ruler. 12/308 ~ .03896 of a year
Note there is more time to expiration as a percent of a year than the 365-day model which only had .03836!
To convert the dirty vol to the cleaned 308-day model vol we multiply by the root of the relative DTEs
36.2% x √(.03836/ .03896) = 35.9%
Note that we expect a lower IV in the clean model because for the same option price we had more time ‘til expiry — so the vol must be lower in the the model with more DTE. This is a good double-check on numerator/denominator confusion when converting.
Where are we?
At the close of May 30th, we observe a dirty vol of 36.2%. when we clean it with a 308-day model which assumes variance passes half as fast on the weekend we compute a clean vol of 35.9%
So for the vol to be “unchanged on the morning of June 2nd” we need to observe a dirty vol that translates to a 35.9% clean vol.
That’s a simple algebra problem using the relative DTEs on the morning of June 2.
We already calculated the dirty DTE that morning to be .03288 as a fraction of the year.
What is the clean DTE according to a 308-day model?
10 business days + 2 weekend days which get 50% weight = 11 days divided by 308-day year:
11/308 = .03571 = clean DTE as a fraction of a year
Recall:
Dirty DTE on the June 2nd open is only .03288 of a year
Note that the gap between clean DTE and dirty DTE has grown even wider which is a clue that the dirty vol the model implies must go higher for a given option price.
Here’s the algebra:
Clean IV * √(clean DTE/Dirty DTE) = Dirty IV
⚖️You may have noticed that this is just a re-arranged balancing identity where:
Clean Variance * Clean Days = Dirty Variance * Dirty Days
It might be easier to remember that and conjure the algebra as needed.
Again the question we care about is what does the dirty IV have to be such that we believe the clean vol of 35.9% is unchanged?
35.9% * √(.03571/.03288) = 37.4%
The dirty vol must go up to 37.4% for the clean vol to be unchanged!
The dirty vol went up to 37.3%…pretty close to unchanged clean vol even though any vendor tool will tell you that IV is up.
I suspect Tina’s dashboard said vol was much closer to unchanged than up over 1 click.
…except there’s this giant elephant in the room.
How much vol should come out of the surface after the meeting?
It looks like there was a small amount of event vol baked in. I’d estimate about a 2.50% straddle for the move which equates to a 1-day 50% annualized vol.
The picture below is the result of changing the move size until you get a relatively smooth vol term structure.
The smooth vol term structure imputes a dirty, event-extracted or base vol of ~ 35%
Let’s recap our expectations recalling that dirty vol closed on Friday at 36.2%
We expected dirty vol to up to 37.4% to keep the clean vol unchanged net of the weekend effect. +1.2% expectation
We expected dirty vol to fall to 35% once the OPEC meeting has passed. -1.2% expectation.
They cancel out!
On balance we expected the dirty to be unchanged on Monday at 36.2% to reflect a clean vol falling from 35.9% to the event-free 35% net of calendar effects.
We observe that the dirty vol actually increased to 37.3% so we can say that oil clean vol was up on the day.
If there was no event baked into the surface, our baseline would not be “clean vol falls on Monday” and we would therefore have said Monday’s vol was unchanged despite the dirty vols optically being higher.
💬Inserting my own comment
It’s not unusual to find this very confusing. Vol cleaning is a necessary step for professional vol trading. It’s the epitome of my measurement not prediction idea where much of the battle is just seeing the present clearly. “Is vol actually up or down?” is a hard question.
To get more grimy — think about earnings season where lots of single-name vols are higher because of upcoming announcements. If you simply tracked implied correlation across the year without normalizing base vols it would appear that they became extremely low 4x a year as stock vols increased relative to the index.
If you don’t normalize for events, then you need to compartmentalize what normal levels of implied correlation are inside and outside of earnings season.
You can start to understand why Tina is quiet about her own process. These processes vary across professional vol shops but what they all have in common is they have a process — you can’t trade high volumes for slivers of edge while being wrong about whether the vol is 25.1 vs 25.4.
The good news is for anyone else using options these differences are just brain damage. If you’re trying to trade for .3 vols of edge you’re already cooked by your execution costs and broker funding rates. Long/short funds don’t try to trade for bps of edge anymore than option users should be trading for a quarter click.
Monday, June 2
Back to the main narrative.
On Monday, June 2nd we saw that the dirty vol opened up 1.2 points which coincidentally matched how much we think the clean vol was up. The ETF itself opened higher 4% — it wasn’t that the OPEC production hikes were bullish (as in they were less than expected) but that was the weekend the Ukraine drones surprised parked Russian bombers.
By the end of that Monday, oil relaxed to being up only 3% on the day and dirty vol for the June 13 expiry dropped to 35%. What clean vol is that by the end of June 2nd?
Dirty DTE = 11/365 ~ .03741
Clean DTE = 10/308 ~.03247
First, look what happened — after June 2nd elapses the DTE flips…there is much less clean DTE than dirty DTE now! It makes sense — when one business day elapses 1/308 rolls off the board in a clean model but only 1/365 elapses in a calendar day model. This happens all week as the dirty days increase relative to the clean days (and therefore the dirty vol declines steadily thru the week to keep the clean vol unchanged before sawtoothing up over the weekend again).
For a dirty vol of 35%, we know the clean vol must be higher because for a given option price we believe there is less DTE.
The algebra:
Clean vol = Dirty Vol * √(Dirty DTE/Clean DTE)
Clean vol = 35% * √( .03741/.03247)
Clean vol = 37.6%
Instead of clean vol falling to a base vol of 35% after the event, we see that it’s actually up to 37.6% up from 35.9% on Friday’s close. An increase of 1.7 points but 2.6 points vs our expectation of clean vol falling from 35.9% to 35% after the meeting. Hence the title of this post…”the dirties are down the cleans are up.”
NotebookLM
I fed NotebookLM both articles and that prompt from earlier.
It’s a pretty amazing synthesis tool, auto-generating a study guide, briefing, FAQ, mind map and even a podcast so you can learn on the go (my interest in this was renewed as I’ve been listening to an auto-generated podcast by a professor friend on the topic of business education).
We watched Ocean’s Eleven with my older son Friday night (we’ve recently established a Friday night ritual where we rotate who picks the movie and who picks the pizza. This past Friday we did a double feature — Dodgeball, which the kids loved and O11 once the little guy went to bed).
Clooney gives a tiny speech in response to Pitt’s skepticism about his motive behind the heist.
Rusty: I need the reason. And don’t say money. Why do this?
Danny: Why not do it?
[Rusty shakes his head]
Danny: ‘Cause yesterday I walked out of the joint after losing four years of my life and you’re cold-decking “Teen Beat” cover boys. [pause]‘Cause the house always wins. Play long enough, you never change the stakes, the house takes you. Unless, when that perfect hand comes along, you bet big, then you take the house.
I emphasized that part because it’s a catchy encapsulation of what Mason Malmuth writes in Gambling Theory & Other Topics, a book I read as a trainee. The most important principle in gambling is to employ a non-self-weighting strategy. In other words, vary your bet size with the opportunity.
I don’t want to get too hung up on whether this is the “most important” as Malmuth contends (you can certainly make the case for “having an edge” in the first place), but it might be the most underappreciated with respect to how we port it to real life. Varying your bet size in blackjack is well-understood, but Malmuth argues for more obscure examples like the brevity of the Gettysburg Address, a masterful bet on the right words and quantity of words in which Lincoln varied his rhetoric for maximal payoff.
It’s a provocative reminder to be careful where you enable life-decision autopilot. If you need inspiration to find areas of your life where you can vary your metaphorical bets, paste this whole section into an LLM and prompt it to give examples in your real life.
[There’s probably an interesting essay to be written about the tension between the value of habits vs the punchy payoff of straying from them in deliberate ways.]
Money Angle For Masochists
🔘Interview riddle
You press a button that gives you a randomly uniformly distributed number between $0 and $1
Giant swaths of the investment industry are doing nothing but selling soothing balms. Placebos. It is the vitamin industry at best and a penis-pill pop-up banner when it “democratizes private investment.”
But it could be no other way.
If you can statistically prove your strategy has alpha you also know exactly how to price it to take all the surplus. The market for making alpha is like any market — it equilibrates based on supply and demand. There’s more wealth out there in search of a return than the capacity to absorb which is why pod shop bosses feast.
I’m not knocking this. Their CAGR, ie returns net of vol drag, through insane market environments after fees are perfectly fine. In fact, it seems that the HF market is more efficient compared to the amount of over-earning for beta that went on in the early 2000s between the dot-com meltdown and the GFC. People that made several lifetimes of loot telling just-so stories to allocators who didn’t notice how Moneyball applied to their field.
The GFC disillusionment revealed many stories to be no more than fairy tales. There was an opening for a new story that perfectly complemented the spread of technological capacity and its rider, technical skill.
Evidence-based investing.
With large data sets and faster computers we could solve investing like a physics problem. Engineers aren’t fooled by steak dinners and silver-tongues. The softest stuff they read is Kahneman who holds the why for why their factors work. All of it has the sheen of the scientific method.
Except there’s one lingering inconvenience. It’s an inconvenience that’s obvious to gamblers. I think most investors can feel it in their bones. Not surprising, we are all natural gamblers to some extent (we eat hot dogs and let strangers drive us around).
The inconvenience is the uselessly long feedback loops. Which we are going to discuss. But it’s worth mentioning that the feedback loops double as a defense for asset managers. They get to say “this works over time and if it worked all the time it wouldn’t work”. That’s true but it doesn’t solve my problem — I STILL HAVE NO FEEDBACK — plus the defense IS convenient to the fee collector.
So we’re left with “keep buying my pills because they might work”. Good luck getting a refund if the fish oil doesn’t make you live longer. The whole arrangement is irreducibly uncomfortable. That’s why it’s called the Paradox of Provable Alpha (and why I need to one day finish writing moontowermoney).
All of these thoughts were stirred up again as I listened to Adam Butler on Excess Returns.
An early quote in the interview:
The size of these edges is so small relative to the noise we encounter daily — especially compared to the gyrations of the underlying indices — that it’s very difficult to make high-confidence, informed choices in advance. In other words, it’s hard to know which edges or strategies to allocate to in a portfolio with any certainty that they’ll outperform a random selection of other possible strategies over the next 10, 20, or 30 years of your investment horizon…your skill in selecting strategies in advance based on even very long histories of performance is pretty close to zero.
I should clarify — Adam distinguishes investment or factor style edges from trading or niche forms of investment that rely on some form of information advantages that have built over time. I think Adam would agree with me when I say trading is like any other business but for superficial reasons gets confused with investing.
I agree with his understanding of pod shops:
Pod shops are really looking for people that genuinely have alpha, so I think it’s useful to kind of distinguish between what we might call sort of systematic factor strategies and alpha.
Alpha comes from somebody who has very particular niche insight or information or experience within a fairly narrow domain of the market. So, for example, we have a client who allocates to a municipal bond manager. Now, this manager has a hard cap at about a billion dollars. The team that runs it spun out of what used to be the largest muni market-making desk — worked there for 20–30 years.
What did that give them? Well, it gave them access to knowledge of where all of the flows from muni bonds — all of the issuance from the muni bond sector — are coming from, the different state governments, who the decision-makers are there, how they can get inside information on what type of issuance is coming down the pipe. And then, being at the center of flows in the muni market — which is a very niche segment of the market — right?
I think that’s just one example, but there are many. For example, somebody who worked for 20 years in the electricity markets — and electricity is a very nuanced pricing market with a very small number of key players, and is largely driven by changes in regulations at the state level and the county level. So, having very specialized knowledge of that, from having worked and gained experience inside the sector, gives you a real edge.
Right, so these are the types of strategies and people that the pod shops are looking for right now. These typically tend to be fairly illiquid strategies — right? You can’t have Elliott Management, a $70 billion firm, running just a niche electricity strategy or a niche muni strategy. But the goal is to find hundreds of people who are all running these niche little strategies, that will all require liquidity to take advantage of opportunities at completely different times from one another, and putting them all together in a diversified basket.
Now, I’m sure there are also very scalable strategies in there as well, that maybe are running more liquid equity strategies or option strategies or whatever. What I fundamentally believe — and my insight from knowing people at those shops — is that the majority of the alpha that you can’t get anywhere else at scale comes from the assembly of many different, less liquid, small niche players that are all operating together in an ensemble.
That last sentence harkens right back to the idea of combining multiple strategies into a portfolio that has a higher Sharpe than any of the constituents by letting the zigs neutralize zags to shrink the denominator.
So I’m nodding along then the host, Jack drops a tight line:
I think Corey [Hoffstein] showed in Factor Fimbulwinter that the amount of time we would need to show that is longer than our investing lifetime.
So this does become about faith.
The verb “showed” following the subject “Corey” is a clue that I get to learn something cool today.
I’m going to jump to the end because I think Corey stages why he takes the approach he does in the article (emphasis mine):
The question we must answer, then, is, “when does statistically significant apply and when does it not?” How can we use it as a justification in one place and completely ignore it in others?
Furthermore, if we are going to rely on hundreds of years of data to establish significance, how can we determine when something is “broken” if the statistical evidence does not support it?
Price-to-book may very well be broken. But that is not the point of this commentary. The point is simply that the same tools we use to establish and defend factors may prevent us from tearing them down.
Corey uses fire to fight fire.
This is where the learning begins. Let’s see what he does.
We ran the following experiment:
Take the full history for the factor and calculate prior estimates for mean annualized return and standard error of the mean.
De-mean the time-series.
Randomly select a 12-month chunk of returns from the time series and use the data to perform a Bayesian update to our mean annualized return.
Repeat step 3 until the annualized return is no longer statistically non-zero at a 99% confidence threshold.
For each factor, we ran this test 10,000 times, creating a distribution that tells us how many years into the future we would have to wait until we were certain, from a statistical perspective, that the factor is no longer significant.
Sixty-seven years.
Ok, I’m going to raise my hand in class.
I didn’t really understand the method.
Awesome I get to learn something new…which means you do too! This is pretty cool.
Luckily, there’s a tireless teacher known as ChatGPT. I wrangled with this professor at office hours until I was able to have it teach me in words I or a middle-schooler can understand.
🧪 How They “Repeat with New 12-Month Chunks” — Using a Coin Flip Example
We’re walking through the logic of how a statistical test updates with each new batch of data — using a coin flip as our analogy.
🎯 The Goal
We want to detect whether a strategy (like a stock factor or a biased coin) has stopped working — i.e., its returns have gone flat.
Think of it like this:
A coin used to land heads 60% of the time.
But now it’s just a fair coin (50/50), we just don’t know it yet.
So we flip it 12 times (like one year of monthly returns), check what it shows, and keep flipping until we’re statistically convinced it’s no longer special.
🧪 The Setup
The coin is now fair (true heads probability = 50%).
We start with a belief: “maybe it’s still biased to 60%.”
We flip the coin in 12-flip chunks, and after each chunk, we update our belief.
This mimics what the researchers did — taking random 12-month samples from a flat return series and asking: “Does this still look like a working strategy?”
🔁 Step-by-Step Walkthrough (One Simulation)
Let’s pretend we have a long list of coin flips (each 1 = heads, 0 = tails), all drawn from a fair coin.
We’ll take this in chunks of 12 flips, like 12 months of flat returns.
1️⃣ Year 1 — First 12 flips
[1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1]
Number of heads: 6
Sample mean: 6 / 12 = 0.50
We compare this to our hypothesis that the coin is biased to 60%:
“Is 50% close enough to 60% that we still believe the coin is special?”
Yes — this result is plausible from a 60% coin, so we keep going.
2️⃣ Year 2 — Next 12 flips
[1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0]
Heads: 5
Mean: 5 / 12 = 0.4167
Now combine both years:
24 total flips
11 total heads
Cumulative mean: 11 / 24 = 0.458
Still not far enough from 0.60 to be statistically confident the coin isn’t biased.
3️⃣ Year 3 — Another 12 flips
[0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1]
Heads: 6
Cumulative now: 36 flips, 17 heads
Cumulative mean: 17 / 36 = 0.472
Now let’s check this against our original belief (that the coin is 60%).
We run a z-test:
Expected mean under H₀ (the hypothesis): 0.60
Standard error: sqrt(0.6 * 0.4 / 36) ≈ 0.0816
Z = (0.472 – 0.60) / 0.0816 ≈ -1.57
Since -1.57 is not extreme enough (we need z < -2.58 to reject at 99% confidence), we still can’t say the coin is fair.
🔁 Repeat the Process
Each year:
Add 12 more flips
Update the total number of heads and flips
Recalculate the cumulative mean
Re-test: “Are we now confident the coin isn’t biased?”
Eventually, the sample mean will drift far enough from 0.60 that the test crosses the 99% threshold. At that point, we’d say:
“I’m now 99% confident this coin is no longer special.”
💡 Why This Works
Even if the coin is fair, short sequences can look biased just by chance. You might see 8 heads out of 12 once in a while — that doesn’t mean the coin works.
So the researchers repeat this full process — from scratch — 10,000 times, each with a new random sequence of fair flips.
Then they record:
“How many years did it take before the test figured out the coin was dead?”
On average, the answer was:
67 years.
🧮 What This Means for Investing
If a strategy (or “factor”) stops working but still produces noisy returns — it might take decades before we’re statistically confident it no longer works.
The noise in the short term can mask the truth for a long time.
I enjoyed Corey’s technique because it gives you a sense of proportion between signal and noise. In so many domains where an assertion is made, that proportion is absent. I always think about how a CRO I worked with would reflexively try to put error bars on any metrics presented in a chart. It’s good epistemological hygiene. It automatically triggers awareness of base rates and outside views. It’s not a panacea for truth, but it rules out obvious bullshit —randomness sold to you as signal. That will save you time and money in life. It may not increase your top line but it will save you your bottom line.