risk rules that ignore p/l memory

I wrote this tweet a while back that bears repeating because I’m not sure if there’s any topic that seems to come up more when I’m asked about risk management.

Risk management continuum very bluntly stated:

1. Rules for cutting risk when you lose (P/L memory)

2. Rules for how big you can be constrained by aggressive portfolio shock assumptions (ie no P/L memory but positions that can lose X% AUM not allowed)

I’ll just say from option trading context #2 is preferable because the best opportunities likely occur when everyone else is constrained by #1

But that framework is not typical, harder to implement and will often make you feel like you are leaving $ on the table

But you don’t lose your business on an idio risk. There’s an irreducible amount of systematic risk already. Don’t make idio something that can take you out.

how to become the main character on voltwit

A reliable way to have Nassim Taleb come through your window to call you an iiiiidiot is to see a stock crash and say something like “That was a 10-standard deviation move!”

On #voltwit, the corner of fintwit with an option fetish, a surefire way to get QVR’s Benn Eifert to QT you is to quote VIX changes in percent terms.

The topic resurfaces every time VIX makes a large move, voltwit predictably reacts so I figured Grok (X’s LLM chatbot) must know about this.

I asked Grok some questions and sure enough it seems like it’s been paying attention.

“Is it wise to quote VIX changes in percent terms?”

  • Volatility of Volatility: The VIX itself is inherently volatile, so small percentage changes might not always reflect significant shifts in market sentiment or conditions.
  • Misinterpretation: Large percentage changes from a low base might exaggerate perceived market fear or complacency. Conversely, small percentage changes at high levels might understate the market’s condition.
  • Contextual Information: Percentages without context (like the baseline level of the VIX) can be misleading. A 5% increase in VIX might mean different things at different times.

Do vol traders dunk on tourists who quote VIX percent changes?”

In the trading community on platforms like X, there’s often a playful or critical banter between experienced volatility (vol) traders and those perceived as “tourists” or less experienced traders.

  • Misinterpretation Risks: Vol traders often emphasize that quoting VIX in percentage terms without additional context can lead to misinterpretations. For example, a 10% increase from a very low VIX level might not be as significant as the same percentage increase from an already high level.

     

Benn’s primary gripe with VIX percents is the behavior of VIX is level-dependent. Its distribution is not congruent at high and low levels of vol.

Notice how the Y-axis is VIX vol points not percents.

In chatting with Benn about this article he pointed out a basic mechanic that makes vol level-dependent:

Volatility is inherently about squared returns, so you can have a very low base level of realized vol but all it takes is one big-ish sized return and because we’re squaring it (along with all the other little returns in the window) it’s going to have a massively outsized impact on window realized volatility. That makes vol very jumpy from low levels.

Another vol manager, Kris Sidial of Ambrus, explains it simply. Note my response below it.

There are multiple contexts in which it is quite useful to measure percent changes in volatility. There are tradeoffs, as you’d expect with any measure. But I’ve always been forceful about the need to slice things from different angles. It’s a healthy way to identify mixed signals, but it’s also affirming when sufficiently different angles agree.

A good example of this “multiple angles” idea is the 2 part series:

Let’s get into a few reasons to measure vol in percent changes.

Cross-sectional comparison

As a relative value options trader, I would typically have an “axe list”. These are vols in various names in various parts of the surface I thought were relatively cheap or expensive.

[The idea of an “axe list” is covered in the Moontower Mission Plan]

Armed with my opinions, I would then buy the options I thought were cheap on days when their strike vols were underperforming, and sell the expensive ones on days the strike vols were outperforming.

Because I’m looking at vols cross-sectionally it makes sense to look at the percent changes in the vol. A one-point move in SPY is much larger than a one-point move in TSLA.

[See Understanding The Vol Scanner for a full explanation.]

Notes and caveats

1) Measuring percent changes in vols work well “locally”.

For example, it was common in modeling spot-vol correlation in oil to assume that as oil futures went up 1%, that vol declined 1%.

[This dynamic corresponds to a “constant ATM straddle regime”. It is easily visible from the straddle approximation formula.]

But nobody believes that doubling the oil price will suddenly lead to a halving in vol. The model only works “locally.”

2) Percent vol changes can be further refined by normalizing for “vol of vol”

If SPY vol changes from 20% to 21%, a 5% change in vol level might still be more significant than TSLA vol changing from 60% to 63%, also a 5% change, because TSLA vol of vol might be higher. After all, it might be common for TSLA vol to move 5% per day.

The analogy to regular investing would be the difference between a dollar-neutral position and a beta-weighted position. If you are long $100 of TSLA and short $100 of SPY, your portfolio will act like it’s long even though in dollars it’s flat. You are long beta because TSLA is more volatile.

[I’m ignoring the correlation aspect of beta because it’s not central to the argument.]

3) An extra note on “vol of vol”

If you measure vol of vol based on changes in ATM vol you are getting a confounding reading. Like if you measured your pulse with your thumb.

Why?

ATM vols are “floating” strike vols. If SPY drops 1% and ATM vol increases by 1 point, that might just be movement along the vol curve. The vol on the 99% strike might have simply been 1 point higher than the prior day’s 100% strike. On a fixed strike basis, the vol didn’t change. In this case, the appearance of a vol change merely reflected a change in the underlying.

For vol trading purposes, you usually care about fixed strike changes (ie curve shifts not movement along the curve) because that’s what drives the vega p/l of the attribution.

Risk and P/L measurement

The second reason to care about percent changes in vol only applies to vol traders. Vol traders defined as traders who run a delta-neutral book and make their edge from isolating cheap and expensive vol.

That said, the discussion should be highly educational for anyone trying to learn options or as a useful self-test for traders who might be interviewing and expected to talk about managing a book.

Let’s back up to consider vol risk. Specifically, vega, the sensitivity of your option p/l due to changes in implied vol.

We start with a scenario. Assume the ATM and at-the-forward (ATF) strikes are the same.

You buy 100 December ATM straddles in stock A and short 100 December ATM straddles in Stock B.

Stock A and Stock B trade for the same price.

Stock B has 2x the implied vol of Stock A.

Are you vega-neutral?

Are you theta-neutral?

[You can look at the greeks from an option calculator to help but if you are an experienced option trader you shouldn’t need to.]

Ok, let’s get to the answers.

You are vega-neutral. Recall the straddle approximation:

Since vega is just change in option (in this case straddle) price per 1 point change in vol, then:

vega = .8 * S * √t

Look at the formula — ATF vega has no dependence on vol level!

Since S and t are equal then your long and short vega perfectly offset.

[Note: OTM option vega DOES depend on vol level. They have volga or “vol gamma” which is what fuels vol convexity.]

Ok, you’re vega-neutral.

Are you theta-neutral?

Again we don’t need an option model. If Stock B is 2x the vol as Stock A its straddle is 2x the price. If both stocks don’t move until expiry, all options go to zero. Necessarily, Stock B experienced 2x the decay.

If you are short the straddle in Stock B, your portfolio collects theta. It is NOT theta-neutral.

Vol traders will often think in terms of vega. “I bought $50k vega in ABC today”.

At the same time, they often try to run a roughly theta-neutral book.

[See Weighting An Option Pair Trade for a discussion about vega and theta weighting and how the weighting should be matched to the expression of your bet — proportional vs spread].

In the riddle above, being vega-neutral did not mean theta-neutral. But we can actually transform vega so that a vega-neutral position is correlated to a theta-neutral position!

Another way to measure vega: “vega per 1%”

Let’s say the vega per straddle was $.50

If you buy 100 straddles your vega is 100 x $.50 x (100 multiplier) =

$5,000

If vol increases by 1 point you make $5,000 from the change in implied vol.

Assume that the implied vol of the straddle is 25%

Multiply the vega by the vol:

$5,000 x 25% = $1,250

Watch what happens if we raise the vol by 1% or .25 points instead of 1 point:

Vega p/l: $5,000 x .25 points = $1,250

Remember when we raised vol by a full point from 25% to 26% (or a 4% change in the vol) you made $5,000 or $1,250 x 4)

By multiplying the vega by the vol itself we have created a new measure:

Vega per 1% measures the vega p/l per 1% change in the vol.

Let’s return to the original riddle.

We now assign implied vols to the straddle. You are long 100 straddles of Stock A at 25% vol and short 100 straddles of Stock B at 50% vol.

While this is vega neutral, it is NOT vega per 1%-neutral

Stock A “vega per 1%”: +$5,000 x 25% is +$1,250

Stock B “vega per 1%”: -$5,000 x 50% is -$2,500

You are net short $1,250 vega per 1%

This perspective is useful for a few reasons:

1) Linear estimate of p/l with respect to percent changes in vol

If vol is up 3% your p/l is simply 3 x vega per 1%. If you are using a view like “vol scanner” to see all the percent changes in vol cross-sectionally the changes will map easily to your vega per 1% risk

2) Vega per 1% proxies a theta-weighted position which is how vol traders often think about their risk and the idea that they are betting on relative proportional vol changes.

If you are short vega per 1% you are collecting theta

Multiple angles

Looking at vega in both the conventional way (p/l sensitivity per 1 point change in vol) and vega per 1% reveals features of a position.

If you are long vega per 1% but short vega, what does that mean?

Any combination of the following:

  • You are short time spreads,
  • You are long high vol options and short lower vol options. Owning skew or vol convexity are both examples of this.
  • Cross-sectionally you are long high vol names and short low vol.

[Note in all these case it’s possible to be paying theta and short gamma locally. But if you shocked the position in a scenario analysis you likely make a ton of money. The relationships between Greeks are all clues as to what is lurking in a complex portfolio.]

In the riddle scenario, to be flat vega per 1%, you must ratio the trade and be short 1/2 as many high vol straddles. Note you will be net long vega. You will win if all the vols parallel shift higher (ie they all go up 10 points), but if they maintain their .5 relationship the p/l will be flat, consistent with the meaning of flat vega per 1%.

Understanding your greeks means understanding what you’re rooting for. You’d be surprised to know that sometimes option traders don’t even know what they’re rooting for.

When you get down to it, any large percentage change in vol is going to require multiple angles to understand. Your p/l isn’t going to line up because vega itself will change as the underlying changes and vol changes interact. Measuring percent changes on small numbers is usually a bad idea and requires transformations to find divine anything worth mentioning.

Does it make sense to talk about a 75% change in VIX from a base vol of 8%? Of course not. One of the reasons you know that is because it can’t fall 75% from 8%. That’s a clue right there that “standard deviation”, a concept we learned about from symmetrical pictures in HS math texts, is not in charge.


In sum, percent changes in vol can be useful measures but you have to know how to wield them and where they break down.

Unless you want to be the main character on voltwit for a day (and have to fix your broken window). But if you’re ok with that at least go the extra mile and do some technical analysis on VIX.

Final caveat

When you use vega per 1% you implicitly assume that both assets have the same vol of vol. In other words, if a 15% vol name’s IV bounces around 1 vol point per day, then a 30 vol name bounces around 2 vol points per day. This may or may not be true but it’s a better guess than raw vega weighting, which would show being long 100 straddles in A (25% IV) and short 100 straddles in B (50% IV) as flat.

arbitrage is a hall of mirrors

Given where markets are these days, there are a lot of investors, often former or current employees and execs of Mag7 names that are sitting in large, concentrated position at a low cost basis.

In English, they’re as rich as celebrities but standing standing right next to you giving out Pocky on snack duty for 3rd grade soccer.

They are reluctant to sell because the tax hit is immediate. One possible solution to “have their cake and eat it too” is to stay long but collar the stock. This is typically presented as buying a put option financed by a covered call.

Here’s an example based on closing TSLA option prices on 1/28/2025 for the Jan 15th 2027 expiry (ie 717 DTE).

The stock closed around $396.65.

We can just round numbers, call it $400.

You can buy the 25% out-of-the-money put, the 300 strike, for about $56 and sell the 25% out-of-the-money call, the 500 strike, for about $108.

To be perfectly clear — you can buy the put for protection, sell the upside call and COLLECT about $52 or about 13% premium.

Think about the risk/reward for a moment.

If the stock drops $100 in 2 years you are stopped out at $300 but you collected $52 so your net loss is only $48 or about 13%.

If the stock climbs to $500, you will get assigned on the short call so you’ll make $100 on the long stock position but still get to keep the $52 premium for a total gain of $152 or about 38%.

In other words, you can stay long the stock but you get paid 3x what you lose on a $100 up move vs $100 down move.

It sounds like free money.

The prices come from option theory’s arbitrage-free (ie risk neutral) pricing.

This is a checklist of forces that seem to create the illusion.

✔️The forward price is actually $430

We know that because if you look at the option chain, despite the $430 call being ~ $33 out-of-the-money, it’s the same price as the 430 put which is in-the-money.

The reason for this is because if it didn’t you could put on a reversal or conversion trade to arbitrage the funding rate on the stock.

Think of it this way, if the 430 call cost $20 more than the $430 put you could sell the call, buy the put and collect $20. At expiry, since you are short the $430 synthetic stock you are guaranteed to sell TSLA at $430 (either you exercise the 430 put if it’s ITM or get assigned on the 430 call if it’s ITM). So you can buy the stock today for say $397 which would be a (mostly) riskless position since you are long the stock and short the synthetic. The cashflows would be:

  • Collect $20 on the synthetic (remember you sold the call for $20 more than the put)
  • Ensure a profit of $33 by expiry (you bought TSLA for $397 and will sell it at expiry at $430)
  • Forgo ~$32 interest on $397 for 2 years (assume 4% rfr)

Net arbitrage profit: +$21 in excess of funding costs!

If the 430 call traded $20 UNDER the put you would do the arbitrage in reverse. You’d buy the call, sell the put and be guaranteed to buy the stock for $430 at expiry. To hedge you would short it today at $397 and collect $29 on the cash in your account.

So at expiry you are buying the stock for $430 that you shorted at $397. Cash flows:

  • -$33 on buying TSLA synthetically and shorting it today
  • +$32 in interest on cash proceeds from the short
  • +$20 in option premium (remember, you sold the put $20 higher than the call you bought)

Net arbitrage profit: +$19 in excess of funding costs!

If the RFR is 4% (which it approximately is) then the 430 call and put must traded around the same price for there to be no arbitrage.

Therefore $430 is the 2 year at-the-forward strike.

✔️Despite both option strikes being $100 or 25% away from the spot price, the call is much “closer”

Part of this has to do with the forward being $430. Referencing the 430 strike the 500 strike is only 16% OTM while the 300 strike is now 30% OTM.

The option that is “closer” has a higher delta and worth more due to moneyness.

But the other reason comes from the fact that Black-Scholes assumes a lognormal distribution of returns (which is a positive skew distribution).

Why? If a stock is bounded by zero but has infinite upside the OTM call will be worth more than the equidistant OTM put. The distribution is balanced around a median stock expectation that is dragged lower by volatility (if you make 25% then lose 25% you are net down over 7%).

In TSLA’s case the 300 put has a -.20 delta while the 500 call has a .60 delta!

(TSLA also has an inverted skew — the call IV is touch higher than the put IV but that has a minor effect on the cost of the collar in the context of this discussion.)

Here’s a summary table including the collar price if the IV was the same for the 300 put and 500 call:

💡What this post “encompassed”

If you understand this post you have implicitly reviewed:


I called this post “arbitrage is a hall of mirrors” because no-arbitrage pricing theory created this situation where the risk/reward of the collar looks incredibly attractive.

Part of that is theory explicitly incorporates the opportunity cost (the risk-free rate) while our intuition tends to gloss over it. Opportunity cost is an easy topic to understand when someone explains it to them, but it’s trickier to apply in live decision-making scenarios. Look no further than rich people who clip coupons or drive 10 miles out of their way for Costco gas.

The output of arbitrage-pricing can be dissonant to our eyeball tests. It was one of my favorite topics to write about because it does feel so warped.

🟰Understanding Risk-Neutral Probability

This is my guide to the subject. It’s full of nested problems, Socratic method, and even financial theory as philosophy. I’ll re-print one of the nested sections:

👽Real World vs Risk-Neutral Worlds

No-arbitrage probabilities allow us to price options by replication

The insight embedded in Black-Scholes is that, under a certain set of assumptions, the fair price of an option must be the cost of replicating its payoff under many scenarios. Any other price offers the opportunity for a risk-free profit. Have you ever wondered why the Black Sholes “drift” term for a stock is the risk-free rate and not an equity risk premium (like you’d expect from another type of pricing model — CAPM) or the stock’s WACC? A position in a derivative and an opposing position in its replication is a riskless portfolio. Therefore that portfolio only needs to be discounted by the risk-free rate. Option pricing derived from a no-arbitrage replication strategy means we should use the risk-free rate to model a stock’s return.

‼️What seasoned option traders get wrong: Outside of the option pricing context, the risk-free rate is the wrong assumption for drift!

From Philip Maymin’s Financial Hacking:

One of the most common mistakes that even highly experienced practitioners make is to act as if the assumptions of Black-Scholes (lognormal, continuous distribution of returns, no transactions costs, etc.) mean that we can always arbitrarily assume the underlying grows at the riskfree rate r instead of a subjective guess as to its real drift μ. But this is not quite accurate. The insight from the Black-Scholes PDE is that the price of a hedged derivative does not depend on the drift of the underlying. The price of an unhedged derivative, for example, a naked long call, most certainly does depend on the drift of the underlying. Let’s say you are naked long an at-the-money one-year call on Apple, and you will never hedge. And suppose Apple has very low volatility. Then the only way you will profit is if Apple’s drift is positive; suppose Apple has very low volatility. Then the only way you will profit is if Apple’s drift is positive…if it drifts down, your option expires worthless. But if you hedge the option with Apple shares, then you no longer care what the drift is. You only make money on a long option if volatility is higher than the initial price of the option predicted. The drift term of the underlying only disappears when your net delta is zero. In other words, an unhedged option cannot be priced with no-arbitrage methods

💡Takeaway: Arbitrage Pricing Theory

Sometimes called the Law of One Price, the idea contends that the fair price of a derivative must be equal to the cost of replicating its cash flows. If the derivative and cost to replicate are different then there is free money by shorting one and buying the other. This approach is how arbitrageurs and market-makers price a wide range of financial derivatives in every asset class including:

  • Futures/Forwards
  • Options
  • ETFs and Indexes These derivatives are the legos from which more exotic derivatives are constructed.

A Source of Opportunity

Let’s recap the logic:

  1. Arbitrage ensures that the price of a derivative trades in line with the cost to replicate it.
  1. A master portfolio comprising:
    1. a position in a derivative
    2. an offsetting position in its replicating portfolio
    3. This master portfolio is riskless.
  1. A riskless portfolio will be discounted to present value by a risk-free rate otherwise there is free money to be made.
  1. The prevailing prices of derivatives imply probabilities.
  1. Those probabilities are risk-neutral arbitrage-free probabilities.

But those probabilities don’t need to reflect real-world probabilities. They are simply an artifact of a riskless arbitrage if it exists.

This can lead to a difference in opinion where the arbitrageur and the speculator are happy to trade with each other.

  • The arbitrageur likely has a short time horizon, bounded by the nature of the riskless arbitrage.
  • The speculator, while not engaging in an arbitrage, believes they are being overpaid to warehouse risk.

Examples

1) Warren Buffet selling puts

The Oracle of Omaha engages in oracular activity — not arbitrage. Warren is well-known for his insurance businesses which earn a return by underwriting various actuarial risks. Warren is less famous for his derivatives trades. [The fact that he rails against derivatives as WMDs might be the most ironic hypocrisy in all of high finance but as I always say — we are multitudes.] Like his insurance business, the put-selling strategy hinges on an assessment of actuarial probabilities. In other words, he believes that real-world probabilities suggest a vastly different value for the puts than risk-neutral probabilities. The major source of the discrepancy comes from the drift term in Black Scholes. Warren is pricing his trade with an equity risk premium in excess of the risk-free rate that a replicator who delta hedges would use.

The option traders who trade against him can be right by hedging the option effectively replicating an offsetting option position at a better price than the one they trade with Berkshire. Warren is happy because he thinks the price of the option is “absurd”. In Warren Buffett is Wrong About Options, we see this excerpt from a Berkshire letter during the GFC:

notion image

Jon Seed writes:

Warren’s assumptions aren’t crazy. In fact, they seem to be pretty accurate. As Robert McDonald derives in the 22nd Chapter of his 3rd addition of Derivatives Markets, a 100 year put for $1bn assuming 20% volatility, a long-term risk free rate of 4.4% and a dividend rate of 1.5% implies a Black-Scholes put price of $2,412,997, close to Buffett’s $2.5 million. But Warren isn’t discussing risk-neutral probabilities, those assumed in Black-Scholes and imputed by volatility assumptions. He’s evaluating the model’s probabilities as if they were real, actual probabilities. If we, (really Robert McDonald), evaluates Black-Scholes using real probabilities by also incorporating our best guess of real equity discount rates, we see that the model is consistent with Warren’s common sense approach. Assuming stock prices are lognormally distributed and that the equity index risk premium is 4%, we would substitute 8.4% for the 4.4% risk-free rate, obtaining a probability of less than 1% that the market ends below where it started in 100 years. Buffett also assumes that the expected loss on the index, conditional on the index under-performing bonds, will be 50%. This again is a statement about the real world, not a risk-neutral world, distribution. With an 8.4% expected return on the market, the implied expectation of $1 billion of the index conditional on the market ending below where it started is $596,452,294, or 59.6% of the current index value. Again, this is close to Buffett’s assumption of 50%.

2) FX Carry

FX futures are derivatives. Their pricing is a straightforward output of the covered interest parity formula. I think I learned this concept on the first day of trader class back in the day.

The key to the formula is recognizing that the value of the future is just the arbitrage-free price arising from the difference in deposit rates between the 2 currencies in the pair.

If a foreign currency offers a higher interest rate than a domestic currency, you expect its future to trade at a discount. We won’t bother with the math since the intuition is sufficient:

If you borrowed the domestic currency today to buy the foreign currency so you could earn the yield spread for say 1 year, you’d have a risky trade — you’d be exposed to the foreign currency, and its associated interest income, devaluing when you try to convert it back to the domestic currency.

therefore, to make the trade riskless, you need to lock in the forward rate today by selling the future. You know what that means — you expect that forward rate to trade at the no-arbitrage price

The higher-yielding currency must therefore have a lower forward FX rate.

The carry trade is basically a speculator saying:

“I know the future FX rate should trade at a discount to the spot rate but I’ve noticed that the future rate rolls up to converge to the spot rate, rather than that lower rate being a predictor of the spot FX rate in one year.

So I’m going to buy that FX future and hold it for a profit.”

The carry trade is not an arbitrage or riskless profit. It’s a risky profit. But the opportunity arises because the futures contract would present an arbitrage at any price other than the risk-neutral price.

another XIV brewing in crypto?

If you don’t know what MicroStrategy (MSTR) then congrats, you have won life. Close this tab and go back to sliding down rainbows and swimming with otters.

For those who remain you likely know that Saylor has been financing his BTC purchases from sale of convertible bonds.

I have nothing to add to that conversation but I have a trade idea. It’s gonna take some background to build up to it.

First, there are 2 required reads. They aren’t long and they’re excellent. The best combination. I will highlight some key points from them.

Convert of Doom: Microstrategy and the dark arts of ‘volatility arbitrage’ (6 min read)
by Alexander Campbell

This post explains how Saylor is effectively arbitraging the MSTR’s volatility by issuing convert that pay zero interest. This works because a convert is just a bond with an embedded call option. By delta hedging the implied vol in the embedded option, dealers or investors can earn a return if the realized vol exceeds the implied vol. The expected return presumably compares favorably or at least similarly to if MSTR just issued interest-bearing debt but Saylor, is effectively transmuting volatility into interest payments.

[In general, when a convert is first issued it’s common for both the stock and vol to decline as dealers hedge both the delta and often the implied vol by selling long-dated options to offset some of the vega they’ve bought at a discount.]

Campbell is both educational and insightful showing how:

1) the Merton model can be used to understand why MSTR is so much more volatile than BTC — the MSTR’s premium to NAV is positively correlated to BTC!

(In Battle Scars As A Call Option, I explained how one of my most painful trades occurred when I was long UNG vol when it went premium to NAV. In that case, the sizeable premium was inversely correlated with the price of gas. The exact opposite scenario of MSTR’s juiced vol today!)

2) this is a regulatory arbitrage.

Quoting Campbell:

Result: Retail buys MSTR shares at 150% premium while sophisticated investors arbitrage vol differentials and MSTR books the diff between all these trades as profitable transactions.

Here’s the irony: We require hedge funds to register with the SEC, spend $50-500k annually on compliance, and limit themselves to accredited investors with millions in the bank. Yet retail investors can freely buy MSTR shares through Robinhood.

And therein lies all the difference. There’s nothing wrong with what MSTR is doing, but it’s a good example of the law of unintended consequences.

Regulators block retail from ‘risky’ hedge funds while inadvertently pushing them into something potentially more dangerous.

By restricting crypto access for years, regulators left retail investors few options. Bitcoin futures required $300k contracts with 50-100% margin. ETFs were obscure or nonexistent. So people bought MSTR instead – a far more complex and potentially risky vehicle.

In trying to protect retail investors, the SEC has inadvertently funneled them into a potentially much riskier product.

Which brings us to the next required reading:

Moonshot or Shooting Star? A Volatile Mix of MicroStrategy, 2x Leveraged ETFs and Bitcoin (7 min read)
by Elm Wealth

Oh how I love the existence of levered ETFs on concentrated ideas. This post echoes a very real possibility of XIV’s “volmageddon”.

Something we’ve discussed ad nauseum in this letter is volatility drag and how geometric returns diverge much lower from arithmetic returns as we increase volatility. The divergence is proportional to variance or volatility squared.

The article links to a neat calculator which offers hands-on lesson in volatility drag.


💡Learn more💡

And linking these to options which is where we are heading:


Exponents are good, wholesome fun. And this post was certainly that, inspiring the the trade idea we’re building towards.

The long quote below (emphasis mine) cuts to the heart of the matter.

Now let’s use some data to look at the probability of going bust just from a single really bad day. The price of a 2x leveraged ETF should go to zero if the price of the stock underlying the ETF goes down by 50% or more in a single day. The probability of such an event is a function of the variability of the MSTR stock price. If we assume the volatility of MSTR will be about 90% (or 5.6% per day), then we could think of a 50% decline in the stock price in one day as being a roughly 9x daily volatility move. A natural question is how often do stocks with very elevated variability, like MSTR, experience days when they decline by 9x their daily variability in returns?

We looked at about 1500 US stocks over the past 50 years, chosen so that at some point they were within the top 1000 stocks by market-cap. We found that the annual probability of such stocks experiencing a one-day price decline of 9x daily volatility was about 6%.

[Kris: The fatness of the tails should swipe you like a dragon. In Mediocristan, 9 standard dev moves don’t happen.]

This isn’t quite the final answer though, as we need the probability of a stock dropping by that much some time during the day, rather than just close-to-close. The usual estimate for the probability of touching a level over some time interval is to simply double the probability of being below that level at the end.

[The explanation of this is the same logic we’ve discussed whereby we estimate the probability of a one-touch by doubling the delta. Here’s Elm Wealth explaining:

To see why this is true in a simple random walk without drift, note that for every path that finishes below the level at the end of the period, there is another path where it hit the level and then followed a path that was a mirror of the path that finished below the level. So, for every path that finished below the relevant level (here a 50% drop), there’s another path that touched the level but then reflected and wound up above the level at the end.]

So, assuming MSTR volatility of 90% per annum, the probability of a down 50% intra-day move occurring at least once over the next year is about 12%.

If we use the MSTR volatility implied by the options market of 160%, then down 50% is only 5x daily volatility. The same data as above yields a close-to-close annual probability of about 30%, which we estimate as about a 60% probability of an intra-day drop that would send the ETF to 0.

There are a number of alternative perspectives one could take in trying to estimate this probability: for instance, trying to estimate the probability of a large one-day drop in Bitcoin and how that might impact the MSTR premium to BTC. For example, a 25% one day drop in BTC and a 33% collapse of the MSTR premium would imply a 50% drop in the MSTR share price.

[Kris: This hints at the MSTR premium vs BTC correlation Campbell wrote about]

A more complex analysis might try to estimate whether it is possible for these leveraged ETFs to become large enough that their daily rebalancing trades could themselves drive the price down 50% in one day. For example, imagine that MSTR rapidly triples in price due to some combination of BTC rally and an increase in MSTR’s premium to the BTC it owns, and the assets in the MSTR leveraged ETFs go from $5 billion to $30 billion. The market capitalization of MSTR could be about $270 billion and the leveraged ETFs would be owning $60 billion, or 22%, of MSTR stock outstanding.

Now imagine for some reason, MSTR stock drops 15% during the day – which, given MSTR volatility, would not be unusual. The leveraged ETFs would need to sell $9 billion of MSTR stock at the closing price. Recently, MSTR daily average trading volume at the close of the day has been about $2 billion, so this would be quite an impactful amount of MSTR to sell at the end of the day. For every 1% the price declines further than the 15%, the ETFs will need to sell another $500 million of MSTR, and if that pushes the price down by another 1%…well, you can see this doesn’t have a happy ending for owners of the leveraged ETF or MSTR.

[Kris: see The Gamma of Levered ETFs]

Bottom line, we think there’s a pretty decent probability – somewhere in the range of 15% to 50% – that these 2x leveraged MSTR ETFs are effectively wiped out in any given year if they are not voluntarily deleveraged or otherwise de-risked sooner.


Towards a trade idea

The 2x ETF is MSTU and the 2x inverse ETF is MSTZ. Unless these are delevered, if MSTR [falls/rises] by 50% in one day [MSTU/MSTZ] goes to zero.

I’m going to walk you through my stream of consciousness as I reached the end of the article.

1) I’ll accept Elm Wealth’s logic , my first question is…um, are there options listed on the levered ETFs?!

Checkmark✔️

MSTZ is thin but MSTU has over 350k contracts of OI.

2) We are not in Kansas anymore. The distribution is extremely discontinuous.

On a hellacious down day in BTC coinciding with premium compression (that positive correlation that Saylor has been monetizing being his undoing is the kind of poetry markets like to write) and the telegraphed, reflexive ETF rebalance flows can take MSTU straight to zero XIV style.

And this can happen on any day.

3) So the next thought in the chain was to consider buying 0DTE puts. Like every morning before brushing my teeth.

This is a non-starter for 2 reasons.

i. 0DTEs are not listed on MSTU

ii. ODTEs don’t capture the overnight vol so you don’t own “all” the risk. This is especially important in BTC since it’s a 24 hr market.

4) We’ll come back to the question of expiry. I’m just adhering to the sequence of my thinking for better or worse (feel free to debug my mental compiler).

So what strike do I want? The bet only hinges on a Boolean outcome — did MSTR fall 50% or not?

If the thrust of the trade is so starkly binary then the put I want is lowest strike on the board that you can pay a penny for. I only care about maximum odds. The strike is the payoff, the premium is the outlay. So if I buy the $5.00 put for a penny I get 499-1 odds.

[Since we are thinking in a risk-budgeted binary way rather than in continuous option terms, a parallel framing would be the $5/$0 put spread]

It’s worth noting that this is a bit weird compared to typical investment scenarios. You really only care about the distance of the strike from 0 which determines the payoff and the premium. The price of the stock doesn’t matter because your payoff depends on a certain percent move happening. No matter what nominal price MSTU trades for, if MSTR goes down 50% MSTU gets wiped out.


Let’s start by thinking aloud about constructing a bet and work from concrete to abstract before we bring it back to concrete again.

Buying a 2-week put

Suppose you spend $1,000 2x per month to buy puts that cost $.01.

(Because of the 100x multiplier this translates to 1,000 option contracts)

If you are buying the $2.50 strike, you will get paid $250,000 if MSTU hits zero.

Over the course of the year, following this strategy will cost $24,000 ($1000 twice a month).

Say it hits on the last trial, your net profit is $226,000 (payoff – cumulative outlay). Call it 9-1 odds.

If MSTU has a 10% chance of hitting zero within the year, this is a fair bet. If the probability is higher, you have positive edge, lower you have negative edge.

This a good place to pause for birthday problem math. It allows us to convert into a useful unit of probability per day.

MSTU trades 251 days a year. If we think it’s 10% to hit 0 one of the days we can compute the probability of it NOT hitting zero on any given day like this:

(1 – p²⁵¹) = 10%

p = 99.958%

Converting to odds:

.99958 / ( 1- .99958) ~ 2382

The odds against MSTU hitting zero on a random day is 2382-1.

If there were 0DTEs if you could buy say any strike from $24 or higher for a penny you would have edge to your model probability of “There’s a 10% chance that MSTU hits zero this year”.

Using the daily probability to compute the chance of MSTU going to zero in 10 business days (roughly what a 2-week option encompasses).

1 – .99958¹⁰ = 99.581% or 238-1 odds.

If you can buy a 2-week $3 strike put for a penny you’d have edge to this probability.

We can extend the reasoning above to construct useful tables based on a range of assumptions about the “probability of MSTU going to zero with a year”.

 

Let’s step through an example.

If you believe MSTU has a 20% chance of going to zero in a year, then you need to 56-1 odds on a 1-month put for it to be fairly priced (and assuming the ETF getting zero’d is the only way to win).

[To compute the payoff ratio: strike / (strike – premium)]

If you could buy the 1-month $.57 strike (yes, a very low strike!) for a penny you would get the 56-1 odds.

I started with this whole “what option can I pay a penny for” reasoning because my intuition told me that for a trade like this you will want a strategy that trades an a very near-dated option for a teeny price because that’s probably where you are going to find the best odds in this framework.

But I should not get to anchored to either this penny idea or the notion that the near dated is absolutely the right way to play this.

At this point, it’s time to look at some data to see if:

1) the prices are ever attractive

2) can we narrow down an expiry range

Market prices

The first thing I did was pull up an option chain for the regular monthly expiry — Jan2025. Good news. While the far OTM puts markets are sometimes wide, several strike are 0 bid, offered at $.05 but critically last sale is $.01. There’s someone who sells these things for a penny.

[This is not the case for the puts in the less liquid MSTZ double short ETF. The markets are also much wider. What does this tell you?]

I fetched fitted end-of-day put prices for MSTU options from 9/24/24 to 1/6/25, filtering for all puts below 1.5 delta. FAR out-of-the-money puts (the puts that correspond to the 98.5% call).

  • I computed the payoff ratios for all puts less than 1.5 delta by comparing strike and premiums as explained earlier.
  • The color coding corresponds to expiry buckets in calendar days (ie 1-10, 11-20, etc)
  • I added a penny to all the premiums. So an option fitted to be $0 is marked at $.01

Right away 2 things stand out:

  1. The chart has trash scaling because of point #2
  2. As expected, you are going to get much better payout ratios on near-dated options. If there’s 2 DTE and the stock is $100, buying the $50 put for a penny offers 4999-1 odds.

So the intuition about the near-dated being better bang for the buck seems correct but the scaling is obscuring the picture and there’s another problem (we’re going to get to it but if you feel up to it, try to guess what it is. One hint is it’s not about transaction costs. That’s important and I’ll say a word about it later as well but that’s not the angle here.)

Let’s fix the scaling. Log base 2 compresses this range nicely and is easy to interpret (every tick mark doubles the payoff).

Ahh, much better. Now we see a smooth descent of payout ratios. To be clear, a y-value of 10 is 2¹⁰ or a payoff ratio of 1024-1. An 11 is 2048-1, and so forth.

Here’s the payoff table reproduced in log base 2 terms:

You can see how the puts suggest the market thinks the probability of MSTU disappearing is somewhere in the 10-15% range (but probably less since you can win on the puts without the ETF zero’ing).

The risk/reward on these near-dated puts is much higher than the deferred puts which is expected since we require much higher odds. Remember if we thought that there’s a 10% chance of MSTU zero’ing in a year, a $10 strike put can trade for $1 (ie 9-1 odds) and be fairly priced. But these short-dated options need to offer much better odds to compensate for a much smaller probability window of MSTU going under.

We need to compare the payoff ratios with the probabilities we inferred from the annual probability (the birthday math) for the stock zero’ing in 1 week, 1 month, etc.

But before that we can address the mystery problem.

By comparing the strike & premiums we can identify if an option is cheap or expensive compared to our model probability but we can’t assess the validity at the strategy level. In other words, we can’t answer whether it’s better to spend $24k on long-dated puts or $2k a month on 1-month puts.

To handicap that we need to adjust our payoff ratios by how often we need to trade so that we can now compare all the strategies on the same measuring stick — “if my annual probability of MSTU zero’ing is X, what’s the best approach”.

So we divide the payoffs by the number of times you must trade per year.

[Used some simple rules, ie for 1 dte, we divide by 251, for 30 dte by 12]

We don’t need to use log scaling for the strategy level chart.

The way to read this is if you think the annual probability is greater than say 20% (see the horizontal dashed lines) than all opportunities above the line are positive expectancy. There’s a lot more opportunity in the nearer-dated confirming the original intuition but every now and again it looks like a 2-3 month put gets fairly cheap.

The median payout ratio normalized to annual odds is 7-1 or 12.5% implied probability of MSTU offing itself.

 

In summary:

  1. MSTR is highly volatile
  2. If it moves 5-10x it’s daily standard deviation in one day to the downside, MSTU can go to zero.
  3. Those size moves historically (via Elm’s article) happen about as often someone rolls a 10 with 2 dice if we say MSTR is 100 vol. If we use it’s implied vol which is more forward looking, we’re it’s more like rolling a 7.
  4. The market seems to price the puts in-between those possibilities but we see that the price moves around quite a bit so you can scoop some when they get offered cheap.
  5. The more aum MSTU gathers the larger the end of day rebalance trade. Something to keep in mind.

Keeping a close eye on this, perhaps building a monitor around this idea is a nice way to grab a convex outcome. Especially one that I suspect has reflexive properties that are conservatively ignored in this independent events “birthday math”.

Endnote on execution

I assumed $.01 slippage on these options. If you pay $.02 for an option that we computed the payoff based on a penny, you’re getting half the odds. So when talking about really long odds and teeny probabilities and option prices, costs matter. Regardless, you have all the knowledge you need to compare max payoff to your own execution prices to bridge this fully to reality.

anchor yourself

Trump launched a memecoin on Friday night. One wallet (presumably his) owns 80% of it. At the time I’m writing this, he’s $24B richer.

You read that correctly. “B” like 🐝

Anyway, there’s a vesting period and no obvious way how he can monetize this without crushing the price. Maybe he can force a bank or another country to accept it as loan collateral. Or maybe he can demand the Treasury to buy it to diversify strategic reserves like its gold.

I’m kidding. I don’t think he can do such things. But also, if there’s a will there’s a way and I suppose a man named “Trump” ridin’ the mother of all heaters is some kind of cosmic onomatopoeia.

Anyway, this brings me to 2 tweets I saw pretty close together on the timeline.

In the early 2000s, I was in a fantasy football league that didn’t have a waiver system. Free agency 24/7 all week. The rules rewarded crackheads who followed football every second, ready to jump online to secure Denver’s backup RB when the starter’s knee exploded live TV. In other words, derelicts like myself who didn’t have a family in their 20s.

Today, I would never be in such a league. Not my speed anymore.

In fact, one of the reasons I left full-time trading was because it’s out of phase with how I want to live. I was never a news junkie. Having a job that required you to be on top of the news became an energy-suck. Playing a video game Trading for Living has a particular cadence.

Cadence. Rhythm. These are important dimensions in matching yourself to what you do.

Trading is different than a lot of desk careers. It’s a bell-to-bell job. Not a lot of homework. No deadlines.

But the best trade of the day can happen at 10:04 in the morning and if you were in the bathroom, you might as well have stayed home. Need to run an errand midday or meet someone for lunch? That can wait. For 30 years. You might make millions but you’re chained to a desk like a 9-year-old who has to raise her hand to go to the bathroom.

The point isn’t to say what’s better or worse. It’s just that trading has a pace and if you like to read peacefully, deliberate decisions slowly, and avoid paranoia you will find the environment stressful. Not to mention the boredom. A trader is like an EMT or firefighter on a slow day. Waiting but ready. Boredom is major problem for exactly the kind of people who think trading would be a great way to be in the action. You fold a lot of hands. But that takes discipline. Lapses in willpower or even a lack of sleep can seduce you into “loosening” your starting requirements to see the next card.

Those tweets above combined with FOMO and the proliferation of “if you can’t beat em, join’em” rationalization is gonna lure people towards spending their brain cycles on things that will feel deeply unfulfilling and that they are poorly matched to.

If you’re a financial thrillseeker these times are for you. If you are a builder or craftsperson, technology tools are accelerating. More power at your fingertips.

Either way…let your focus anchor you.

Why do you do anything? Maybe go a few whys deep. The alternative will be being battered by the waves. Adrift. And angry. There’s going to be a lot of games happening in front of your eyes. Some say there always have been, at least now it’s out in the open. Touche. But there are consequences to that too.

Are you better off or worse off?

It’s a deeply personal question. I am increasingly of the belief that within a decade, your own whys will be the only questions. We are leaving a world where people (at least people with the luxury to read substacks) just compile their parent’s script. Doing things because it just seems like the next thing to do.

It won’t make sense to do that in the same way it doesn’t make sense to describe a color as round or a chord as wet.

A musing by a reader

I am sharing this with permission from a professional option mm who sent it to me. I deleted any identifying info. The person has been reading moontower for a long time and was graciously sharing.

Thought I’d share a few things I’ve learned. Most will be obvious to you, but maybe a nugget in here for one of your posts.

1) Starting with the obvious: market making is the hardest way to make an easy living. You can grind it out every single day scraping away ticks for edge, and at the end of the day your outcomes are decided by liquidity and volatility. They’re the two things you can’t control, and they’re the two things that determine your fate. This isn’t true for everyone, but it’s certainly true for [redacted].

2) I firmly believe that actively trading is not sustainable for sane human beings. Managing an options portfolio is like taking care of a baby. It’s a living organism that constantly needs to be tended to. If you neglect it, it will die. The amount of mindshare it takes up and context-switching is just simply unmanageable over long periods of time. The intensity it takes to perform at your best during market hours takes a beating on the human body. The guys on my team are [ages redacted] and look like they’re [redacted]. Besides a few partners, I don’t know a person here that doesn’t have a drinking problem. This is anecdotal, but I see it from other groups too.

3) Trading vol is easy. Managing a position is hard. I’m convinced you need to be borderline OCD to manage a book. Between the pruning of positions and the fine-tuning of the model, you have to have an insane attention to detail to get an acceptable slide.

4) You’re always underpaid until you’re a partner or PM. I won’t go into detail on this one because you did a whole blog post on it (maybe it was just a tweet idk). I’ll just say that you nailed it.

[Kris: See Getting Less Screwed On Compensation and adverse selection in the option job market]

5) Like most things, luck is the difference between 0 or hero in this industry and it doesn’t matter if you trade long or short. I’ve seen groups blow up in a matter of days. I’ve seen groups that should’ve blown up and then came back to make a fukin stupid amount of money from continuing to hold short.

“over-the-shoulder” substance

When I worked at Parallax I used to carpool with 3 good friends I worked with. On our rides back and forth, we frequently talked about trading situations that came up on the desk, “Yesterday a broker showed me this, I responded with that, how would you have handled it?”

It was the equivalent of game film for trading. I used to joke that if we recorded the calls on Periscope (remember Periscope?) it would be must-see tv for the fintwit crowd.

Given privacy and compliance, it was a non-starter idea. More broadly I thought that what I call“over-the-shoulder” educational material would be popular. The same way you watch Twitch to see how someone plays a game. Watching their mind work in real-time. Or watching somebody learn to play a song on the guitar by ear especially if they haven’t heard the song before. You pick up a bunch of stuff from what’s typically left out of a scripted tutorial. The mistake they make and how they correct it. Or how they started in the first place with the equivalent of a blank sheet.

I want to hear their reasoning out loud. I want it to be raw.

A few years ago, I watched this guy solve an insane sudoku as he narrated his thinking. It’s one of the most delightful YouTube videos I’ve ever watched.

On the heels of the Moontower Discord Voice chat I hosted a week ago, I thought heck let me try my hand at an over-the-shoulder video. I planned to talk about how to think about synthetic futures in the options market. I opened up my Interactive Brokers account option chains and a spreadsheet and just started riffing.

Since I did no prep, I figured instead of just recording a video I’d just livestream on X/Twitter. From one thought to the next, I end up covering almost an hour of info.

The feedback to what I saw as merely an experiment was tremendous. It got nearly 13k views and lots of love. I guess I’ll be doing more of this kind of thing.

I uploaded it to YT. I hope you find it useful.

 

An unscripted tour of key option concepts as I look at live data including:

  • Synthetic futures
  • Implied interest rates
  • Reversal/Conversions
  • Box rates
  • Tracking structures in IB
  • P/L attribution of a strangle
  • Tickers used: TSLA IBIT SPX

The Moontower Subscriber Hoot

On Friday I tried a little experiment.

I opened a voice channel in our Discord called The Moontower Subscriber Hoot.

[Back in my floor days it was common for traders on the floor wear a headset to be on continuous conference with the traders streaming quotes upstairs in the office. At the fund, we also had an open conference line on our turrets to be able to talk to other traders on the team. I don’t know the origin of the term in the industry, but these conferences are called “hoots”.]

I wasn’t writing or working on something that needed deep attention so I figured why not open the hoot and chat while I tried to sell some TLT puts. [I didn’t get filled because I was cheeky with a limit a penny above the bid.]

I narrated what I was doing which I though could be a nice “over-the-shoulder” way to explain what I’m thinking as I do it plus enjoy the banter, something I miss from the old job.

Regarding TLT, I decided to toe into some bond delta with TLT being down more more than 2 st devs over the past month, more than any other liquid name in the Cockpit view.

I decided to express the delta through short vol as TLT vols in the belly of the term structure look attractive enough to sell on both our DASHBOARD and REAL view.

I chose a 35d put…skew is a bit elevated and the strike vol on the options were up a lot that morning.

ATM vol relationship to realized vol history

We talked about several other trades on the hoot and I got to explain a juicy rev/con trade that turned out to not be a real opportunity but people came away with a better understanding of how synthetic futures work!

In narration, the group was able to see how I use greeks to make sense of what’s going on in real-time. Makes a topic that seems abstract super practical and useful. For example, if I sell .30d puts at $1.85 vs stock at $65 and buy them back at $1.88 when the stock drops to $64.50 why that’s a big winner if I traded them delta-neutral.

You can also use greeks to discuss vol changes from tick to tick.

“See how the option is offered at $1.80 again 20 minutes later even though stock is a dime lower? The option only has a penny of theta, so it wasn’t erosion. It has a .30 delta and .15 vega so that means IV is down .2 clicks”

Overall a great experiment that I’ll make a trend. Not committing to a schedule but when I host them it will be on a Thursday and/or Friday.


A few weeks ago Nick Pardini had me on his podcast Analyzing Finance.

I know Nick from his days as a researcher for Parallax, we sat a few desks from one another.

It was fun to catch up. We cover options, volatility, and how options theory principles are found in all kinds of life or business decisions. The options stuff is really perfect for people trying to learn, it’s not heavy, and touches on a lot of practical questions around when you should consider using them or not and why volatility behaves the way it does.

Stay Groovy

you can ONLY eat risk-adjusted returns

Last month my friend Khe published a letter pointing to Money With Katie’s adjustment to the 4% rule for avoiding “lifestyle creep”. The 4% thing is a rule-of-thumb for spending so you don’t outlive your assets in retirement.

My Rule for Avoiding Lifestyle Creep: Don’t Live Beyond Your Assets (2 min read)

Katie’s adjustment is a simple formula that marries both assets and income to come up with a spending rule that balances the desire to spend more as you make more while still saving enough for your future.

The formula = (4% of net worth + post-tax income) / 2

I think of rule-of-thumb like this as an API to a complex code base known as “retirement finance”. Nobel Prize winner William Sharpe said knowing how much to save and spend for retirement is “the nastiest, hardest problem” in finance.

That 4% rule abstracts away the interaction between investment returns, inflation rates, longevity, and taxes because, well because, we can’t let ourselves be paralyzed by this problem before brushing our teeth every morning.

But that 4% rule and any other “rule” is obviously a guideline sitting on assumptions with generation long lookback windows. And as we learned in Sunday’s letter, we don’t have a lot of samples of generation-sized windows to generate any confidence-inspiring inferences.

The entire output of the personal finance industry on the topic of investment history is a streetlight problem. But forgivable. Remember “hardest, nastiest problem” and we still gotta find the toothpaste.

Despite a career in options trading, replete with greek letters and complex financial instruments, I find myself in total agreement with Corey Hoffstein’s recent podcast guest Victor Haghani who despite being a super-quant (sometimes super means flying to close to the sun as he was also a partner in LTCM) admit that when he left high finance he realized he didn’t know how to invest.

From Last of the Tactical Allocators:

After LTCM, I woke up, and it wasn’t a dream — I realized that I needed to focus on managing my family savings. Up until that time, I had worked at an investment bank that took a lot of my compensation and put it into the stock of the company, Solomon. At LTCM, it was pretty natural to invest a lot of my savings in our fund that we managed.

It was a shocking realization to see that I had been working in finance for about 17 years, alongside some really brilliant minds — practitioners and academics — yet I had never really thought much about investing for myself. All of my focus had been on research, to begin with, and then proprietary trading.

[Kris: I wrote a post my own wake-up call: My Investing Shame Is Your Gain]

My situation was typical: you come out of college, go to Wall Street, and get trained in everything related to Wall Street. Unless you’re going into private wealth, which Solomon didn’t even have at the time, you don’t get trained at all in personal finance. So, there I was, looking around to see what my friends and respected colleagues were doing. Everyone was following the Yale Endowment Model that David Swensen had made so popular. Yale’s returns were incredibly attractive.

I knew people in private equity, hedge funds, venture capital, and distressed investments, so I started investing as though I were a one-man Yale Endowment. Meanwhile, I was on sabbatical from working, spending time with my young kids. After four or five years, I had this realization that what I was doing made no sense — from a life perspective or a risk-and-fee-paying perspective.

The final straw was when I sat down with my accountant, David, to review my tax return for one of those years. I asked him, David, why am I paying all this tax? I haven’t had that much income this year. He explained, Well, you have this income here as short-term capital gains, and you have these expenses over here that you can’t deduct because they’re miscellaneous itemized deductions.

It was like an epiphany: Geez, what am I doing? I realized I had to go back to basics.

A category of basics that I think are fundamental to investing is what I broadly call return math. It matters because investing is really a nesting of many re-investments. And compounding is the realm of multiplication. While it’s true that multiplication is just addition on ‘roids, a failure to understand it can mean the difference between the prefix “ste-” or “hemor-”.

We’ll go to Victor for some education.

Eating risk-adjusted returns

Corey, playing devil’s advocate, confronts Victor with a common charge leveled against quants:

“you can’t eat sharp ratios”

Victor: Risk-adjusted returns are the ultimate thing that we care about I think investors should be trying to maximize risk-adjusted returns. And what is a risk-adjusted return? Well, a risk-adjusted return is the return that you expect to get or that you did get minus a cost for risk. And the cost for risk comes from the fact that typically we have a decreasing marginal utility of wealth or consumption that makes us risk-averse.

We could write down a formula in an idealized world for what risk-adjusted return is, but let’s just think about what it is qualitatively. I mean qualitatively what it is is that I come to you and you have your optimal portfolio of equities and safe assets and whatever, and I say to you all right, that’s it, I’m going to take away that portfolio from you and I’m going to give you in its place a 100% safe portfolio. But you can’t have the portfolio that you have right now that has all these risky assets in it. What is the lowest return that you would accept on a totally safe portfolio so that you would be not happier or less happy than you were with this risky portfolio that had this positive extra expected return?

Well, when you answer that question you’ve just answered the question of what is the risk-adjusted return on your portfolio. The risk-adjusted return could also be termed the certainty equivalent return of your portfolio. It’s basically what would be the 100% safe annuity that you could turn your wealth into without taking any more market risk. That is what you eat. That’s what you’re going to spend on your food. That’s the only thing that you have to eat — that’s the annuitized risk-adjusted stream of consumption that your wealth will support. And so that is what you eat with the appropriate inflation adjustment.

[Kris: There’s a question bandied around Twitter every now and then cutting to the heart of this — what would the TIPs yield need to be for you to plow all your savings into it and not concern yourself with investing anymore? In this interview, Corey and Victor frequently speak in terms of real returns and what sticks out to me is how much higher people think equity real returns are above TIPs but in reality that number over long periods is ~ 3% give or take 2%, maybe 3%. If the TIPs yield were 4% you could really live by the 4% rule without worrying. Except for taxes of course. But if we are going to talk about taxes, that’s the muscle movement that makes any realistic form of alpha look like 10lb dumbbell curls as far as impact. Reclassifying your entire income to a friendlier tax code is a better use of time than trying to outsmart markets. Unless your answer to a calendar with no meetings is afternoon delight with the solar credits section of the IRS code.]

Sharp ratio is not what you eat. Sharp ratio is we’re looking for portfolios that have the highest sharp ratio, but we’re not trying to maximize the sharp ratio of our portfolios. We’re trying to maximize the risk-adjusted return of our portfolio, and there’s going to be all kinds of cases where you’re going to prefer a lower sharp ratio portfolio to a higher one depending on what your constraints are and other things. But it’s the risk-adjusted return that you’re eating. You’re not eating the sharp ratio, and nobody’s claiming that you’re eating sharp ratio.

[Kris: In this next part, Victor doesn’t use the words “arithmetic” and “geometric”. That’s ok. He’s a gentleman, you buy someone a drink before you go there.]

What you’re not eating is you’re not eating expected return. If you start eating expected return, bad things happen. Let’s take a toy example where somebody tries to eat their expected return. Say you have your wealth, you’re retiring, and you look at your portfolio. You construct this portfolio, and you believe that this portfolio has a 5% real return after cost of living adjustments. So, you got this portfolio. The only problem is that you had to build this portfolio with a bunch of risky things, unfortunately, because maybe you were a pharmaceutical person. So, you’ve built it with different pharmaceutical companies, and this is a pretty risky portfolio, but it has a 5% average annual real return.

And let’s just say the volatility of this thing is much higher than the market volatility, that it’s got 30% annual volatility. You got $10 million, and you say, “Oh, well, the expected annual return of this portfolio is $500,000 a year. I’m going to spend $400,000 a year adjusted for inflation for the rest of my life, and I should be fine because my portfolio has a 5% average annual return. So, I’m just going to spend $400,000 — you know, that’s the 4% rule — I’m going to spend $400,000 adjusted for cost of living for the rest of my life, and everything should be fine.”

What’s your most likely amount of wealth in 25 years? Your most likely amount of wealth in 25 years is zero. You’re going to be wiped out. Why are you going to be wiped out? Because the up 30, down 30 is killing you. The first year you went up 35%, so you went up to $13.5 million. You spent $400,000. Beautiful. It’s all great. But the next year you went down 25%. You went down to 30%, but there’s the 5% expected return. The next year you go down 25%. Uh-oh. Now after you spent the $400,000, you have less wealth than you started off with. It’s volatility drag, and that volatility drag means that your median portfolio has gone to zero before 25 years. Once it starts going down, it really starts going down fast.

That’s what happens when you eat expected return. So, you have to eat risk-adjusted return. If you eat your expected return, it doesn’t end well. Maybe that’s where we get all these missing billionaires from — they were eating expected return. People get rich, and they think the expected returns are high, and then they try to eat a fraction of the expected return, but they’re not eating risk-adjusted return.


I will leave you related ideas to chew on.

Dr. Philip Maymin was recently interviewed on the CFA Institute podcast. You might recognize his name because he’s the author of book I constantly recommend, Financial Hacking.

About Philip:

Dr. Philip Maymin is Portfolio Manager and Director of Asset Allocation Strategies at Janus Henderson. He is also the Endowed Schramm Chair of Analytics and the MSBA Program Director at Fairfield Dolan, the CTO for Swipe.bet, and an instructor at Analytics.Bet.

In the past, he has been a portfolio manager at Long-Term Capital Management, Ellington Management Group, and his own hedge fund. He was Assistant Professor of Finance and Risk Engineering at the NYU School of Engineering, as well as an analytics consultant with several NBA teams and the Chief Analytics Officer for Vantage Sports.

Maymin co-founded the journals Algorithmic Finance and the Journal of Sports Analytics. Additionally, he was a policy scholar for a free market think tank, a Justice of the Peace, a Congressional candidate, and an award-winning journalist.

I recommend listening to it for several threads.

There’s a lot about AI. He’s very much in the weeds, so much so that it’s the topic of his next book.

Next, there’s some nice reinforcement of some of Victor’s ideas (maybe not an accident, Maymin was also at LTCM). Both Victor and Philip talk about dynamic vs static allocations. Victor’s firm helps you dynamically size your portfolio according to your risk own risk function as well as expected return (I call it “Kelly aware”). Phillip emphasizes tail risk management (not in a financial product sense necessarily, he’s speaking generally) because like many things in life, it’s a few moments that have most of the impact or if Wu-Tang Financial went quant — ”power law rules everything around me”.

For Maymin, the focus should be on risk management since the forces of competition make it hard to win big on alpha (alpha being defined as capturing excessive return without paying the risk cost) but those same forces do not keep not inhibit you from avoiding disaster which is a nice asymmetry for the individual.

That conclusion flows easily from his articulation of efficient markets hypothesis. His coverage of that idea, what it actually means, and its copious shortcomings are the best I’ve heard. I also remember him covering it in his book to a poetic degree and extending well beyond markets iirc.

Postscripts

  • Philip suggests de-risking during higher vol periods because if you don’t then those periods will be a much higher proportion of your performance than the length of time that coincided with those periods which is another way of saying that a fixed position size is bigger part of a risk budget when times are volatile. That’s fairly obvious but the open question I have is whether higher volatility periods also coincide with higher returns. I presume they do in arithmetic terms but not geometric which is what matters so my unverified take is you want to be smaller when vol is high even if the expected returns are higher. My own investing process doesn’t switch gears hard with the vol level so this is an area I need to do some work on.
  • I thought about including ideas from Mason Malmuth’s book Gambling Theory and Other Topics, because it emphasize that the most important concept in gambling is to follow a “non-self-weighting” strategy. Which is another way to say “vary your size with the edge”. Such an observation would be banal to this audience but he points to several common strategies that are counterexamples that people generally approve of. He’s a pretty incendiary character, arguing against diversification (he admits this is the largest area of pushback he receives) and claiming that “money management” is a stupid if not moot topic. He also gives the example of nuclear MAD as a self-weighting strategy and the brevity of the Gettysburg address as a non-self-weighting strategy. I gotta admit, reading it again 25 years later, I feel the dude’s a bit of crank. It was required reading at SIG but that must be in spite of the diversification thing. Jeff Yass has repeatedly emphasized that diversification is a free lunch.

Inoculate yourself against “persuasive” charts

The original moontower blog is https://moontowermeta.com/. The “meta” is an important word. Important enough that Facebook stole my language and turned it into their ticker. F’n Zuck. Leave some for the little people bruh.

The reason I used “meta” (besides the fact that moontower.com wasn’t available) was because a lot of things I think about are fairly meta. Knowledge is the object but how we acquire knowledge is the meta. Trading is a very meta discipline because games with counterparties require a solid “theory of mind”.

In the spirit of meta, I really enjoyed a recent post by Robot James titled:

Valuation Timing with Excel (6 min read)

It’s meta because it’s really about arming yourself with data analysis to confront a narrative or chart. It’s worth stepping through the article together to appreciate just how many meta-nuggets it contains.

First, we start with an object-level observation that you’ve likely encountered. I’ll quote freely from the post but all bold is mine:

You may have seen a lot of charts like this recently:

The conclusions people tend to draw from this chart are:

  • there is an obvious and strong relationship between valuation and expected future returns (cheap = good, expensive = bad)
  • valuation estimates are currently historically high; therefore, expected returns of the S&P 500 are historically low.

We should always be wary of drawing strong conclusions from stuff people share on the internet or in sell-side research.

There are a few reasons to be skeptical of the strong conclusions people tend to make on seeing this:

  • the chart might just be wrong (people screw up financial data analysis all the time)
  • 10 years is a really long time horizon
  • all of the 10-year total returns are actually positive
  • why are there so many points? How many 10-year periods has the index even existed for?!

The good news is that, with a few simple skills, we don’t have to believe what randos on the internet say.

Even if we can’t write code, we can use Microsoft Excel and free online tools to investigate these things ourselves.

James shows how simple it is to grab the data that would feed such a chart so we can manipulate it ourselves. One of the first manipulations is addressing the fact that such a chart is really derived from an extremely small sample size because each data point is highly overlapping to the others. A rolling 10-year return is comprised of 120 months so each new “sample” overlaps with the prior one by 119/120.

James starts the exploration by looking at monthly returns (instead of 10-year returns) vs CAPE.

Let’s turn back to James for interpretation.

Unsurprisingly, that looks like a big blob. (Anything with monthly returns on the y-axis will look like a big blob.)

[Kris: that bold statement is a useful bit of knowledge that comes from looking at financial data frequently]

What does James do next?

We can look at longer non-overlapping periods. Let’s keep with the 10-year forward window and look at decades.

The problem is that we now only have 15 observations! Ten years is a long time, and we simply don’t have that many unique non-overlapping ten-year periods. And we certainly don’t have many unique non-overlapping ten-year periods that are similar to the current market structure and competitive environment.

[Kris: that bold bit is an evergreen problem in finance because investing is biology not physics. Markets learn so output become inputs. What does that mean? Markets are more likely to fall AFTER everyone starts believing they can only go up. The “only goes up” is the output or observation that then becomes an input into how much risk investors take. There is always some price that peers back at history and says “not this time”.]

So James slices the data another way.

Plot the valuation metric itself…

whenever we see an effect, we should ask what other than our pet theory might be causing that effect to appear. In particular a lot has changed over that time period. The market looks nothing like what it did in 1900 today.

And, indeed, if we plot a time series of our valuation metric, it looks kinda drifty.

It’s not really reasonable, I don’t think, to assume that CAPE 20 would “mean” the same thing in 2024 as it did in 1900.

He tries another manipulation:

One cheap and dirty way we can make that metric a bit less drifty and more comparable over time is to standardize it by its values over a recent rolling window.

For example, here I’ve standardized it as a 10yr rolling score. (Not necessarily cos I think that’s the right thing to do – I just want to make a point).

Now it looks a lot more stationary. It stays in the same range. It doesn’t drift off. This is unsurprising cos we forced it to look like that.

[Kris: the bold is another lit bit of fingertip knowledge that you acquire from frequent contact with data.]

Yet, another manipulation:

Now, we can plot our next year’s returns vs this standardized z-score.

If we still see an effect when we do this, it would make us more confident in the valuation effect. If we don’t, it won’t destroy our confidence because we’ve made some pretty arbitrary and dubious scaling choices here.

Indeed, at least with this scaling choice, we don’t see the effect we are looking for.

That’s ok. That’s the nature of work like this. We’re just exploring, trying to break things. We try to look at things from as many different angles as we can and see how much of the limited evidence lines up.

[Kris: I just want to pause for beauty as my wife likes to say. James is spoon-feeding serum against chart crimes and charlatans who read “How To Lie With Statistics” as a manual].

James’ Conclusion

I think the evidence (and economic sense) supports the idea that high valuations are correlated with lower expected returns. But it’s nowhere near as clear-cut as the initial scatterplot suggests. We simply don’t have enough data, and the market is constantly changing underneath us, making it hard for us to draw strong inferences.

My conclusion

This points to an uncomfortable reality. If a data analysis was conclusive then everyone would do the thing prescribed until the data exhaust from the behavior was no longer conclusive. This is deeply reminiscent of what I call the Paradox of Provable Alpha.

Notice what James did.

He recognized that the data proves nothing but it’s simply too underpowered to accept or reject any claims. His prior barely gets updated: “I think the evidence (and economic sense) supports the idea that high valuations are correlated with lower expected returns.”

He goes to bed at night with judgment as his best guess much like a farmer’s almanac will do better at predicting the weather in a month vs some meteorological model.

 

Thanks again to Robot James for the heavy lifting on the original article. I was just narrating alongside it to highlight what stood out to me and how it related to other topics we discuss here.

seat arbitrage

If a tree falls in a forest, does it have a delta impact?

I didn’t feel like writing so you get the answer I’d give over a beer in Roppongi if you gave me a few minutes to collect my thoughts (and if I drank).

Dark arts. Microstructure. Options.

Enjoy…

 

Another story with some powerful lessons…

Let’s start with one of the best stories I got to be a part of at SIG.

Tina recounted it on Twitter, I’ll offer more color below.

Tina:

Ok, seat arb story. One day, ICE announced that they wanted to buy the NYBOT. Jeff Yass runs into me when he came into the NY office one day when this started, and asked if we should be buying these seats for edge – ICE stock in exchange for seat. I was Head of the NYBOT for SIG at the time with traders in coffee, sugar, cocoa, as well the Russell 1k,2k,3k.

I had talked to many traders beforehand and overwhelmingly, they were against the buyout. So I told Jeff, no it won’t pass and and we would lose buying these seats. Then I dug around more and realized that the vast majority of people against the buyout were leasing the seats and that owners with votes were for.

Called Jeff back and told him I changed my mind. Jeff green lights it. This became such a fun crazy time because, I would be trading during the day, watching ICE stock, watching the seat tape -seat prices on the ticker on the boards, and then when the seat offered were at a sufficient discount, I would stop trading, send my clerk to run to the membership office and bring me docs to sign. The edge from these seats became more than the edge from trading so I would literally stop trading during the day at times to do this. Of course then I had to call Jeff’s right hand man Shawn, and then the COO to free cash up.

We had to put all these seats under individual traders, since technically the traders were the members. So myself, Kris Abdelmessih, etc had many many seats in our names which was also funny. The seats were maybe $650k and I bought maybe 30-40 for SIG.

In the meantime, the CME was also bidding for the NYMEX which was in the same building btw as the NYBOT. Somehow the head of energy for SIG was out for a bit and so I was the most senior in the building. I saw Jeff during that time and he said “ I really really want to buy some NYMEX seats”.

So one day, this guy I knew who owned a clearing company is alone w me at the elevator and asked if I would buy his seat. Jeff had given me a $10m top when the seat was maybe worth $10.8m. Think the displayed market for seats was $8.5 at $9.3m or something. Guy is like, “I will sell you my seat for $8.8m. “

I call Bala [Cynwyd], get his admin, he’s in a meeting but I tell her I needed him. Jeff picks me up, approves it, tells me to call the COO to free margin up and wire the money. Was pretty exhilarating that one trade to get so much edge I must say. The best part was also that, SIG got awarded the CME specialist post on the NYSE so we were the only ones who could sell CME short, setting up for a real arbitrage.

All of this happened when I was pretty young, so you can imagine this was all super cool, the trust Jeff had in me to manage so much of his money. I am forever grateful for the opportunity.

Pretty neat.

I’ll add a bit more.

The head of energy on the NYMEX oversaw nat gas trading as well as me (I oversaw oil trading). Before we came to the NYMEX, he was also my boss on the NYSE. I remember being at the NYSE member meeting when then CEO John Thain (after the Grasso departure) started explaining what would become Reg NMS!

SIG also bought NYSE seats before it went public. By the time, the NYMEX and NYBOT were ready to demutualize they understood this particular style of special sit quite well.

Aside on the NYMEX deal

Before the NYMEX was acquired, it was member-owned. The member owned a “seat” which gave them voting rights as a matter of exchange governance plus the right to trade on the floor.

The CME offered buy the NYMEX in stock. A member would receive some amount of CME shares for each seat they owned. To value a seat you had 2 primary inputs:

  1. The amount of CME shares you get x the price of the shares during some fixing window (I don’t remember the details)
  2. The value of the permit which allowed you to trade on the floor

The permit could be valued by a simple DCF based on how much you could lease a seat to a trader or broker on a monthly basis. Forecasting lease rates could be tricky since the life of the trading floor was already in question.

In fact, this is why lessees were so against the deal. They owned no equity in the deal and their livelihood was at risk if the floor’s days are numbered. The seat owners had their golden ticket. In the time leading up to the sale, seats more than 10x’d in value with many seat owners buying even more. That deal spawned lot of generationally rich Sal’s from Staten Island.

The trading permit however was a small portion of the overall seat value so the DCF exercise was fairly inconsequential. The main risk was the CME’s stock price but as you saw from Tina’s story — there was a lot of edge. If CME stock had 30% vol, with the trading permit, you were basically buying the shares at a 1 standard deviation discount (and that’s if you had to hold for a year). With SIG able to short CME to it was a good trade to plow size into.

I know 9 figs sounds quaint, but it was a lot of dough in a pre-GFC, pre-Fartcoin world

 

Being Nimble

You could relate this story back to my video above. There was some opacity to the market because the seat bid/offer prices were maintained by a small group of office workers employed by NYMEX. Our trading assistant would frequently go up to the membership office bearing coffee or treats to chat them up for color on who’s been poking around the order book. Know the chokepoints.

It reminds me of someone who knows their local RE duplex/multi-family market cold. Occasionally a listing comes up and they will know the exact block and layout so they immediately notice that while it says 3 bedroom, it’s really a 4 with a minor change plus it’s on a side of the street worth a 5% bump. Call the broker, offer 50k thru ask if they take the listing down immediately (and this is in the subset of cases where you didn’t get the look before it hit MLS).

Usually this type of fingertip knowledge in dark corners doesn’t scale, but the seat arb was a rare exception. A bunch of jabronis just made their grandkids rich out of the blue and didn’t want to gamble on the closing of the last 20%. Edge.

I think my favorite part of the story though was the moment. We were all in our 20s and Jeff trusted us. SIG was very entrepreneurial. I got to be a member of every exchange in NY except the NYFE in under a decade. You have enough social aptitude plus lots of training in how to think about risk…“go break into that pit”.

Trading firms, at least ones with a floor heritage, have a fairly flat org structure which is strongly on display in Tina’s story. Empowering employees and limiting bureaucracy seems to be a real edge but requires the right culture and alignment. I recently highlighted the flat structure of Valve, but like SIG they don’t answer to any outside investors.

Going from this real-life example up to the level of lesson, this is SIG’s Todd Simkin explaining the advantage in his interview with Ted Seides.

Ted: Over the 30 plus years you’ve been at SIG, you’ve seen in the hedge fund world this growth of multi-manager platforms. How do you view yourselves competitively to some of the bigger people that you see in the markets?

Todd: We have been in the fortunate position of having the most patient capital of all. One of the challenges with hedge funds is their need to frequently manage not just quarterly reports but monthly, weekly, or even daily reports. They must demonstrate adherence to their outlined strategy and deliver consistent returns.

In contrast, our investors are the principals of the firm.

They understand the risks we take, including outsized risks, and they are the ones driving these decisions. If I decide to put on a $100 million insurance risk tied to the winner of the Super Bowl, I’m not worried about explaining losses to a multitude of stakeholders. Instead, I have a single conversation with the relevant decision-maker, outlining the edge I perceive and the terms of the deal. Their involvement includes monitoring the situation, such as checking the health of the quarterback throughout the season.

This patient approach has enabled us to stay in and grow businesses during downturns while shutting down exposures when needed. Unlike others who must adhere to predefined strategies, such as maintaining a certain percentage of long-short equity exposure, we can dynamically allocate capital.

We benefit from the large capital base while retaining the flexibility and focus of having a small number of decision-makers. These decision-makers avoid imposing artificial rules that might constrain our strategies, a common issue when managing external money.

 

Ted: When it comes to trading, even though long-duration capital is an advantage, your focus often remains on relatively shorter time frames. What sets the traders at SIG apart that allows you to stay successful in an extremely competitive market?

Todd: I think there are a few things.

One is that we focus a lot on the decision process—the information available, how we used that information, and then what trade we made—all of that way before we discuss the results. I think a lot of other people have that upside down. They say, “How did you do? If you made money, great, keep doing what you’re doing. If you lost money, that means that you took too much risk, and that’s a bad thing.”

Whereas our traders are focused on the decision process and the expected value first, and because of that, we don’t do things that I’ve seen some of our competitors do that we would think would bleed away some of those profits.

For example, say you do all your work, there’s no selection bias, there’s no reason to think that you’ve gained new information by being able to enter a trade and you get to buy an asset for $10 that you think is worth $20. Seems great, and then somebody comes along and they say they’ll buy it back from you for $19.

Do you want to sell it?

A lot of people at that point would say, “Well, that’s great. I bought it for $10. I sell it at $19. I make $9. I put it in my pocket, and I go away pretty happy, and I sleep well tonight. Nothing bad can happen tomorrow with my position. I’m out of it, and I’ve just made my money.”

And we say, “No.” If anything, if we’re able to buy more at $19 and we still think it’s worth $20, then we would. The fact that we got to buy it for $10 is great, sort of confirmed now by the fact that someone’s willing to pay $19, but that doesn’t mean we want to sell it and lock in this profit just because you have an opportunity to close a position.

That is part of the culture of the firm. We’re not going to give something up just to feel better in our small individual portfolio, which is part of this much, much bigger firm-wide portfolio. If the whole firm had the opportunity to do that and gave up 10% of our profits every time we had a profit-making opportunity, that would be really costly. Somebody else is on the other side of that trade picking up all that extra money that we’d be giving away.

[Kris: That’s exactly the NYMEX/CME example!]

So part of the culture of the firm is one in which we are finding edges wherever we can find them but then capturing all of it by either holding to maturity or holding to expiration or closing at an appropriate rate when we have either new information or where the markets have changed.