Greeks Are Everywhere

The option greeks everyone starts with are delta and gamma. Delta is the sensitivity of the option price with respect to changes in the underlying. Gamma is the change in that delta with respect to changes in the underlying.

If you have a call option that is 25% out-of-the-money (OTM) and the stock doubles in value, you would observe the option graduating from a low delta (when the option is 25% OTM a 1% change in the stock isn’t going to affect the option much) to having a delta near 100%. Then it moves dollar for dollar with the stock.

If the option’s delta changed from approximately 0 to 100% then gamma is self-evident. The option delta (not just the option price) changed as the stock rallied. Sometimes we can even compute a delta without the help of an option model by reasoning about it from the definition of “delta”. Consider this example from Lessons From The .50 Delta Option where we establish that delta is best thought of as a hedge ratio 1:

Stock is trading for $1. It’s a biotech and tomorrow there is a ruling:

  • 90% of the time the stock goes to zero
  • 10% of the time the stock goes to $10

First take note, the stock is correctly priced at $1 based on expected value (.90 x $0 + .10 x $10). So here are my questions.

What is the $5 call worth?

  • Back to expected value:90% of the time the call expires worthless.10% of the time the call is worth $5

.9 x $0 + .10 x $5 = $.50

The call is worth $.50

Now, what is the delta of the $5 call?

$5 strike call =$.50

Delta = (change in option price) / (change in stock price)

  • In the down case, the call goes from $.50 to zero as the stock goes from $1 to zero.Delta = $.50 / $1.00 = .50
  • In the up case, the call goes from $.50 to $5 while the stock goes from $1 to $10Delta = $4.50 / $9.00 = .50

The call has a .50 delta

Using The Delta As a Hedge Ratio

Let’s suppose you sell the $5 call to a punter for $.50 and to hedge you buy 50 shares of stock. Each option contract corresponds to a 100 share deliverable.

  • Down scenario P/L:Short Call P/L = $.50 x 100 = $50Long Stock P/L = -$1.00 x 50 = -$50

    Total P/L = $0

  • Up scenario P/L:Short Call P/L = -$4.50 x 100 = -$450Long Stock P/L = $9.00 x 50 = $450

    Total P/L = $0

Eureka, it works! If you hedge your option position on a .50 delta your p/l in both cases is zero.

But if you recall, the probability of the $5 call finishing in the money was just 10%. It’s worth restating. In this binary example, the 400% OTM call has a 50% delta despite only having a 10% chance of finishing in the money.

The Concept of Delta Is Not Limited To Options


Futures have deltas too. If the SPX cash index increases by 1%, the SP500 futures go up 1%. They have a delta of 100%.

But let’s look closer.

The fair value of a future is given by:

Future = Seʳᵗ


S = stock price

r = interest rate

t = time to expiry in years

This formula comes straight from arbitrage pricing theory. If the cash index is trading for $100 and 1-year interest rates are 5% then the future must trade for $105.13

100e^(5% * 1) = $105.13

What if it traded for $103?

  • Then you buy the future, short the cash index at $100
  • Earn $5.13 interest on the $100 you collect when you short the stocks in the index.
  • For simplicity imagine the index doesn’t move all year. It doesn’t matter if it did move since your market risk is hedged — you are short the index in the cash market and long the index via futures.
  • At expiration, your short stock position washes with the expiring future which will have decayed to par with the index or $100.
  • [Warning: don’t trade this at home. I’m handwaving details. Operationally, the pricing is more intricate but conceptually it works just like this.]
  • P/L computation:You lost $3 on your futures position (bought for $103 and sold at $100).
    You broke even on the cash index (shorted and bought for $100)
    You earned $5.13 in interest

    Net P/L: $2.13 of riskless profit!

You can walk through the example of selling an overpriced future and buying the cash index. The point is to recognize that the future must be priced as Seʳᵗ to ensure no arbitrage. That’s the definition of fair value.

You may have noticed that a future must have several greeks. Let’s list them:

  • Theta: the future decays as time passes. If it was a 1-day future it would only incorporate a single day’s interest in its fair value. In our example, the future was $103 and decayed to $100 over the course of the year as the index was unchanged. The daily theta is exactly worth 1 day’s interest.
  • Rho: The future’s fair value changes with interest rates. If the rate was 6% the future would be worth $106.18. So the future has $1.05 of sensitivity per 100 bps change in rates.
  • Delta: Yes the future even has a delta with respect to the underlying! Imagine the index doubled from $100 to $200. The new future fair value assuming 5% interest rates would be $210.25.Invoking “rise over run” from middle school:delta = change in future / change in index
    delta = (210.25 – 105.13)/ (200 – 100)
    delta = 105%

    That holds for small moves too. If the index increases by 1%, the future increases by 1.05%

  • Gamma: 0. There is no gamma. The delta doesn’t change as the stock moves.

Levered ETFs

Levered and inverse ETFs have both delta and gamma! My latest post dives into how we compute them.

✍️The Gamma Of Levered ETFs (8 min read)

This is an evergreen reference that includes:

  • the mechanics of levered ETFs
  • a simple and elegant expression for their gamma
  • an explanation of the asymmetry between long and short ETFs
  • insight into why shorting is especially difficult
  • the application of gamma to real-world trading strategies
  • a warning about levered ETFs
  • an appendix that shows how to use deltas to combine related instruments

And here’s some extra fun since I mentioned the challenge of short positions:


Bonds have delta and gamma. They are called “duration” and “convexity”. The duration is the sensitivity to the bond price with respect to interest rates. Borrowing from my older post Where Does Convexity Come From?:

Consider the present value of a note with the following terms:

Face value: $1000
Coupon: 5%
Schedule: Semi-Annual
Maturity: 10 years

Suppose you buy the bond when prevailing interest rates are 5%. If interest rates go to 0, you will make a 68% return. If interest rates blow out to 10% you will only lose 32%.

It turns out then as interest rates fall, you actually make money at an increasing rate. As rates rise, you lose money at a decreasing rate. So again, your delta with respect to interest rate changes. In bond world, the equivalent of delta is duration. It’s the answer to the question “how much does my bond change in value for a 1% change in rates?”

So where does the curvature in bond payoff come from? The fact that the bond duration changes as interest rates change. This is reminiscent of how the option call delta changed as the stock price rallied.

The red line shows the bond duration when yields are 10%. But as interest rates fall we can see the bond duration increases, making the bonds even more sensitive to rates decline. The payoff curvature is a product of your position becoming increasingly sensitive to rates. Again, contrast with stocks where your position sensitivity to the price stays constant.


Companies have all kinds of greeks. A company at the seed stage is pure optionality. Its value is pure extrinsic premium to its assets (or book value). In fact, you can think of any corporation as the premium of the zero strike call.

[See a fuller discussion of the Merton model on Lily’s Substack which is a must-follow. We talk about similar stuff but she’s a genius and I’m just old.]

Oil drillers are an easy example. If a driller can pull oil out of the ground at a cost of $50 a barrel but oil is trading for $25 it has the option to not drill. The company has theta in the form of cash burn but it still has value because oil could shoot higher than $50 one day. The oil company’s profits will be highly levered to the oil price. With oil bouncing around $20-$30 the stock has a small delta, if oil is $75, the stock will have a high delta. This implies the presence of gamma since the delta is changing.


One of the reasons I like boardgames is they are filled with greeks. There are underlying economic or mathematical sensitivities that are obscured by a theme. Chess has a thin veneer of a war theme stretched over its abstraction. Other games like Settlers of Catan or Bohnanza (a trading game hiding under a bean farming theme) have more pronounced stories but as with any game, when you sit down you are trying to reduce the game to its hidden abstractions and mechanics.

The objective is to use the least resources (whether those are turns/actions, physical resources, money, etc) to maximize the value of your decisions. Mapping those values to a strategy to satisfy the win conditions is similar to investing or building a successful business as an entrepreneur. You allocate constrained resources to generate the highest return, best-risk adjusted return, smallest loss…whatever your objective is.

Games have mine a variety of mechanics (awesome list here) just as there are many types of business models. Both game mechanics and business models ebb and flow in popularity. With games, it’s often just chasing the fashion of a recent hit that has captivated the nerds. With businesses, the popularity of models will oscillate (or be born) in the context of new technology or legal environments.

In both business and games, you are constructing mental accounting frameworks to understand how a dollar or point flows through the system. On the surface, Monopoly is about real estate, but un-skinned it’s a dice game with expected values that derive from probabilities of landing on certain spaces times the payoffs associated with the spaces. The highest value properties in this accounting system are the orange properties (ie Tennessee Ave) and red properties (ie Kentucky). Why? Because the jail space is a sink in an “attractor landscape” while the rents are high enough to kneecap opponents. Throw in cards like “advance to nearest utility”, “advance to St. Charles Place”, and “Illinois Ave” and the chance to land on those spaces over the course of a game more than offsets the Boardwalk haymaker even with the Boardwalk card in the deck.

In deck-building games like Dominion, you are reducing the problem to “create a high-velocity deck of synergistic combos”. Until you recognize this, the opponent who burns their single coin cards looks like a kamikaze pilot. But as the game progresses, the compounding effects of the short, efficient deck creates runaway value. You will give up before the game is over, eager to start again with X-ray vision to see through the theme and into the underlying greeks.

[If the link between games and business raises an antenna, you have to listen to Reid Hoffman explain it to Tyler Cowen!]

Wrapping Up

Option greeks are just an instance of a wider concept — sensitivity to one variable as we hold the rest constant. Being tuned to estimating greeks in business and life is a useful lens for comprehending “how does this work?”. Armed with that knowledge, you can create dashboards that measure the KPIs in whatever you care about, reason about multi-order effects, and serve the ultimate purpose — make better decisions.

The Gamma Of Levered ETFs

Levered ETFs use derivatives to amplify the return of an underlying index. Here’s a list of 2x levered ETFs. For example, QLD gives you 2x the return of QQQ (Nasdaq 100). Levered ETFs use derivatives to get the levered exposure. In this post, we will compute the delta and gamma of levered ETFs and what that means for investors and traders.

Levered ETF Delta

In options, delta is the sensitivity of the option premium to a change in the underlying stock. If you own a 50% delta call and the stock price goes up by $1, you make $.50. If the stock went down $1, you lost $.50. Delta, generally speaking, is a rate of change of p/l with respect to how some asset moves. I like to say it’s the slope of your p/l based on how the reference asset changes.

For levered ETFs, the delta is simply the leverage factor. If you buy QLD, the 2x version of QQQ, you get 2x the return of QQQ. So if QQQ is up 1%, you earn 2%. If QQQ is down 1%, you lose 2%. If you invest $1,000 in QLD your p/l acts as if you had invested $2,000.

$100 worth of QLD is the equivalent exposure of $200 of QQQ.

Your dollar delta is $200 with respect to QQQ. If QQQ goes up 1%, you make 1% * $200 QQQ deltas = $2

The extra exposure cuts both ways. On down days, you will lose 2x what the underlying QQQ index returns.

The takeaway is that your position or delta is 2x the underlying exposure.

Dollar delta of levered ETF = Exposure x Leverage Factor

In this case, QLD dollar delta is $200 ($100 x 2).

Note that QLD is a derivative with a QQQ underlyer.

Levered ETF Gamma

QLD is a derivative because it “derives” its value from QQQ. $100 exposure to QLD represents a $200 exposure to QQQ. In practice, the ETF’s manager offers this levered exposure by engaging in a swap with a bank that guarantees the ETF’s assets will return the underlying index times the leverage factor. For the bank to offer such a swap, it must be able to manufacture that return in its own portfolio. So in the case of QLD, the bank simply buys 2x notional the NAV of QLD so that its delta or slope of p/l matches the ETFs promise.

So if the ETF has a NAV of $1B, the bank must maintain exposure of $2B QQQ deltas. That way, if QQQ goes up 10%, the bank makes $200mm which it contributes to the ETF’s assets so the new NAV would be $1.2B.

Notice what happened:

  • QQQ rallied 10% (the reference index)
  • QLD rallies 20% (the levered ETF’s NAV goes from$1B –> $1.2B)
  • The bank’s initial QQQ delta of $2B has increased to $2.2B.

Uh oh.

To continue delivering 2x returns, the bank’s delta needs to be 2x the ETF’s assets or $2.4B, but it’s only $2.2B! The bank must buy $200M worth of QQQ deltas (either via QQQs, Nasdaq futures, or the basket of stocks).

If we recall from options, gamma is the change in delta due to a change in stock price. The bank’s delta went from 2 (ie $2B/$1B) to 1.833 ($2.2B/$1.2B). So it got shorter deltas, in a rising market –> negative gamma!

The bank must dynamically rebalance its delta each day to maintain a delta of 2x the ETF’s assets. And the adjustment means it must buy deltas at the close of an up day in the market or sell deltas at the close of a down day. Levered ETFs, therefore, amplify price moves. The larger the daily move, the larger the rebalancing trades need to be!

I’ve covered this before in Levered ETF/ETN tool, where I give you this spreadsheet to compute the rebalancing trades:

From Brute Force To Symbols

There was confusion on Twitter about how levered ETFs worked recently and professor @quantian stepped up:

Junior PM interview question: An X-times leveraged fund tracks an underlying asset S. After time T, S have moved to ST = (1+dS)S0. The initial delta is of course X. What is the portfolio gamma, defined as (dDelta)/(dS), as a function of X?

Despite correctly understanding how levered and inverse ETFs work I struggled to answer this question with a general solution (ie convert the computations we brute-forced above into math symbols). It turns out the solution is a short expression and worth deriving to find an elegant insight.

@quantian responded to my difficulty with the derivation.

I’ll walk you through that slowly.
Mapping variables to @quantian’s question:
  • NAV =1

You are investing in a levered ETF that starts with a NAV of 1

  • X = The leverage factor

The bank needs to have a delta of X to deliver the levered exposure. For a 2x ETF, the bank’s initial delta will be 2 * NAV = 2

  • S = the underlying reference index

The dynamic:

  • When S moves, the bank’s delta will no longer be exactly X times the NAV. Its delta changed as S changed. That’s the definition of gamma.
  • When S moves, the bank needs to rebalance (buy or sell) units of S to maintain the desired delta of X. The rebalancing amount is therefore the change in delta or gamma.

Let’s find the general formula for the gamma (ie change in delta) in terms of X. Remember X is the leverage factor and therefore the bank’s desired delta.

The general formula for the gamma as a function of the change in the underlying index is, therefore:
X (X – 1)
where X = leverage factor

There are 2 key insights when we look at this elegant expression:

  1. The gamma, or imbalance in delta due to the move, is proportional to the square of the leverage factor. The more levered the ETF, the larger the delta adjustment required. If there was no leverage (like SPY to the SPX index), the gamma is 0 because 0 (0-1) = 0

  2. The asymmetry of inverse ETFs — they require larger rebalances for the same size move! Imagine a simple inverse ETF with no leverage.

-1 (-1 – 1) = 2

A simple inverse ETF, has the same gamma as a double long ETF.

Consider how a double short ETF has a gamma of 6!:

-2 (-2 -1) = 6 

When I admit that I had only figured out the rebalancing quantities by working out the mechanics by brute force in Excel, @quantian had a neat observation:

I originally found this by doing the brute force Excel approach! Then I plotted it and was like “hm, that’s just a parabola, I bet I could simplify this”

X2– X shows us that the gamma of an inverse ETF is equivalent to the gamma of its counterpart long of one degree higher. For example, a triple-short ETF has the same gamma as a 4x long. Or a simple inverse ETF has the gamma of a double long. The fact that a 1x inverse ETF has gamma at all is a clue to the difficulty of running a short book…when you win, your position size shrinks and the effect is compounded by the fact that your position is shrinking even faster relative to your growing AUM as your shorts profit!

I’ve explained this asymmetry before in The difficulty with shorting and inverse positions as well as the asymmetry of redemptions:

  • As the reference asset rallies, position size gets bigger and AUM drops due to losses. As reference asset falls, position size shrinks while AUM increase due to profits.
  • Redemptions can stabilize rebalance requirements in declines and exacerbate rebalance quantities in rallies as redemptions reduce shares outstanding and in turn AUM while in both cases triggering the fund’s need to buy the reference asset which again is stabilizing after declines but not after rallies. In other words, profit-taking is stabilizing while puking is de-stabilizing.

Rebalancing In Real Life

The amount of the rebalance from our derivation is:

X(1 + X ΔS) – X (1+ ΔS)


X = leverage factor

ΔS = percent change in underlying index

Another way to write that is:

X (X-1) (ΔS)

In our example, 2 * (2-1) * 10% = $.2 or an imbalance of 20% of the original NAV!

In practice, the size of the rebalance trade is of practical use. If an index is up or down a lot as you approach the end of a trading day then you can expect flows that exacerbate the move as levered ETFs must buy on up days and sell on down days to rebalance. It doesn’t matter if the ETF is long or inverse, the imbalance is always destabilizing in that it trades in the same direction as the move. The size of flows depends on how much AUM levered ETFs are holding but they can possibly be mitigated by profit-taking redemptions.

During the GFC, levered financial ETFs had large rebalance trades amidst all the volatility in bank stocks. Estimating, frontrunning, and trading against the rebalance to close was a popular game for traders who understood this dynamic. Years later levered mining ETFs saw similar behavior as precious metals came in focus in the aftermath of GFC stimulus. Levered energy ETFs, both in oil and natural gas, have ebbed and flowed in popularity. When they are in vogue, you can try to estimate the closing buy/sell imbalances that accompany highly volatile days.

Warning Label

Levered ETFs are trading tools that are not suitable for investing. They do a good job of matching the levered return of an underlying index intraday. The sum of all the negative gamma trading is expensive as the mechanical re-balancing gets front-run and “arbed” by traders. This creates significant drag on the levered ETF’s assets. In fact, if the borrowing costs to short levered ETFs were not punitive, a popular strategy would be to short both the long and short versions of the same ETF, allowing the neutral arbitrageur to harvest both the expense ratios and negative gamma costs from tracking the index!

ETFs such as USO or VXX which hold futures are famous for bleeding over time. That blood comes from periods when the underlying futures term structure is in contango and the corresponding negative “roll” returns (Campbell has a timeless paper decomposing spot and roll returns titled Deconstructing Futures Returns: The Role of Roll Yield). This is a separate issue from the negative gamma effect of levered or inverse ETFs.

Some ETFs combine all the misery into one simple ticker. SCO is a 2x-levered, inverse ETF referencing oil futures. These do not belong in buy-and-hold portfolios. Meth heads only please.

[The amount of variance drag that comes from levered ETFs depends on the path which makes the options especially tricky. I don’t explain how to price options on levered ETFs but this post is a clue to complication — Path: How Compounding Alters Return Distributions]

Key Takeaways

  • Levered ETFs are derivatives. Their delta changes as the underlying index moves. This change in delta is the definition of gamma. 

  • Levered and inverse ETFs have “negative gamma” in that they must always rebalance in a destabilizing manner — in the direction of the underlying move.
  • The required rebalance in terms of the fund’s NAV is:

X (X-1) (ΔS)

  • The size of the rebalance is proportional to the square of the leverage factor. The higher the leverage factor the larger the rebalance. For a given leverage factor, inverse ETFs have larger gammas. 

  • The drag that comes from levered ETFs means they will fail to track the desired exposure on long horizons. They are better suited to trading or short-term risk management.

Appendix: Using Delta To Summarize Exposures

We can see that delta is not limited to options, but is a useful way to denote exposures in derivatives generally. It allows you to sum deltas that reference the same underlying to compute a net exposure to that underlying.

Consider a portfolio:

  • Short 2000 shares of QQQ
  • Long 1000 shares of QLD
  • Long 50 1 month 53% delta calls

By transforming exposures into deltas then collapsing them into a single number we can answer the question, “what’s my p/l if QQQ goes up 1%?”

We want to know the slope of our portfolio vis a vis QQQ.

A few observations:

  • I computed net returns for the portfolio based on the gross (absolute value of exposures)
  • The option exposure is just the premium, but what we really care about is the delta coming from the options. Even though the total premium is <$37k, the largest delta is coming from the options position.

3-Card Macro

Here’s a rarity for this letter. Let’s play macroeconomics.

I don’t usually write about “macro” because it feels like astrology. You can look at any bit of data (the “current thing” as they say on the internet is inflation) and see it molded to be the cause or result of whatever axe a speaker is trying to grind. People who just learned what inflation was on Tuesday have incorporated it into their pre-existing worldview so seamlessly that by Friday their updated narrative is more coherent than ever.

Macro is the raw material for story-telling. For marketing. It’s a political battering ram for both sides. But macro is a ball of yarn. The discourse that’s consumable is reduced so much to aid absorption that the logic, by necessity, ends up sounding unassailable. It must. The logic is solving for convenience. Not understanding.

So I don’t write about it because I don’t think it’s especially useful. It’s more likely to give you brainworms by beefing up your priors. It will calcify hunches into commandments when the evidence only merits “things to learn more about”.

“Buy Potatoes”

While I don’t write about macro, I’m hardly immune to the animal urge to read about it and pretend I understand how the world works. My mind’s sentry just keeps me from taking it too seriously. [I suspect the sentry has saved me many times and cost me many times and I can’t tell if it’s worth the free rent it gets in my head. The question is moot, since I can’t evict it anyway. It’s like SF up in there.]

My first macro boner happened (was searching for the right verb here and “happens” is the most fitting action word to apply to boners) while reading Michael Lewis’ Liar’s Poker when I was 21. When Michael was a junior salesperson on the Salomon trading desk, he was taken under the wing of a senior trader named Alexander:

The second pattern to Alexander’s thought was that in the event of a major dislocation, such as a stock market crash, a natural disaster, the breakdown of OPEC’s production agreements, he would look away from the initial focus of investor interest and seek secondary and tertiary effects.

Remember Chernobyl? When news broke that the Soviet nuclear reactor had exploded, Alexander called. Only minutes before, confir mation of the disaster had blipped across our Quotron machines, yet Alexander had already bought the equivalent of two supertankers of crude oil. The focus of investor attention was on the New York Stock Exchange, he said. In particular it was on any company involved in nuclear power. The stocks of those companies were plummeting. Never mind that, he said. He had just purchased, on behalf of his clients, oil futures. Instantly in his mind less supply of nuclear power equaled more demand for oil, and he was right. His investors made a large killing. Mine made a small killing. Minutes after I had persuaded a few clients to buy some oil, Alexander called back.

“Buy potatoes,” he said. “Gotta hop.” Then he hung up. Of course. A cloud of fallout would threaten European food and water supplies, including the potato crop, placing a premium on uncon taminated American substitutes. Perhaps a few folks other than potato farmers think of the price of potatoes in America minutes after the explosion of a nuclear reactor in Russian, but I have never met them.

But Chernobyl and oil are a comparatively straightforward example. There was a game we played called What if? All sorts of complications can be introduced into What if? Imagine, for example, you are an institutional investor managing several billion dollars. What if there is a massive earthquake in Tokyo? Tokyo is reduced to rubble. Investors in Japan panic. They are selling yen and trying to get their money out of the Japanese stock market. What do you do?

Well, along the lines of pattern number one, what Alexander would do is put money into Japan on the assumption that since everyone was trying to get out, there must be some bargains. He would buy precisely those securities in Japan that appeared the least desirable to others. First, the stocks of Japanese insurance companies. The world would probably assume that ordinary insurance companies had a great deal of exposure…

If you are 21 years old today, how can you not hear “buy potatoes” and not think of trading as Settlers of Catan? Trading is the ultimate boardgame. It’s just that now, an algo sweeps all the call offers on Brent futures before Alexander finishes reading the headline. [Actually, the market-makers’ streaming call offers have a “panic” setting that gets triggered if they get hit on more than couple related strikes at a time and pull their co-located quotes before they get picked off by the news-reading algo. So prices can gap to something closer to fair value on very little trading volume. The graveyard of backtesting signals that don’t appreciate this would occupy every blade of grass on the planet if it were a physical place].

The blessing and curse of our frontal lobes is a desire to understand how the complex world works. Macro brings the romance of chess to investing. It lures mandate-dodging pros and tourists alike with scalable p/l if you can:

a) make accurate predictions


b) identify mispriced lines with respect to those predictions.

In reality, it’s more like 3-card Monte where you have no chance of guessing where the card is and if you get lucky the reward is the false confidence to wager more next time. [I highly recommend the Wikipedia for 3-card Monte. The analogy of “marks” and “shills” to the finance marketing machine writes itself].

My personal relationship with macro is as follows:

  1. I’ve got some mental model of how things work
  2. Stuff happens — none of it was predicted by that model
  3. Backfit new models to explain the strange stuff
  4. Repeat

Seriously, the meme never stops giving.

With that, in this week’s Money Angle, I’m going to go full-Tobias [narrator: you never go full Tobias] and share some macro takes I found resonant in explaining the past several decades.

Money Angle

Since this is macro story-telling I’d consider this entertainment. I’m just picking a story that feels right. These are the re-factored views of Lyn Alden and Cem Karsan.

In Lyn’s March newsletter, we start by rewinding the clock.

  1. The globalized labor arbitrage begins
    Starting in the early 1980s, China began to open its economy to the rest of the world. And then starting in the early 1990s, the Soviet Union collapsed and its various former states also began to open their economies to the world. This combination brought a massive amount of untapped labor into global markets within a rather short period of time, which allowed corporations to geographically arbitrage their operations (a.k.a. offshore a big chunk of their labor force and various facilities) to take advantage of this. This was disadvantageous to laborers and tradespeople in developed markets, and advantageous to executives and shareholders, particularly in the US where we shifted towards massive trade deficits in the 1990s. But it did also help hundreds of millions of people rise out of abject poverty in these developing countries, and created hundreds of millions of new global consumers for those global brands as their wealth grew. China experienced a massive increase in the average standing of living, and so did many former Soviet states.

  2. Bean counters then optimized on the back of this arbitrageAll sorts of management approaches regarding “lean manufacturing” and “just in time delivery” became popular among corporations and MBA programs during this era. Some of these had their roots in the early 20th-century manufacturing revolution (via Ford, Toyota, and others), but they were basically rediscovered, expounded upon, and brought to a new level in the 1980s, 1990s, and 2000s across the entire manufacturing sector.

    Moontower readers will recognize the MBA mindset of selling options. Engineers build beyond spec, or “overengineer”. Biologists know our redundant kidney is an insurance policy.

    We constantly trade slack for efficiency as Moloch whistles by. The balance between efficiency and slack (or efficiency and fairness for that matter) is hard to find. So we can count on overshooting until the “gotchas” show themselves.

    Of course, this was somewhat of an illusion. Companies basically traded away resilience in favor of efficiency, while pretending that there was minimal downside, and yet this type of approach only works under a benign global environment. Outside of the Middle East and a few localized regions around the world, the 1980s through the 2010s was generally a period of limited war as far as supply chains were concerned, with significant global openness and cooperation. Extremely efficient and highly complex supply chains, with limited redundancy or inventory, could thrive in this stars-aligned macro environment. Any company not playing that game would be less efficient in this environment, and thus would be out-competed.

  3. As we enter a new regime, comparisons to recent history are breakingGoing forward, any back-tests about inflation or disinflation that only go back twenty or thirty years are practically useless. This whole 1980s-through-2010s disinflationary period (with one substantial cyclical inflationary burst in the 2000s) was during a backdrop of structurally falling interest rates and increasing globalization, with the sacrifice of resiliency for more efficiency. The world is now looking at the need to duplicate many parts of the supply chain, find and develop potentially redundant sources of commodities, hold higher inventories of everything, and in general boost resilience at the cost of efficiency.

  4. The 1940s as the reference pointI’ve been making a macroeconomic comparison between the 2020s and the 1940s for nearly two years now, and the similarities unfortunately continue to stack up. For the most part, I was referring to monetary and fiscal policy and the long-term debt cycle for that comparison, with charts like this that my readers are quite familiar with by now.

    This unusually wide gap between inflation and interest rates is one of the key reasons I regularly compare the 2020s to the 1940s (rather than primarily the 1970s, despite some other similarities there), and I have been making that comparison for nearly two years before the gap became as wide as it is now.

    Since debt was so high in the 1940s (unlike the 1970s where it was low), and the inflation was driven by fiscal spending and commodity shortages in the 1940s (rather than a demographic boom and commodity shortages as in the 1970s), the Fed held interest rates low even as inflation ran hot in the 1940s (unlike the 1970s where they raised rates to double-digit levels).

  5. Will Russia’s latest adventure hasten the broader movement to diversify away from USD reserves?Diversification of global reserves and payment channels into a more multi-polar reserve currency world, with a renewed emphasis on neutral reserve assets. Much like how COVID-19 accelerated the practice of remote work, I think Russia’s war with Ukraine and the associated sanction response by the West will accelerate that diversification of global reserves and payment channels…In a world where official reserves can be frozen, some degree of reserve diversification would be rational for most countries to consider, and as investors, we should probably expect this to occur over time. This is especially true for countries that are not strongly aligned with the United States and western Europe.

  6. The conundrum facing US policymakers

    Unfortunately for the Fed, the US economic growth rate is already decelerating, and basically the only way to reduce supply-driven inflation with monetary policy is to reduce demand for goods, which is recessionary.

    Credit markets are already weakening, the Treasury market is becoming rather volatile and illiquid, and the Fed has ended quantitative easing. The Fed is likely to continue monetary tightening until financial markets get truly messy, at which point they may reverse course to the dovish side yet again.

    Stagflationary economic conditions are inherently hard for central banks to deal with; stagflation is somewhat outside of their expected models. In fact, the Fed might end up being forced to tighten liquidity with one hand and loosen liquidity with the other hand.

    This is another reason why countries may shift towards gold and other commodities for a portion of their official reserves. Not only can fiat reserves (bonds and deposits) be frozen by foreign countries that issue those liabilities, they also keep getting devalued with interest rates that are far below the prevailing inflation rate because debt levels are too high to raise rates above the inflation level.

We now shift to Cem Karsan. In an interview with Hari Krishnan, Cem discusses how policy in the aftermath of the GFC is misunderstood. This is important because policy is changing, and if you failed to interpret the effect of policy in the last decade, you may be caught flat-footed as the policy changes.

I don’t want to be subtle about the potential problem.

Popular investment strategies have been fit to recent decades. Roboadvisors, 60/40, “model portfolios” and target-date funds are now the default. Once you tell an advisor your age and risk tolerance they strap you into an off-the-shelf glide path and go looking for their next client. And it’s hard to fault them. They have no edge in the alpha game. 60/40 and similar approaches are low-cost, commoditized solutions that allow an advisor to (correctly) not spend time stock-picking. With fee compression in advisory, FAs can defend their net profits by outsourcing the investing portion of the job while focusing on planning and sales. To further cement their incentives, the prisoner’s dilemma of advisory means they can’t stray too far from popular asset allocation prescriptions because the advisors are the one short tracking error volatility.

Lyn believes the macro world is changing sharply. Cem has been beating this drum for the past year at least. Let’s see what he says.

  1. Monetary policy came to dominate because it’s relatively free from political stand-offsMy mental model of macro involves essentially monitoring the Fed (ie monetary policy) and then fiscal policy. We’ve essentially had one of those two major pipes sealed shut for 42 years. Our founding fathers in the US created a system that was purposely made to not change laws quickly or easily. That was fine until the economy became more dynamic and quicker. Congress decided they couldn’t act quickly enough to economic crises. So they created the Federal Reserve but they wanted to control its mandate, to not give it broad latitude. So they created a clear mandate of price stability and maximum employment, and only gave them one tool, — monetary policy. Essentially, there’s only been one game in town because the only way things would ever get past from a fiscal perspective is in a crisis. The monetary solution was faster, so monetary policy has been the only game in town for 42 years.

  2. The nature of loose monetary policy is to encourage investmentMonetary policy is free-market economics, right? It is empowering nature to go about and, and create kind of optimal outcomes. From a growth perspective, that is GDP-maximizing. We have created a technological revolution, almost unintentionally, but by being monetary policy supply-side dominant. We’ve created the Ubers and Amazons and Tesla’s of the world, companies that never would have existed in previous periods because they wouldn’t have had the cash flow to survive. But infinite cash flow ultimately led to longer duration bets. And this is why growth has outperformed value because cash flows haven’t mattered when money is free. If there’s no need to make money, the need is to capture market share and get bigger, to ultimately make money in the long run. You send money to corporations, corporations make more money. Ultimately, that leads to more globalization. If you send money to corporations, what is the corporation’s mandate, by definition, they have to maximize profitability, maximizing profitability means lowering costs of their goods, right and capturing more market share. So that’s the power of competition.

    [This echoes Lyn’s discussion of globalization]

    Remember Moloch’s main trick for fanning unhealthy competition, is to reduce our values to narrow optimizations. When an institution is highly specialized its incentives become perverse from a society-level purview.

    Let’s see why.

  3. Monetary policy didn’t appear to have side effects because the Fed’s narrow mandate failed to consider wider signs of economic vitalityIf you look at incentives, you’ll see the results. The incentives have been to these two simple ideas of price stability and maximum employment. Greenspan realized that the economy had somewhat changed. And that more monetary policy wasn’t causing inflation. It took the natural rate of unemployment down from 6% to 4%. And kept doing more monetary policy, which led to the tech bubble. Without having to worry about inflation their mandate was basically maximum employment. If that’s the case, right? Why wouldn’t you just do more, it’s a free lunch, right. And so the world has had a free lunch now, for 40 years, interest rates have gone lower and lower. Maximum employment has been more and more sticky at the lower end.

    But there was a catch. And it wasn’t the Fed’s job to address it. (In fact, you could argue that thru the “wealth effect” the catch was intended.)

  4. It’s not really the Fed’s fault. The problem is they didn’t have a mandate for inequality, or a lot of other issues.The Fed is permitted to neglect growing gaps in equality. These gaps finally caught up to us during Covid, but this time the government responded. We ran a giant fiscal deficit with PPE loans, extended unemployment benefits, direct transfer payments, rent moratoriums, and general forgiveness. Support for these measures was broad enough to get them passed.

    But perhaps the most important result was the recognition that inequality itself is stark. The keyboard class just hummed along on Zoom, often getting paid more, while having less opportunities to spend. They built up massive savings which if I didn’t know better seems to be conspicuously spent on house overbids and jerking the ladder up with renewed and unprecedented force.

    Let’s turn to Cem’s framing of inequality and why it’s a value we cannot ignore.

    Ultimately this goes back to Socrates. Do you give the best violin players the best violins? Or do you give the worst violin players the best violins? At the end of the day, we’ve given the best violin players the best violins. And Socrates would argue that that’s what you should do, because it creates infinitely beautiful music. But there are a bunch of violin players that don’t get to make music anymore. So we start talking about inequality about 10 years ago, and it’s really built up in five years, and COVID accelerated that trend. Again, all of a sudden, COVID happens. We get that populist kind of reaction, which had been building, what created Donald Trump and created Bernie Sanders. This is not a political statement. The world has become more populist, because of this inequality that’s essentially been created by monetary policy for two generations. And so now the fiscal response is where we are.

    This policy shift is noteworthy for investors. Especially if you have mistaken beliefs about how loose monetary policy affects supply and demand. In the aftermath of the GFC, with the monetary spigot open, the consensus was it would lead to broad inflation.

    Cem offers a counterintuitive explanation that fits what we actually witnessed since the GFC.

  5. The important difference between fiscal and monetary policy — fiscal is inflationary. Monetary, counterintuitively is not.This whole thing is important in terms of the pipes and how everything works. That fiscal policy piece that’s been sealed shut for 40 years now has $12 trillion in fiscal policy. $12 trillion in Fiscal policy is an order of magnitude in real terms bigger than the New Deal. It is about the same size as the new deal when adjusted for the size of the economy. The New Deal filled a hole over a decade, which was called the Great Depression. This is not the Great Depression. We spent about one and a half trillion of that $12 trillion, there’s about $10 trillion still in the pipe to come, and we’re about to reopen.

    So it is not a surprise that we are having inflation. Fiscal policy has a velocity of one, it goes directly into people’s pockets, sometimes even more with things like infrastructure spending. Monetary policy has a velocity of almost zero, it goes directly to “Planet Palo Alto”. And Palo Alto creates new technologies. They’re sophisticated, futuristic people. They provide new self-driving cars and things getting delivered to your doorstep. They create supply. That’s the thing that people don’t understand — monetary policy actually increases supply, it does not increase demand. And so it is deflationary.

  6. The role of the Fed todayWhen the Fed was created, the economy was very different. It was dependent on labor. The trickle-down effects of a laborer getting paid more was enough to counteract those inflationary supply effects. That is no longer the case. So ultimately, the Fed has a mandate, which is completely unreasonable — to control price stability. With supply-side economics, the only way that they can control this ultimately is to pull back. And slow capital markets decrease via the wealth effect. Ultimately, there’s a significant lag, so they are not in a position to ultimately control inflation without bringing down markets.

Cem is saying that raising rates is a blunt tool. It’s a monetary solution to a fundamentally non-monetary problem because it only works on one side of the ledger. Demand. Rate hikes can only reasonably expect to slow the economy by decreasing demand. It doesn’t address the main problem which is a lack of supply to absorb the demand. In fact, it aggravates it. If you believe inflation is a purely monetary phenomenon this is a belated prompt to unlearn that.

How I relate this to MMT

MMT discourse gets lambasted because it appears irresponsibly profligate. Consider the Investopedia definition of modern monetary theory:

Modern Monetary Theory (MMT) is a heterodox macroeconomic framework that says monetarily sovereign countries like the U.S., U.K., Japan, and Canada, which spend, tax, and borrow in a fiat currency that they fully control, are not operationally constrained by revenues when it comes to federal government spending.

Critics of MMT read that as “these crazy MMTers think you can print as much money as you want and spend it”.

This is a strawman. MMT supporters think it’s not useful to think a government is like a household that has to pay its debt back. This isn’t because they are irresponsible. It’s because they recognize that any discussion of whether a certain amount of debt is reasonable, depends on what it is backed by. Debt is neither good nor bad. Its merit depends on what is productive assets back it.

So an MMTer believes you can run a deficit (so government expenditures do not need to be matched with revenues) as long as the expenditures lead to investments in productive capacity. In other words, is the spending creating projects and jobs that will generate a real return? This is hardly unfamiliar logic. When students enroll in medical school, their loans are collateralized by the expectation of increased earnings power that comes with getting “M.D.” after your name. Constraining their current ability to spend by their current earnings would be a horrible loss of economic efficiency.

MMT is deeply focused on inflation

In the MMT world, inflation is a serious topic because a highly inflationary environment is evidence that the spending was not wise. Inflation is a test.

I direct you to my notes on Jesse Livermore’s Upside Down Markets paper on the mechanics of inflation:

Jesse, in a nod to Adam Smith’s invisible hand, calls the inflation the”invisible fist”. The requirement to return principal and interest to a lender constrains the expansion of credit and therefore spending power. Unproductive spending will lead to the destruction of financial wealth. If I borrowed money for a lemonade stand but then spent it on a vacation, my deficit spending will have created new financial wealth for the system, but it won’t have created any new real wealth. I won’t receive future cash flows to repay the loan.

The first place where the invisible fist will destroy financial wealth will be on my personal balance sheet. I originally accounted for the business as a new financial asset that offset the new liability that I had taken on. If that new financial asset never comes into existence, or if it turns out to be worthless, then it’s going to get written off. I’m going to end up with a new liability and nothing else—a negative cumulative hit to my net worth. The next place where the invisible fist will destroy financial wealth will be on the lender’s balance sheet. The loan will get defaulted on. In a full default, the lender will suffer a hit to his financial wealth equivalent in size to the financial wealth that I added to the balance sheets of the people that I bought the vacations from. The lender will experience an associated decrease in his spending power, compensating for the increase in spending power that my unproductive vacation expenditures will have conferred onto those people. In the end, the total financial wealth and spending power in the system will be conserved. The invisible fist will not allow them to enjoy sustained increases, since the real wealth in the system—its capacity to fulfill spending—did not increase. In this way, the invisible fist will prevent an inflationary outcome in which the supply of financial wealth overwhelms the supply of real wealth.

One of the leading MMT theorists, Stephanie Kelton, explains that, if anything, the MMT crowd takes inflation more seriously than mainstream economics. This makes sense if inflation, not the size of the deficit, is the true cause for concern.

In We Need to Think Harder About Inflation, she writes:

It’s the typically cavalier way of thinking about inflation that has come to dominate mainstream economics. Keeping a lid on inflation is the central bank’s job, not something Congress, the White House, or anyone else really needs to waste time thinking about. If inflation accelerates above some desired target, the Fed will knock it back down by tightening monetary policy. Easy peasy. (Unless, of course, the Fed “falls behind the curve,” allowing “inflation expectations to become unanchored” and other mumbo-jumbo.)

All you really need is an “independent” central bank that is deemed “credible” by market participants, and you can sit back and relax. There’s a one-size-fits-all way to deal with any inflation problem. To dial inflation down, simply dial up the overnight interest rate. You might throw in some “forward guidance” to help shape “inflation expectations” but that’s really still about managing inflation via adjustments in the short-term interest rate…

It’s this sort of cavalier attitude and reverence for monetary policy that troubles me. We’re supposed to accept—as a matter of faith—that the central bank can always handle any inflation problem because mainstream economics says so?

The “invisible fist” today

So have we plowed too much money into unproductive projects? Have we overpaid for the projects and startups? The market tries to sort the question out every day.

In keeping any sense of proportion we should recognize that crypto is a tiny portion of the overall economy. But as a metaphor, it poses an interesting question. Have we dumped too much money into cat gifs, figuratively speaking? Are a few becoming insanely rich while society holds the bag?

I’m partial to Michael Pettis’ idea of the bezzle. In Minsky Moments in Venture CapitalAbraham Thomas explains how bezzle conditions emergeThe key insight is that high prices create a positive feedback loop because prices themselves tell you something about risk. High prices signify safety. This is a paradox because a high price is also an asymmetric risk to reward. The paradox tends to resolve itself abruptly:

One way to understand Minsky cycles is that they’re driven by the gap between ‘measured risk’ and ‘true risk’.

When you lend money, the ‘true risk’ you take is that the borrower defaults3. But you can’t know this directly; instead you measure it by proxy, using credit spreads. Credit spreads reflect default probabilities, but they also reflect investor demand for credit products. A subprime credit trading at a tight spread doesn’t necessarily imply that subprime loans have become less risky (though that could be true); the tight spread may also be driven by demand for subprime loans. Measured risk has deviated from true risk.

Similarly, when you invest in a startup, the ‘true risk’ that you take is that the startup fails. But you can’t know this directly; instead you measure it by proxy, using markups. Markups reflect inverse failure probabilities (the higher and faster the markup, the more successful the company, and hence the less likely it is to fail — at least, so one hopes). But markups also reflect investor demand for startup equity. Once again, measured risk has deviated from true risk.

During Minsky booms, measured risks decline. During Minsky busts, measured risks increase. The flip from boom to bust occurs when the market realizes that true risks haven’t gone away.

Squaring all of this with my own priors

We started with Lyn and Cem’s analysis of how we got to today. If the world de-globalizes many of the deflationary headwinds that convolved with loose monetary policy will reverse. The new regime would be inflationary. How inflationary is anyone’s guess. Is the floor for the foreseeable future 2%, 4%, higher?

If bubbles pop and bezzles recede, would that make inflation worse as we discover that spending was wasted and economic supply did not grow where it needed to (ahem housing and energy)?

Or will deflation come roaring back as drawdowns push wealth effects in reverse and higher borrowing costs on huge loan balances crowd out future growth?

My prior is torn between:

a) inflation will fall in a way that surprises people. Low inflation is actually the default because wealth inequality acts as what I call an “inflation heat sink”. Here’s my explanation using the boardgame Monopoly:

Unfortunately, if you think inflation is going to fall the trade is probably not to buy treasuries since real rates are already quite negative. The related insight is more concerning. Bonds can keep falling as inflation falls and nothing would be glaringly mispriced. Ouch, 60/40.


b) Stimulative fiscal policy is inflationary in the short-run (and in the long-run if the spending is unproductive) and while fiscal policy is highly political, neither party is afraid to run big deficits at this point anymore.

If inflation accelerates (or at least fails to abate) it’s not clear what investments it would be good for. People like to promote real estate as an inflation hedge. Given the low affordability already built into prices, outpacing real rates is hardly a given. Maybe that’s where you lose the least? What inflation expectations are already embedded in commodity prices? How to invest for inflation from current prices is a hard problem.

Unwinding imbalances

I tend to believe Lyn and Cem’s story about how the forces that brought us to today are unwinding. If we have spent the past 40 years building a giant imbalance between capital and labor this reversal is ultimately a good thing. But it’s not going to make capital happy. If you have been an outsize winner for the past decade, you’re probably going to moan about it when the imbalance narrows (hey it’s understandable that we respond to marginal changes to our situation, but it’s not reasonable to miss the big picture that prior wins have been out of proportion to contribution). What was given to you easily by the Fed’s support of high duration bets, can be taken away. Don’t expect sympathy.

The most direct way to correct the imbalance would be to heavily tax high earners and the rich but it’s not politically viable. But there’s a backdoor. The combination of fiscal stimulus, especially if done in a progressive way, will bolster the economy while we raise interest rates. The net effect will offset labor’s pain in an economic slowdown. But it will still be inflationary since we are supporting demand (by giving $$ to those who actually spend it) while constraining supply (by raising hurdle rates on capital expenditures). To a rich person invested in growth and expensive real estate, this will feel like stagflation as labor’s slice of the pie increases relatively while demand for the rich person’s assets slows (opposite of how loose monetary conditions created inflation for homes and stocks but not labor and importable goods). It would be a shadow progressive tax using inflation to take back financial wealth while creating conditions for lower-wage earners to keep up with the price of goods and services.

No matter how the economic picture unfolds, the theme that feels alive under the surface is imbalance. It doesn’t matter that the average American lives better than a 16th-century king. We relentlessly compare and that’s never going to stop. The imbalance matters. We have accumulated massive amounts of debt. If the debt isn’t truly backed by the collective wealth of the individuals who make up the economy, eventually a catalyst will shatter the illusion that we can continue rolling it over.

Mechanically, that debt is of course backed by the sum of our assets. But when push comes to shove, how is it apportioned? What is the fair attribution? Do our anti-trust laws and tax policy divide the risk and rewards fairly (whatever that means from an equitable and efficiency perspective)? It’s not as easy as saying “the market gets this right”. The rules are political. Power is not accorded solely due to merit (whatever that means). So sure, the debt is backed by our assets, but good luck calling the loans back in.

I recently re-watched Ray Dalio’s How The Economic Machine Works. I assigned it to some young teens who are trying to learn about the economy. The video shows how the macroeconomy is built up from everyday transactions. A loan (ie credit) is one such transaction. The credit cycle is endemic to the economy itself. Dalio overlays the short-term credit cycle (small squiggles) on the long-run cycle.

(credit: the

The end of long-term debt cycles are times of massive upheaval. Historically this has meant violence, currency devaluations, and victors dividing the spoils regardless of who was previously listed as a creditor.

Dalio, uses a highly understated word for this adjustment. Deleveraging. The default process (whether outright or via inflation) is redistributive. Politics determines the winners and losers. It’s always been give and take. But we need to keep talking.

If we cannot find a way to cooperate, the problem of imbalance and feelings of injustice don’t just disappear.

Redistribution finds a way.


I realize MMT is a polarizing topic. I have a cursory understanding of it, so feel free to correct me: I think the core insight that a government doesn’t need to run like a household is correct mechanically. The problem is how the debt is allocated to the citizens in the form of taxes and transfer payments. So from a policy point of view, I don’t know if MMT frameworks lead to effective practical policy. Effective is always a matter of debate and depends on your constituency’s perspective. In that sense, MMT is no different than any other framework that claims to have good reasons for how it grows and splits the pie.

The great personal dividend of MMT is how it popularized the sectoral balances approach to understanding the economy. Similar to Dalio’s video, it views the economy as a collection of small transactions. Since every buy is someone else’s sell, we can use a giant T account of credits and debits to understand the economy from basic accounting. Those credits and debits flow through the household, government, corporate and “foreign sectors”. Economists associated with this approach include Godley, Pettis, Kalecki, and Levy.

To demonstrate the power of this lens, I encourage you to read Jesse Livermore’s Upside Down Markets paper. It completely reshaped my understanding of macro into something that I believe is closer to reality. Whenever I hear a macro argument, I at least try to place it within the sectoral balances framework to see if it is at least self-consistent. The advantage of basic accounting identities is they are identities. They are tautologically true and that’s a useful razor for an initially evaluating an argument.

Jesse’s paper is a beast. Many people won’t read a 40k word paper. I encourage everyone to read this one. While it’s a dense exploration of timely macroeconomics ideas, Jesse’s rare ability to tackle the complexity in an approachable, step-by-step progression is an amazing opportunity to learn.

Having said that, I realize many people still won’t read the paper. So I decided to try my hand at creating this explainer. I completely re-factored it to turn it into a personal reference. You can use it too:

✍️Moontower Guide To Jesse Livermore’s Upside Down Markets (link)

Moontower on Gamma

The first option greek people learn after delta is gamma. Recall that delta represents how much an option’s price changes with respect to share price. That makes it a convenient hedge ratio. It tells you the share equivalent position of your option position. So if an option has a .50 delta, its price changes by $.50 for a $1.00 change in the stock price. Calls have positive deltas and puts have negative deltas (ie puts go down in value as the stock price increases). If you are long a .50 delta call option and want to be hedged, you must be short 50 shares of the stock (options refer to 100 shares of underlying stock). For small moves in the stock, your call and share position p/l’s will offset because you are “delta neutral”.

This is true for small moves only. “Small” is a bit wishy-washy because small depends on volatility and this post is staying away from that much complexity. Instead, we want to focus on how your delta changes as the stock moves. This is vital because if our option delta changes then your equivalent share position changes. If your position size changes, then that same $1 move in the stock leads means your p/l changes are not constant for every $1 change. If I’m long 50 shares of a stock, I make the same amount of money for each $1 change. But if I’m long 50 shares equivalent by owning a .50 delta option, then as the stock increases my delta increases as the option becomes more in-the-money. That means the next $1 change in the stock, produces $60 of p/l instead of just $50. We know that deep in-the-money options have a 1.00 delta meaning they act just like the stock (imagine a 10 strike call expiring tomorrow when the stock is trading for $40. The option price and stock price will move perfectly in lockstep. The option has 100% sensitivity to the change).

A call option can go from .50 delta to 1.00 delta. Gamma is the change in delta for the change in stock. Suppose you own a .50 delta call and the stock goes up by $1. The call is solidly in-the-money and perhaps its new delta is .60. That change in delta from .50 to .60 for a $1 move is known as gamma. In this case, we say the option has .10 gamma per $1. So if the stock goes up $1, the delta goes up by .10.

While this is mechanically straightforward, some of the lingo around gamma is confusing. People spout phrases like “a squared term”, “curvature”, “convexity”. I’ve written about what convexity is and isn’t because I’ve seen it trip up people who should know better. See Where Does Convexity Come From?. In this post, we will demystify the relationship of these words to “gamma”. In the process, you will deeply improve your understanding of options’ non-linear nature.

How the post is laid out:


  • Acceleration
  • The squared aspect of gamma
  • Dollar gamma


  • Constant gamma
  • Strikeless products
  • How gamma scales with price and volatility
  • Gamma weighting relative value trades



You already understand “curvature”. I’ll prove it to you.

You wake up tomorrow morning and see a bizarre invention in your driveway. An automobile with an unrivaled top speed.  You take it on an abandoned road to test it out. Weirdly, it accelerates slowly for a racecar. Conveniently for me, it makes the charts I’m about to show you easy to read.

You are traveling at 60 mph.

Imagine 2 scenarios:

  1. You maintain that constant speed.
  2. You accelerate such that after 1 minute you are now traveling at 80 mph. Assume your acceleration is smooth. That means over the 60 seconds it takes to reach 80 mph, your speed increases equally every second. So after 3 seconds, you are traveling 61 mph, at 6 seconds you are moving 62 mph. Eventually at 60 seconds, you are traveling 80 mph.


In the acceleration case, what was your average speed or velocity during that minute?

Since the acceleration was smooth, the answer is 70 mph.

How far did you travel in each case?

Constant velocity:

Accelerate at 20mph per minute:

If the acceleration is smooth, we can take the average velocity over the duration and multiply it by the duration to compute the distance traveled.

Let’s now continue accelerating this supercar by a marginal 20mph rate for the next 15 minutes and see how far we travel. Compare this to a vehicle that maintains 60 mph for the whole trip. The table uses the same logic — the average speed for the last minute assumes a constant acceleration rate.

Let’s zoom in on the cumulative distance traveled at each minute:

We found it! Curvature.

Curvature is the adjustment to the linear estimate of distance traveled that we would have presumed if we assumed our initial speed was constant. Let’s map this analogy to options.

  • Time –> stock price

    How much time has elapsed from T₀ maps to “how far has the stock moved from our entry?”

  • Velocity –> delta

    Delta is the instantaneous slope of the p/l with respect to stock price, just as velocity is the instantaneous speed of the car.

  • Acceleration –> gamma

    Acceleration is the change in our velocity just as gamma is the change in delta.

  • Cumulative distance traveled –> cumulative p/l

    Distance = velocity x time. Since the velocity changes, multiply the average velocity by time. In this case, we can double-check our answer by looking at the table. We traveled 52.5 miles in 15 minutes or 210 mph on average. That corresponds to our speed at the midpoint of the journey — minute 8 out of 15.
    P/l = average position size x change in stock price. Just as our speed was changing, our position size was changing!

Delta is the slope of your p/l. That’s how I think about position sizes. Convexity is non-linear p/l that results from your position size varying. Gamma mechanically alters your position size as the stock moves around.

The calculus that people associate with options is simply the continuous expression of these same ideas. We just worked through them step-wise, minute by minute taking discrete averages for discrete periods.

Intuition For the Squared Aspect Of Gamma

Delta is familiar to everyone because it exists in all linear instruments. A stock is a linear instrument. If you own 100 shares and it goes up $1, you make $100. If it goes up $10, you make $1,000. The position size is weighted by 1.00 delta (in fact bank desks that trade ETFs and stocks without options are known as “Delta 1 desks”).  Since you just multiply by 1, the position size is the delta. If you’re long 1,000 shares of BP, I say “you’re long 1,000 BP deltas”. This allows you to combine share positions and option positions with a common language. If any of the deltas come from options that’s critical information since we know gamma will change the delta as the stock moves.

If your 1,000 BP deltas come from:

500 shares of stock


10 .50 delta calls

that’s important to know. Still, for a quick summary of your position you often just want to know your net delta just to have an idea of what your p/l will be for small moves.

If you have options, that delta will not predict your p/l accurately for larger moves. We saw that acceleration curved the total distance traveled. The longer you travel the larger the “curvature adjustment” from a linear extrapolation of the initial speed. Likewise, the gamma from options will curve your p/l from your initial net delta, and that curvature grows the further the stock moves.

If you have 1,000 BP deltas all coming from shares, estimating p/l for a $2 rally is easy — you expect to make $2,000.

What if your 1,000 BP deltas all come from options? We need to estimate a non-linear p/l because we have gamma.

Let’s take an example from the OIC calculator.

The stock is $28.35

This is the 28.5 strike call with 23 days to expiry. It’s basically at-the-money.

It has a .50 delta and .12 of gamma. Let’s accept the call value of $1.28 as fair value.

Here’s the setup:

Initial position = 20 call options.

What are your “greeks”?

    • Delta  =  1,000

      .50 x 20 contracts x 100 share multiplier

    • Gamma =  240

      .12 x 20 contracts x 100 share multiplier

(the other greeks are not in focus for this post)

The greeks describe your exposures. If you simply owned 1,000 shares of BP you know the slope of your p/l per $1 move…it’s $1,000. That slope won’t change.

But what about this option exposure? What happens if the stock increases by $1, what is your new delta and what is your p/l?

After $1 rally:

    • New delta = 1,240 deltas

      .62 x 20 contracts x 100 share multiplier

      Remember that gamma is the change in delta per $1 move. That tells us if the stock goes up $1, this call will increase .12 deltas, taking it from a .50 delta call to a .62 delta call.

That’s fun. As the stock went up, your share equivalent position went from 1,000 to 1,240.

Can you see how to compute your p/l by analogizing from the accelerating car example?

[It’s worth trying on your own before continuing]

Computing P/L When You Have Gamma 

Your initial delta is 1,000. Your terminal delta is 1,240.

(It’s ok to assume gamma is smooth over this move just as we said the acceleration was smooth for the car.)

Your average delta over the move = 1,120

1,120 x $1 = $1,120

You earned an extra $120 vs a basic share position for the same $1 move. That $120 of extra profit is curvature from a simple extrapolation of delta p/l. Since that curvature is due to gamma it’s best to decompose the p/l into a delta portion and a gamma portion.

  • The delta portion is the linear estimate of p/l = initial delta of 1,000 x $1 = $1,000
  • The gamma portion of the p/l is the same computation as the acceleration example:

Your gamma represents the change in delta over the whole move. That’s 240 deltas of change per $1. So on average, your delta was higher by 120 over the move. So we scale the gamma by the move size and divide by 2. That represents our average change in delta which we multiply by the move size to compute a “gamma p/l”.


Γ = position weighted gamma = gamma per contract  x  qty of contracts  x  100 multiplier

△S = change in stock price

We can re-write this to make the non-linearity obvious — gamma p/l is proportional to the square of the stock move!

Generalizing Gamma: Dollar Gamma

In investing, we normally don’t speak about our delta or equivalent share position. If I own 1,000 shares of a $500 stock that is very different than 1,000 shares of a $20 stock. Instead, we speak about dollar notional. Those would be $500,000 vs $20,000 respectively. Dollar notional or gross exposures are common ways to denote position size. Option and derivative traders do the same thing. Instead of just referring to their delta or share equivalent position, they refer to their “dollar delta”. It’s identical to dollar notional, but preserves the “delta” vocabulary.

It is natural to compute a “delta 1%” which describes our p/l per 1% move in the underlying.

For the BP example:

  • Initial dollar delta = delta x stock price = 1,000 x $28.35 = $28,350 dollar deltas
  • Δ1% = $28,350/100 = $283.50

    You earn $283.50 for every 1% BP goes up.

Gamma has analogous concepts. Thus far we have defined gamma in the way option models define it — change in delta per $1 move. We want to generalize gamma calculations to also deal in percentages. Let’s derive dollar gamma continuing with the BP example.

  1. Gamma 1%

    Gamma per $1 = 240

    Of course, a $1 move in BP is over 3.5% ($1/$28.35). To scale this to “gamma per 1%” we multiply the gamma by 28.35/100 which is intuitive.

    Gamma 1% = 240 * .2835 = 68.04

    So for a 1% increase in BP, your delta gets longer by 68.04 shares.

  2. Dollar gamma

    Converting gamma 1% to dollar gamma is simple. Just multiply by the share price.

    By substituting for gamma 1% from the above step, we arrive at the classic dollar gamma formula:

Let’s use BP numbers.

$Gamma = 240 * 28.35² / 100 = $1,929

The interpretation:

A 1% rally in BP, leads to an increase of 1,929 notional dollars of BP due to gamma. 

Instead of speaking of how much our delta (equivalent share position) changes, you can multiply dollar gamma by percent changes to compute changes in our dollar delta.

Generalizing Gamma P/L For Percent Changes

In this section, we will estimate gamma p/l for percent changes instead of $1 changes. Let’s look at 2 ways.

The Accelerating Car Method

The logic flows as follows (again, using the BP example):

  • If a 1% rally leads to an increase of $1,929 of BP exposure then, assuming gamma is smooth, a 3.5% rally (or $1) will lead to an increase of $6,751 of BP length because 3.5%/1% * $1,929
  • Therefore the average length over the move is $3,375 (ie .5 * $6,751) due to gamma
  • $3,375 * 3.5% = $118 (This is very close to the $120 estimate we computed with the original gamma p/l formula. This makes sense since we followed the same logci…multiply the average position size due to gamma times the move size.)

The Algebraic Method

We can adapt the original gamma p/l formula for percent changes.

We start with a simple identity. To turn a price change into a percent we simply divide by the stock price. If a $50 stock increased $1 it increased 2%

If we substitute the percent change in the stock for the change in the stock we must balance the identity by multiplying by :

We can double-check that this works with our BP example. Recall that the initial stock price is $28.35:

This also checks out with the gamma p/l we computed earlier.


Constant Gamma 

In all the explanations, we assume gamma is smooth or constant over a range of prices. This is not true in practice. Option gammas peak near the ATM strike. Gamma falls to zero as the option goes deep ITM or deep OTM. When you manage an option book, you can sum your positive or negative gammas across all your inventory to arrive at a cumulative gamma. The gamma of your net position falls as you move away from your longs and can flip negative as you approach shorts. This means gamma p/l estimates are rarely correct, because gamma calculations themselves are instantaneous. As soon as the stock moves, time passes, or vols change your gamma is growing or shrinking.

This is one of the most underappreciated aspects vol trading for novices. Vanilla options despite being called vanilla are diabolical because of path dependence. If you buy a straddle for 24% vol and vol realizes 30% there’s no guarantee you make money. If the stock makes large moves with a lot of time to expiration or when the straddle is not ATM then those moves will get multiplied by relatively low amounts of dollar gamma. If the underlying grinds to a halt as you approach expiration, especially if it’s near your long strikes, you will erode quickly with little hope of scalping your deltas.

Skew and the correlation of realized vol with spot introduce distributional effects to vol trading and may give clues to the nature of path dependence. As a trader gains more experience, they move from thinking simply in terms of comparing implied to realized vol, but trying to understand what the flows tell us about the path and distribution. The wisdom that emerges after years of trading a dynamically hedged book is that the bulk of your position-based p/l (as opposed to trading or market-making) will come from a simple observation: were you short options where the stock expired and long where it didn’t?

That’s why “it’ll never get there” is not a reason to sell options. If you hold deltas against positions, you often want to own the options where the stock ain’t going and vice versa. This starts to really sink in around year 10 of options trading.

Strikeless Products

The path-dependant nature of vanilla options makes speculating on realized vol frustrating. Variance swaps are the most popular form of “strikeless” derivatives that have emerged to give investors a way to bet on realized vols without worrying about path dependence. Conceptually, they are straightforward. If you buy a 1-year variance swap implying 22% vol, then any day that the realized vol exceeds 22% you accrue profits and vice versa(sort of1). The details are not important for our purpose, but we can use what we learned about gamma to appreciate their construction.

A derivative market cannot typically thrive if there is no replicating portfolio of vanilla products that the manufacturer of the derivative can acquire to hedge its risk. So if variance swaps exist, it must mean there is a replicating portfolio that gives a user a pure exposure to realized vol. The key insight is that the product must maintain a fairly constant gamma over a wide range of strikes to deliver that exposure. Let’s look at the dollar gamma formula once again.

We can see that gamma is proportional to the square of the stock price. While the gamma of an option depends on volatility and time to expiration, the higher the strike price the higher the “peak gamma”. Variance swaps weight a strip of options across a wide range of strikes in an attempt to maintain a smooth exposure to realized variance. Because higher strike options have a larger “peak” gamma, a common way to replicate the variance swap is to overweight lower strikes to compensate for their smaller peak gammas. The following demonstrates the smoothness of the gamma profile under different weighting schemes.

Examples of weightings:

Constant = The replicating strip holds the same number of 50 strike and 100 strike options

1/K = The replicating strip holds 2x as many 50 strike options vs 100 strike

1/K² = The replicating stripe holds 4x as many 50 strike options vs 100 strike

Note that the common 1/K² weighting means variance swap pricing is highly sensitive to skew since the hedger’s portfolio weights downside puts so heavily. This is also why the strike of variance swaps can be much higher than the ATM vol of the same term. It reflects the cost of having constant gamma even as the market sells off. That is expensive because it requires owning beefy, sought-after low delta puts.

How Gamma Scales With Price, Volatility, and Time

Having an intuition for how gamma scales is useful when projecting how your portfolio will behave as market conditions or parameters change. A great way to get a feel for this is to tinker with an option calculator.  To demonstrate the effects of time, vol, and price, we hold 2 of the 3 constant and vary the 3rd.

Assume the strike is ATM for each example.

Here are a few rules of thumb for how price, vol, and time affect gamma.

  • If one ATM price is 1/2 the other, the lower price will also have 1/2 the dollar gamma. Linear effect as the higher gamma per option is offset by the dollar gamma’s extra weight to higher-priced stocks.
  • If one ATM volatility is 1/2 the other, the dollar gamma is inversely proportional to the ratio of the vols (ie 1/vol ratio).
  • If an option has 1/2 as much time until expiration, it will have √ratio of more gamma.

Stated differently:

  • Spot prices have linearly proportional differences in gamma. The lower price has less dollar gamma.
  • Volatility has inverse proportionality in gamma. The higher vol has less dollar gamma.
  • Time is inversely proportional to gamma with a square root scaling. More time means less dollar gamma.

Gamma-weighting Relative Value Trades

As you weight relative value trades these heuristics are handy (it’s also the type of things interview test your intuition for).

Some considerations that pop out if you choose to run a gamma-neutral book?

  • Time spreads are tricky. You need to overweight the deferred months and since vega is positively proportional to root time, you will have large net vega exposures if you try to trade term structure gamma-neutral.
  • Stocks with different vols. You need to overweight the higher vol stocks to be vega-neutral, but their higher volatility comes with higher theta. Your gamma-neutral position will have an unbalanced theta profile. This will be the case for inter-asset vol spreads but also intra-asset. Think of risk reversals that have large vol differences between the strikes.
  • Overweighting lower-priced stocks to maintain gamma neutrality does not tend to create large imbalances because spot prices are positively proportional to other greeks (higher spot –> higher vega, higher dollar gamma, higher theta all else equal).

Weighting trades can be a difficult topic. How you weight trades really depends on what convergence you are betting on. If you believe vols move as a fixed spread against each other then you can vega-weight trades. If you believe vols move relative to each other (ie fixed ratio — all vols double together) then you’d prefer theta weighting.

I’ve summarized some of Colin Bennett’s discussion on weighting here. The context is dispersion, but the intuitions hold.

Finally, this is an example of a top-of-funnel tool to spot interesting surfaces. The notations on it tie in nicely with the topic of weightings. The data is ancient and besides the point.

Wrapping Up

Gamma is the first higher-order greek people are exposed to. Like most of my posts, I try to approach it intuitively. I have always felt velocity and acceleration are the easiest bridges to understanding p/l curvature. While the first half of the post is intended for a broad audience, the second half is likely to advanced for novices and too rudimentary for veterans. If it helps novices who are trying to break into the professional world, I’ll consider that a win. I should add that in Finding Vol Convexity I apply the concept of convexity to implied volatility. You can think of that as the “gamma of vega”. In other words, how does an option’s vega change as volatility changes?

I realize I wrote that post which is more advanced than this one in the wrong order. Shrug.

Honest Mirrors

I liked this post by Morgan Housel:

✍️How People Think (29 min read)

He explains:

This article describes 17 of what I think are the most common and influential aspects of how people think.

It’s a long post, but each point can be read individually. Skip the ones you don’t agree with and reread the ones you do – that itself is a common way people think.

My obsessive need to consolidate and refactor required transposing his list to this one (I re-titled them all for compression):

1) Tribalism

2) We only see the tip of icebergs

3) All probability gets represented as yes or no

4 and 5) We expect trees to grow to the sky (which leads us to overreaction)

6) We are surprised when geniuses disappoint us

7) Unhealthy competition makes us short-sighted. The antidote is extending the horizon to create space.

8) Stories FTW

9) Complexity sells

10) Motivated reasoning is the rule

11) Experience is the raw material for empathy

12) Heisenberg makes us poor self-evaluators. Seek other’s input

13 and 16) Innumerate about extremes [compounding & inevitably of rare occurrences ie the birthday problem]

14) Simple but not easy

15) When imagining change we fail to consider the full context [for example when you are younger you imagine being older as your current life with grey hair but you don’t consider the mental and emotional evolution that comes with aging]

17) Idealism is seductive but counterproductive often leading to isolated demands for rigor

I want to zoom in on #7. It’s our child-eating friend Moloch again. I covered him several times this year:

Recall how Moloch symbolizes the tendency to overoptimize on a single value to the detriment of all others, swallowing everyone in its unhealthy path.

I tend to be pretty laid back in general. For better or worse, I tend to be a satisficer rather than a maximizer. A charitable interpretation of that trait (there are non-charitable ones too but I’m the host of this here party at the Moontower) is I appreciate ergodicity. See Luca Dellanna’s What Is Ergodicity? for a quick explanation that word.

But it turns out, humans likely grok the idea in their DNA. In fact, this appreciation forms the basis of pushback against some of the cognitive bias research, especially loss aversion. Contrived behavioral economics experiments assume agents maximize single-trial expected value instead of median expectancy. What behavioral economists label as design flaws are more of Chesterton’s Fence to protect you from self-destruction in the name of maximization. The expected value of saved seconds from jayrunning across the street might be positive. But you only need one ill-timed fall to negate the sum of those optimized moments.

So when I say that “slack” is the answer to Moloch, it has nothing to do with being lazy. It’s appreciating that any one trial is just a single draw in a repeated strategy and the merits of the strategy cannot be graded on isolated outcomes.

Since we are on the topic of behavioral economics, there is another common knock against cognitive bias research.

Via Notes From Todd Simkin On The Knowledge Project:

Shane points out a paradox in cognitive science. Knowing our biases doesn’t seem to help us overcome them.

Todd concurs:

It is definitely true that it is sort of descriptive of the past. A lot of these heuristics and biases are things that we can see when we after we’ve already identified that a mistake has been made. And we say, Okay, well, why was the mistake made? Say, oh, because I was anchored, or because of the way the question was framed, or whatever it might be, we have a really hard time seeing it in ourselves.

But we know the cure for this. I wrote:

This is a topic the brilliant Ced Chin has studied in depth. Ced told me that the literature suggests the only way cognitive bias inoculation works is via group reinforcement. I told him that was exactly the cultural DNA when I was at SIG which makes me believe there is a lot of value in being aware of bias. Anytime you replayed your decision process, it was a cultural norm to point out where in the process you were prone to bias.

Todd reinforces Ced’s conclusions:

We have a really easy time seeing when someone else is making that type of stupid mistake. A big part of our approach to education is to teach people to talk through their decisions, and to end to talk about why they’re doing what they’re doing with their peers, the other people on their team. If we can do that real-time, that’s great. Often in trading, you don’t have that opportunity, because things are just too immediate. But certainly, anytime things have changed. If you’re doing things differently, it’s a really good time to turn to the traders around you. And the quantitative researchers around you and the assistant traders and your team and say, Hmm, it looks like all the sudden Gamestop is a whole lot more volatile than it was a week ago. Here’s how I’m positioning for this trading. What do you guys think? And have someone say, oh, it seems like you’re really anchored to last week’s volatility. If things have changed that much, you need to move much more quickly than you’re moving right now.So you don’t realize that you’re anchored, that’s the whole nature of being anchored, is that you don’t recognize the outsized importance that the anchor has on your decision, but somebody else who’s a little bit more distant from it can. So if we’re good at encouraging communication, then we’re going to be really good at getting other people to help improve your decision process.

I add:

There it is. The key — communication. It’s not some magic formula. Even after I left SIG I spent my whole career working with SIG alum. This culture and these types of communications happen all day on the desk. Despite the common perceptions of “trading”, I have always found it to be a team game and communication skills are paramount.

Todd expands:

I know that you are fond of pointing out that you are the sum of the five people that you spend the most time with. So if the people that you’re spending the most time with are your co-workers who are thinking about trading the same way you are, then maybe you’re going to combine the same types of errors, it’s certainly better than then trying to act on your own. But even better is if you have a culture that rewards truth-finding, as opposed to rewarding action. If nobody feels personally attacked, because of somebody else pointing out their error, but instead feels like we together have now done more to get closer to, to some truth to the better way to act or the you know, the more accurate, fair value of this asset that we’re trading, then everybody feels like it’s a win. And they will therefore encourage the involvement of the people around them.

If you work in a Molochian, credit-stealing environment you face a prisoner’s dilemma as to whether you even want to even correct others’ biases. (I suspect this gets worse as the fiefdoms that emerge in large hierarchies rot the spirit from the inside). Teamwork and its antecedent, alignment, are devilishly hard, but critical because they hold the key to improving decisions.

When Shane asks what the most important variables are for being a better decision-maker, he expects Todd might say “probabilistic thinking”. But Todd did not hesitate with his answer:

Talk more is number one, that beats probabilistic thinking. That beats sort of anything else. Truth-finding is being able to bring in other people in the decision process in a constructive way. So finding good ways to communicate, to improve the input from others. Thinking probabilistically I think is definitely a very, very important piece of trying to diagnose what works by trying to think of where where things fall apart, where people fail. The other place that people fail is falling in love with their decision process and not being open to being wrong. So an openness to feedback to finding disconfirming information to actively seeking out disconfirming information, which is really uncomfortable. But that I think is the other piece that is super important for being a good trader.

If I were to try to be a prop trader from my pajamas, I’d form a Discord channel of sharp, open-minded, truth-seeking, humble, teachable teammates before I even opened a brokerage account.

Trading is not a single-player game.

You need honest mirrors. Not the ones you find in fancy dressing rooms.

Stay groovy squad

Can Your Manager Solve Betting Games With Known Solutions?

One of the best threads I’ve seen in a while. It’s important because it shows how betting strategies vary based on your goals.

In the basic version, the “Devil’s Card Game” is constrained by the rule that you must bet your entire stack each time.

You can maximize:

  1. expectation
  2. utility (in the real world Kelly sizing is the instance of this when utility follows a log function)
  3. the chance of a particular outcome.

At the end of the thread, we relax the bet sizing rules and allow the player to bet any fraction of the bankroll they’d like. This is a key change.

It leads to a very interesting strategy called backward induction. In markets, the payoffs are not well-defined. But this game features a memory because it is a card game without replacement. Like blackjack. You can count the possibilities.

The thread shows how the backward induction strategy blows every other strategy out of the water.

If we generalize this, you come upon a provocative and possibly jarring insight:

The range of expectations simply based on betting strategies is extremely wide.

That means a good proposition can be ruined by an incompetent bettor. Likewise, a poor proposition can be somewhat salvaged by astute betting.

I leave you with musings.

  1. Is it better to pair a skilled gambler with a solid analyst or the best analyst with a mid-brow portfolio manager?
  2. How confident are you that the people who manage your money would pick the right betting strategy for a game with a known solution?Maybe allocators and portfolio managers should have to take gambling tests. If analytic superiority is a source of edge, the lack of it is not simply an absence of one type of edge. It’s actually damning because it nullifies any other edge over enough trials assuming markets are competitive (last I checked that was their defining feature).

Finance Guilt

Finance Guilt

I’ve said several times that finance is really just code. Like software, it’s an abstraction skin pulled over physical features. One can feel a bit disembodied if their formulation of the world for 8-12 hours a day are prices. Prices that collapse all of human enterprise, from the dirt under its fingernails to the sunrises and sunsets between now and some expiration date, into some Excel number format.

Just as software intermediates for less, financial innovation lowers the cost of go-betweens. In finance, the things went-between are people paying to offload risk to people looking to get paid for warehousing risk. In software and finance, skimming a tiny bit of rent on those transactions is lucrative.

How good or bad we can feel about the degree of skimming depends on how much surplus is created versus the higher friction model. The value of information liquidity is fairly obvious so Google enjoyed a positive reputation for at least its first decade in business. Meanwhile, finance feels like a constant barrage of “what did Wells Fargo do now?” or words that rhyme with Fonzi. People outside finance can be excused for having a dim, albeit biased, view of the profession since nobody reports on people doing an honest job.

With that in mind, I leave you with Mitchell’s understandable question:

Here’s my quick response:

Agustin’s response:

I’ll wrap with a footnote from a recent post:

The slicing and dicing of risk is finance’s salutary arrow of progress. Real economic growth is human progress in its battle against entropy. By farming, we can specialize. By pooling risk, we can underwrite giant human endeavors with the risk spread out tolerably. People might not sink the bulk of their net worth into a home if it wasn’t insurable. Financial innovation is matching a hedger with the most efficient holder of the risk. It’s matching risk-takers who need capital, with savers who are willing to earn a risk premium. Finance gets a bad rap for being a large part of the economy, and there are many headlines that enflame that view. I, myself, have a dim view of many financial practices. I have likened asset management to the vitamin industry — it sells noise as signal. But the story of finance broadly goes hand in hand with human progress. It might not be “God’s work” as Goldman’s boss once cringe-blurted, but its most extreme detractors as well as the legions of “I wish I was doing something more meaningful with my life” soldiers are discounting the value of its function which is buried in abstraction. Finance is code, so if software is eating the world, financialization is its dinner date.

If You Make Money Every Day, You’re Not Maximizing

If You Make Money Every Day, You’re Not Maximizing

This is an expression I heard early in my trading days. In this post, we will use arithmetic to show what it means in a trading context, specifically the concept of hedging.

I didn’t come to fully appreciate its meaning until about 5 years into my career. Let’s start with a story. It’s not critical to the technical discussion, so if you are a robot feel free to beep boop ahead.

The Belly Of The Trading Beast

Way back in 2004, I spent time on the NYSE as a specialist in about 20 ETFs. A mix of iShares and a relatively new name called FEZ, the Eurostoxx 50 ETF. I remember the spreadsheet and pricing model to estimate a real-time NAV for that thing, especially once Europe was closed, was a beast. I also happened to have an amazing trading assistant that understood the pricing and trading strategy for all the ETFs assigned to our post. By then, I had spent nearly 18 months on the NYSE and wanted to get back into options where I started.

I took a chance.

I let my manager who ran the NYSE floor for SIG know that I thought my assistant should be promoted to trader. Since I was the only ETF post on the NYSE for SIG, I was sort of risking my job. But my assistant was great and hadn’t come up through the formal “get-hired-out-of-college-spend-3-months-in-Bala” bootcamp track. SIG was a bit of a caste system that way. It was possible to crossover from external hire to the hallowed trader track, but it was hard. My assistant deserved a chance and I could at least advocate for the promotion.

This would leave me in purgatory. But only briefly. Managers talk. Another manager heard I was looking for a fresh opportunity from my current manager. He asked me if I want to co-start a new initiative. We were going to the NYMEX to trade futures options. SIG had tried and failed to break into those markets twice previously but could not gain traction. The expectations were low. “Go over there, try not to lose too much money, and see what we can learn. We’ll still pay you what you would have expected on the NYSE”.

This was a lay-up. A low-risk opportunity to start a business and learn a new market. And get back to options trading. We grabbed a couple clerks, passed our membership exams, and took inventory of our new surroundings.

This was a different world. Unlike the AMEX, which was a specialist system, the NYMEX was open outcry. Traders here were more aggressive and dare I say a bit more blue-collar (appearances were a bit deceiving to my 26-year-old eyes, there was a wide range of diversity hiding behind those badges and trading smocks. Trading floors are a microcosm of society. So many backstories. Soft-spoken geniuses were shoulder-to-shoulder with MMA fighters, ex-pro athletes, literal gangsters or gunrunners, kids with rich daddies, kids without daddies). We could see how breaking in was going to be a challenge. These markets were still not electronic. Half the pit was still using paper trading sheets. You’d hedge deltas by hand-signaling buys and sells to the giant futures ring where the “point” clerk taking your order was also taking orders from the competitors standing next to you. He’s been having beers with these other guys for years. Gee, I wonder where my order is gonna stand in the queue?

I could see this was going to be about a lot more than option math. This place was 10 years behind the AMEX’s equity option pits. But our timing was fortuitous. The commodity “super-cycle” was still just beginning. Within months, the futures would migrate to Globex leveling the field. Volumes were growing and we adopted a solid option software from a former market-maker in its early years (it was so early I remember helping their founder correct the weighted gamma calculation when I noticed my p/l attribution didn’t line up to my alleged Greeks).

We split the duties. I would build the oil options business and my co-founder who was more senior would tackle natural gas options (the reason I ever got into natural gas was because my non-compete precluded me from trading oil after I left SIG). Futures options have significant differences from equity options. For starters, every month has its own underlyers, breaking the arbitrage relationships in calendar spreads you learn in basic training. During the first few months of trading oil options, I took small risks, allowing myself time to translate familiar concepts to this new universe. After 6 months, my business had roughly broken even and my partner was doing well in gas options. More importantly, we were breaking into the markets and getting recognition on trades.

[More on recognition: if a broker offers 500 contracts, and 50 people yell “buy em”, the broker divvies up the contracts as they see fit. Perhaps his bestie gets 100 and the remaining 400 get filled according to some mix of favoritism and fairness. If the “new guy” was fast and loud in a difficult-to-ignore way, there is a measure of group-enforced justice that ensures they will get allocations. As you make friends and build trust by not flaking on trades and taking your share of losers, you find honorable mates with clout who advocate for you. Slowly your status builds, recognition improves, and the system mostly self-regulates.]

More comfortable with my new surroundings, I started snooping around. Adjacent to the oil options pit was a quirky little ring for product options — heating oil and gasoline. There was an extremely colorful cast of characters in this quieter corner of the floor. I looked up the volumes for these products and saw they were tiny compared to the oil options but they were correlated (gasoline and heating oil or diesel are of course refined from crude oil. The demand for oil is mostly derivative of the demand for its refined products. Heating oil was also a proxy for jet fuel and bunker oil even though those markets also specifically exist in the OTC markets). If I learned anything from clerking in the BTK index options pit on the Amex, it’s that sleepy pits keep a low-profile for a reason.

I decided it was worth a closer look. We brought a younger options trader from the AMEX to take my spot in crude oil options (this person ended up becoming a brother and business partner for my whole career. I repeatedly say people are everything. He’s one of the reasons why). As I helped him get up to speed on the NYMEX, I myself was getting schooled in the product options. This was an opaque market, with strange vol surface behavior, flows and seasonality. The traders were cagey and clever. When brokers who normally didn’t have business in the product options would catch the occasional gasoline order and have to approach this pit, you could see the look in their eyes. “Please take it easy on me”.

My instincts turned out correct. There was edge in this pit. It was a bit of a Rubik’s cube, complicated by the capital structure of the players. There were several tiny “locals” and a couple of whales who to my utter shock were trading their own money. One of the guys, a cult legend from the floor, would not shy away from 7 figure theta bills. Standing next to these guys every day, absorbing the lessons in their banter, and eventually becoming their friends (one of them was my first backer when I left SIG) was a humbling education that complemented my training and experience. It illuminated approaches that would have been harder to access in the monoculture I was in (this is no shade on SIG in any way, they are THE model for how to turn people into traders, but markets offer many lessons and nobody has a monopoly on how to think).

As my understanding and confidence grew, I started to trade bigger. Within 18 months, I was running the second-largest book in the pit, a distant second to the legend, but my quotes carried significant weight in that corner of the business. The oil market was now rocking. WTI was on its way to $100/barrel for the first time, and I was seeing significant dislocations in the vol markets between oil and products1. This is where this long-winded story re-connects with the theme of this post.

How much should I hedge? We were stacking significant edge and I wanted to add as much as I could to the position. I noticed that the less capitalized players in the pit were happy to scalp their healthy profits and go home relatively flat. I was more brash back then and felt they were too short-sighted. They’d buy something I thought was worth $1.00 for $.50 and be happy to sell it out for $.70. In my language, that’s making 50 cents on a trade, to lose 30 cents on your next trade. The fact that you locked in 20 cents is irrelevant.

You need to be a pig when there’s edge because trading returns are not uniform. You can spend months breaking even, but when the sun shines you must make as much hay as possible. You don’t sleep. There’s plenty of time for that when things slow down. They always do. New competitors will show up and the current time will be referred to as “the good ole’ days”. Sure enough, that is the nature of trading. The trades people do today are done for 1/20th the edge we used to get.

I started actively trading against the pit to take them out of their risk. I was willing to sacrifice edge per trade, to take on more size (I was also playing a different game than the big guy who was more focused on the fundamentals of the gasoline market, so our strategies were not running into one another. In fact, we were able to learn from each other). The other guys in the pit were hardly meek or dumb. They simply had different risk tolerances because of how they were self-funded and self-insured. My worst case was losing my job, and that wasn’t even on the table. I was transparent and communicative about the trades I was doing. I asked for a quant to double-check what I was seeing.

This period was a visceral experience of what we learned about edge and risk management. It was the first time my emotions were interrupted. I wanted assurance that the way I was thinking about risk and hedging was correct so I could have the fortitude to do what I intellectually thought was the right play.

This post is a discussion of hedging and risk management.

Let’s begin.

What Is Hedging?

Investopedia defines a hedge:

A hedge is an investment that is made with the intention of reducing the risk of adverse price movements in an asset. Normally, a hedge consists of taking an offsetting or opposite position in a related security.

The first time I heard about “hedging”, I was seriously confused. Like if you wanted to reduce the risk of your position, why did you have it in the first place.? Couldn’t you just reduce the risk by owning less of whatever was in your portfolio? The answer lies in relativity. Whenever you take a position in a security you are placing a bet. Actually, you’re making an ensemble of bets. If you buy a giant corporation like XOM, you are also making oblique bets on GDP, the price of oil, interest rates, management skill, politics, transportation, the list goes on. Hedging allows you to fine-tune your bets by offsetting the exposures you don’t have a view on. If your view was strictly on the price of oil you could trade futures or USO instead. If your view had nothing to do with the price of oil, but something highly idiosyncratic about XOM, you could even short oil against the stock position.

Options are popular instruments for implementing hedges. But even when used to speculate, this is an instance of hedging bundled with a wager. The beauty of options is how they allow you to make extremely narrow bets about timing, the size of possible moves, and the shape of a distribution. A stock price is a blunt summary of a proposition, collapsing the expected value of changing distributions into a single number. A boring utility stock might trade for $100. Now imagine a biotech stock that is 90% to be worth 0 and 10% to be worth $1000. Both of these stocks will trade for $100, but the option prices will be vastly different 2.

If you have a differentiated opinion about a catalyst, the most efficient way to express it will be through options. They have the most urgent function to a reaction. If you think a $100 stock can move $10, but the straddle implies $5 you can make 100% on your money in a short window of time. Annualize that! Go a step further. Suppose you have an even finer view — you can handicap the direction. Now you can score a 5 or 10 bagger allocating the same capital to call options only. Conversely, if you do not have a specific view, then options can be an expensive, low-resolution solution. You pay for specificity just like parlay bets. The timing and distance of a stock’s move must collaborate to pay you off.

So options, whether used explicitly for hedging or for speculating actually conform to a more over-arching definition of hedging — hedges are trades that isolate the investor’s risk.

The Hedging Paradox

If your trades have specific views or reasons, hedging is a good idea. Just like home insurance is a good idea. Whether you are conscious of it or not, owning a home is a bundle of bets. Your home’s value depends on interest rates, the local job market, and state policy. It also depends on some pretty specific events. For example, “not having a flood”. Insurance is a specific hedge for a specific risk. In The Laws Of Trading, author and trader Agustin Lebron states rule #3:

Take the risks you are paid to take. Hedge the others.

He’s reminding you to isolate your bets so they map as closely as possible to your original reason for wanting the exposure.

You should be feeling tense right about now. “Dude, I’m not a robot with a Terminator HUD displaying every risk in my life and how hedged it is?”.

Relax. Even if you were, you couldn’t do anything about it. Even if you had the computational wherewithal to identify every unintended risk, it would be too expensive to mitigate3. Who’s going to underwrite the sun not coming up tomorrow? [Actually, come to think of it, I will. If you want to buy galactic continuity insurance ping me and I’ll send you a BTC address].

We find ourselves torn:

  1. We want to hedge the risks we are not paid to take.
  2. Hedging is a cost

What do we do?

Before getting into this I will mention something a certain, beloved group of wonky readers are thinking: “Kris, just because insurance/hedging on its own is worth less than its actuarial value, the diversification can still be accretive at the portfolio level especially if we focus on geometric not arithmetic returns…rebalancing…convexi-…”[trails off as the sound of the podcast in the background drowns out the thought]. Guys (it’s definitely guys), I know. I’m talking net of all that.

As the droplets of caveat settle the room like nerd Febreze, let’s see if we can give this conundrum a shape.

Reconciling The Paradox

This is a cornerstone of trading:

Edge scales linearly, risk scales slower

[As a pedological matter, I’m being a bit brusque. Bear with me. The principle and its demonstration are powerful, even if the details fork in practice.]

Let’s start with coin flips:

[A] You flip a coin 10 times, you expect 5 heads with a standard deviation of 1.584.

[B] You flip 100 coins you expect 50 heads with a standard deviation of 5.

Your expectancy scaled with N. 10x more flips, 10x more expected heads.

But your standard deviation (ie volatility) only grew by √10 or 3.16x.

The volatility or risk only scaled by a factor of √N while expectancy grew by N.

This is the basis of one of my most fundamental posts, Understanding Edge. Casinos and market-makers alike “took a simple idea and took it seriously”. Taking this seriously means recognizing that edges are incredibly valuable. If you find an edge, you want to make sure to get as many chances to harvest it as possible. This has 2 requirements:

  1. You need to be able to access it.
  2. You need to survive so you can show up to collect it.

The first requirement requires spotting an opportunity or class of opportunities, investing in its access, and warehousing the resultant risk. The second requirement is about managing the risk. That includes hedging and all its associated costs.

The paradox is less mystifying as the problem takes shape.

We need to take risk to make money, but we need to reduce risk to survive long enough to get to a large enough number of bets on a sliver of edge to accumulate meaningful profits. Hedging is a drawbridge from today until your capital can absorb more variance.

The Interaction of Trading Costs, Hedging, and Risk/Reward

Hedging reduces variance, in turn improving the risk/reward of a strategy. This comes at a substantial cost. Every options trader has lamented how large of line-item this cost has been over the years. Still, as the cost of survival, it is non-negotiable. We are going to hedge. So let’s pull apart the various interactions to gain intuition for the various trade-offs. Armed with the intuition, you can then fit the specifics of your own strategies into a risk management framework that aligns your objectives with the nature of your markets.

Let’s introduce a simple numerical demonstration to anchor the discussion. Hedging is a big topic subject to many details. Fortunately, we can gesture at a complex array of considerations with a toy model.

The Initial Proposition

Imagine a contract that has an expected value of $1.00 with a volatility (i.e. standard deviation) of $.80. You can buy this contract for $.96 yielding $.04 of theoretical edge.

Your bankroll is $100.

[A quick observation so more advanced readers don’t have this lingering as we proceed:

The demonstration is going to bet a fixed amount, even as the profits accumulate. At first glance, this might feel foreign. In investing we typically think of bet size as a fraction of bankroll. In fact, a setup like this lends itself to Kelly sizing5. However, in trading businesses, the risk budget is often set at the beginning of the year based on the capital available at that time. As profits pile up, contributing to available capital, risk limits and bet sizes may expand. But such changes are more discrete than continuous so if we imagine our demonstration is occurring within a single discrete interval, perhaps 6 months or 1 year, this is a reasonable approach. It also keeps this particular discussion a bit simpler without sacrificing intuition.]

The following table summarizes the metrics for various trial sizes.

What you should notice:

  • Expected value grows linearly with trial size
  • The standard deviation of p/l grows slower (√N)
  • Sharpe ratio (expectancy/standard deviation) is a measure of risk-reward. Its progression summarizes the first 2 bullets…as trials increase the risk/reward improves

Introducing Hedges

Let’s show the impact of adding a hedge to reduce risk. Let’s presume:

  • The hedge costs $.01.

    This represents 25% of your $.04 of edge per contract. Options traders and market makers like to transform all metrics into a per/contract basis. That $.01 could be made up of direct transaction costs and slippage.

    [In reality, there is a mix of drudgery, assumptions, and data analysis to get a firm handle on these normalizations. A word to the uninitiated, most of trading is not sexy stuff, but tons of little micro-decisions and iterations to create an accounting system that describes the economic reality of what is happening in the weeds. Drunkenmiller and Buffet’s splashy bets get the headlines, but the magic is in the mundane.]

  • The hedge cuts the volatility in half.

Right off the bat, you should expect the sharpe ratio to improve — you sacrificed 25% of your edge to cut 50% of the risk.

The revised table:


  • Sharpe ratio is 50% higher across the board
  • You make less money.

Let’s do one more demonstration. The “more expensive hedge scenario”. Presume:

  • The hedge costs $.02

    This now eats up 50% of your edge.

  • The hedge reduces the volatility 50%, just as the cheaper hedge did.



  • The sharpe ratio is exactly the same as the initial strategy. Both your net edge and volatility dropped by 50%, affecting the numerator and denominator equally. 

  • Again the hedge cost scales linearly with edge, so you have the same risk-reward as the unhedged strategy you just make less money.

If hedging doesn’t improve the sharpe ratio because it’s too expensive, you have found a limit. Another way it could have been expensive is if the cost of the hedge stayed fixed at $.01 but the hedge only chopped 25% of the volatility. Again, your sharpe would be unchanged from the unhedged scenario but you just make less money.

We can summarize all the results in this chart.

The Bridge

As you book profits, your capital increases. This leaves you with at least these choices:

  1. Hedge less since your growing capital is absorbing the same risk
  2. Increase bet size
  3. Increase concurrent trials

I will address #1 here, and the remaining choices in the ensuing discussion.

Say you want to hedge less. This is always a temptation. As we’ve seen, you will make money faster if you avoid hedging costs. How do we think about the trade-off between the cost of hedging and risk/reward?

We can actually target a desired risk/reward and let the target dictate if we should hedge based on the expected trial size.

Sharpe ratio is a function of trial size:


E = edge
σ = volatility
N = trials

If we target a sharpe ratio of 1.0 we can re-arrange the equation to solve for how large our trial size needs to be to achieve the target.

If our capital and preferences allow us to tolerate a sharpe of 1 and we believe we can get at least 400 trials, then we should not hedge.

Suppose we don’t expect 400 chances to do our core trade, but the hedge that costs $.01 is available. What is the minimum number of trades we can do if we can only tolerate a sharpe as low as 1?

Using the same math as above (1/.075)2 = 178

The summary table:

If our minimum risk tolerance is a 1.5 sharpe, we need more trials:

If your minimum risk tolerance is 1.5 sharpe, and you only expect to do 2 trades per business day or about 500 trades per year, then you should hedge. If you can do twice as many trades per day, it’s acceptable to not hedge.

These toy demonstrations show:

  • If you have positive expectancy, you should be trading
  • The cost of a hedge scales linearly with edge, but volatility does not
  • If the cost of a hedge is less than its proportional risk-reduction you have a choice whether to hedge or not
  • The higher your risk tolerance the less you should hedge
  • The decision to dial back the hedging depends on your risk tolerance (as proxied by a measure of risk/reward) vs your expected sample size

Variables We Haven’t Considered

The demonstrations were simple but provides a mental template to contextualize cost/benefit analysis of risk mitigation in your own strategies. We kept it basic by only focusing on 3 variables:

  • edge
  • volatility
  • risk tolerance as proxied by sharpe ratio

Let’s touch on additional variables that influence hedging decisions.


If your bankroll or capital is substantial compared to your bet size (perhaps you are betting far below Kelly or half-Kelly prescribed sizes) then it does not make sense to hedge. Hedges are negative expectancy trades that reduce risk.

We can drive this home with a sports betting example from the current March Madness tournament:

If you placed a $10 bet on St. Peters, by getting to the Sweet 16 you have already made 100x. You could lock it in by hedging all or part of it by betting against them, but the bookie vig would eat a slice of the profit. More relevant, the $1000 of equity might be meaningless compared to your assets. There’s no reason to hedge, you can sweat the risk. But what if you had bet $100 on St. Pete’s? $10,000 might quicken the ole’ pulse. Or what if you somehow happened upon a sports edge (just humor me) and thought you could put that $10k to work somewhere else instead of banking on an epic Cinderella story? If St. Pete’s odds for the remainder of the tourney are fair, then you will sacrifice expectancy by hedging or closing the trade. If you are rich, you probably just let it ride and avoid any further transaction costs.

If you are trading relatively small, your problem is that you are not taking enough risk. The reason professionals don’t take more risk when they should is not because they are shy. It’s because of the next 2 variables.

Capacity Per Trade

Many lucrative edges are niche opportunities that are difficult to access for at least 2 reasons.

  • Adverse selection

    There might only be a small amount of liquidity at dislocated prices (this is a common oversight of backtests) because of competition for edge. 

    Let’s return to the contract from the toy example. Its fair value is $1.00. Now imagine that there are related securities that getting bid up and market for our toy contract is:

 bid  ask
.95 – 1.05

10 “up” (ie there are 10 contracts on the offer and 10 contracts bid for)

Based on what’s trading “away”, you think this contract is now worth $1.10.

Let’s game this out.

You quickly determine that the .95-1.05 market is simply a market-maker’s bid-ask spread. Market-makers tend to be large firms with tentacles in every related market to the ones they quote. It’s highly unlikely that the $1.05 offer is “real”. In other words, if you tried to lift it, you would only get a small amount of size.

What’s going on?

The market-maker might be leaving a stale quote to maximize expectancy. If a real sell order were to come in and offer at $1.00, the market maker might lift the size and book $.10 of edge to the updated theoretical value. 

Of course, there’s a chance they might get lifted on their $1.05 stale offer but they might honor only a couple contracts. This is a simple expectancy problem. If 500 lots come in offered at $1.00, and they lift it, they make $5,000 profit ($.10 x 500 x option multiplier of 100). If you lift the $1.05 offer and they sell you 10 contracts, they suffer a measly $50 loss. 

So if they believe there’s a 1% chance or greater of a 500 lot naively coming in and offering at mid-market then they are correct in posting the stale quote.

What do you do?

You were smart enough to recognize the game being played. You used second-order thinking to realize the quote was purposefully stale. In a sense, you are now in cahoots with the market maker. You are both waiting for the berry to drop. The problem is your electronic “eye” will be slower than the market-maker to snipe the berry when it comes in. Still, even if you have a 10% chance of winning the race, it still makes sense to leave the quote stale, rather than turn the offer. If you do manage to get at least a partial fill on the snipe, there’s no reason to hedge. You made plenty of edge, traded relatively small size, and most importantly know your counterparty was not informed!

As a rule, liquidity is poor when trades are juiciest. The adverse selection of your fills is most common in fast-moving markets if you do not have a broad, fast view of the flows. This is why a trader’s first questions are “Do I think I’m the first to have seen this order? Did someone with a better perch to see all the flow already pass on this trade?”

In many markets, if you are not the first you might as well be last. You are being arbed because there’s a better relative trade somewhere out there that you are not seeing.

[Side note: many people think a bookie or market-maker’s job is to balance flow. That can be true for deeply liquid instruments. But for many securities out there, one side of the market is dumb and one side is real. Markets are often leaned. Tables are set when certain flows are anticipated. If a giant periodic buy order gets filled at mid-market or even near the bid, look at the history of the quote for the preceding days. Market-making is not an exercise in posting “correct” markets. It’s a for-profit enterprise.]

  • Liquidity

    The bigger you attempt to trade at edgy prices, the more information you leak into the market. You are outsizing the available liquidity by allowing competitors to reverse engineer your thinking. If a large trade happens and immediately looks profitable to bystanders, they will study the signature of how you executed it. The market learns and copies. The edge decays until you’re flipping million dollar coins for even money as a loss leader to get a look at juicier flow from brokers. 

    As edge in particular trades dwindles, the need to hedge increases. The hedges themselves can get crowded or at least turn into a race.


If a hedge, net of costs, improves the risk/reward of your position, you may entertain the use of leverage. This is especially tempting for high sharpes trades that have low absolute rates of return or edge. Market-making firms embody this approach. As registered broker-dealers they are afforded gracious leverage. Their businesses are ultimately capacity constrained and the edges are small but numerous. The leverage combined with sophisticated diversification (hedging!) creates a suitable if not impressive return on capital.

The danger with leverage is that it increases sensitivity to path and “risk of ruin”. In our toy model, we assumed a Gaussian distribution. Risk of ruin can be hard to estimate when distributions have unknowable amounts of skew or fatness in their tails. Leverage erodes your margin of error.

General Hedging Discussion

As long as hedging, again net of costs, improves your risk/reward there is substantial room for creative implementation. We can touch on a few practical examples.

Point of sale hedging vs hedging bands

In the course of market-making, the primary risk is adverse selection. Am I being picked off? If you suspect the counterparty is “delta smart” (whenever they buy calls the stock immediately rips higher), you want to hedge immediately. This is a race condition with any other market makers who might have sold the calls and the bots that react to the calls being printed on the exchange. That is known as a point-of-sale hedge is an immediate response to a suspected “wired” order.

If you instead sold calls to a random, uninformed buyer you will likely not hedge. Instead, the delta risk gets thrown on the pile of deltas (ie directional stock exposures) the firm has accumulated. Perhaps it offsets existing delta risk or adds to it. Either way, there is no urgency to hedge that particular deal.

In practice, firms use hedging bands to manage directional risk. In a similar process to our toy demonstration, market-makers decide how much directional risk they are willing to carry as a function of capital and volatility. This allows them to hedge less, incurring less costs along the way, and allowing their capital to absorb randomness. Just like the rich bettor, who lets the St. Peter’s bet ride.

In The Risk-Reversal Premium, Euan Sinclair alludes to band-based hedging:

While this example shows the clear existence of a premium in the delta-hedged risk-reversal, this implementation is far from what traders would do in practice (Sinclair, 2013). Common industry practice is to let the delta of a position fluctuate within a certain band and only re-hedge when those bands are crossed. In our case, whenever the net delta of the options either drops below 20 or above 40, the portfolio is rebalanced by closing the position and re-establishing with the options that are now closest to 15-delta in the same expiration.

Part art, part science

Hedging is a minefield of regret. It’s costly, but the wisdom of offloading risks you are not paid for and conforming to a pre-determined risk profile is a time-tested idea. Here’s a dump of concerns that come to mind:

  • If you hedge long gamma, but let short gamma ride you are letting losers grow and cutting winners short. Be consistent. If your delta tolerance is X and you hedge twice a day, you can cut all deltas in excess of X at the same 2 times every day. This will remove discretion from the decision. (I had one friend who used to hedge to flat every time he went to the bathroom. As long as he was regular this seemed reasonable to me.)

  • Low net/high gross exposures are a sign of a hedged book. There are significant correlation risks under that hood. It’s not necessarily a red flag, but when paired with leverage, this should make you nervous. 

  • Are you hedging your daily, weekly, or monthly p/l? Measures of local risk like Greeks and spot/vol correlation are less trustworthy for longer timeframes. Spot/vol correlation (ie vol beta) is not invariant to price level, move size, and move speed. Longer time frames provide larger windows for these variables to change.  If oil vol beta is -1 (ie if oil rallies 1%, ATM vol vols 1%) do I really believe that the price going from 50 to 100 cuts the vol in half?

  • There are massive benefits to scale for large traders who hedge. The more flow they interact with the more opportunity to favor anti-correlated or offsetting deltas because it saves them slippage on both sides. They turn everything they trade into a pooled delta or several pools of delta (so any tech name will be re-computed as an NDX exposure, while small-caps will be grouped as Russell exposures). This is efficient because they can accept the noise within the baskets and simply hedge each of the net SPX, NDX, IWM to flat once they reach specified thresholds.

    The second-order effect of this is subtle and recursively makes markets more efficient. The best trading firms have the scale to bid closest to the clearing price for diversifiable risk6. This in turn, allows them to grab even more market share widening their advantage over the competition. If this sounds like big tech7, you are connecting the dots. 

Wrapping Up

The other market-makers in the product options pit were not wrong to hedge or close their trades as quickly as they did. They just had different constraints. Since they were trading their own capital, they tightly managed the p/l variance.

At the same time, if you were well-capitalized and recognized the amount of edge raining down in the market at the time, the ideal play was to take down as much risk as you could and find a hedge with perhaps more basis risk (and therefore less cost because the more highly correlated hedges were bid for) or simply allow the firm’s balance sheet to absorb it.

Since I was being paid as a function of my own p/l there was not perfect alignment of incentives between me and my employer (who would have been perfectly fine with me not hedging). If I made a great bet and lost, it would have been the right play but I personally didn’t want to tolerate not getting paid.

Hedging is a cost. You need to weigh that with the benefit and that artful equation is a function of:

  • risk tolerance at every level of stakeholder — trader, manager, investor
  • capital
  • edge
  • volatility
  • liquidity
  • adverse selection

Maximizing is uncomfortable. Almost unnatural. It calls for you to tolerate larger swings, but it allows the theoretical edge to pile up faster. This post offers guardrails for dissecting a highly creative problem.

But if you consistently make money, ask yourself how much you might be leaving on the table. If you are making great trades somewhere, are you locking it in with bad trades? If you can’t tell what the good side is that’s ok.

But if you know the story of your edge, there’s a good chance you can do better.

Momentum Psychology

His tweet brought my attention to @cobie and his masterful description of psychology.

I also appreciated Josh Brown’s take on the sell-off in so-called growth or momentum names. Here’s an excerpt from Jan 31’s It’s not over yet.

I’m less interested in the real-time action. Focus the evergreen psychology instead:

Where do bounces come from in a midst of a correction?

Sometimes it’s just that stocks have fallen too far for sellers to want to keep selling. This isn’t bullish. In fact, this type of bounce can suck people back in by creating the appearance that the worst is over. Growth stocks in particular. Because belief dies hard and enthusiasm for cutting edge technologies fades slowly, not suddenly. Which mean the give-up process is long and drawn out – even after a stock is cut in half sometimes the worst is still yet to come. The slow bleed after is often worse than the initial shocking drop that preceded it.

Over at Verdad Capital, Dan Rasmussen revisits their “Bubble 500” list of overpriced growth stocks, originally created in the Summer of 2020. It’s filled with money-losing companies working in exciting areas of technology such as electric vehicles and gene editing therapy and so on. Needless to say, this list of bubble stocks has gotten absolutely destroyed year-to-date, after having run straight up in Verdad’s face through the middle of 2021.  Dan explains two very important things in his update this week: The first is that sell-offs for growth stocks differ from sell-offs for value stocks in one very important way:

This breakdown is significant, especially for growth stocks. Remember, growth stocks trend, and value stocks mean revert. The psychology is simple. People hear about a hot stock that’s gone up 3x, they buy some, it goes up 2x, they buy more: the whole attraction of buying a hot growth stock is the historic return trajectory. Value stocks are the opposite: you do well buying them when they’re down…

This idea is counterintuitive – that some stocks actually become worse buys as they are falling to lower prices, but the explanation is psychological, not financial. Stocks trading at excessive valuations require a fan base to sustain their share prices. That fan base is often a bandwagon-jumping melange of traders and investors who are attracted to recent gains. Yes, they’ll latch onto the fundamental story, but the fact that the stock has been and currently is going up is the main thing. When the stock breaks, so too does the fandom. And when the fan base moves on to greener pastures or runs out of money, a new fan base will not form for this stock with its chart in decline. Broken growth stocks become orphans. There is no natural place for them to find a home.

Momentum is a divergent strategy while “value” is a mean-reverting strategy. Several years ago the research team at OSAM published edifying papers on how these approaches work. I wrote a summary here:

✍️ Notes on OSAM’s Factors from Scratch (6 min read)

Value works by fading overreaction. Momentum is attributed to underreaction. In a name trending higher, the sellers are discounting the substance of new information too aggressively. In dork world, we call this anchoring. If you pay attention to “anomalies” you may recognize the concept of post-earnings drift as an acute example of anchoring. Wikipedia even has an entry for it.

Fear or FOMO in markets cuts both ways. On the way down, we fear a loss of wealth. On the way up we fear social embarrassment — we aren’t keeping up with our neighbors. We are caught between self-preservation and shame. I wonder if being part of the herd is any consolation on the way down while everyone loses. Or is this just another miserable psychological asymmetry inseparable from speculation?

Anyway, I don’t have much to add. Investing requires you to be honest about your desires, constraints, and emotional tolerance. If you can get honest with yourself, you can initiate a plan that you can stick to. You want to avoid ad-hoc decisions with the bulk of your savings (I’m not gonna poo poo on gambling with 1 or 2% of your wealth, especially if it suppresses wider risk-seeking behavior. Agustin Lebron’s Laws of Trading has a provocative section about “risk set points” that operate like weight set points. If your life becomes too dull in one way you spice it up in another and vice versa. Maybe Alex Honnold’s portfolio is all in bonds 🤷🏽).

Oh and just a quick observation that you can ponder in the context of those flashy, earningless momentum stocks. If you start at $1 and double 6x you get to $64. If the stock drops 50%, you’ve only erased 1 halving.

Be careful knife catching.

There’s Gold In Them Thar Tails: Part 2

This is Part 2 of a discussion of how sourcing talent or outcomes in the tails or extremes of a distribution call for our selection criteria to embrace more variance than searches in the heart of a distribution. To catch up please read There’s Gold In Them Thar Tails: Part 1.

If you can’t be bothered here’s the gist:

  1. We saw that an explosion of choice whether it’s a job or college applicants, songs to listen to, athletes to recruit has made selection increasingly difficult.
  2. A natural response is to narrow the field by filtering more narrowly. We can do this by making selection criteria stricter or deploying smarter algorithms and recommendation engines.
  3. This leads to increased reliance on legible measurements for filtering.
  4. Goodhart’s law expects that the measures themselves will become the target, increasing the pressure on candidates to optimize for narrow targets that are imperfect proxies or predictors of what the measure was filtering for.
  5. Anytime we filter, we face a trade-off between signal (“My criteria is finding great candidates”) and diversity. This is also known as the bias-variance trade-off.
  6. Diversity is an essential input to progress. Nature’s underlying algorithm of evolution penalizes in-breeding.
  7. In addition to a loss of diversity, signal decays as you get closer to the extremes. This is known as tail divergence. The signal can even flip (ie Berkson’s Paradox).
  8. The point where the signal noise overwhelms the variance in the candidates is an efficient cutoff. Beyond that threshold, selectors should think more creatively than “just raise the bar”.

At the end of part 1, there were strategies for both the selector and the selectees to increase diversity to improve outcomes in the extremes.

If narrower filters are less effective in the tails (ie more noise, weaker correlations between criteria and match quality), we should be intentional about the randomness we introduce to the process. A 1500 SAT is a noisy predictor of “largest alumni donor 20 years from now”. Instead, accept the 1350 SAT from the homeschooled kid in Argentina. Experiment with criteria and let chance retroactively hint at divergent indicators that you would never have thought to test. One of the benefits of such an experiment is that if you are methodical about how you introduce chance you can study the results for a hidden edge. If nobody else has internalized this thinking because they think it’s too risky (it’s not…the signal of the tighter filter had already degraded), then you have an opportunity to leap ahead of your competitors who underestimate the optionality in trying many recipes and keeping the ones that taste good. You tolerate some mayonnaise liver sandwiches before you discover pb&j.

In part 2, we reflect on what tail divergence says about life and investing.

Where Instincts Fail

Tail divergence is the simple observation that attributes that correlate with certain outcomes lose their predictive ability as we get into the extremes. If you are 6’7, you’re better at basketball than most of the population. But you couldn’t step foot on the hardwood with the lowly Rocket’s 12th man. Taken further, Berkson’s Paradox shows that it’s possible for the correlation to flip. LessWrong thinks the flippening may be causal because of too much of a good thing:

Maybe being taller at basketball is good up to a point, but being really tall leads to greater costs in terms of things like agility… Maybe a high IQ is good for earning money, but a stratospherically high IQ has an increased risk of productivity-reducing mental illness. Or something along those lines.

The safest generalization to absorb:

When speculating about the tails of a distribution your intuition is less reliable. 

If you can pinpoint causality, that’s a bonus. Simply realizing your guesses about extremes is random is an advantage. It splits your brain wide open to get your imagination oxygen. 

Behavioral psychology recognizes the usefulness of heuristics to make judgements while highlighting how “biases” such as framing can short-circuit our “System 1” machinery. Intuition is a useful guide when we have deep experience in a domain, but we should seek external data (base rates) or guidance when we stray from the mundane.

If our intellectual adventures take us from “mediocrastian” to “extremistan” then data is not necessarily a helpful tour guide. It can even be harmful if it encourages a false sense of security or a load-bearing assumption that turns out to be hollow 1

A recent example of intuition failing in an extreme scenario still stings. When Covid first started spreading in the US, asset prices and city rents dove lower. Financial markets stabilized and began recovering when the government commit to replacing lost demand with an unprecedented fiscal package for an unprecedented event. My suburban house shot up 15% in value as locked-down city dwellers wanted more space. Seeing the divergence between home price and rentals, I quickly diagnosed the home price bump as a premium needed to absorb a sudden, but transitory urban exodus until we could get a vaccine. While it wasn’t the main consideration for selling the “trade setup” was not lost on me. My intuition in this extreme scenario couldn’t have fathomed that the price would shoot 20% more (and still going, ughh) through where I sold as the lockdowns lifted. My trading intuition degrades less gracefully than I’d like to admit as the orbits get further from financial options. 

Moral Intuition

As technology and science fiction converge, it would be dangerous to lazily extrapolate how we handle routine computer-enabled behavior to edge cases. If you have ever played dark forms of “would you rather?” then you are already familiar with the so-called trolley problem:


The Conversation explains the so-called trolley problem in the context of self-driving cars:

The car approaches a traffic light, but suddenly the brakes fail and the computer has to make a split-second decision. It can swerve into a nearby pole and kill the passenger, or keep going and kill the pedestrian ahead.

This is spiky terrain. What is the value of a life? This is not a novel dilemma. In Tails Explained, I show how courts use probabilities of accidental (ie rare) deaths to estimate tort damages. What is novel is the scale of these considerations once robots take the wheel. The giant fields of AI safety and ethics are proof that scaling up tort law is not going to cut it. We are forced to explicitly study realms that ancient moralities only needed to consider rhetorically. 

In Spot The Outlier,  Rohit writes:

the systems we’d developed to intuit our way through our lives have difficulty with contrived examples of various trolley problems, but that’s mainly because our intuitions work in the 80% of cases where the world is similar to what we’ve seen before, and if the thought experiment is wildly different (e.g., Nozick’s pleasure machine) our intuitions are no longer a reliable guide.

In The Tails Coming Apart As A Metaphor For Life, Slatestarcodex says:

This is why I feel like figuring out a morality that can survive transhuman scenarios is harder than just finding the Real Moral System That We Actually Use. There’s a potentially impossible conceptual problem here, of figuring out what to do with the fact that any moral rule followed to infinity will diverge from large parts of what we mean by morality.

A wave of exponential automation threatens to capsize our moral rafts. Slatestar invokes one of my favorite paragraphs2 of all-time to make his point. 

When Lovecraft wrote that “we live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far”, I interpret him as talking about the region from Balboa Park to West Oakland on the map above [This is a metaphor for moral territory he builds in the full post].

Go outside of it and your concepts break down and you don’t know what to do.

The full opening paragraph of Call Of Chtulu deserves your eyes:

The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.

Slatestar edits Lovecraft:

The most merciful thing in the world is how so far we have managed to stay in the area where the human mind can correlate its contents.

This is not an optimistic outlook for our ability to reconcile our based local morality with a species-level perspective. Reasoning about extremes is more futile than we’d like to think. As we  search for outliers, we need humility. 

Even The Math Prescribes Humility

Let’s translate tail divergence to math terms. We discussed how SAT has predictive power of GPA. The issue is that this power loses efficacy as we get to the top-tier of GPAs, just as being tall starts to tell us less about the best basketball players once we are dealing with the sample that has made it to the NBA. 

This loss of signal manifests as a correlation breakdown over some range of the X or explanatory variable. This is the result of the error terms or variance in a regression increasing or decreasing over some range. The fancy word for this is “heteroscedasticity”. 

See this made-up example from 365DataScience:

The variance of the errors visibly changes as we move from small values of X to large values. 

It starts close to the regression line and goes further away. This would imply that, for smaller values of the independent and dependent variables, we would have a better prediction than for bigger values. And as you might have guessed, we really don’t like this uncertainty.

Ordinary least squares (ie OLS) regression is a common technique for computing a correlation. However, equal variance (homoscedasticity) is one of the 5 assumptions embedded in OLS. Tail divergence is evidence that the data set violates this assumption, so we shouldn’t be surprised when the filters we used in the meat of the distributions lose efficacy in the extremes. 

If you broke the regression into 2 separate lines, one for the low to middle range of SAT scores and one for the top decile of SAT scores we could compute different correlations to GPA. If the tails diverge, we would see a lower correlation for the higher range. Correlations even as high as 80% have discouraging amounts of explanatory power. 

For the derivation, see From CAPM To Hedging.

We shouldn’t be surprised when the most successful person from your 8th grade class, wasn’t a candidate for the “most likely to succeed” ribbon. The qualities that informed that vote leave a lot of “risk remaining” when trying to predict the top performers in the wide-open game of life. 

Since the nature of extremes are untamed, we need humility. This is true, but abstract. What does “humility” mean practically? It means making decisions that are robust to the lack of determinism in the tails. In fact, we can construct approaches that actively seek to harness the variance in the tails. 

The world of trading and investing is a perfect sandbox to explore such approaches.

Take Advantage of Poor Tail Intuition In Investing

I know the heading is ironic. 

Let’s see if we can use “option-like” approaches to use the divergence or uncertainty in the tails to our advantage. 

Respect Path

Rohit summarized the argument succinctly:

If measurement is too strict, we lose out on variance. 
If we lose out on variance, we miss out on what actually impacts outcomes.

Tails are unpredictable by the same models that might be well-suited for routine scenarios. In fact, rare outcomes can be stubbornly resistant to description by any models in a complex system. The robust response to this situation is not to lean into our models but to relax the filters in favor of diversity, which increases our chance of capturing an outcome nobody has foreseen, because, by definition, nobody’s model could have predicted (and therefore bid it up) in the first place. 

How do you do that?

2 words: Respect. Path. 

Recall from part 1, that David Epstein’s research-based suggestion:

One practice we’ve often come back to: not forcing selection earlier than necessary. People develop at different speeds, so keep the participation funnel wide, with as many access points as possible, for as long as possible. I think that’s a pretty good principle in general, not just for sports.

What does this mean in a trading context?

This is easy to explain by its opposite. Let’s rewind a decade. Jon Corzine managed to blow up MF Global by focusing on the belief that European bonds (remember the Greek bond crisis?) would pay out in the end and placing that bet with extreme leverage. While the bonds eventually paid out, the margin calls buried MF Global. This is a common story. I chose it because it exemplifies how a lack of humility is the murder weapon. 

The moment you employ leverage, you are worshiping at the altar of path. Corzine refused to make the appropriate sacrifices to the gods. He focused on the terminal value of the bonds. A focus so myopic, Corzine still stubbornly clings to the idea that he was right. [I once went to dinner with an option trader who worked closely with Corzine. He described him as both smart and unfazed in his path-blindness. I’d like to take issue with “smart” but he’s the one giving a fortune away, so I’ll just shut up.] 

He might be rich, but if you were a stakeholder or client in MF Global, he’s a villain. Let’s not be like Jon Corzine. 

Ways To Respect Path

Treat leverage with respect

The most common forms of financial leverage we employ are mortgages. The primary path risk here is needing to re-locate suddenly and potentially needing to sell at a bad time. If there are many potential forks on your horizon, the liquidity in renting can be worth it3.


“Rebalance timing luck”

This is a term coined by Corey Hoffstein in his paper The Dumb (Timing) Luck of Smart Beta. First of all, this topic is central to any analysis of performance. You can have 10 different trend-following strategies with the same approximate rules but if they vary in their execution by a single day, the impact of luck can be tyrannical. Imagine one strategy was long oil the day it went negative, another strategy got out of the position one day earlier. Is the difference in performance predictive? It’s a bedeviling issue for allocators trying to parse historical returns. 

If timing is not part of your alpha, then leaving it to chance can swamp the edge you worked so hard to find, capture, and market to investors. This is a recipe for disappointment for either the manager (who gets unlucky) or the investor who chose the fund from a crop of competitors based on noise. 

Respecting path means smoothing the effect of rebalance timing luck. This is commonly done by dividing a single strategy into multiple strategies differing only by their rebalance schedule. The ensemble will average the luck across executions, hopefully adhering the results closer to its intended expression. 

Path vs terminal value thinking

Corzine had a terminal value opinion (“if I hold these bonds to maturity I’ll get paid”). Still, any trade that is marked-to-market must still weather path. Leverage makes the trade acutely fragile with respect to path. Even if his bet was a good one at the time, the expression was negligent because it did not properly reflect his constraints. 

It’s critical that the expression of a bet clings closely to its thesis. If you want to bet on the final outcome of a trade, you need to insulate the expression from path. Similarly, you can bet on path while being indifferent to the final outcome. For example, a momentum investor may devise a rule-based strategy to levitate with an inflating bubble but exit before holding the bag. These participants bet on path not terminal value. The past few years have glorified such a game of hot potato. 

Whether this game of hot potato is really a game of Russian roullete depends on the expression. Many momentum strategies use stops or trailing stops to escape a trade where the trend has petered out or reversed. This expression mimics a long option position. They are creating unbounded upside and limiting their downside. This expression is banking on a dangerous assumption: liquidity. They are constructing a “soft” option presumably because they think it’s cheaper than purchasing a financial or what I call a “hard” or contractual option.

Let’s ignore realized volatility which is a first order determinant of whether the option is cheaper. The biggest problem is gap risk. Soft-option constructions assume continuity. But we know technology breaks, markets close, stocks get halted, countries invade each other, exchanges cancel trades. Pricing gap risk is impossible. That’s why derivative traders say the only hedge for an option is a similar option. Trading strategies are said to be robust to model risk if they contain offsetting exposures to the same model. If you’re short a call option on TSLA the only real hedge  is to be long a different TSLA call. Reliance on the mathematical model cancels out. 

Zooming in on options (feel free to skip and jump down to Investing for Path)

Some market participants focus on terminal value or the “long run” while others are focused on path. The price of options are consensus mechanisms that balance both views. I discussed this in What The Widowmaker Can Teach Us About Trade Prospecting And Fool’s Gold:

The nat gas market is very smart. The options are priced in such a way that the path is highly respected. The OTM calls are jacked, because if we see H gas trade $10, the straddle will go nuclear.

Why? Because it has to balance 2 opposing forces.

        1. It’s not clear how high the price can go in a true squeeze or shortage
        2. The MOST likely scenario is the price collapses back to $3 or $4.
Let me repeat how gnarly this is.
The price has an unbounded upside, but it will most likely end up in the $3-$4 range.
Try to think of a strategy to trade that.
Good luck.
        • Wanna trade verticals? You will find they all point right back to the $3 to $4 range.
        • Upside butterflies which are the spread of call spreads (that’s not a typo…that’s what a fly is…a spread of spreads. Prove it to yourself with a pencil and paper) are zeros.
The market places very little probability density at high prices but this is very jarring to people who see the jacked call premiums.
That’s not an opportunity. It’s a sucker bet.

Investors with different time horizons often trade with each other. It’s even possible they have the same long-term views but Investor A thinks X is overbought in the near-term and sells to Investor B who just wants to buy-and-hold. Investor A is hoping to buy X back cheaper. They are trying to time the market and generate trading P/L, expecting to find a more attractive entry to X later. Perhaps A is a trader more than an investor. A is obsessively conscious of near-term opportunity costs or hurdle rates. As an options trader, I am generally more focused on path than terminal value. 

Let’s see how trade expression varies with your lens of terminal value vs path. 

Static Expressions

A static trade expression means you put your trade on and leave it alone until some pre-defined catalyst. For options this is typically expiration. The reason you might do this is you are aware that you cannot predict the path but do not want to be shaken out of the position because you like the odds the market is offering on the terminal value of a proposition. To use natural gas, suppose the gas futures surge to $6 amidst a polar vortex but you think there is a 25% chance the price falls to $4.50 by expiration.

Suppose you can buy a vertical spread that pays 4-1 on that proposition. The bet is positive expectancy so you decide to take it. This is a discrete bet. The worst-case scenario is losing your premium. You can size the trade by feel (I’m willing to risk 1% to make 4%) or some version of Kelly sizing. Instead of trading towards a target amount of risk (whether that’s delta, vega, etc) you budget a fixed dollar amount towards it and let it ride. I refer to this type of bet as “risk-budgeting”.

When “risk-budgeting” a trade you specify a fixed bet size and you do not use leverage or pseudo-leverage (for example taking a short option position which demands margin). The point is to set-it-and-forget-it. 

These types of trades were a small minority of my allocations, but they are the easiest to manage. By design, you are not getting cute with the expression, because you expect the path to your possible outcome to be hairy. This is a self-aware strategy for respecting path.

Dynamic Expressions

Most of my trades were actively managed.  Running a large options portfolio means lots of churn as you whack-a-mole opportunities. You find more attractive positions to warehouse than what’s currently on the books, or perhaps you are adding to get to a more full-size position.

The key is most of the focus is on path not terminal value. Sometimes I’m buying vol because I have a view on volatility, but often I’m buying vol if I think there are going to be more vol buyers. The first kind of buying is a hybrid of path and terminal value thinking, but the second type of vol buying has a momentum mindset. My view on realized vol takes a backseat to my view on flows if I think the option demand will exceed supply at current levels of implied volatility. 

Other dynamic trade expressions:

    1. Implied sentiment

      Another path-aware expression is to bet on the expectations embedded in prices. I might load up on oil calls not because I think oil is going to $200, but because I think the awareness that such a price is possible can emerge due to some catalyst (“saber-rattling”). I’m thinking in terms of path not terminal value when my thesis is “sentiment can go from apathy to fear”. I’m betting on a change in the Overton Window. The change in sentiment can increase call option implied vols and even the futures. But the option trade expression is a purer play than the futures.

      [The number of ways an oil future can rise is greater than the number of drivers to push oil call skew higher, so the call options isolate the thesis better by being directly levered to it. Agustin Lebron’s 3rd Law Of Trading: Only take the risks you are paid to take.]

    2. Owning the wing

      Tail options are on average “expensive” in actuarial terms. But there are several reasons why I do not short them. 

      1. “Average” is hiding a lot of detail. The excess premium in those options can be proportionally small to what those options can be worth conditional on stressed states of the world. Buying them when they are relatively cheap to their own elevated premiums can be worthwhile, especially if those options put you in the driver’s seat when the world starts melting down. If you are the only one with bullets in a warzone, there’s a good chance you have them because the terminal-value-Jon-Corzine crowd underestimated path. Then you can sell the options “closing” at truly outlandish prices. I want the tails because I don’t want to be running a trading business with a prime broker’s trapdoor beneath me. 

      2. I’m not smart enough to know when to sell tail options opening. I buy them when they are relatively cheap (which usually still means expensive to Corzine brains) and I sell them closing when they go nuclear. Like when you throw some insane offer out there and it gets taken. As a rule you don’t want to sell wings to someone who spent more than a few moments thinking about it or used a spreadsheet or model or calculator or star chart. You sell them to people who are forced to buy them. When Goldman blows their customer out they don’t haggle. 

        In practice, ratio put spreads look attractive to terminal value people who like to “buy the one and sell the two” because their breakeven is so “far” out-of-the-money and they get to win on medium drawdowns. I often like to sell the 1 and buy the 2 because conditional on the 1×2 “getting there”, the 2 are going to be untouchable. 

        [The buyer of the one in a 1×2 is happiest in the grinding trend scenario where strike vols underperform the skew.]

      3. In The “No Easy Trade” Principle I explain how implied market parameters do not vary as widely as realized parameters because markets are discounting machines4.

        Markets bet on mean reversion. Vols often underreact when they are rising (or falling) as the regime changes. These turns can be great path trades. They are momentum opportunities to lift or hit slower participants who are anchored to the prior regime. These opportunities are very profitable since you are not only putting the bet on the right way, but you are able to get liquidity from stale actors. (The trouble with many opportunities is getting liquidity — if you know something is going up but everyone else does too, your signal is valid but insufficiently differentiated. Turning every measly 5 lot offer into a new bid makes the market more efficient without extracting a reward for it. In fact, if you do that, you don’t understand expectancy or the principle of maximization. Your job isn’t to correct incorrect markets. It’s to make money. The overlap is imperfect.) The challenge is you somehow need to not be anchored yourself 5.

      4. Humility is recognizing that the craziest event has yet to happen. Market shocks are a feature. They look different every time because we prepare for the last war. The instruments that measure our vitals become the targets themselves. Tail options provide volatility convexity, or exposure to “vol of vol”. You don’t need to know the nature of the next shock to know that you will have wanted vol convexity. See Finding Vol Convexity. 

Combining Expressions

I’ll mention this for completeness but it’s a topic I should probably do a video for. It’s not complicated but it’s a bit technical for a post like this. When running an options book, it’s possible to treat some of the positions dynamically and some of them statically. In practice, I “remove” line items that have well-defined risks from of my position at the most recent mark-to-market value so that I do not incorporate their Greeks into my book. I don’t hedge it with the rest of the pile.

For example, if I notice an out-of-the-money put spread on my books, instead of dynamically managing a position that was short a tail, I’d put the spread in another account and sell the corresponding delta hedge associated with it. Going forward it would not generate any Greeks in my main risk view so there’s no need to hedge (remember hedging is a cost). The risk is sequestered to the premium. Let’s say it’s $75,000 worth of put spreads. The expectancy of the spread is presumably zero, so it’s like having a simple over/under bet on the books. If expiration goes my way I get to make a multiple of that, but I know the worst (and most likely) case is losing $75k which given the size of the book is noise. If my capital swamps the risk, there’s no point in hedging it especially since it’s short a tail that’s sensitive to vol of vol.


Investing for path


Venture capital is a strategy that is robust to path. The fact that the portfolio marks are fairy dust helps, but in this context is not important. Why is venture a strategy that exploits divergence in the tails?

Because from its construction, it admits it doesn’t know much. If you believe you are sampling from start-ups that have a power-law distribution (admittedly a big “if”), then the correct strategy is indeed to “spray and pray”6.

Byrne Hobart piggybacks Jerry Neumann in his explanation:

One of my favorite blog posts on venture returns is Jerry Neumann’s power laws in venture. His key point is that if venture returns follow a power-law distribution, average returns rise indefinitely as you get a bigger sample set. There is no well-defined mean! If you measure adult height, you quickly converge on 5’9” for American men and 5’4” for American women. You will find outliers, but they’re equally common at both ends of the distribution. But if you measure startup investing returns, you’ll keep getting tripped up: flop, failure, failure, flop, Google, fad, fraud, freaky scandal, Facebook…

Does this imply that the ideal strategy for venture is to invest in as many companies as possible? If you’re sampling from a power-law distribution, that’s what you should do. 

Lux Capital partner Josh Wolfe’s approach epitomizes the spirit of searching for gold in the tails. On Invest Like The Best, he explained his investing beliefs:

  • Confident that curiosity, following leads, and relentlessness will lead you to the next idea.
  • Confident you won’t know when or how you happen upon the idea.
  • Confident that the idea lies in the edges of companies that are doing innovative things, often from first principles or science, and very few people are looking there.

These principles propagate from a commitment to benefitting from optionality and positive convexity of non-linear relationships. 

The key line follows:

When analyzing how they found deals it only made linear, narrative sense after the fact.

This is reinforced in On Contrarianism, where I quote Wolfe as well as Marc Andreesen and trader Agustin Lebron on why the best investments start out controversial. The gist is that an idea must be so radical and far-fetched that it doesn’t get bid up while also being possible. The intersection of great ideas after-the-fact that sound dumb before-the-fact is nearly invisible. Most ideas people think are dumb, are indeed, dumb. Venture understands this and systematically wraps a sound process around a low hit rate. 

“Gorilla” Investing

Gorilla investing is another strategy designed to look like a long option. The gist of it is to invest an equal amount in a list of candidates that are competing for a giant market. As the winners start pulling away, you shed the losers and reallocate the proceeds back into the winners. 

Since it rebalances away from losers into winners, it explicitly bets against mean reversion. It’s a divergent strategy that growth investors employ in winner-take-all sectors7.

The strategy requires extensive judgment, but I highlight it as another example of an investing algorithm with roots in epistemic humility. If you want to learn more about this strategy see the notes for Gorilla Game or pick up the book. 


Like venture or Rohit’s advice on recruiting, gorilla investing casts a wide net from a sufficiently narrowed field and lets attrition decide where to allocate more. In Where Does Convexity Come From? I explain that that the essence of convexity is a non-linear p/l resulting from a change in your position size in the same direction as the return of your position. Your exposure to a winning trade grows the more it wins. 

Byrne writes:

Since venture success is defined by dealflow, i.e. by whether or not you have a chance to invest in the hottest companies, the main function of the Series A investment is to get a chance to invest in Series B and Series C and so on. Arguably, the better the fund, the more of its real value today consists of pro-rata rights rather than the investments themselves.

That’s a general case of positive convexity: the better the situation, the higher your exposure.

This is the essence of capturing the upside when our signals struggle to parse winners from an exclusive field. If we cannot predict what will happen in the tails, the next best thing is the ability to increase our exposure to momentum when it’s going our way. This begins with humility and funneling wider than our instincts suggest. From that point, we let actual performance provide us with incremental information on what works and what doesn’t.

Contrast this with a model that takes itself more seriously than tail correlations warrant. The model is filtering prematurely. We don’t look for tomorrow’s star athletes amongst the best 8-year-olds because we know puberty is a reshuffling machine.  

Keep in mind:

  • Correlations break down or invert in the extreme
  • Make your selections robust to path or possibly taking advantage of it. 
  • Systematize finding gold in diversity. There’s a decent chance others won’t be looking there. 

Happy prospecting!