Reasoning Through A Housing Trade Out Loud

Today I’ll share a personal investing story. It’s in the thinking-out-loud category. I can see the spots where someone could say “that’s stupid” (don’t let that deter you from pointing them out). And that’s why I want to share it — this is the messy process of making a decision. It’s imprecise. It has more “vibes” than I’m supposed to admit. But at the end of the day, there’s an irreducible amount of “putting your finger in the air” with most investing decisions.

The Housing Trade

At the start of 2022, I felt housing might be screwed. Home prices and inflation were red-hot and the risk of the Fed’s hand being forced to raise interest rates was beginning to materialize. Mortgage payments were extra sensitive to bond duration math if rates were to start lifting from such a low base. This would slow housing demand. On the supply side, there were still materials and labor supply shortages. Superficially this is bullish housing but that was already in the price. Looking ahead, this combination felt (notice the vibes…I’m not looking any data up. It’s pure staring out the window) like it could destroy demand. The idea of demand destruction reverberates from my oil trading past. OPEC doesn’t optimize for the maximum price the way you might expect from a cartel. They can be quick to supply the market because they don’t want to kill their customers. Sure a high price means the inventory in the ground is worth more but the business of producing oil, the business that enjoys a multiple, is burnt toast.

The most vulnerable part of the stack felt like the homebuilders because, like an oil refiner, they sit in between the raw materials and the finished goods. They would be squeezed on both sides. Cancellations + high costs.

I pulled up a chart in March of 2022 (this is what it looks like through this weekend of course).

Since the beginning of the year, in less than 90 days, XHB underperformed SPY by nearly 20%.

The market was well ahead of me. Dammit. It appears there’s nothing to do. In the liquid market at least.

I had 2 ideas that could be applied to stale markets.

  1. Decline to invest in the next batch of Austin flips. We had been bankrolling a friend’s short-term flips in Austin since the pandemic. We were just receiving our return from the most recent one and while we’d normally just re-invest, we took a break.
  2. Sell the house we bought in Texas the prior summer. We had a renter in place and we still hadn’t owned the house for a year (meh, short-term gains). We asked our realtor what he thinks the house could fetch and he indicated the market was still hot. He thought we could get 35-40% more than we paid the prior July (which is really nuts since the house had already appreciated since the pandemic and our purchase price was a 12% overbid to the listing price). The realtor’s number sounded optimistic but looking at comps I thought there was maybe a 15-20% chance of catching his number and in most other cases get some kind of quick profit. But I wasn’t really pricing it off profit. I was worried about risk. The cap rate would be terrible if rates went up even 1% and since we were committed to CA we didn’t want the property anymore anyway. The liquid markets were a sell signal. The illiquid market was lagging.

A family with small kids and another on the way was renting the house so our ability to move quickly was a bit hampered with respect to showing but we did get the house on the market by April. We immediately caught a bid above our ridiculous asking price! 2 days later, the stock market dove. Yinh and I were convinced they would back out.

We were right. A day later we got the call. They’re out. Apparently, their financial advisor told them to cancel. I feigned annoyance while secretly thinking “smart advisor”.

Skipping ahead, we cut the price and caught one single bid. But we needed to agree to a long closing period. We’d wake up every day “please no whammy”.

It finally closed in October. We made a touch over 20% before commission which felt so lucky. By now it was also a long-term capital gain.

But what do we do with the cash?

Rebalance.

You sell the thing up 20%, what’s on sale to buy?

We would reallocate the cash to stocks on a relatively vol-neutral basis (if we sold $1 of house, maybe buy about $.50 of stocks if we think stocks are twice as volatile as residential RE).

But there was another risk on my mind.

Being renters ourselves we were effectively “short” or underweight housing after selling the property. From a liability-matching investing lens, this was unsettling. Conveniently, the homebuilders were now down about 40% compared to SPY — the thing I wanted to short a few months earlier I now wanted to buy because it filled a risk hole AND was pricing in pain. So we put 1/4 of the proceeds from the house sale into IWM and 1/4 into XHB.

(I just cut half the position a couple weeks ago as we reduced our net equity exposure and rolled into T-bills. I keep our equity exposure in a band and I chose to sell XHB based on its outperformance.)

Things I believe

  1. Markets are smart. Liquid markets adjust quickly.
  2. My life’s work is not figuring out what prices are right, so my allocations are driven by desired exposures or non-exposures to risk. That’s the best I can do given how much time I am willing to spend thinking about things I have no control over.
  3. Within that framework, my choice of diversified exposure is relative value voodoo and vibes. But you know what…even in my professional trading that was true. In that case, my life’s work was to measure option prices at much closer resolutions than anything I’m doing here, but pulling the trigger felt pretty much the same. What’s the liquid market telling me about about fair value and what do I do with that info? Any individual trade is noise, but if I’m disciplined about risk then no decision carries the risk of the whole portfolio and the framework is left to converge to its logic over time — capture a risk premium without mortally wounding yourself along the way.
  4. Luck will betray you one day so enjoy it when she smiles upon you. We felt like we caught the last bid in America on that house. If we listed the house a few months earlier (which we might have except for the complications with the tenant — no fault of anyone was just a matter of details) we would have been extra lucky, but we would have gotten a worse price on our stock buys.And if we don’t make the sale? Pain parade. We miss the profits, don’t get to rebalance, and I curse myself for getting into an illiquid asset. I hate illiquidity already. As I get more experience, I want to rule out illiquidity more and more. Ruling-in needs to be for a justifiably unique exposure. The option to rebalance has a value — whether you choose to ignore it or not is up to you. See How Much Extra Return Should You Demand For Illiquidity?

A note on taxes

We will pay LT gains taxes of about 30% between Fed and CA. Why not 1031 exchange? Well, I thought real estate prices would be too sticky (ie they won’t come down enough) before our 6-month window to close on a new property. I expected wide bid-asks as sellers locked into low fixed rates try to wait out market weakness. I didn’t want to sell something up 20% to buy something down 5 or 10% when I could buy something down 40% (which is more standard deviations — again, think in vol-adjusted terms. This is also why buying high growth wasn’t attractive even if they were down more than housing…they are higher vol plus the skew in their distributions means volatility is understating the risks — that’s a post for another time).

More generally, let’s examine the math of 1031 tax savings. Imagine the house I sold went from $800k to $1mm. My tax liability is about 30% of 200k or $60k. But the brokerage cost of what I buy on the backend is pretty close to that (5% of $1mm when I eventually sell the 1031 property). It’s true that the cost is deferred but the cost is also inflation indexed since it’s a percent of the home value. You are not saving nearly as much as you think because you are forced into a high transaction cost asset and the cost is a percentage of the entire asset value, not just the profit.

[Note 1: If you don’t have to pay that fat state tax and your LT gains rate is closer to 20% than this argument is even stronger.]

[Note 2: This argument is much less compelling if you plan to never sell and get stepped-up basis for your heirs. But you get stepped-up basis on stocks when you die too. But anyway, I’m not in the never-sell camp because the tax tail isn’t going to wag my risk dog. There’s always a price that warrants saying “sold” to. If a HODLer wins they get concentrated. That might be ok for your human capital but that’s not a strategy for a random number generator. And from my unenlightened seat, the market’s job is to set prices for great assets so that they are effectively random. If you disagree, you should invest for a living. I heard you can get rich doing that. Actually, you have a better chance of getting rich by convincing people you could do that.]

It’s Not The Merit It’s The Price

My past self makes me cringe.1

I remember a weekend Yinh and I spent in Big Sur before having kids. We stayed at a resort/hotel place for free in exchange for listening to the timeshare spiel. I’m just pushing back on every point, complaining about the math this poor lady on the bottom-of-the-realtor-totem-pole is conveniently ignoring. Looking back, I’m genuinely sorry to have been acting myself in that moment.

When you feel your blood pressure rising you can channel some grace by just thinking of someone you know who would be smooth in that situation. The aspirational move here is just smile and nod. I had the situation exactly backward — it was me who was embarrassing himself, not her with the canned pitch as pushy and nonsensical as it was.

Luckily I have this moon letter thing as an outlet for my teeth-grinding financial complaints. I’m over the timeshare sales thing (well, actually I just pay for a room and save myself the grief. I admit this feels more like a hair dryer solution 2 than addressing the root of my anger) and onto another — I can’t stand when a life insurance salesperson pretends they are doing god’s work by telling me about their widow client’s big settlement. I’m not against buying insurance — I have car insurance and life insurance. But I’m against motte-and-bailey persuasion techniques. If a widow getting paid is deemed a self-congratulatory act of corporate benevolence then Warren Buffet is the priest of puts, a hokey paragon of virtue, backstopping markets with the heart of a patriot. Ok.

Defending life insurance by focusing on the settlements that get paid out is as silly as branding calls sold as income. And for the same reason — there is no consideration of price. Let’s compare:

Defense of insurance: “Look at the settlement the policyholder received. It has so many zeros in it.”

Rebuttal: That would be true even if the insurance cost twice as much. So the issue isn’t whether there would be a settlement it’s the proposition on the whole.

Defense of covered calls: “The premium you collect is extra income, and if the calls go in-the-money you’ll be happy anyway”

Rebuttal: This would be true if I sold the calls for 1/2 the price that I actually sold them for.

In other words, both of these defenses are empty words because they skirt the defining point:

It’s not the merit of the idea — it’s the price.

The wrong price will ruin any proposition. Ideas without prices are worthless. “It’s a good idea to brush your teeth.” But if brushing your teeth took 8 hours a day, you’re better off pulling them all and getting implants.

“It’s a good idea to get insurance” has the invisible qualifier “assuming the price is reasonable”. From there we can debate “reasonable” and we should. But I assure you the percentage of time spent in a life insurance consultation that’s devoted to decomposing its cost is not commensurate to how important it is in the decision.

Money Angle

Let’s harp on this “merit cannot exist independent of price” idea. We’ll return to insurance for a moment.

The griftiness of insurance sales as a function of complexity is an inverted U curve. Term insurance is not complex, it’s highly competitive and low margin. Private placements, which I’ve written about, are sold to very wealthy people who likely have a CFO-type managing their money. It’s the midwit crowd from all ends of the income spectrum that express their snowflake exceptionalism in exactly the wrong place and end up paying for their agents’ kids’ private school tuition.

Many insurance products are complex and seriously difficult to understand — every now and then I’ll take a hard look at one and just think, “they expect the average person to comprehend what’s actually going on inside this black box?!” And of course, the answer is “no”. That’s actually the point.

Here’s a tip — run away if you can’t understand the insurance product better than the salesperson. This is not as high a bar as you think. Salespeople are experts at sales not financial engineering. If they weren’t selling annuities they’d be selling cars or homes. (It’s a blanket statement so there are exceptions — but you know who will agree with me the most? Nerdy advisors who don’t have perfect teeth. This is the old Taleb bit “surgeons shouldn’t look like surgeons”.)

When I look at insurance products, especially structured products, I look for the options embedded in them. The costs for these options is opaque. Many of them have analogs in the listed options markets, but ultimately the ones buried in insurance policies resemble illiquid flex options with long-dated maturities and substantial padding added to their prices. If you wanted to be rigorous about valuing an insurance policy you’d need to know everything from the value of these hidden options to how much credit risk to discount the various issuer’s policies by. Apples-to-apples comparisons are impossible. This de-commoditizes the products giving unscrupulous salepeople ample room to practice their dark art.

An aside about options thinking

I know someone who negotiates and prices leases for commercial office space. They work on huge leases with clients like FAANG. One of the things they mentioned was how they would try to embed provisions in leases which were basically hard-to-price options. The person also spent a couple years with an options market-making group and is generally very quantitative — I would use the person for math help regularly.

I also know of a few wildly successful option traders who did quite well in personal RE investing by structuring options with potential sellers (one of these stories was focused on an ex-colleague of mine which was discussed in a certain big city’s media post-GFC).

And one more related bit — an option manager I know is friends with a fund manager who deals exclusively in the pre-IPO share market. This is a class of funds that provide liquidity to late-stage VC portfolio company employees. The manager was able to help the fund manager by showing them how a particular option embedded in their structures was deeply mispriced.

A final aside on the usefulness of option thinking…in Option Theory As A Pillar Of Decision-Making, I include this:

Getting to The Price

A current example of the need to assess a proposition by understanding its price comes from the boom in covered-call ETFs. Jason Zweig of the WSJ recently published:

Why Investors Are Piling Into Funds That Promise Not to Beat the Stock Market (paywalled)

After great returns last year, covered-call funds are all the rage among income-oriented investors. But their high yields aren’t a free lunch.

The article covers the explosion in AUM in covered-call funds like the JPMorgan Equity Premium Income ETF (JEPI) or Global X Nasdaq 100 Covered Call ETF (QYLD).

These ETFs manage roughly $20B and $6B aum respectively.

We’ll talk about QYLD because its holdings are published while JEPI is a discretionary, actively managed ETF. (But I still want to know who gets to hungry-hungry hippo those option orders!).

QYLD sells covered calls on the Nasdaq 100. That means it sells a call option while owning the underlying index. If you buy 100 shares of QQQ and sell a call option you could do the same thing. That’s not an argument against this product though. Ease is a valid use case for a product.

More background: it sells the 1-month at-the-money call as opposed to out-of-the-money calls which is what people generally think of with covered-call strategies (when I was just a boy they called these “buy-writes” but I haven’t heard that term since Arrested Development was on the air).

I’ve addressed “selling options for income” as euphemistic, sales-led framing. I’m not necessarily opposed to selling options but when you brand it as “income” you are blatantly misrepresenting reality. You are pretending the option premium is income when the bulk of it is just the fair discounted weighted average of a set of possible futures. My bone with the marketing pitch is that there’s no discussion of price. Again, whether this is a good strategy depends on price and the price isn’t static. (I feel like like I’ve force-fed you like foie gras on this topic. If I have to hear about this “strategy” from one more medical professional I hope I better be sedated on an operating table so I can finally drown it out)

When the marketers show me the level of implied correlations they are selling in the calls then we can have a good-faith conversation. Or how about when they tell me who the buyer for those calls is? Because I can assure you there’s no natural buyer — the boys and girls buying those calls are only doing so because they are too cheap. They didn’t wake up in the morning and think “I’m not going to look at prices, I just think owning call options that go to zero is a reasonable way to invest my money.” You know what traders are thinking when they see the marketers pitch: “Thank you for stocking the pond, we’ll be waiting”.

And they will be waiting. Market-makers are lions in the bush who know the dinner’s migration patterns. Unlike lions, they need to be discreet. You can’t just pounce and scare everyone off. You don’t want to make a scene. So they pre-position.

The market-makers’ pre-positioning serves a dual purpose.

  1. It spreads the market impact over a longer window of liquidity. This is actually pro-social — it’s “markets properly working”. The telegraphed order is not as scary even though it’s a large size because the end of it is known and there’s no adverse selection risk. It’s what’s known as a “dumb” or uninformed order. It’s not reasonable to expect zero market impact because unless there’s someone who wants to buy all these options, the pool of greeks need to be absorbed by a get-paid-to-warehouse-risk-in-exhange-for-profit entity. The market is just an auction for that clearing price and the greeks dropped on the market will be recycled in adjacent markets emanating from the original disturbance. (I.e. the market makers will buy vega from you and sell it in some other correlated market where the entire proposition presents an attractive relative value play — it’s just a big web. Market-makers are the silk between the nodes.)
  2. You want the option seller to get filled near the offer so they feel good about the fill. That’s what it means to “not leave a scene”. So now that you are short vol 3 days ahead of the anticipated arrival of the order, knowing that the current vol level incorporates the impact of your own selling, you are ready to buy the new supply “in line”. Remember this is not frontrunning. It’s a probabilistic bet. The market-makers have no fiduciary duty to the fund (as opposed to actual frontrunning where the broker trades ahead of an order they control). Market-makers want the brokers to “feel” like they got a good fill. There are no fingerprints. A TCA that looks at execution price vs arrival price is already benchmarked to a mid-market price that has been faded to absorb the flow.

What does this mean for the cost of something like QYLD?

A napkin math approach

Assumptions:

  • At the current AUM, they sell about 5,000 NDX at-the-money call options (equivalent to 200,000 QQQ options) every month.
  • Implied volatility is about 25% so the fund collects 2.89% of the index level 3 in premium monthly. (Can you see how ridiculous it is to call this income? Would you call it income regardless of how little premium it collected? What if the option was in-the-money and they collected the same amount of premium? Conflating premium with income is a timeshare tactic except it’s pushed by corporations who know better not Jane “it’s this job or dogfood for dinner” Doe.
  • The ATM call is pure extrinsic value.

The question is how much vol slippage can we expect on that order. I asked around and a full vol point seems like a reasonable estimate. Because of the “setting the table” pre-positioning effect it’s hard to get a perfect answer. So we’ll use 1 vol point and you can adjust the final analysis by changing it.

If there is 1 full vol click of slippage and the option you sell is pure extrinsic, than you are losing:

1 vol point / 25 vol points x 2.89% of AUM x 12 months in annual slippage.

That’s 139 bps in annual slippage. That needs to added to the 60 bp expense ratio for the fund.

So you are paying 1.99% per year for a beta-like exposure created with vanilla products. And the alleged income is not income. It’s a correctly priced option premium in one of the most liquid equity index markets in the world.

Even if I grant you a 10% VRP (variance-risk-premium is an idea that options are bid beyond their fair value for any number of reasons like convexity-preference, hedging demand, or the possibility that markets allocate prices according to efficient portfolios and single assets being mispriced might not be from a portfolio point-of-view) that means the alleged income is 10% of what the marketers claim.

This whole trend in covered-call ETFs feels more like an innovation for getting paid for commoditized exposures in a fee-compressed landscape than an innovation that actually improves investing outcomes.


An (Overly) Candid Opinion

I’m not some socialist arguing against giving people an abundance of choice. I just want to remind you that no smart-sounding idea gets a free pass without consideration of its cost. And my own wholly personal opinion is you are paying a lot for convenience here. Plus the more AUM these things get the worse the slippage.

A saying I repeat too much: Asset management is the vitamin industry. It sells placebos. It sells noise as signal.

The proliferation of option products seems like something devised by products people not alpha people, a complaint I’d charge against most of the asset management world (which probably means I’m being too harsh but also I’m not criticizing any single firm — I don’t even know anything about these large fund companies because they were not part of my career genealogy. To me, they were always just the names of customers). Another reason I should be softer on all this is that, in aggregate, active management is critical. But there’s a paradox of thrift thing where we should (and this is dark) encourage it for others but not subscribe ourselves.

If you are truly obsessed and love investing then you can figure out your own way and maybe I’m just a faint admonishing voice in the background that you mostly ignore (I do hope I help you think better around the edges at least). But for the casual investor whose targeted by pitches and thinks they are missing out, you are given permission to live FOMO-free. There’s nothing to see except a midwit trap.

[And definitely don’t look at these. Gag me.

Actually, any TSLA options mm wants to gag me for raining on their parade. That should tell you something.]

Spending As Self-Discovery

There are a couple of ideas that tug on me with a force I haven’t felt before my experience in the last 2 years.

Regarding my career

It’s become more of a priority to sustain a living in a way that feels deeply tied to others. There are a lot of ways to make money, but finding a way that doesn’t feel alienating, finding a way that feels like I’m lifting others is harder to pull off. To be clear, any constraint limits your options. Expectancy-wise I’m paying for this preference the same way a remote worker might be taking a pay cut compared to the counterfactual. But this is only a concern if I measure expectancy only in dollars. That’s falling into a trap of letting legible accounting dictate all of what matters. You know what’s more baller — giving your soul a bank account and finding a way to stack it. That means more energy for everyone else around you.

Look, if you have the luxury of reading a Substack at your desk, then simply making money is easy. You just have to be good at something. By definition, mediocrity is everywhere. It’s a low bar for ambition. Making enough money on your terms takes either top 1% talent or being good at something plus courage. If you stop at just being above average at something, some overlord will always stand ready to make you part of their portfolio. But you’ll just be another asset to be rebalanced on their schedule.

You don’t become irreplaceable until you display how you are different (I’m obviously talking about the differences that are helpful while remaining aware that many strengths are weaknesses in other contexts). You cannot do this from a place of fear. You can’t maximize opportunities if you can’t afford to say no. So it stands to reason that you don’t want to build your life in a way that makes you unable to say no. That’s the real trap of conformity. Conformity isn’t clothing. It’s not even what you say necessarily. It’s relegating your agency for comfort to someone with a different mix of values because you were too lazy to identify your own. And the irony is they probably did the same.

[It’s out of scope for this post, but this idea is deeply intertwined with learning which I believe is really about agency and freedom from conformity. And I don’t mean that in some truther way which is just conformity sold as alternative. Like what “grunge” became. And I say this as someone with all the albums.]

Spending Money Strategically

I mentioned that emotionally moving was expensive. It’s expensive to transact real estate and travel. But the costs were worth the information. That’s a concept that exists in poker or even trading. For example, you can “fish” mid-market by dangling a one-lot to see where the bots live (and to defend yourself against such tactics, in a dark pool for example, you can define a minimum trade size).

In The Art and Science Of Spending Money, Morgan Housel offers a menu of ways to spend money that you may not have considered. #10 explains why:

Not knowing what kind of spending will make you happy because you haven’t tried enough new and strange forms of spending.

Evolution is the most powerful force in the world, capable of transforming single-cell organisms into modern humans.

But evolution has no idea what it’s doing. There’s no guide, no manual, no rulebook. It’s not even necessarily good at selecting traits that work.

Its power is that it “tries” trillions upon trillions of different mutations and is ruthless about killing off the ones that don’t work. What’s left – the winners – stick around.

There’s a theory in evolutionary biology called Fisher’s Fundamental Theorem of Natural Selection. It’s the idea that variance equals strength, because the more diverse a population is the more chances it has to come up with new traits that can be selected for. No one can know what traits will be useful; that’s not how evolution works. But if you create a lot of traits, the useful one – whatever it is – will be in there somewhere.

There’s an important analogy here about spending money.

A lot of people have no idea what kind of spending will make them happy. What should you buy? Where should you travel? How much should you save? There is no single answer to these questions because everyone’s different. People default to what society tells them – whatever is most expensive will bring the most joy.

But that’s not how it works. You have to try spending money on tons of different oddball things before you find what works for you. For some people it’s travel; others can’t stand being away from home. For others it’s nice restaurants; others don’t get the hype and prefer cheap pizza. I know people who think spending money on first-class plane tickets is a borderline scam. Others would not dare sit behind row four. To each their own.

The more different kinds of spending you test out, the closer you’ll likely get to a system that works for you. The trials don’t have to be big: a $10 new food here, a $75 treat there, slightly nice shoes, etc.

Here’s Ramit Sethi again: “Frugality, quite simply, is about choosing the things you love enough to spend extravagantly on—and then cutting costs mercilessly on the things you don’t love.”

There is no guide on what will make you happy – you have to try a million different things and figure out what fits your personality.

Recently, I’ve seen a wild example of this.

My wife has a close friend that was in the rat race grind. After joining Yinh on the board of OrFA, the friend has been regularly visiting the orphanage in Vietnam. She’s a tall blond, as American as apple pie. A stranger in a strange land in the Vietnamese countryside. But she found herself deeply moved by not only the children but the local culture and all its people. She has dramatically re-arranged her entire life to prioritize her involvement and presence in Vietnam. Witnessing the impact on an otherwise familiar, professional life that started with a donation and some curiosity has been a powerful frame shake. (My family, 13 of us in total, are going to Vietnam for a few weeks this year and have already planned a soccer match between all the kids at the orphanage. If my kids moan about what’s for dinner after that trip, they’re getting lit the f up).

This is a reminder. You can just do things. You can’t introspect your way to knowing what you want. It’s too much projection of your current self into a different reality. It doesn’t recognize that the feedback changes you.

Spending money in a new way is just another method to try on different versions of yourself. To explore personal frontiers on the not-so-crazy lark that there are deeply rewarding modes to explore the world that you are completely ignoring. Your fixation on the crowded paths your surroundings have directed you towards might frustrating, not because of any personal failings, but because those paths are overbid (look no further than the college admission Hunger Games).

If you can’t be happy unless you get that house or that wedding or that title, it’s not because they are your destiny — it’s because you haven’t taken your imagination off-leash.

Using Log Returns And Volatility To Normalize Strike Distances

Basic Review

Consider a $100 stock. In a simple return world, $150 and $50 are each 50% away. They are equidistant. But in compounded return world they are not. $150 is closer. This blog post will progress from an understanding of natural logs to normalizing the distance of asset strikes.

The use of log returns in financial and derivatives modeling is useful because investing contexts usually involve re-investing your capital. In other words, the growth process is multiplicative, not additive. But if it’s multiplicative we find ourselves needing to specify a compounding interval. This is an invitation to attach a cumbersome asterisk to every model.

Logarithms offer an elegant solution — they allow us to standardize an assumption:  returns are continuously compounded.

If you are uncomfortable already, these short primer posts will help you catch up. And don’t worry, we will revisit HS math intuitively in this post before getting to the main course.

  • In Examples Of Comparing Interest Rates With Different Compounding Intervals, we saw how to convert back and forth between simple returns and compounded returns by dividing a holding period into different intervals.
  • In Understanding Log Returnswe showed how log returns are an extreme case of compounded returns — it assumes that compounding occurs continuously. In other words as you divide the holding period into smaller and smaller intervals, you find a rate that is smaller than the growth rate for the entire holding period. If the growth from $1 to $2 is fixed than the more compounding periods there are, the lower the rate must be in order for $1 to end up being $2.

Math Class Made Intuitive

You probably remember hearing about the constant e and the natural log from math class. You also repressed it. Because it was taught poorly.

Understanding e

We’ll turn to betterexplained.com:

e is NOT just a number!

Describing e as “a constant approximately 2.71828…” is like calling pi “an irrational number, approximately equal to 3.1415…”. Sure, it’s true, but you completely missed the point. Pi is the ratio between circumference and diameter shared by all circles. It is a fundamental ratio inherent in all circles and therefore impacts any calculation of circumference, area, volume, and surface area for circles, spheres, cylinders, and so on.

e is the base rate of growth shared by all continually growing processes. e lets you take a simple growth rate (where all change happens at the end of the year) and find the impact of compound, continuous growth, where every nanosecond (or faster) you are growing just a little bit. 

e shows up whenever systems grow exponentially and continuously: population, radioactive decay, interest calculations, and more.

Just like every number can be considered a scaled version of 1 (the base unit), every circle can be considered a scaled version of the unit circle (radius 1), and every rate of growth can be considered a scaled version of e (unit growth, perfectly compounded).

So e is not an obscure, seemingly random number. e represents the idea that all continually growing systems are scaled versions of a common rate.

Let’s say our basic unit of time is a year.

e is the constant that says “if I start with $1 and continuously compound at a rate of 100%, how much do I end up with…$2.71828”

Understanding the natural logarithm (ln)

It’s true that the natural log is the inverse of an exponential of base e just as logs answer the question “what power do I raise 10 to in order to get to X?”. But defining the natural log as an inverse is circular not intuitive. Again, we turn to BetterExplained. From Demystifying the Natural Logarithm (ln):

The natural log gives you the time needed to reach a certain level of growth.

e and the Natural Log are twins:

ex is the amount we have after starting at 1.0 and growing continuously for x units of time

ln⁡(x) is the time to reach amount x, assuming we grew continuously from 1.0

If e is about growth, the natural log (ln) is about how much time it takes to achieve that growth.

The Natural Log is About Time

    • ex lets us plug in time and get growth.
    • ln(x) lets us plug in growth and get the time it would take.

For example:

    • e3 is 20.08. After 3 units of time, we end up with 20.08 times what we started with.
    • ln⁡(20.08) is about 3. If we want growth of 20.08, we’d wait 3 units of time (again, assuming a 100% continuous growth rate).

Let’s apply e and natural logs to asset returns to understand how to normalize distances.

Normalizing Distance

Let’s return to the $100 stock. We said $150 is closer than $50 in the world of compounding. Let’s assume our growth occurs over 3 years. Here’s a summary of simple returns vs annually compounded returns (or CAGR):

So far so good. The compounded returns are lower than the simple average return. Since log returns are just compounded returns sampled continuously we’d expect them to be even lower.

The total log return is indeed lower than the total simple return.

We can also see that in logspace -50% total return is “further” away than up 50%. This is the first encounter we get with the concept of distance where we see that 50% in either direction is not the same. But by the end of this post, you will learn how to normalize even 2 log returns that look the same, but don’t mean the same thing.

But before that, we will need to complete our understanding of log returns. We saw that the 3-year total log returns are lower than the 3-year total returns. To do that I pose the question:

Can you compute the annualized log returns?

Pattern-matching the computations for average simple returns and CAGR, it appears we have 2 choices respectively:

  1. Total log return / 3or
  2. (1 + Total log return) 1/3 – 1

Remember what e and ln mean in the first place:

The expression ex is a total quantity of growth. It’s actually assumed to be e 1 * x where the 1 represents 100% continuously compounded growth and X represents a unit of time. The natural log or ln(ex) then solves for how much time (ie x) did it take to arrive at the total quantity of growth assuming 100% continuous compounding. 

A key insight is that we don’t need to assume a 100% rate and x to be time. We can simply think of x as the product of “rate multiplied by time”. This allows us to substitute any rate for the assumed rate of 100% to find the time. Once again we turn to BetterExplained:

We can use their logic to return to our question: Can you compute the annualized log returns from these total 3-year  log returns?

Down Case:

log return = -69%

rate x time = -69%

rate x 3 = -69%

The annualized rate must be -23.1%

To annualize log returns, we simply take the total log return and divide by the number of years!

The complete summary table:

All is right in the world…the more compounding intervals we divide the total period into the lower the return must be. Continuous compounding represents the most intervals we can slice the period into and therefore it is the smallest rate.

Recapping so far:

  • Compounded rates are lower than simple rates for the same total return
  • Log returns are convenient measuring sticks because we just assume continuous compounding
  • etells us how much continuously compounded growth we get if we know the time period and rate
  • The natural log can tell us:
    • How much time we needed at a given rate to achieve that egrowth
    • What rate we needed for a given time period to achieve that egrowth

Normalizing Distances For Volatility

Let’s return to the $100 stock and assume continuous compounding. What price on the downside is the equivalent of the stock moving up $20? By now, we understand, the equivalent downside move is less $20 away. Let’s compute the equivalent distances in log space.

ln(120/100) = 18.23%

We solve for a negative 18.23% log return:

ln(x/100) = -18.23%

x/100 = e-18.23%

x = .8333 * 100 = $83.33

If the stock starts at $100 then $120 and $83.33 are equidistant in log space.

We want to take this further. To compare distances, especially in different assets, we want to normalize for volatility.

Volatility is just another word for standard deviation. A 10% log return in BTC means a lot less than a 10% log return in 5-year Treasury notes. We should measure log returns in terms of how many standard deviations away a specified amount of growth is. Note, this is exactly what the concept of a z-score is in statistics. It tells us how far away from the mean a particular observation is.

Let’s stick with our $100 stock and give it a volatility of 18.23%.

  • A 1 standard deviation move to the upside in 1 year is $120
  • A 1 standard deviation move to the downside in 1 year is $83.33

If we define K as a strike price, we can back into a general formula for how far K is from the spot price in terms of standard deviations. Let’s define all our variables first:

K = strike price

S = Spot price

σ = volatility

t = time (in years)

We start with an intuitive expression for a Z-score using our variables:

We can confirm this makes sense with numbers from the previous example. We’ll set t to 1 (ie 1 year) and the Z-score is 1 corresponding to 1 standard deviation:

The formula makes sense. In English, it says “divide the distance in logspace by the annualized volatility scaled to 1 year”.

This simply validated the expression for Z-score. We still want to define any strike price, K, as a function of its volatility and time.

Algebra ensues:

  • If you input a positive volatility number, the formula spits out what a 1 standard deviation up move is.
  • If you input a negative volatility number, the formula spits out what a 1 standard deviation down move is.

If you recall, the big insight from earlier:

The expression ex is a total quantity of growth…we don’t need to assume a 100% rate and x to be time. We can simply think of x as the product of “rate multiplied by time”.

This fact can allow us to decompose the Z-score expression to account for the fact that our underlying stock process has both:

  1. a drift component (option theory uses the risk-free rate for reasons that are beyond this post)
    and
  2. a random component drawn from a distribution defined by a mean (spot + drift) and volatility.

Defining the expressions:

  • Risk-free rate or drift = r
  • The mean of the distribution (aka the “forward”) = Sert
  • The standard deviation scaled to time = σ√t

The Z-score formulas that incorporate drift for 1 standard deviation up and down respectively:

  • Kup = Se(rt + σ√t)
  • Kdown = Se(rt – σ√t)

[The rate in the ex portion is part drift and part random. Why do we combine them with addition instead of multiplication? Because the time portion affects each component differently. We can’t double the variance and halve the time because time also factors into the drift (ie the interest rate)]

Let’s wrap with an example, this time including the drift.

Set r = 5% and t = 1

Fwd = 100e.05 = $105.13

If we are just considering the one standard deviation around the mean (as opposed to a full standard deviation up or down) this is the theoretical stock distribution:

What’s the point of all this?

For anyone within sneezing distance of a derivatives desk, these are rudiments. These computations are the meaning behind the Black Scholes’s z-scores (d1 and d2) and probabilities. These standardizations are critical for comparing vol surfaces. If you can’t contextualize how far a price is you cannot make meaningful comparisons between option volatilities and therefore prices.

If you only trade linear instruments because you are a well-adjusted human then hopefully you still found this lesson helpful. Seeing math from different angles is like filling in the grout in the tiles of your mental processing. You can measure the distance (or accumulated growth, positive or negative) in log space to account for compounding. You can standardize comparisons by using the asset’s vol as a measuring stick. And after all that, if you still don’t enjoy this, you can feel better about your life choices to do work that doesn’t rely on it.

If you do rely on understanding this stuff, hopefully you got e.00995-1 better today.

Understanding Log Returns

If you draw a return a simple return at random from a normal (ie bell curve) distribution and compound it over time, the resultant wealth distribution will be lognormally distributed with the center of mass corresponding to the CAGR return.

Imagine your total 1-year return is 10%. So your terminal wealth is 1.10.

If you compounded monthly to end up at a terminal wealth of 1.10 we can compute the monthly compounding rate as:

1.10 ^ (1/12) = .797% per month or annualized (ie x12) =  9.57% 

Let’s instead compound daily to end up with a terminal wealth of 1.10.

1.10 ^ (1/365) – 1 = .026% or annualized (x365) = 9.53%

The more frequently we compound while keeping the total return the same the lower the compounded rate or average rate that prevails to get us from initial to terminal wealth.

Log returns are returns compounded continuously (as if you were going to compound even more frequently than every single second but at a tiny rate). When we annualize that rate as we did in the prior examples we end up with a log return.

Or simply:

Ln(1.10) = 9.53%

Similar after rounding to just compounding daily.

Let’s say your $1 grows to $1.50 after 1 year, then

  • your simple return is 50%
  • your log return is ln(1.5) = 40.5%

This chart reveals 2 facts:

  1. Log returns are always smaller than simple returns just as compounded returns are lower than simple returns. This makes sense because log returns are just compounding where the interval between compounding is reduced to zero so it takes a lower rate applied more frequently to get to the same total return.
  2. Higher volatility (ie the larger changes) means a wider gap between the simple and log return. Again, reminiscent of the formula relating geometric and arithmetic returns.

The chart raises a question. We know that volatility increases the gap between simple and compounded returns but why is this exacerbated on the downside? There was nothing in the formula (CAGR = Arithmetic Mean – .5 * σ²) that points to any such asymmetry.

The answer lies in an illusion.

In the chart, 1.5 and .5 appear to be equidistant away. They are both 50% away, right?

That’s true…but only in simple terms!

In compounded terms, .50 is “further away” than 1.5.

A thought exercise will make this clear:

If I start at 100 and can only move in increments of 10%, I can get to 150 in 5 moves.

100 * 1.10 * 1.10 * 1.10 *1.10 * 1.10 = 1.61

But on the downside, compounding by a fixed amount means more moves to cover the same absolute distance.

100 * .9⁵ = 59

In fact, I need 2 more moves to “cross” 50. With 7 moves I finally get to 47.8

The chart masks the fact that in logspace .5 is much further than 1.5 and therefore to have moved 50% from the start the volatility (ie the move size) must have been higher. And that’s exactly what the log returns show:

Price Simple Return Logreturn
50 -50% -69%
150 50% 41%

$50 is further away in logspace corresponding to a higher compounded volatility. If the volatility is higher, the gap between the simple and log-returns is wider.

Application to options

The analogy to options is the x-axis in this chart is strike prices because they are absolute distances apart. They are not equidistant apart in logspace!

We make the x-axis equidistant in logspace by making the log returns 10% apart.

Now we can chart the log returns on the x-axis. The distance of each total return from the diagonal shows the divergence between the log returns and simple return. It widens as you expect as we get to larger move sizes, but the chart is more symmetrical because the distance between the “strikes” is now normalized to compounded returns. 

Geometric vs Arithmetic Mean In The Wild

Review

In ‘Well What Did You Expect’? we learned:

  • Mathematical “expectation” is a simple average or arithmetic mean of various outcomes weighted by their probability
  • Arithmetic means are familiar. Your average score in a class is the sum of your test scores divided by the number of tests. If you score 85, 90, 98  your average for the class is:  (85+90+98)/3 = 91

    Note the scores are weighted equally. Here’s what the number sentence looks like without factoring out the 1/3:

    .33 * 85+ .33 * 90 + .33 * 98 = 91

    If the final test is worth 50% of the total grade the weighted average is computed: .25 * 85 + .25 * 90 + .50 * 98  = 92.75

    Whether we are weighting the results equally or not, we are still computing the average by summing, then dividing.

  • Geometric means are like arithmetic means except quantities are multiplied instead of summed. Since investing is the process of earning a return and reinvesting the total proceeds we are multiplying, not summing results. If you invest $100 at 10% for 5 years your final wealth is given by:

    $100 * (1.10) * (1.10) * (1.10) * (1.10) * (1.10)  or simply $100 * (1.10)⁵ = $161.05

    In life, we often know the ending amount and the initial investment but want to know “what was my average growth rate per year?”

    The answer to that question is not the simple arithmetic average but the geometric average because we were re-investing or multiplying our capital each year by some rate. That rate is known as the CAGR or “compound annual growth rate”

    If we start with $100 and have $161.05 after 5 years we compute the geometric average in an analogous way to arithmetic averages, but instead of dividing by the number of years, we take Nth root of our total growth where N is the number of years we compounded for.

    CAGR for 5 years = ($161.05/$100) ^ (1/5) -1 = 10% 

    [we subtract that 1 at the end to remove our starting capital and just have the rate]

  • CAGR vs Simple Average Returns

With investing we are almost always re-investing our capital. That means our capital is being multiplied by a rate from one period to the next. When we want to know the average rate, we really want to pick the geometric average not the arithmetic one (there are other types of averages too like the harmonic average!). We want to compute the CAGR.

As a last proof that the CAGR and simple arithmetic average are different we can revisit the example above. If we compound an initial capital of $100 at 10% per year for 5 years we end up with $161.05 for a total return of 61.05%.

If we compute the simple average:

61.05% / 5 = 12.2%

This is higher than the CAGR of 10%

This is a consistent result. The geometric mean is always lower than the arithmetic mean!

How much lower?

It depends on how volatile the investment is. The reason is intuitive.

Imagine making 50% and losing 50%. The order doesn’t matter. You have net lost 25% of your initial capital.

The formula that relates the arithmetic mean and CAGR:

CAGR = Arithmetic Mean – .5 * σ²

where:

σ = annualized volatility

 

This Is Not Just Theoretical

I grabbed SP500 total returns by year going from 1926-2023. Here’s what you find:

Simple arithmetic mean of the list: 12.01%

Standard deviation of returns: 19.8%

These are actual sample stats.

What did an investor experience?

If you start with $100 and let it compound over those 97 years, you end up with $1,151,937. 

What’s the CAGR?

CAGR = ($1,151,937 / $100)^(1/97) – 1 

CAGR = 10.12%

These are the actual historical results. An average annual return of 12.01% translated to an investor’s lived experience of compounding their wealth at 10.12% per year. 

Comparing the sample to theory

If you knew in advance that the stock market would increase 12.01% per year and you used the CAGR formula with our sample arithmetic mean return and standard deviation, what compound annual growth rate would you predict?

CAGR = Arithmetic Mean – .5 * σ²

CAGR = 12.01% – .5 * 19.8%²

CAGR = 10.06%

An average arithmetic return of 12.01% at 19.8% vol predicted a CAGR of 10.06% vs an actual result of 10.12%

Not too shabby. 

I used the same parameters to run a simulation where every year you draw a return from a normal distribution with mean 12% and standard deviation of 19.8% and compounded for 97 years.  

I ran it 10,000 times. (Github code — it works but you’ll go blind)

Theoretical expectations

CAGR = median return = mean return .5 * σ²

CAGR = .12 – .5 * .198² = 10.04% 

Median terminal wealth = 100 * (1+ CAGR)^ (N years)

Median terminal wealth = $100 * (1+ .104)^ (97) = $1,072,333

Arithmetic mean wealth = 100 * (1+ mean return)^ (N years)

Arithmetic mean wealth = $100 * (1+ .12)^ (97) = $5,944,950

The sample results from 10,000 sims

The median sample CAGR: 10.19%

The median sample terminal wealth = $1,2255,90

The mean terminal wealth: $5,952,373

Summary Table 

The most salient observation:

The median terminal wealth, the result of compounding, is much less than what simple returns suggest. When you are presented with an opportunity to invest in something with an IRR or expected return of X, your actual return if you keep re-investing will be lower than if you take the simple average of the annual returns.

If the investment is highly volatile…it will be much lower. 

The distribution of terminal wealth

The nice thing about simulating this process 10,000x is we can see the wealth distribution not just the mean and median outcomes.

Remember the assumptions:

  • Drawing a random sample from a normal distribution with a mean of 12% and standard deviation of 19.8%
  • Assume we fully re-invest our returns for 97 years

And our results:

  • The median sample CAGR: 10.19%

  • The median sample terminal wealth = $1,2255,90

  • The mean terminal wealth: $5,952,373

This was the percentile distribution of terminal wealth:

The mean wealth outcome is 5x the median wealth outcome due to a 2% gap between the arithmetic and geometric returns. The geometric return compounded corresponds exactly to the median terminal wealth which is why we use CAGR, a measure that includes the punishing effect of volatility. 

In terms of mathematical expectation, if you lived 10,000 lives, on average your terminal wealth would be nearly $6mm but in the one life you live, the odds of that happening are less than 20%.

The chart was calculated from this table:

Percentile Wealth 97-year CAGR
0.95 $22,323,532 13.5%
0.9 $12,048,311 12.8%
0.85 $7,955,791 12.3%
0.8 $5,601,855 11.9%
0.75 $4,098,451 11.6%
0.7 $3,210,573 11.3%
0.65 $2,480,813 11.0%
0.6 $1,981,453 10.7%
0.55 $1,604,153 10.5%
0.5 $1,275,987 10.2%
0.45 $1,009,583 10.0%
0.4 $804,035 9.7%
0.35 $627,807 9.4%
0.3 $476,756 9.1%
0.25 $357,112 8.8%
0.2 $257,498 8.4%
0.15 $186,552 8.1%
0.1 $115,257 7.5%
0.05 $58,646 6.8%

Note that, also 20% of the time, your $100 compounded for 97 years turns into $257,498 or a CAGR of 8.4%. A result that is 1/5 of the median and 1/20 of the mean. Ouch. 

So when someone says the stock market returns 10% per year because they looked at the average return in the past, realize that after adjusting for volatility and the fact that you will be re-investing your proceeds (a multiplicative process), you should expect something closer to 8% per year. 

And one last thing…you should be able to see how rates of return, when compounded for long periods of time, lead to dramatic differences in wealth. Taxes and fees are percentages of returns or invested assets. Make sure you are spending them on things you can’t get for free (like beta).

A Question I Wonder About

If you draw a return a simple return at random from a normal (ie bell curve) distribution and compound it over time, the resultant wealth distribution will be lognormally distributed with the center of mass corresponding to the CAGR return.

We saw that theory, simulation and reality all agreed. 

Or did they?

The simulation and theory were mechanically tied. I drew a random return from N [μ=12%, σ = 19.8%] and compounded it. But reality also agreed.

It may have been a coincidence. Let me explain. 

Stock market returns are not normally distributed. They are well-understood to differ from normal because they have a heavy fat-left tail and negative skew.

  1. The fat-left tail describes the tendency for returns to exhibit extreme (ie multi-standard deviation) moves more frequently than the volatility would suggest.
  2. Negative skew means that large moves are biased toward the downside.

These scary qualities are counterbalanced by the fact that the stock market goes up more often than it goes down. In the 97-year history I used to compute the stats, positive years outnumbered negative years 71-26 or nearly 3-1. 

The average returns, whichever average you care to look at, is the result of this tug-of-war between scary qualities and a bias toward heads. With the distribution not being a normal bell curve it feels suspicious that the relationship between CAGR and arithmetic mean returns conformed so closely to theory.

I have some intuitions about negative skew (that’s a long overdue post sitting in my drafts that I need to get to) that tell me that in the presence of lots of negative skew, volatility understates risk in a way that would artificially and optically narrow the gap between CAGR and mean return. By extension, I would expect that the measured CAGR of the last 97 years would have been lower relative to the theory’s prediction. 

But we did not see that.

I have 2 ideas why the CAGR was held up as expected, despite non-normal features that should penalize CAGR relative to mean return. 

  1. Path

    In Path: How Compounding Alters Return Distributions, we saw that trending markets actually reduce the volatility tax that causes CAGRs to lag arithmetic returns. It’s the “choppy” market that goes up and down by the same percent that leaves you worse off for letting your capital compound instead of rebalancing back to your original position size. The volatility tax or “variance drain” occurs when the chop happens more than trends (holding volatility constant of course). But since the stock market has gone up nearly 3x as often as it went down perhaps this trend compounding “bonus” offset the punitive negative skew effect on CAGR. 

  2. What negative skew?
      Qty Avg Return St Dev of Returns
    Up years 71 21.3% 12.7%
    Dn years 26 -13.4% 11.4%

    Using annual point-to-point returns, I’m not seeing negative skew. 

I’ve exhausted my bandwidth for this topic so I’ll leave it to the hive. Hit me up with your guesses. 

 

 

Trading Is Like Any Other Business

This post is adapted from this twitter thread.

If you are on the outside looking in at trading, you might think there’s some magic about knowing what a good trade is. In the past week, I’ve explained the same idea to 2 different people so might as well say it here…

In the day-to-day grind of trading options/derivs, all the sharps know the good side of a trade (sports sharps, is this true there as well?) If you could freeze time Zak Morris style and poll them you’d see that quickly. If you ask them why they want to take that side the reasoning will not be because of some opinion or fundamental idea…the logic will simply look like “well party X is bidding Y for Z so if I can buy A for price B and sell Z at Y to X then I’ve got a good trade on”.

So the key to being a sharp is more of seeing flow and knowing where the real bids and offers are. It’s knowing where the buyer’s bid is, and if they have more behind, or if they are in-between reloads and so on. It’s not based on some super-secret model. All the sharps wanna do the same thing which tells you opinions of what is cheap and expensive are table stakes. And if they disagree…congrats, you’ve found fair value by definition. The pick’em price you can buy or sell at.

The market tells you immediately if your trade is a good or bad one. It offers in your face after you just bought (bad trade). Or it fades when you try to lift and ticks you (good trade). I remember in trading class one of the partners explained that the definition of a good trade was one that is bid where you just bought. The worst-case scenario is you just scratch your trade if you want.If your trade is bad it’s because don’t know about something else out there. You are being quasi-barbed. You are actually providing liquidity via an intermediary to something else you don’t see!

The role of a market-maker, and why it’s a business and not speculation makes sense. They are the intermediaries who bridge the liquidity between actors who don’t see the whole picture because they are narrowly focused on their own knitting. For the market maker to be effective, they of course have to have the table-stakes sense of what’s cheap and expensive on a relative basis.

But the main job is access.

They need to see the flow.

Every limit order out there is an option for them to lean on. If there were no orders, no flow, no need for immediacy by a natural investor, then the cloud of prices would be quoted around some generic model. It would look orderly. Things get out of line because of flow. The info in that flow causes market-makers to Bayesian update fair value.

Brokers bring flow to market. It’s a competitive biz. If JPM gets their cust a bad fill, next time the client will use GS. So a market maker has a homeostatic relationship with the broker. You need to be tight enough to win the broker’s flow so they don’t get embarrassed (if I sell you something at $5 and it’s immediately offered at $4.90 the broker is gonna be pissed), but not so tight that there’s no margin in the trade for you.

There are many traders willing to provide prices to brokers. The brokers own the relationships, the flow, the lifeblood of the business. The resulting dynamic has parallels to other businesses in terms of economies of scale. For example, I might need to put up the brokers at fair value in liquid ETF options so that someone else on my team can win the more lucrative single stock flow. In a classic sense, I’m a loss leader.

I traded commodity options. This is an intensely competitive options business because banks are willing to provide large physical clients attractive option prices because there’s lending/banking business they can win from them. Being a market maker in a market where some party has effectively “commoditized the complement” is a tough place to be when your business is only in trading the “complement”. In a broad sense, the bank is seeing more flow than you. If you insist on competing with that you need to be aware and lean into your competitive advantages over the bank. Maybe the bank desks are silo’d along another dimension where you aren’t. That can be a wedge for you to compete.

In sum, trading is just one of many types of businesses and has many parallels.

It’s not some magic crystal-balling. The models are not the edge. It’s the combo of doing many things well. Technology, relationships, organizational behavior (just think of the alignment issue where one desk is a loss leader for another…how does comp work in such a situation? Who gets what seat? What skill is more scarce? Chickens/egg problems abound).

Well What Did You “Expect”?

Here’s a simple coin flip game. It costs $1 to play.

  • Heads: you get paid an additional $1 (ie 100% return)
  • Tails: you lose $.90

The expectancy of the game is $.05 or 5%.

We compute expectancy:

.5 * $1.00 + .5 * (-$.90)

It’s exactly the same calculation as a weighted average or arithmetic mean. This is a useful computation for many simple one-off decisions. Like should I buy an airline ticket for $1000 or the refundable fare for $1,100?

If there’s a 10% chance I need a refund then the extra $100 saves me $1,100.

10% * $1,100 =$110 which is greater than the $100 surcharge. 9% is my breakeven probability.

It’s tempting to use this logic in investing. Let’s say you expect the stock market to return 7% per year on average for 40 years. Start with $100 and plug in numbers:

$100 * 1.07⁴⁰ = $1497

Yay, you expect to have about 15x your starting capital after 40 years!

Eh. Sort of.

See the word “expect” in math terms and in colloquial terms is a bit different.

If I bet $1 on that coin game I theoretically expect to have $1.05 after 1 trial. In reality, I’m either going to end up with $2 when I double up or $.10 when I lose.

Another example:

I roll a die. If it comes up “1”, I win $600. Otherwise, nothing happens. Theoretically, I expect to win $100:

1/6 * $600 + 5/6 * $0 = $100

But if I asked you what you “expect” to happen if you play this game…you “expect” to win nothing. You only win 1/6 of the time after all.

Back to the investing example.

Investing is not a one-off game. It’s a compounding game where you plow your total capital back into the sausage machine to get that 7%

That’s why we use 1.07⁴⁰.

You are counting on your $100 growing by 1.07 * 1.07 * 1.07…

So that 15x number…that’s mathematical expectancy the same way the dice game is worth $100 or the coin game is worth $1.05 even though those outcomes are never actually experienced.

What you expect to happen in the colloquial sense of the term is the geometric mean. The arithmetic average is a measure of centrality when you sum the results and divide by the number of results. (In our examples you are summing results weighted by their probabilities, but you are still summing). The geometric mean corresponds to the median result of a compounding process. Compounding means “multiplying not summing”. The median is the measure that maps to our colloquial use of “expected” because it’s the 50/50 point of the distribution. That’s the number you plan life around.

The theoretical arithmetic mean result of playing the lotto might be losing 50% of your $2 Powerball ticket (which is another way of saying you are paying 2x what the ticket is mathematically worth). The median result is you lit your cash on fire. You plan your life around the median, especially when it’s far away from the mean. We’ll come back to that.

With investing we are multiplying our results from one year to the next together. The geometric mean is what you actually “expect” in the colloquial sense of the term. The geometric mean is more familiarly known as the CAGR or ‘compound annual growth rate’.

What is the relationship between the arithmetic mean to the geometric mean? This is the same exact question as “what is the relationship of mathematical expectancy and the CAGR?”

It’s an important question since that theoretical arithmetic mean is only expected if we live thousands of lives (actually there are ways to experience the arithmetic mean without relying on reincarnation. This is pleasant news because what good is being rich if you come back a pony.) We want to focus on the CAGR, which is much closer to what we might experience.

It turns out that number is lower.

How much lower? It depends on how volatile the investment is. The formula that relates the arithmetic mean and CAGR:

CAGR = Arithmetic Mean – .5 * σ²

where:

σ = annualized volatility

If an investment earned 7% per year with a standard deviation(ie volatility) of 20% you can estimate the CAGR as follows:

CAGR = .07% – .5 * .20² = .05

In arithmetic expectancy, over 40 years you expect to earn 1.07⁴⁰ = 15. You expect to have 15x’d your money.

But the median outcome, which corresponds to the geometric mean is 1.05⁴⁰ = 7.

7x is much closer to what you “expect” in the colloquial sense of the term. Less than 1/2 the arithmetic expectation!

The formula tells us that the arithmetic and geometric mean (“CAGR”) will diverge by the volatility. And that volatility term is squared…which means the divergence is extremely sensitive to the volatility.

This is a table of CAGRs where you can see the destructive power of volatility:

Why is volatility so impactful on a compounded return?

An easy way to see the impact of high volatility is to imagine making 50% and losing 50%. The order doesn’t matter. You have net lost 25% of your initial capital.

We can compute the geometric mean by weighting each possibility by its frequency in the exponent (in this case the exponents must sum to 2 because that’s the sample space — up and down):

.5¹ x 1.5¹ = .75

Go back to the first game in the post. You invest $1 in a coin game. Heads to make 100%, tails you lose 90%. This game had a positive arithmetic expectancy of 5%.

What is our arithmetic expectancy if you compound (ie re-invest) by playing 2x then the total possibilities are:

HT: 2 x .1 = .2

HH: 2 x 2 = 4

TH: .1 x 2 = .2

TT: .1 x .1 = .01

Since each scenario is equally likely (25% each) the arithmetic expectancy is simply the average = 1.1025

This jives with 1.05² = 1.1025

The average arithmetic return compounds as expected.

But our lived (median) experience is much worse. The median result is .20, a loss of 80%!

We could have seen that by computing the geometric mean:

2¹ x .1¹ = .20

Driving the point home with an extreme example

Consider a super favorable bet.

You roll a die:

  • Any number except a ‘6’: 10x your bet
  • Roll a ‘6’: Lose your entire bet

The arithmetic expectancy is ridiculous.

5/6 x 10 + 1/6 x -1 = 8.167 or ~700% return

But if you keep reinvesting your proceeds in this bet, you will go bust as soon as the 6 comes up. The median experience is a total loss, even though the arithmetic expectancy compounded is wildly positive. If you played this game 20 times in a row you’d [arithmetically] expect to make ~ 700%²⁰.

But you have a 97.4% chance of going broke because you need “not a 6” to come up 20 times in a row = 1 – (5/6)²⁰

That arithmetic expectancy of ~ 700%²⁰ is being driven by the single scenario where the 6 never comes up (that occurs 2.6% of the time). In that case, your p/l is $10²⁰ or between a quintillion and sextillion dollars.

But the geometric mean is 0 because multiplying over the 6 sample spaces:

10⁵ x 0¹ = 0

I chose such extreme examples because nothing illustrates volatility like all-or-nothing bets. The intuition you need to keep is that high volatility means you should expect to lose your money even if the arithmetic expectancy is high.

As soon as you start re-investing (ie compounding) your results are going to be governed by that geometric mean which hates volatility.

For the people who tout lotto ticket investments like crypto or transformative technologies with talks of “asymmetrical upside” or “super positive expectancy” remember even if they might be right, the most likely scenario is they lose all their money on that investment. Even literal lotto tickets can tip into positive expectancy. When that happens how much do you put into it?

Exactly. Not much. Because you know what to expect.

The role of rebalancing and diversification

Investing is not a one-off game. You always re-invest. By re-balancing, you “create” more lives by not concentrating your wealth in a single bucket which swamps the rest of your portfolio as it grows. If you never rebalanced BTC on the way up it would have eventually become nearly 100% of your portfolio and then 2022 happened.

If you don’t ever rebalance you are effectively praying that “not a 6” comes up for the 40 years you are compounding wealth. It’s not as extreme as that because market volatility isn’t as extreme as dice or coins. But the principle holds.

You only get one life so you care about the median. Diversification plus rebalancing gives you the god-perspective of getting to invest a fraction of your wealth into many lives.

Keep in mind — rebalancing is not changing your overall expectancy; it’s changing the distribution of returns by pushing the median return (geometric mean or CAGR) up to your theoretical arithmetic return. This trade-off is not free. If you rebalance you don’t get the 1000x payoff that occurs when a single concentrated position hits 50 heads in a row.

Money Angle For Masochists

Imagine a $100 stock that can either go up or down 25% every year.

It’s 50/50 to be up or down.

Let’s look at the distribution of the stock after 4 years (with the probabilities of each price below it)

Look at the extremes after 4 years:

  • $31.64

    A -25% CAGR over 4 years = cumulative loss of 68%

  • $244.14

    A +25% CAGR over 4 years = cumulative gain of 144%

If you sumproduct every terminal probability by terminal price you get $100. And yet, while the stock is fairly valued at $100, after 4 years, you have lost money in 11/16th of scenarios (~69%). The right tail is driving the fair value of $100 while most paths take the stock lower.

This is the mathematical nature of compounding. The most likely outcomes are lower even if the stock is fairly priced.

In the real world, stocks don’t just flip up and down like coins. The probabilities are not 50/50 and there aren’t just 2 buckets they can rest in from one year to the next. The beauty of option surfaces is they allow us to separate the probabilities from the distance of the buckets (and the number of buckets is continuous…there’s no price the stock is not allowed to go to).

Here’s some homework you can do with the above data:

  1. What’s the value of the 4-year $146.68 strike call worth?1
  2. What’s the value of the 4-year $75 strike put? 2
  3. How about the 4-year $125 call? 3

Bonus Questions

Imagine this stock is an ETF and there’s a 2x levered version (which means it’s 2x as volatile) of it.

  • What strike call on the levered ETF is equivalent to the $146.48 strike on the unlevered ETF?4(Hint: It’s further than $46.48 OTM)
  • What’s the value of the call at that strike? 5
  • If I was a market-maker and I got lifted at fair value on the 2x levered ETF 4-year 200 strike call and I go buy the regular ETF 150 4-year 150 calls to cover my risk how many do I need to buy to be perfectly hedged? (Assume you can buy them for what they’re worth…you have enough information to compute their fair value). 6

If you got through this then you have a new appreciation for how far certain prices are from a spot price and how it depends on time and volatility!


Starting from basics like the volatility tax, progressing to how path influences the volatility tax (trends are more like a volatility rebate and choppiness is a tax….the ratio of trend to chop will determine the ultimate cost of the volatility), and finally bridging these concepts to Black Scholes this series will take your understanding of compounding and how returns work to a deeper level.

  1. The Volatility Drain
  2. Path: How Compounding Alters Return Distributions
    [Between this post and the bonus questions you can start to see why pricing OTM options on levered ETFs given a liquid options market on the unlevered version is an application of these concepts]
  3. Solving A Compounding Riddle With Black-Scholes

Shout Out To Matt Hollerbach

Despite trading options for nearly 20 years at the time, it wasn’t until 2019 that I thought really hard about compounding. I knew how to manipulate formulas and how it related to options but it wasn’t until I discovered Matt’s work that I started to see it from a new angle. Matt makes it approachable and builds up insights in small steps. His blog inspired mine, especially many of my earlier posts. The entire blog is worth spending time working through. It’s similar to what I’ve said about gambling — it’s a place where you will learn how to think about risk and return far better than what finance texts will teach.

These are all-time great ones:

Trend Following is Hot Air

Investing Games

Solving the Equity Premium Puzzle, and Uncovering a Huge Flaw in Investment Theory

It’s painful to watch the median (or should I say average) “investor” reason about how markets work because without these intuitions (you don’t need to know formulas necessarily) you are innumerate. That’s like being illiterate but for like numbers and stuff. And the deficiency is as obvious as illiteracy is to a literate person.

The good news is we can all get better.

Practice Second Gear Thinking

A theme I harp on is that investing is biology not physics. It’s a competitive game. You figure something out, the game adjusts. When people say “X happens in Y-year cycles” they are thinking like an astronomer who notes the fixed periodicity by which the Earth orbits the sun. This is physics thinking.

But markets price in common information.

Housing has been an inflation hedge. But if everyone knows that, will they bid it up until it’s not? If someone pays 2x the Zestimate for your house, then is this “investment” still an inflation hedge?

Absolute statements in the world of investing without any consideration of price are a red flag that the speaker doesn’t understand the difference. We know the Celtics are good. But what matters to a bettor is their record against the point spread. The point spread is why investing will never be QED’d.

You can learn more about investing from games and betting than books that have “investing” in the title. The investing books are fine for glossaries and knowing the mechanics of a bond just as the rulebook is imperative to play Scrabble. But the rulebook, like the textbooks, teaches you nothing of how to be good at the game at hand.

In an adaptive game, you need to see the next level. Let’s look at 2 types of second-level.

“Theory of Mind”

Wikipedia defines “theory of mind” as the “capacity to understand other people by ascribing mental states to them.”

In poker, when you bet, you know your cards and some sense of how they might rank at the table. But the key piece of information, is “when I make a bet, what hand do my opponents think I have?” Without considering such second-order knowledge how do you weigh the information you receive when they call your bet? You must inhabit your opponent’s mind. It’s the same skill you need to interpret market prices (see Staring Out The Window). What expectations and beliefs are in the price? This is second-order thinking.

Second order sensitivities

Besides second-order thinking, we must identify second-order effects. In the options world, the “greeks” are sensitivities. Delta is the option’s sensitivity to the underlying. Gamma is a second-order sensitivity that describes how an option’s delta changes with respect to the underlying.

But this topic is everywhere. If a company sells more widgets it makes more profit. But second-order effects mean attracting more competition or saturating a market. Every satisfied customer is one less customer that needs satisfying. So if I build a model of profitability based on units sold, when does the function inflect? When does opportunity fade into unsold inventory?

[A fun way to think about second-order sensitivities is playing “engine builder” boardgames like Dominion or Wingspan where synergies between your cards lower the marginal costs of later actions2. In essence, the cards have gamma based on how you stack them. Every time I use a card it might increase my odds of winning by X. That’s the delta or “benefit per use”. But the delta itself increases with synergy, so as the game progresses, you get more delta or benefit/use ratio, from the same card]

An exercise in thinking about second-order sensitivities

Commercial real estate investor Bill Lenehan on Invest Like The Best:

Here in Marin County, where values have gone up substantially and having a small house in Montana, where that market has similarly boomed post-COVID is that it is unquestionable that these property values are not sustainable…As someone who’s trying to build a business, which includes recruiting people, training them, compensating them, et cetera, doing that in this housing market is substantially more challenging. Well, I guess it feels good that the house is worth more than you paid for it. Net-net, I would welcome a decline in housing to a more normalized level.

Bill is looking at second-order effects. What would it look like if we translated bits of this quote to “greeks”?

Bill’s personal net worth sounds like it had an uptick because he owns a home in Marin. But he knows it would go up even more if it didn’t go up by residential home appreciation. That’s because he knows he has a larger delta to his business than his home.

Define delta as a change in Bill’s net worth with respect to real estate prices

What does this depend on?

For our narrow example, we will limit this to 2 sources of his net worth.

  • The weight of his home as % of net worth
  • The (delta of his business to RE prices) x (his ownership of the business)

But delta of his business to RE prices decomposes further:

  1. There’s the value of his company’s real estate
  2. The expected growth of the business which depends on operating margins amortized into a current valuation.

Bill recognizes that the growth of the business is the largest driver of his net worth and the rise in value of his Marin home represents a slowdown on this larger (albeit nebulous) factor.

First-order thinking is delta. “What happens to my income if I work more hours?” That’s a simple line with a slope of “pay per hour”. Suppose the extra work means you need to employ a babysitter. The slope is just “net pay per hour after paying the sitter”. But if the sitter is a student who can only work an additional 5 hours per week, then the delta or net pay per hour changes because you need to find a higher-priced sitter at some threshold of hours. That’s gamma – the change in your delta. What if there are no other sitters? Then you hit a wall. Your payoff function is abruptly halted. Or what if it inflects because now you need more massages because work makes your body hurt? These are all second-order effects on your delta (slope = net pay per hour) with respect to an increase in hours.

Mathematically, deltas are the slopes of lines. The cause-effect relationship of anything important is rarely so simple. It is convenience that compels us to describe how things work by pointing to lines. The deltas themselves change reminding you that linear thinking is just a snapshot in time. In fact, that’s all a calculus derivative is — zooming in so close to a function that its slope at that point describes a line.

When you listen to explanations, try to fill in the gaps of logic that the speaker understands but are unsaid. You are making the “greeks” or sensitivities explicit. Then you are only one step away from asking “what other greeks are at play?” and what is the “shape of their functions?”

To ponder:

What does a “too much of a good thing” function look like?

How about a hormesis (ie “a little of bad” thing) function?

What does a discontinuous phase transition(ie “gas > liquid> solid”) look like?

What does a logistics or S-curve function signify?

What phenomena follow a convex function? A concave function?

What’s a “winner-take-all” function look like?


Sell Your Textbooks For Boardgames

My bias is traders should study gambling, not investors and definitely not macroeconomics. I feel trading requires self-awareness and unique mix of humility and confidence. Humility demands questioning how you know what you think you know. But this is also a description of a cat chasing its epistemological tail. This needs to be balanced with the confidence to make a decision before you are comfortable, otherwise, you will be too late.

This brings me to Aaron Brown’s article in Bloomberg (paywalled):

Want to Succeed on Wall Street? Learn Poker, Not Economics

These excerpts will save you a click.

What

The Federal Reserve Bank of New York in conjunction with researchers at the University of Southern California and University College London for a paper titled Strategic Sophistication and Trading Profits: An Experiment with Professional Traders. The authors recruited 56 professional traders, plus an equal-size sample of students for controls, and evaluated their performance in a computer-simulated trading game. They then tested their subjects on a wide range of specific skills to see which skills were correlated to trading success.

Main findings

  • Among students, the only useful predictor of trading success was general intelligence.
  • Among professional traders, though, neither intelligence nor other personality traits and cognitive skills mattered much. Success did not depend on any fundamental insight about value. What mattered was strategic sophistication in the sense of taking an analysis of other people’s behavior to high levels. This calls to mind the folk wisdom found in poker, which is that “beginners think about their cards. With a little experience, they start thinking of the other guy’s cards. Poker begins when you think about what the other guy thinks about your cards.” The Fed paper suggests that professional traders are playing poker, while the students are playing games like chess, backgammon, or blackjack that depend on intelligence rather than guessing what other people are thinking.
  • The paper’s finding goes well beyond the claim that strategy is valuable for trading. It suggests that other things such as intelligence, risk strategies, personality traits or knowledge of fundamental value do not matter — or at least are evenly distributed among traders that they can’t be used to predict success.

Murky interpretations

  • The Fed paper did not find any advantage to years of education or experience or other indicators of trading. Who should you believe? The Turtle experiment and Wall Street folk wisdom have one great advantage, in that they are based on real people trading large amounts of money in real financial markets. Unfortunately, that makes controlled experimentation prohibitively expensive. Formal studies and other academic work conducted under laboratory conditions make the results much more scientific but at the cost of being one layer removed from reality.
  • If you are not a trader but want to be one, either for your own account or for an institution, the study suggests you should play poker rather than attending class and take game theory courses over economics…but
  • Conventional wisdom says you should develop your comparative advantages, whatever they are, and study successful traders. If your interest is to understand the economic function of trading, the study suggests it is a game that rewards aggregating information from others’ bids and offers and using that information to provide liquidity. Conventional wisdom suggests trading is a broader skill that combines fundamental and technical information to produce an equilibrium, with many different types of traders performing different functions.

Practical upshot (emphasis mine)

If you like poker more than class and game theory more than economics, it’s good news. You may lose in trading competitions with fellow students, but you have a bright future on Wall Street. On the other hand, if you’re counting on traders to assess fundamental economic value, the study is bad news. It suggests they’re focused on outsmarting each other, not on investigating reality.

Whatever you think about the study and possible implications, it’s always good to see a careful, controlled, rigorous analysis in an area where opinions tend to be much stronger than the foundations for those opinions.

Understanding Implied Forwards

These are not trick questions:

Suppose you have an 85 average on the first 4 tests of the semester. There’s one test left. All tests have an equal value in your final score. You need a 90 average for an A in the class.

What do you need on the last test to get an A in the class?

What is the maximum score you can get for the semester?

If you are comfortable with the math you have the prerequisites required to learn about a useful finance topic — implied forwards!

Implied forwards can help you:

  • find trading opportunities
  • understand arbitrage and its limits

We’ll start in the world of interest rates.

The Murkiness Of Comparing Rates Of Different Maturities

Consider 2 zero-coupon bonds. One that matures in 11 months and one that matures in 12 months. They both mature to $100.

Scenario A: The 11-month bond is trading for $92 and the 12-month bond is trading for $90.

What are the annualized yields of these bonds if we assume continuous compounding?1
Computing the 12-month yield

r = ln($100/$90) = 10.54%
Computing the 11-month yield

r = ln($100/$92) * 12/11 = 9.10%

This is an ascending yield curve. You are compensated with a higher interest rate for tying up your money for a longer period of time.

But it is very steep.

You are picking up 140 extra basis points of interest for just one extra month.

Let’s do another example.

Scenario B: We’ll keep the 12-month bond at $90 but say the 11-month bond is trading for only $91.
Computing the 11-month yield

r = ln($100/$91) * 12/11 = 10.29%

So now the 11-month bond yields 10.29% and the 12-month bond yields 10.54%

You still get paid more for taking extra time risk but maybe it looks more reasonable. It’s kind of hard to reason about 25 bps for an extra month. It’s murky.

Think back to the test score question this post opened with. There is another way of looking at this if we use a familiar concept — the weighted average.

The Implied Forward Interest Rate

We can think of the 12-month rate as the average rate over all the intervals. Just like a final grade is an average of the individual tests.

We can decompose the 12-month rate into the average of an 11-month rate plus a month-11 to month-12 forward rate:

“12-month” rate = “11-month” rate + “11 to 12-month” forward rate

Let’s return to scenario A:

12-month rate = 10.54%

11-month rate = 9.1%
Compute the “11 to 12-month” forward rate like a weighted average:

10.54% x 12 = 9.1% x 11 + Forward Rate11-12 x 1

Forward Rate11-12 = 26.37%

We knew that 140 bps was a steep premium for one month but when you explicitly compute the forward you realize just how obnoxious it really is.
How about scenario B:

12-month rate = 10.54%

11-month rate = 10.29%
Compute the “11 to 12-month” forward rate like a weighted average:

10.54% x 12 = 10.29% x 11 + Forward Rate11-12 x 1

Forward Rate11-12 = 13.26%

Arbitraging The Forward Rate (Sort Of)

It’s common to have a dashboard that shows term structures. But the slopes between months can be optically underwhelming with such a view. Seeing that the implied forward rate is 13.26% feels more profound than seeing a 25 bps difference between month 11 and month 12.

You may be thinking, “this forward rate is a cute spreadsheet trick, but it’s not a rate that exists in the market.”

Let’s take a walk through a trade and see if we can find this rate in the wild.

The first step is just to ground ourselves in a basic example before we understand what it means to capture some insane forward rate.

Consider a flat-term structure:

[Note: the forward rate should be 10.54% but because I’m computing YTM on a bond price that only goes to 2 decimal places we are getting an artifact. It’s immaterial for these demonstrations]

Now let’s look back at the steep term structure from scenario A:

With an 11-month rate of 9.10% and a 12-month rate of 10.54% we want to borrow at the shorter-term rate and lend at the longer-term rate. That means selling the nearer bond and buying the longer bond.

When you study asset pricing, one of the early lessons is to step through the cash flows. This is the basis of arbitrage pricing theory (APT), a way of thinking about asset values according to their arbitrage or boundary conditions. As opposed to other pricing models, for example CAPM, someone using APT says the price of an asset is X because if it weren’t there would be free money in the world. By walking through the cash flows, they would then show you the free money2. The fair APT price is the one for which there is no free money.

Stepping Thru The Cash Flows

Let’s see how this works:
Today

  1. We short the 11-month bond at $92
  2. We buy 1.022 12-month bonds for $90. We can buy 1.022 of the cheaper bonds from the proceeds of selling the more expensive $92 bond. The net cash flow or outlay is $0.
  3. Spend the next 11 months surfing.

At the 11-month maturity

We will need $100 to pay the bondholder of the 11-month bond so we sell 12-month bonds.

But for what price?

Well, let’s say the prevailing 1-month interest rate matched the rates we were seeing in the flat term structure world of 10.49%, the rate implied by the 11-12 month forward when we initiated the trade.

In that case, the bonds we own are worth $99.13.

[With one month to maturity we compute the continuous YTM: ln(100/99.13) * 12 = 10.49%]

If we sell 1.009 of our bonds at $99.13 we can raise the $100 to pay back the loan. We are left with .0134 bonds.
At the 12-month maturity

Our stub of .0134 bonds mature and we are left with $1.34.

So what was our net return?

Hmm, lemme think, carry the one, uh — infinite!

We did a zero cash flow trade at the beginning. We didn’t lay out any money and ended with $1.34.

That’s what happens when you effectively shorted a 26.37% forward rate but the one-month rate has rolled down to something normal, in this case about 10.50%

[In real life there is all kind of frictions — you know like, collateral when you short bonds.]

Summary table:

What if somehow, that crazy 26.37% “11-12 month forward rate” didn’t roll down to a reasonable spot rate but actually turned out to be a perfect prediction of what the 1-month rate would be in 11 months?

Let’s skip straight to the summary table.

Note the big difference in this scenario: the bond with 1 month remaining until maturity is only worth $97.83 (corresponding to that 26.33% yield, ignore small rounding). So you need to sell all 1.022 of the bonds to raise $100 to pay back the loan.

Besides frictions, you can see why this is definitely not an arbitrage — if the 1-month rate spiked even higher than 26.33% the price of the bonds would be lower than $97.83. You would have sold all 1.022 of your bonds and still not been able to repay the $100 you owe!

So the “borrow short, lend long” trade is effectively a way to short a 1-month forward at 26.33%. It might be a good trade but it’s not free money.

Still, this exercise shows how our measure of the forward is a tradeable level!

[If you went through the much more arduous task of adjusting for all the real-world frictions and costs you would impute a forward rate that better matched what you considered to be a “tradeable price”. The principle is the same, the details will vary. I was not a fixed-income trader and own all the errors readers discover.]

The Implied Forward Implied Volatility

Now you’re warmed up.

Like interest rates, implied volatilities have a term structure. Every pair of expiries has an implied forward volatility. The principle is the same. The math is almost the same.

With interest rates we were able to do the weighted average calculation by multiplying the rates by the number of days or fraction of the year. That’s because there is a linear relationship between time and rates. If you have an un-annualized 6-month rate, you simply double it to find the annualized rate. You can’t do that with volatility.3

The solution is simple. Just square all the implied volatility inputs so they are variances. Variance is proportional to time so you can safely multiply variance by the number of days. Take the square root of your forward variance to turn it back into a forward volatility.

Consider the following hypothetical at-the-money volatilities for BTC:

Expiry1 Expiry 2
Implied Vol 40% 42%
Variance (Vol2) .16 .1764
Time to Expiry (in days) 20 30

Let’s compute the 20-to-30 day implied forward volatility. We follow the same pattern as the weighted test averages and weighted interest rate examples.

The decomposition where DTE = “days to expiry”:

“variance for 30 days” = “variance for 20 days” + “variance from day 20 to 30”

Expiry2 variance * DTEExpiry2 = Expiry1 variance * DTEexpiry1 + Forward variance20-30 * Days20-30

Re-arrange for forward variance:

Fwd Variance20-30 = (Expiry2 variance * DTEExpiry2 – Expiry1 variance * DTEexpiry1) / Days20-30

Fwd Variance20-30 = (.1764 * 30 – .16 * 20) / 10

Fwd Variance20-30 = .2092

Turning variance back into volatility:

√.2092 = 45.7%

If the 20-day option implies 40% vol and the 30-day option implies 42% vol, then it makes sense that the vol between 20 and 30 days must be higher than 42%. The 30-day volatility includes 42% vol for 20 days, so the time contained in the 30-day option that DOES NOT overlap with the 20-day option must be high enough to pull the entire 30-day vol up.

This works in reverse as well. If the 30-day implied volatility were lower than the 20-day vol, then the 20-30 day forward vol would need to be lower than the 30-day volatility.

The Arbitrage Lower Bound of a Calendar Spread

The fact that the second expiry includes the first expiry creates an arbitrage condition (at least in equities). An American-style time spread cannot be worth less than 0. In other words, a 50 strike call with 30 days to expiry cannot be worth less than a 50 strike call with 20 days to expiry.

Here’s a little experiment (use ATM options, it will not work if the options are far OTM and therefore have no vega):

Pull up an options calculator where you make a time spread worth 0.

I punched in a 9-day ATM call at 39.6% vol and a 16-day ATM call at 29.70001% vol. These options are worth the same (for the $50 strike ATM they are both worth $1.24).

Now compute the implied forward vol.

Expiry1 Expiry 2
Implied Vol 39.6% 29.70001%
Variance (Vol2) .157 .088
Time to Expiry (in days) 9 16

You can predict what happens when we weight the variance by days:

Expiry1 = .157 * 9 = 1.411

Expiry2 = .088 * 16 = 1.411

Expiry 2 has the same total variance as Expiry 1 which means there is zero implied variance between day 9 and day 16.

The square root of zero is zero. That’s an implied forward volatility of zero!

A possible interpretation of zero implied forward vol:

The market expects a cash takeover of this stock to close no later than day 9 with 100% probability.

A Simple Tool To Build

With a list of expirations and corresponding ATM volatility, you can construct your own forward implied volatility matrix:

Arbitrage?

Like the interest rate forward example, there’s no arbitrage in trying to isolate the forward volatility unless you can buy a time spread for zero.4

For most of the past decade, implied volatility term structures have been ascending (or “contango” for readers who once donned a NYMEX or CBOT badge). If you sell a fat-looking time spread you have a couple major “gotchas” to contend with:

  1. Weighting the trade
    If you are short a 1-to-1 time spread you are short both vega, long gamma, paying theta. This is not inherently good or bad. But you need a framework for choosing which risks you want and at what price (that statement is basically the bumper sticker definition of trading imbued simultaneously with truth and banality). If you want to bet on the time spread narrowing, ie the forward vol declining, then you need to ratio the trades. The end of Moontower On Gamma discusses that. Even then, you still have problems with path-dependence because the gamma profile of the spread will change as soon as the underlying moves. The reason people trade variance swaps is that the gamma profile of the structure is constant over a wide range of strikes providing even exposure to the realized volatility. Sure you could implement a time spread with variance swaps, but you get into idiosyncratic issues such as bilateral credit risk and greater slippage.
  2. The bet, like the interest rate bet, comes down to what the longer-dated instrument does outright.You were trying to isolate the forward vol, but as time passes your net vega grows until eventually the front month expires and you are left with a naked vol position in the longer-dated expiry and your gamma flips from highly positive to negative (assuming the strikes were still near the money).

Term structure bets are usually not described as bets on forward volatility bets but more in the context of harvesting a term premium as time passes and implied vols “roll down the term structure”. This is a totally reasonable way to think of it, but using an implied forward vol matrix is another way to measure term premiums.

The Wider Lessons

Process

Forwards vols represent another way to study term structures. Since term structures can shift, slope, and twist you can make bets on the specific movements using outright vega, time spreads, and time butterflies respectively. A tool to measure forward vols is a thermometer in a doctor’s bag. How do we conceptually situate such tools in the greater context of diagnosis and treatment?

Here’s my personal approach. Recognize that there are many ways to skin a cat, this is my own.

  1. I use dashboards with cross-sectional analysis as the top of an “opportunity funnel”. You could use highly liquid instruments to calibrate to a fair pricing of parameters (skew, IV risk premium, term premium, wing pricing, etc) in the world at any one point in time. This is not trivial and why I emphasize that trading is more about measurement than prediction. To compare parameters you need to normalize across asset types.
    To demonstrate just how challenging this is, an interview question I might ask is:

    Price a 12-month option on an ETF that holds a rolling front-month contract on the price of WTI crude oil5

    I wouldn’t need the answer to be bullseye accurate. I’m looking for the person’s understanding of arbitrage-pricing theory which is fundamental to being able to normalize comparisons between financial instruments. The answer to the question requires a practical understanding of replicating portfolios, walking through the time steps of a trade, and computing implied forward vols on assets with multiple underlyers. (Beyond pricing, actually trading such a derivative requires understanding the differences in flows between SEC and CFTC-governed markets and who the bridges between them are.)

  2. The contracts or asset classes that “stick out” become a list of candidates for research. There are 2 broad steps for this research.
    • Do these “mispriced” parameters reveal an opportunity or just a shortcoming in your normalization?
      Sleuthing the answer to that may be as simple as reading something publically available or could require talking to brokers or exchanges to see if there’s something you are missing. If you are satisfied to a degree of certainty commensurate with the edge in the opportunity that you are not missing anything crucial, then you can move to the next stage of investigation.
    • Understanding the flow
      What flow is causing the mispricing? What’s the motivation for the flow? Is it early enough to bet with it? Is it late enough to bet against it? You don’t want to trade the first piece of a large order but you will not get to trade the last piece either (that piece will be either be fed to the people who got hurt trading with the flow too early as a favor from the broker who ran them over — trading is a tit-for-tat iterated game, or internalized by the bank who controls the flow and knows the end is near.)

3. Execute

Suppose you determine that the term structure is too cheap compared to a “fair term structure” as triangulated by an ensemble of cross-sectional measurements. Perhaps, there is a large oil refiner selling gasoline calls to hedge their inventory (like covered calls in the energy world). You can use the forward vol matrix to drill down to the expiry you want to buy. “Ah, the 9-month contract looks like the best value according to the matrix. Let’s pull up a montage and see if it’s really there. Let’s see what the open interest is?…”

As you examine quotes from the screens or brokers, you may discover that the tool is just picking up a stale bid/ask or wide market, and that the cheapest term isn’t really liquid or tradeable. This isn’t a problem with the tool, it’s just a routine data screening pitfall. The point is that tools of this nature can help you optimize your trade expression in the later stage of the funnel.

Meta-understanding

This discussion of forward vols was like month 1 learning at SIG. It’s foundational. It’s also table stakes. Every pro understands it. I’m not giving away trade secrets. I am not some EMH maxi6 but I’ll say I’ve been more impressed than not at how often I’ll explore some opportunity and be discouraged to know that the market has already figured it out. The thing that looks mispriced often just has features that are overlooked by my model. This doesn’t become apparent until you dig further, or until you put on a trade only to get bloodied by something you didn’t account for as a particular path unfolds.

This may sound so negative that you may wonder why I even bother writing about this on the internet. Most people are so far out of their depth, is this even useful? My answer is a confident “yes” if you can learn the right lesson from it:

There is no silver bullet. Successful trading is the sum of doing many small things correctly including reasoning. Understanding arbitrage-pricing principles is a prerequisite for establishing what is baked into any price. Only from that vantage point can one then reason about why something might be priced in a way that doesn’t make sense and whether that’s an opportunity or a trap7. By slowly transforming your mind to one that compares any trade idea with its arbitrage-free boundary conditions or replicating portfolio/strategy, you develop an evergreen lens to ever-changing markets.

You may only gain or handle one small insight from these posts. But don’t be discouraged. Understanding is like antivenom. It takes a lot of cost and effort to produce a small amount8. If you enjoy this process despite its difficulty then it’s a craft you can pursue for intellectual rewards and profit.

If profit is your only motivation, at least you know what you’re up against.