A Simple Demonstration of Return Vs Volatility

Arithmetic returns

  • Expected return for a bet is the simple probability-weighted average of outcomes.
  • If there is a 50% chance of a bet making 21% and a 50% chance of it returning 19% this it’s a good bet that is also not volatile. You expect to make 20% on average (despite the fact that you can’t ever make that on any single bet since you can only earn 19% or 21%).
  • Your expected terminal wealth after a single trial is 1.2x what you started with.
  • Since we took a simple average of the outcomes we computed an arithmetic mean return of 20%

Compounded returns

For multi-period investing where we do not take any distributions or “money off the table” we cannot use simple arithmetic means to compute an expected return.

Consider the same bet after 2 trials. These are the 4 possibilities each equally likely:

  • Best return, best return
  • Best return, worst return
  • Worst return, best return
  • Worst return, worst return

If we look at the summary table, there is no difference between the mean expected return and the median.

Let’s keep the mean return the same but raise the volatility. An investment that is equally likely to:

  • go up 100%
  • fall by 60%

Even though this is more volatile than the first investment, the mean expected return is still 20% per trial. You can compute this in 2 ways:

50% * +100% + 50% * -60% = 20%


Terminal wealth  = 50% * 2 + 50% * .4 = 1.2 or 20% return

But let’s see what happens when we look at the compounded scenario where we fully re-invest the proceeds of the first period into a second period.

Now the mean compounded return has dropped from 20% to just 4.72% and the median outcome is a loss of 10.6%!

The divergence between mean and median returns comes from the compounded effect of volatility.

Investing Is a Multiplicative Process

When it comes to investing, we are usually re-investing rather than taking our profits off the table each year. We hope to grow our wealth year by year like this:

1.10 * 1.10 * 1.10 … or 1.10n where n is the number of compounding intervals (typically years).

Therefore, we want to look at compounded not mean rates of return. To compute them we simply take the n-th root of our terminal wealth where n is the number of years.

If you doubled your money in 5 years then your CAGR = 21/5 – 1 = 14.9%

Note that if you took the naive average return you could say you earned 100% in 5 years or 20% per year. But this defies reality where you re-invested a growing sum of capital every year.

CAGR is a median return

It’s important to note that the expected mean return of these investments is still 20% per year. It’s just that the median is much lower. In the high volatility example, your lived experience usually results in a loss of 10.6% but the mean 2-period return is still positive 4.7%. The complication is that the avergae is driven by the 25% probability that you double your money in 2 consecutive year. In every other scenario, you lose money.

Volatility is altering the distribution of your outcomes not the mean outcome. 

Mathematically the median is the geometric mean. In a multiplicative process, you care more about the geometric mean. After all, you only get one life.

A note on log returns

A logreturn is a compounded return where we assume continuous compounding. So instead of every year, it’s more like every second. Of course, if our wealth grows from $1 to $2 in 5 years but we assume tiny compouding intervals, then the rate per interval must be small. After all the start and end of our journey ($1 to $2) is the same, we are just slicing it into smaller sections.

Computing an expected logreturn is simple. Using the volatile example:

.5 * ln(2) + .5 * ln(.40) = -11.2%

Note that this is slightly worse than the geometric mean return (aka median) we computed earlier of -10.6%

Volatility’s effect on compounded returns

The following table presents different investments that each have an expected arithmetic return of 20%. Just like the examples above. But the various payoffs are altered to proxy different levels of volatility. An investment that can earn 21% or 19% is much less volatile than one that can return 100% or -60% even though the average return is the same.

We use the simplest measure to represent the volatility — the ratio of the best return to the worst return.

The stable investment volatility proxy is 1.21 / 1.19 = 1.017

The volatile investment above is 2 / .4 = 5.00

Table snippet:

These charts show the divergence between arithmetic and median returns as we increase the volatility (the ratio of the best return to the worst return):

An investment that is equally likely to return 60% as it is to lose 20% has a 20% expected return but if you keep re-investing your long-term median outcome is closer to a 12-13% CAGR.

What if we raise the volatility further to a ratio of 5 (terminal wealth of 2x vs .4x):

At a ratio of 3.5 (1.87x vs .53x) our median result is zero. At a ratio of 5, the average return remains 20% but the median return is losing 10%. Almost all the paths are losing they are just being counterbalanced by the unlikely event that you keep flipping heads.


  • Investing is a multiplicative process so we want to look at compounded or log returns not simple returns
  • Compounded returns ask “what growth rate when multiplied from period to period gets us from the start point to the end point?”
  • Compounded and logreturns are always less than arithmetic returns
  • Compounded and log returns are better measures for what you expect to find in your bank account after volatility has taken its toll. Remember if you lose 50% on an investment you need 100% to get back to even. If you earn 50% on an investment you only need to lose 33% to be back at even.
  • If there was no volatility there’d be no promise of return, but volatility is a quadratic drag on returns. The sweet spot for your portfolio likely falls in the realm of the volatility of broadly diversified portfolios. By rebalancing you can reduce concentration risks that threaten to turn your entire nest egg into a coin flip. Even if this coin has positive expectancy, remember you can’t eat theoretical edge.

Further reading

What I Learned About TIPs

I tried to buy some Treasury Inflation-Protected securities (TIPs bonds) last week. I failed. But I learned a lot about how they work.

I’ll paraphrase and quote from State Street’s outstanding primer for basic background info then explain the mechanics with an example.


  • TIPs bonds were introduced in 1997
  • 5, 10, 30 year terms
  • Like ordinary treasuries they are backed by the “full faith and credit” of the US government
  • The principal and income are indexed to inflation
  • At maturity, the holder receives the greater of the inflation-indexed principal or the original principal. In other words, there is an embedded put option struck at the inflation index price level at the time of issue


  • The annual coupon rate is fixed at issue
  • The coupon is paid semi-annually (each payment is 1/2 the coupon rate)
  • Although the coupon rate is fixed, the principal amount of the bond adjusts monthly based on the CPI-U or Consumer Price Index for All Urban Consumers (not seasonally adjusted)

Relationship to inflation

The breakeven rate allows you to compare TIPs to nominal treasuries

Breakeven: The annualized rate of CPI inflation over the life of the bond that makes the total return of a TIPS equal to that of a similar-tenor Treasury. Calculated as the yield difference between Treasury bonds and TIPS of the same maturity, breakeven rates are, ultimately, a proxy for the market’s inflation expectations. The lower the rate, the lower the expectation for inflation.

Positive inflation typically benefits the performance of TIPS, while falling inflation (deflation/ disinflation) may cause lower performance. It is important to note that market inflation expectations are often already priced into TIPS. Therefore, for inflation trends to be beneficial for the relative return of TIPS, it must develop at a rate that is higher than the market’s anticipated breakeven inflation rates.

The following example illustrates how the inflation adjustment feature of TIPS works during a period of inflation and what it means for returns. If the US 10-year yield is 3.87% and the yield on a 10-year TIPS bond is 1.58%, this means that the breakeven rate is 2.29%. If inflation over the next 10 years is actually 2.5%, this would lead to stronger relative performance, all else equal, for TIPS versus nominals, as realized inflation was higher than what was estimated (as represented by the breakeven) at the time of purchase.

A change in market expectations or uncertainty about inflation can change TIPS prices before maturity, however. For example, beginning in April 2021 nominal and real yields both fell. Yet, real yields fell faster as a result of widening breakeven rates and investors’ desire to mitigate the effects of inflation on their Treasury exposure. At the time, therefore, investors felt breakeven rates (i.e., market-based inflation expectations) were understated and not reflective of the loose policy environment. As expectations increased, TIPS outperformed nominal Treasuries by more than 8% through 2021.7

Why are TIPs returns only loosely correlated to inflation?

Because interest rate themselves are correlated with inflation and…

like all bonds, TIPS are subject to interest-rate risk. And because of this, they are not a perfect hedge against inflation. For example, in March 2022, the Fed began an aggressive rate hike campaign to combat rising inflation. Through year end, the central bank raised rates by a total of 4.25%, which was the fastest rate hike cycle in decades.10 During the same period, TIPS registered a loss of 10.8%,11 primarily due to their duration risk amid the unprecedented speed of the rate hike cycle. These bond losses were widespread among many other fixed income asset classes, such as nominal Treasuries, investment grade corporate bonds, high yield bonds, etc


  • TIPs income is taxed at ordinary rates
  • The income is state and local tax-exempt
  • Phantom income tax: The principal amount of the bond is indexed to inflation. In a positive inflation environment that increase in principal is taxable even though you do not receive distributions. It is possible that your annual tax liability exceeds the coupon income. At maturity, you will be repaid the appreciated principal amount but there will not be a capital gains tax as you have been paying the taxes on the phantom income during the holding period


A quick bit on the CPI-U inflation index.

From Historical CPI-U data we can see:

  • The period from 1982-1984 has a defined index value of 100
  • The August 2023 index value was 307.026

In other words, $3.07 today equates to $1.00 in the 1982-1984 period

Let’s examine a 30-year TIPs bond issued on February 28, 2023

  1. The dated date is February 15, 2023. That’s when interest starts to accrue.
  2. The coupon is 1.50%. That rate is fixed forever.
    • The holder receives .75% of the adjusted principal value every 6 months from the dated date (every August and February 15th)

Adjusting the principal

When the above 30-year TIPs was dated on February 15, 2023 the CPI-U index was at 297.254 which would have corresponded to an index ratio of 1.00.

The index ratio is the amount you multiply the $100 par value of the bond to find the adjusted principal.

On October 1, 2023, ref CPI was 305.691

305.691/297.254 = 1.02838

According to the table that’s exactly the index ratio on October 1. That means the adjusted principal of the bond on that day is $102.838

For demonstration’s sake, pretend it paid a coupon on October 1.

The coupon payment would be:

.75% x 102.838 for each $100 worth of bonds you originally bought.

  • As CPI-U increases the index ratio increases. The fixed-rate coupon is multiplied by this index ratio to compute your interest.
  • At maturity, you aren’t paid back $100 but the adjusted principal per $100 of bond you originally bought.

Quoting convention

As you can imagine, buying TIPs later in the secondary market means the security’s index ratio is much higher than 1 because of the accumulated inflation.

This 20-year TIPs was:

  • issued in 2009
  • has less than 6 years remaining until maturity
  • referenced a CPI-U index of 214.7 when it was dated

The bond will be quoted as a percentage of 100 par.

This is a snapshot of the bond at the close of 10/11/23 from my Interactive Brokers account:

Even though the bond is quoted as a percentage of par — we’ll use a last sale of 100.7255 — the outlay must be multiplied by the bond adjustment factor (ie the index ratio of 1.42581).

Outlay  = price x index ratio

It would cost 100.7255 x 1.42581 or $1,436.15 per $1,000 of face value.

Your coupon payments will be of course indexed to an adjusted principal value:

adjusted principal =face value x index ratio

adjusted principal = $1,000 x 1.42581 = $1,425.81

If October 11, 2023, was a coupon date:

coupon payment = 1.25% x 1,425.81= $17.82

Don’t forget at maturity, your bonds return the adjusted principal not the $1,000 face value.

Additional Considerations

On-the-run vs off-the-run

Consider 2 bonds:

  • A 10-year bond issued today
  • A 30-year bond issued 20 years ago

Both bonds have 10 years remaining to maturity. The new bond is call on-the-run and the old one is known as the off-the-run.

Off-the-run bonds will trade at a discount to the on-the-run (ie the off-the-run will offer a higher yield). Why?

If you buy a new bond with an index ratio near 1.00 you cannot lose if there is deflation. The bond’s adjusted principal cannot fall below 100% of par.

If you buy an off-the-run bond whose years of prior inflation have pushed the adjusted principal up to say 1.4x of par then if you have month-over-month deflation the CPI-U index will fall bringing the ratio and adjusted principal lower. That 1.00 strike put that is embedded in the bond doesn’t protect you from a falling price level the way it protects a newly issued bond.

All the TIPs on IB’s platform seem to be off-the-run except for the recent 30-year bond. The lowest index ratio on a bond with less than 10 years til maturity is about 1.40 which means there are no on-the-run 5 or 10-year TIPs on the platform.


You can access TIPs via ETFs. State Street is an ETF provider so they remind you that TIPs ETFs avoid phantom income:

One of the complicating issues of using individual TIPS is that investors must pay taxes each year on the inflation adjustment to the principal even though the inflation adjustment isn’t received until the bond matures. ETFs avoid issuing this “phantom income” by distributing all inflation adjustments (classified as Treasury income) as they are accrued. This turns phantom income into realized cash flows.

I’ll add one more thought. ETFs offer something akin to a constant maturity exposure. So for example, if you buy an ETF targeting 7-10 year durations you will have constant exposure to that level of interest rate risk/sensitivity.

If you buy individual bonds, as time passes they become shorter-dated which reduces the exposure to interest rates.

It’s not a matter of what’s better, it’s just a question of whether you want constant exposure to interest rate volatility or if you like an entry point today and content to receive the carry while the exposure to rates dwindles away.

I saw a YouTube video comparing the performance of a TIPs ETF vs just owning an individual TIPs bond and then it moaned about how the ETF did worse in the rising rate environment. Well of course it would — it’s interest rate exposure never lapses. If rates were falling, the ETF would have done better than the individual bond.

If you are going to compare an ETF to individual bonds, you should compare the ETF to a bond ladder that is rebalanced to a constant exposure to the same duration.

Final observations

Inflation reporting

Inflation reporting denotes a trailing 1-year rate. The latest CPI index vs 12 months earlier. Just like you might prefer a shorter moving average to emphasize recent data you might want to just look at the monthly inflation index. You could even annualize the recent 3-month change if you thought that was more relevant. If you do that, don’t forget to consider seasonal biases.

  • For the past 3 years, inflation has been compounding at a touch over 6%
  • In the past year, we’ve seen inflation of about 3.70%
  • In the past 6 months, we’ve seen inflation of about 3.94% annualized

TIPs seem cheap relative to nominal bonds (ie breakevens are “low”)

Implied breakevens:

5 Year Breakeven Inflation = 2.20%

10-year Breakeven Inflation = 2.32%

Does this imply:

  1. Bearishness (ie inflation is going to tank — this doesn’t seem to be reflected in equity valuations)?
  2. Flow anomalies?
  3. The breakevens are derived from off-the-run TIPs whose embedded put option is worthlessly far out-of-the-money?

Oh yea, one last thing…

I said in the beginning I tried to buy TIPs and failed. There were over $5,000,000 worth of bonds on the offer. I tried to pay the ask for $100,000 worth and got a message that the NBBO did not need to be honored for less than the displayed size. I don’t know how the heck the bond market works but that just smells.

Additional reading

Outline of the Risk-Neutral Probability lessons

I published a new lesson. It’s a big post with 5 embdedded sub-posts and exercises for the reader. It ties together many concepts I write about.

I adapted the Tweet thread I used to promote it into this post to provide an outline for prospective reads.

Understanding Risk-Neutral Probability (Moontower)


To let you build your own understanding, the lesson begins with a progression of simple questions.


From this simple progression, you have actually self-derived a foundation for a key concept that we are going to get some mileage out of. Plus more practice to help you internalize the concept. It will make sense!


For those learning about derivatives or even pros who aren’t academically minded (myself included) there are exploration detours into 2 topics.

Useful Detours

1. Advanced Topic: How To Compute Risk-Neutral Probabilities From A Binomial Tree (Moontower)

That section includes exercises, again so you can own the knowledge, and then a derivation that you will probably have figured out intuitively:

  1. Advanced Topic: ReplicationReal World vs Risk-Neutral Worlds (Moontower)

    This one will be especially fun for traders because it includes:

    💡 What seasoned option traders get wrong

    💡 How divergence between real and risk neutral probability lead to mutual opportunity for speculators & traders

    I give examples from:

    1. Warren Buffet
    2. 2. FX Carry

Returning to the main post

After the detours, we pick up again with the idea of risk-neutrality and make a logical step into the principle of that underpins how investments will be priced:

🧽risk absorbability


This is an underappreciated idea. I can tell because people confess their ignorance all the time when talking about their great investments. They don’t understand that the default assumption should be idiosyncratic risks don’t pay.

Why not?

Consider 2 significant ways that an investing entity can absorb risk and we use simple examples to demonstrate how risky propositions can be rationally priced with no risk premium:

1. Bet Sizing

2. Diversification

Let’s look at each:

Risk absorption by bet sizing


Risk absorption by diversification:


The key takeaway: The more risk you can absorb, the closer your bid approaches the risk-neutral (ie arbitrage-free) price.

Bring it altogether

Now we get to tie all of this together to reason about investing broadly. There’s a short recap before we get to heavier lifting.


Now we’re ready for the real discussion which starts with an an obvious question:

🏗️Why did we slowly build all this theoretical scaffolding?

2 reasons that I categorize as:

1. Instrumental

2. Appreciative

These are essays in themselves.

🔗Instrumental Reasons (aka the practical reasons to learn this stuff)

We divide the audience for this into

a) Traders

b) Investors


I also give an example of the line being blurry between them:


Then we really address the implications for investors which applies to many more people.

We do this in Socratic form.


⚕️Then I offer some prescriptions


That covers instrumental reasons but I am an advocate of financial masturbation. So we also look at the appreciative reasons to understand this stuff.

Like I mentioned, this is an essay unto itself:

🔗Appreciative Reasons

This is the outline:


Followed by some cope:


Then closing comments and links for further reading. This post tied together a lot of material.


🔗Links For Further Reading

You Think You’re Trading Vol….But Are You Even?

Option amateurs underappreciate the role of funding in pricing derivatives. Professional options traders need to be obsessed with funding costs because they are trading for tiny, often sub-penny, margins.

Here’s a simple example to demonstrate the tyrannical effect of funding on pricing:

What is a 1-year American at-the-forward call option on a non-div paying, 20% implied vol, $100 stock worth?

You need to feed the model an interest rate to get an answer. You look at the yield curve and see a 5% rate (making this up) for 1 year. This yields a forward price of $105 (we can hand-wave simple vs compounded rates for this purpose).

Imagine the bid-ask for this call is 40 cents wide $7.80 – $8.20

If you buy on the bid and sell on the offer you make a .40 profit. Easy-peasy.

Now imagine you buy the bid and hedge the position until expiry. What implied vol did you buy?

The first thing to recognize is that you will be shorting the stock to hedge. Assuming it’s easy to borrow, you are still not going to receive a 5% rate on the cash proceeds. Your prime broker needs to earn its margin. If 5% is the risk-free rate, let’s assume they pay you 4.5% on cash balances. Conversely, the prime broker will lend at 5.5% (this is known as the “long rate” and it’s the rate you finance long positions at). If you sell the call on the offer you will need to pay that rate to finance the shares you buy.

Uh oh.

If you buy the call you need to use a 4.5% rate in the model to back out an implied vol and if you sell the call you need to use a 5.5% rate in the model. You can see where this is going.

  • If you buy the call on the bid you are paying 20.06% implied vol.
  • If you sell the call on the offer you are selling 19.95% implied vol.

(Check the math if you want)

You think you’re trading vol but because of the bid-ask spread on your funding rate, you are basically trading the same implied vol even if you buy the bid and sell the ask. Rho is the sensitivity of the option price for a 1% change in the interest rate. The vega of an option is the sensitivity of its price for a 1-point change in volatility.

The rho of this call option is 46 cents vs a vega of 40 cents.

A 1% difference in funding rate (ie 4.5% vs 5.5%) is an institutional level bid-ask. It can be much worse for retail.

If you are trying to make markets you think you’re trading vol but are you even?

Pricing and carrying longer-dated options is crucially dependent on funding costs and the bid-ask spreads might not even be wide enough to compensate a market maker for their funding spread. Another way of saying this: the market-maker with such a 1% wide funding rate is making a 20% “choice” market in the vol. If the bid-ask was tighter they would be bidding a higher vol than they were offering!

(Again this assumes they hold and manage the position as opposed to spreading the options off by say buying one call and selling another or having the privileged position of just getting ping-ponged on their posted bid-ask all day)

It’s Not The Merit It’s The Price

My past self makes me cringe.1

I remember a weekend Yinh and I spent in Big Sur before having kids. We stayed at a resort/hotel place for free in exchange for listening to the timeshare spiel. I’m just pushing back on every point, complaining about the math this poor lady on the bottom-of-the-realtor-totem-pole is conveniently ignoring. Looking back, I’m genuinely sorry to have been acting myself in that moment.

When you feel your blood pressure rising you can channel some grace by just thinking of someone you know who would be smooth in that situation. The aspirational move here is just smile and nod. I had the situation exactly backward — it was me who was embarrassing himself, not her with the canned pitch as pushy and nonsensical as it was.

Luckily I have this moon letter thing as an outlet for my teeth-grinding financial complaints. I’m over the timeshare sales thing (well, actually I just pay for a room and save myself the grief. I admit this feels more like a hair dryer solution 2 than addressing the root of my anger) and onto another — I can’t stand when a life insurance salesperson pretends they are doing god’s work by telling me about their widow client’s big settlement. I’m not against buying insurance — I have car insurance and life insurance. But I’m against motte-and-bailey persuasion techniques. If a widow getting paid is deemed a self-congratulatory act of corporate benevolence then Warren Buffet is the priest of puts, a hokey paragon of virtue, backstopping markets with the heart of a patriot. Ok.

Defending life insurance by focusing on the settlements that get paid out is as silly as branding calls sold as income. And for the same reason — there is no consideration of price. Let’s compare:

Defense of insurance: “Look at the settlement the policyholder received. It has so many zeros in it.”

Rebuttal: That would be true even if the insurance cost twice as much. So the issue isn’t whether there would be a settlement it’s the proposition on the whole.

Defense of covered calls: “The premium you collect is extra income, and if the calls go in-the-money you’ll be happy anyway”

Rebuttal: This would be true if I sold the calls for 1/2 the price that I actually sold them for.

In other words, both of these defenses are empty words because they skirt the defining point:

It’s not the merit of the idea — it’s the price.

The wrong price will ruin any proposition. Ideas without prices are worthless. “It’s a good idea to brush your teeth.” But if brushing your teeth took 8 hours a day, you’re better off pulling them all and getting implants.

“It’s a good idea to get insurance” has the invisible qualifier “assuming the price is reasonable”. From there we can debate “reasonable” and we should. But I assure you the percentage of time spent in a life insurance consultation that’s devoted to decomposing its cost is not commensurate to how important it is in the decision.

Money Angle

Let’s harp on this “merit cannot exist independent of price” idea. We’ll return to insurance for a moment.

The griftiness of insurance sales as a function of complexity is an inverted U curve. Term insurance is not complex, it’s highly competitive and low margin. Private placements, which I’ve written about, are sold to very wealthy people who likely have a CFO-type managing their money. It’s the midwit crowd from all ends of the income spectrum that express their snowflake exceptionalism in exactly the wrong place and end up paying for their agents’ kids’ private school tuition.

Many insurance products are complex and seriously difficult to understand — every now and then I’ll take a hard look at one and just think, “they expect the average person to comprehend what’s actually going on inside this black box?!” And of course, the answer is “no”. That’s actually the point.

Here’s a tip — run away if you can’t understand the insurance product better than the salesperson. This is not as high a bar as you think. Salespeople are experts at sales not financial engineering. If they weren’t selling annuities they’d be selling cars or homes. (It’s a blanket statement so there are exceptions — but you know who will agree with me the most? Nerdy advisors who don’t have perfect teeth. This is the old Taleb bit “surgeons shouldn’t look like surgeons”.)

When I look at insurance products, especially structured products, I look for the options embedded in them. The costs for these options is opaque. Many of them have analogs in the listed options markets, but ultimately the ones buried in insurance policies resemble illiquid flex options with long-dated maturities and substantial padding added to their prices. If you wanted to be rigorous about valuing an insurance policy you’d need to know everything from the value of these hidden options to how much credit risk to discount the various issuer’s policies by. Apples-to-apples comparisons are impossible. This de-commoditizes the products giving unscrupulous salepeople ample room to practice their dark art.

An aside about options thinking

I know someone who negotiates and prices leases for commercial office space. They work on huge leases with clients like FAANG. One of the things they mentioned was how they would try to embed provisions in leases which were basically hard-to-price options. The person also spent a couple years with an options market-making group and is generally very quantitative — I would use the person for math help regularly.

I also know of a few wildly successful option traders who did quite well in personal RE investing by structuring options with potential sellers (one of these stories was focused on an ex-colleague of mine which was discussed in a certain big city’s media post-GFC).

And one more related bit — an option manager I know is friends with a fund manager who deals exclusively in the pre-IPO share market. This is a class of funds that provide liquidity to late-stage VC portfolio company employees. The manager was able to help the fund manager by showing them how a particular option embedded in their structures was deeply mispriced.

A final aside on the usefulness of option thinking…in Option Theory As A Pillar Of Decision-Making, I include this:

Getting to The Price

A current example of the need to assess a proposition by understanding its price comes from the boom in covered-call ETFs. Jason Zweig of the WSJ recently published:

Why Investors Are Piling Into Funds That Promise Not to Beat the Stock Market (paywalled)

After great returns last year, covered-call funds are all the rage among income-oriented investors. But their high yields aren’t a free lunch.

The article covers the explosion in AUM in covered-call funds like the JPMorgan Equity Premium Income ETF (JEPI) or Global X Nasdaq 100 Covered Call ETF (QYLD).

These ETFs manage roughly $20B and $6B aum respectively.

We’ll talk about QYLD because its holdings are published while JEPI is a discretionary, actively managed ETF. (But I still want to know who gets to hungry-hungry hippo those option orders!).

QYLD sells covered calls on the Nasdaq 100. That means it sells a call option while owning the underlying index. If you buy 100 shares of QQQ and sell a call option you could do the same thing. That’s not an argument against this product though. Ease is a valid use case for a product.

More background: it sells the 1-month at-the-money call as opposed to out-of-the-money calls which is what people generally think of with covered-call strategies (when I was just a boy they called these “buy-writes” but I haven’t heard that term since Arrested Development was on the air).

I’ve addressed “selling options for income” as euphemistic, sales-led framing. I’m not necessarily opposed to selling options but when you brand it as “income” you are blatantly misrepresenting reality. You are pretending the option premium is income when the bulk of it is just the fair discounted weighted average of a set of possible futures. My bone with the marketing pitch is that there’s no discussion of price. Again, whether this is a good strategy depends on price and the price isn’t static. (I feel like like I’ve force-fed you like foie gras on this topic. If I have to hear about this “strategy” from one more medical professional I hope I better be sedated on an operating table so I can finally drown it out)

When the marketers show me the level of implied correlations they are selling in the calls then we can have a good-faith conversation. Or how about when they tell me who the buyer for those calls is? Because I can assure you there’s no natural buyer — the boys and girls buying those calls are only doing so because they are too cheap. They didn’t wake up in the morning and think “I’m not going to look at prices, I just think owning call options that go to zero is a reasonable way to invest my money.” You know what traders are thinking when they see the marketers pitch: “Thank you for stocking the pond, we’ll be waiting”.

And they will be waiting. Market-makers are lions in the bush who know the dinner’s migration patterns. Unlike lions, they need to be discreet. You can’t just pounce and scare everyone off. You don’t want to make a scene. So they pre-position.

The market-makers’ pre-positioning serves a dual purpose.

  1. It spreads the market impact over a longer window of liquidity. This is actually pro-social — it’s “markets properly working”. The telegraphed order is not as scary even though it’s a large size because the end of it is known and there’s no adverse selection risk. It’s what’s known as a “dumb” or uninformed order. It’s not reasonable to expect zero market impact because unless there’s someone who wants to buy all these options, the pool of greeks need to be absorbed by a get-paid-to-warehouse-risk-in-exhange-for-profit entity. The market is just an auction for that clearing price and the greeks dropped on the market will be recycled in adjacent markets emanating from the original disturbance. (I.e. the market makers will buy vega from you and sell it in some other correlated market where the entire proposition presents an attractive relative value play — it’s just a big web. Market-makers are the silk between the nodes.)
  2. You want the option seller to get filled near the offer so they feel good about the fill. That’s what it means to “not leave a scene”. So now that you are short vol 3 days ahead of the anticipated arrival of the order, knowing that the current vol level incorporates the impact of your own selling, you are ready to buy the new supply “in line”. Remember this is not frontrunning. It’s a probabilistic bet. The market-makers have no fiduciary duty to the fund (as opposed to actual frontrunning where the broker trades ahead of an order they control). Market-makers want the brokers to “feel” like they got a good fill. There are no fingerprints. A TCA that looks at execution price vs arrival price is already benchmarked to a mid-market price that has been faded to absorb the flow.

What does this mean for the cost of something like QYLD?

A napkin math approach


  • At the current AUM, they sell about 5,000 NDX at-the-money call options (equivalent to 200,000 QQQ options) every month.
  • Implied volatility is about 25% so the fund collects 2.89% of the index level 3 in premium monthly. (Can you see how ridiculous it is to call this income? Would you call it income regardless of how little premium it collected? What if the option was in-the-money and they collected the same amount of premium? Conflating premium with income is a timeshare tactic except it’s pushed by corporations who know better not Jane “it’s this job or dogfood for dinner” Doe.
  • The ATM call is pure extrinsic value.

The question is how much vol slippage can we expect on that order. I asked around and a full vol point seems like a reasonable estimate. Because of the “setting the table” pre-positioning effect it’s hard to get a perfect answer. So we’ll use 1 vol point and you can adjust the final analysis by changing it.

If there is 1 full vol click of slippage and the option you sell is pure extrinsic, than you are losing:

1 vol point / 25 vol points x 2.89% of AUM x 12 months in annual slippage.

That’s 139 bps in annual slippage. That needs to added to the 60 bp expense ratio for the fund.

So you are paying 1.99% per year for a beta-like exposure created with vanilla products. And the alleged income is not income. It’s a correctly priced option premium in one of the most liquid equity index markets in the world.

Even if I grant you a 10% VRP (variance-risk-premium is an idea that options are bid beyond their fair value for any number of reasons like convexity-preference, hedging demand, or the possibility that markets allocate prices according to efficient portfolios and single assets being mispriced might not be from a portfolio point-of-view) that means the alleged income is 10% of what the marketers claim.

This whole trend in covered-call ETFs feels more like an innovation for getting paid for commoditized exposures in a fee-compressed landscape than an innovation that actually improves investing outcomes.

An (Overly) Candid Opinion

I’m not some socialist arguing against giving people an abundance of choice. I just want to remind you that no smart-sounding idea gets a free pass without consideration of its cost. And my own wholly personal opinion is you are paying a lot for convenience here. Plus the more AUM these things get the worse the slippage.

A saying I repeat too much: Asset management is the vitamin industry. It sells placebos. It sells noise as signal.

The proliferation of option products seems like something devised by products people not alpha people, a complaint I’d charge against most of the asset management world (which probably means I’m being too harsh but also I’m not criticizing any single firm — I don’t even know anything about these large fund companies because they were not part of my career genealogy. To me, they were always just the names of customers). Another reason I should be softer on all this is that, in aggregate, active management is critical. But there’s a paradox of thrift thing where we should (and this is dark) encourage it for others but not subscribe ourselves.

If you are truly obsessed and love investing then you can figure out your own way and maybe I’m just a faint admonishing voice in the background that you mostly ignore (I do hope I help you think better around the edges at least). But for the casual investor whose targeted by pitches and thinks they are missing out, you are given permission to live FOMO-free. There’s nothing to see except a midwit trap.

[And definitely don’t look at these. Gag me.

Actually, any TSLA options mm wants to gag me for raining on their parade. That should tell you something.]

Using Log Returns And Volatility To Normalize Strike Distances

Basic Review

Consider a $100 stock. In a simple return world, $150 and $50 are each 50% away. They are equidistant. But in compounded return world they are not. $150 is closer. This blog post will progress from an understanding of natural logs to normalizing the distance of asset strikes.

The use of log returns in financial and derivatives modeling is useful because investing contexts usually involve re-investing your capital. In other words, the growth process is multiplicative, not additive. But if it’s multiplicative we find ourselves needing to specify a compounding interval. This is an invitation to attach a cumbersome asterisk to every model.

Logarithms offer an elegant solution — they allow us to standardize an assumption:  returns are continuously compounded.

If you are uncomfortable already, these short primer posts will help you catch up. And don’t worry, we will revisit HS math intuitively in this post before getting to the main course.

  • In Examples Of Comparing Interest Rates With Different Compounding Intervals, we saw how to convert back and forth between simple returns and compounded returns by dividing a holding period into different intervals.
  • In Understanding Log Returnswe showed how log returns are an extreme case of compounded returns — it assumes that compounding occurs continuously. In other words as you divide the holding period into smaller and smaller intervals, you find a rate that is smaller than the growth rate for the entire holding period. If the growth from $1 to $2 is fixed than the more compounding periods there are, the lower the rate must be in order for $1 to end up being $2.

Math Class Made Intuitive

You probably remember hearing about the constant e and the natural log from math class. You also repressed it. Because it was taught poorly.

Understanding e

We’ll turn to betterexplained.com:

e is NOT just a number!

Describing e as “a constant approximately 2.71828…” is like calling pi “an irrational number, approximately equal to 3.1415…”. Sure, it’s true, but you completely missed the point. Pi is the ratio between circumference and diameter shared by all circles. It is a fundamental ratio inherent in all circles and therefore impacts any calculation of circumference, area, volume, and surface area for circles, spheres, cylinders, and so on.

e is the base rate of growth shared by all continually growing processes. e lets you take a simple growth rate (where all change happens at the end of the year) and find the impact of compound, continuous growth, where every nanosecond (or faster) you are growing just a little bit. 

e shows up whenever systems grow exponentially and continuously: population, radioactive decay, interest calculations, and more.

Just like every number can be considered a scaled version of 1 (the base unit), every circle can be considered a scaled version of the unit circle (radius 1), and every rate of growth can be considered a scaled version of e (unit growth, perfectly compounded).

So e is not an obscure, seemingly random number. e represents the idea that all continually growing systems are scaled versions of a common rate.

Let’s say our basic unit of time is a year.

e is the constant that says “if I start with $1 and continuously compound at a rate of 100%, how much do I end up with…$2.71828”

Understanding the natural logarithm (ln)

It’s true that the natural log is the inverse of an exponential of base e just as logs answer the question “what power do I raise 10 to in order to get to X?”. But defining the natural log as an inverse is circular not intuitive. Again, we turn to BetterExplained. From Demystifying the Natural Logarithm (ln):

The natural log gives you the time needed to reach a certain level of growth.

e and the Natural Log are twins:

ex is the amount we have after starting at 1.0 and growing continuously for x units of time

ln⁡(x) is the time to reach amount x, assuming we grew continuously from 1.0

If e is about growth, the natural log (ln) is about how much time it takes to achieve that growth.

The Natural Log is About Time

    • ex lets us plug in time and get growth.
    • ln(x) lets us plug in growth and get the time it would take.

For example:

    • e3 is 20.08. After 3 units of time, we end up with 20.08 times what we started with.
    • ln⁡(20.08) is about 3. If we want growth of 20.08, we’d wait 3 units of time (again, assuming a 100% continuous growth rate).

Let’s apply e and natural logs to asset returns to understand how to normalize distances.

Normalizing Distance

Let’s return to the $100 stock. We said $150 is closer than $50 in the world of compounding. Let’s assume our growth occurs over 3 years. Here’s a summary of simple returns vs annually compounded returns (or CAGR):

So far so good. The compounded returns are lower than the simple average return. Since log returns are just compounded returns sampled continuously we’d expect them to be even lower.

The total log return is indeed lower than the total simple return.

We can also see that in logspace -50% total return is “further” away than up 50%. This is the first encounter we get with the concept of distance where we see that 50% in either direction is not the same. But by the end of this post, you will learn how to normalize even 2 log returns that look the same, but don’t mean the same thing.

But before that, we will need to complete our understanding of log returns. We saw that the 3-year total log returns are lower than the 3-year total returns. To do that I pose the question:

Can you compute the annualized log returns?

Pattern-matching the computations for average simple returns and CAGR, it appears we have 2 choices respectively:

  1. Total log return / 3or
  2. (1 + Total log return) 1/3 – 1

Remember what e and ln mean in the first place:

The expression ex is a total quantity of growth. It’s actually assumed to be e 1 * x where the 1 represents 100% continuously compounded growth and X represents a unit of time. The natural log or ln(ex) then solves for how much time (ie x) did it take to arrive at the total quantity of growth assuming 100% continuous compounding. 

A key insight is that we don’t need to assume a 100% rate and x to be time. We can simply think of x as the product of “rate multiplied by time”. This allows us to substitute any rate for the assumed rate of 100% to find the time. Once again we turn to BetterExplained:

We can use their logic to return to our question: Can you compute the annualized log returns from these total 3-year  log returns?

Down Case:

log return = -69%

rate x time = -69%

rate x 3 = -69%

The annualized rate must be -23.1%

To annualize log returns, we simply take the total log return and divide by the number of years!

The complete summary table:

All is right in the world…the more compounding intervals we divide the total period into the lower the return must be. Continuous compounding represents the most intervals we can slice the period into and therefore it is the smallest rate.

Recapping so far:

  • Compounded rates are lower than simple rates for the same total return
  • Log returns are convenient measuring sticks because we just assume continuous compounding
  • etells us how much continuously compounded growth we get if we know the time period and rate
  • The natural log can tell us:
    • How much time we needed at a given rate to achieve that egrowth
    • What rate we needed for a given time period to achieve that egrowth

Normalizing Distances For Volatility

Let’s return to the $100 stock and assume continuous compounding. What price on the downside is the equivalent of the stock moving up $20? By now, we understand, the equivalent downside move is less $20 away. Let’s compute the equivalent distances in log space.

ln(120/100) = 18.23%

We solve for a negative 18.23% log return:

ln(x/100) = -18.23%

x/100 = e-18.23%

x = .8333 * 100 = $83.33

If the stock starts at $100 then $120 and $83.33 are equidistant in log space.

We want to take this further. To compare distances, especially in different assets, we want to normalize for volatility.

Volatility is just another word for standard deviation. A 10% log return in BTC means a lot less than a 10% log return in 5-year Treasury notes. We should measure log returns in terms of how many standard deviations away a specified amount of growth is. Note, this is exactly what the concept of a z-score is in statistics. It tells us how far away from the mean a particular observation is.

Let’s stick with our $100 stock and give it a volatility of 18.23%.

  • A 1 standard deviation move to the upside in 1 year is $120
  • A 1 standard deviation move to the downside in 1 year is $83.33

If we define K as a strike price, we can back into a general formula for how far K is from the spot price in terms of standard deviations. Let’s define all our variables first:

K = strike price

S = Spot price

σ = volatility

t = time (in years)

We start with an intuitive expression for a Z-score using our variables:

We can confirm this makes sense with numbers from the previous example. We’ll set t to 1 (ie 1 year) and the Z-score is 1 corresponding to 1 standard deviation:

The formula makes sense. In English, it says “divide the distance in logspace by the annualized volatility scaled to 1 year”.

This simply validated the expression for Z-score. We still want to define any strike price, K, as a function of its volatility and time.

Algebra ensues:

  • If you input a positive volatility number, the formula spits out what a 1 standard deviation up move is.
  • If you input a negative volatility number, the formula spits out what a 1 standard deviation down move is.

If you recall, the big insight from earlier:

The expression ex is a total quantity of growth…we don’t need to assume a 100% rate and x to be time. We can simply think of x as the product of “rate multiplied by time”.

This fact can allow us to decompose the Z-score expression to account for the fact that our underlying stock process has both:

  1. a drift component (option theory uses the risk-free rate for reasons that are beyond this post)
  2. a random component drawn from a distribution defined by a mean (spot + drift) and volatility.

Defining the expressions:

  • Risk-free rate or drift = r
  • The mean of the distribution (aka the “forward”) = Sert
  • The standard deviation scaled to time = σ√t

The Z-score formulas that incorporate drift for 1 standard deviation up and down respectively:

  • Kup = Se(rt + σ√t)
  • Kdown = Se(rt – σ√t)

[The rate in the ex portion is part drift and part random. Why do we combine them with addition instead of multiplication? Because the time portion affects each component differently. We can’t double the variance and halve the time because time also factors into the drift (ie the interest rate)]

Let’s wrap with an example, this time including the drift.

Set r = 5% and t = 1

Fwd = 100e.05 = $105.13

If we are just considering the one standard deviation around the mean (as opposed to a full standard deviation up or down) this is the theoretical stock distribution:

What’s the point of all this?

For anyone within sneezing distance of a derivatives desk, these are rudiments. These computations are the meaning behind the Black Scholes’s z-scores (d1 and d2) and probabilities. These standardizations are critical for comparing vol surfaces. If you can’t contextualize how far a price is you cannot make meaningful comparisons between option volatilities and therefore prices.

If you only trade linear instruments because you are a well-adjusted human then hopefully you still found this lesson helpful. Seeing math from different angles is like filling in the grout in the tiles of your mental processing. You can measure the distance (or accumulated growth, positive or negative) in log space to account for compounding. You can standardize comparisons by using the asset’s vol as a measuring stick. And after all that, if you still don’t enjoy this, you can feel better about your life choices to do work that doesn’t rely on it.

If you do rely on understanding this stuff, hopefully you got e.00995-1 better today.

Understanding Log Returns

If you draw a return a simple return at random from a normal (ie bell curve) distribution and compound it over time, the resultant wealth distribution will be lognormally distributed with the center of mass corresponding to the CAGR return.

Imagine your total 1-year return is 10%. So your terminal wealth is 1.10.

If you compounded monthly to end up at a terminal wealth of 1.10 we can compute the monthly compounding rate as:

1.10 ^ (1/12) = .797% per month or annualized (ie x12) =  9.57% 

Let’s instead compound daily to end up with a terminal wealth of 1.10.

1.10 ^ (1/365) – 1 = .026% or annualized (x365) = 9.53%

The more frequently we compound while keeping the total return the same the lower the compounded rate or average rate that prevails to get us from initial to terminal wealth.

Log returns are returns compounded continuously (as if you were going to compound even more frequently than every single second but at a tiny rate). When we annualize that rate as we did in the prior examples we end up with a log return.

Or simply:

Ln(1.10) = 9.53%

Similar after rounding to just compounding daily.

Let’s say your $1 grows to $1.50 after 1 year, then

  • your simple return is 50%
  • your log return is ln(1.5) = 40.5%

This chart reveals 2 facts:

  1. Log returns are always smaller than simple returns just as compounded returns are lower than simple returns. This makes sense because log returns are just compounding where the interval between compounding is reduced to zero so it takes a lower rate applied more frequently to get to the same total return.
  2. Higher volatility (ie the larger changes) means a wider gap between the simple and log return. Again, reminiscent of the formula relating geometric and arithmetic returns.

The chart raises a question. We know that volatility increases the gap between simple and compounded returns but why is this exacerbated on the downside? There was nothing in the formula (CAGR = Arithmetic Mean – .5 * σ²) that points to any such asymmetry.

The answer lies in an illusion.

In the chart, 1.5 and .5 appear to be equidistant away. They are both 50% away, right?

That’s true…but only in simple terms!

In compounded terms, .50 is “further away” than 1.5.

A thought exercise will make this clear:

If I start at 100 and can only move in increments of 10%, I can get to 150 in 5 moves.

100 * 1.10 * 1.10 * 1.10 *1.10 * 1.10 = 1.61

But on the downside, compounding by a fixed amount means more moves to cover the same absolute distance.

100 * .9⁵ = 59

In fact, I need 2 more moves to “cross” 50. With 7 moves I finally get to 47.8

The chart masks the fact that in logspace .5 is much further than 1.5 and therefore to have moved 50% from the start the volatility (ie the move size) must have been higher. And that’s exactly what the log returns show:

Price Simple Return Logreturn
50 -50% -69%
150 50% 41%

$50 is further away in logspace corresponding to a higher compounded volatility. If the volatility is higher, the gap between the simple and log-returns is wider.

Application to options

The analogy to options is the x-axis in this chart is strike prices because they are absolute distances apart. They are not equidistant apart in logspace!

We make the x-axis equidistant in logspace by making the log returns 10% apart.

Now we can chart the log returns on the x-axis. The distance of each total return from the diagonal shows the divergence between the log returns and simple return. It widens as you expect as we get to larger move sizes, but the chart is more symmetrical because the distance between the “strikes” is now normalized to compounded returns. 

Geometric vs Arithmetic Mean In The Wild


In ‘Well What Did You Expect’? we learned:

  • Mathematical “expectation” is a simple average or arithmetic mean of various outcomes weighted by their probability
  • Arithmetic means are familiar. Your average score in a class is the sum of your test scores divided by the number of tests. If you score 85, 90, 98  your average for the class is:  (85+90+98)/3 = 91

    Note the scores are weighted equally. Here’s what the number sentence looks like without factoring out the 1/3:

    .33 * 85+ .33 * 90 + .33 * 98 = 91

    If the final test is worth 50% of the total grade the weighted average is computed: .25 * 85 + .25 * 90 + .50 * 98  = 92.75

    Whether we are weighting the results equally or not, we are still computing the average by summing, then dividing.

  • Geometric means are like arithmetic means except quantities are multiplied instead of summed. Since investing is the process of earning a return and reinvesting the total proceeds we are multiplying, not summing results. If you invest $100 at 10% for 5 years your final wealth is given by:

    $100 * (1.10) * (1.10) * (1.10) * (1.10) * (1.10)  or simply $100 * (1.10)⁵ = $161.05

    In life, we often know the ending amount and the initial investment but want to know “what was my average growth rate per year?”

    The answer to that question is not the simple arithmetic average but the geometric average because we were re-investing or multiplying our capital each year by some rate. That rate is known as the CAGR or “compound annual growth rate”

    If we start with $100 and have $161.05 after 5 years we compute the geometric average in an analogous way to arithmetic averages, but instead of dividing by the number of years, we take Nth root of our total growth where N is the number of years we compounded for.

    CAGR for 5 years = ($161.05/$100) ^ (1/5) -1 = 10% 

    [we subtract that 1 at the end to remove our starting capital and just have the rate]

  • CAGR vs Simple Average Returns

With investing we are almost always re-investing our capital. That means our capital is being multiplied by a rate from one period to the next. When we want to know the average rate, we really want to pick the geometric average not the arithmetic one (there are other types of averages too like the harmonic average!). We want to compute the CAGR.

As a last proof that the CAGR and simple arithmetic average are different we can revisit the example above. If we compound an initial capital of $100 at 10% per year for 5 years we end up with $161.05 for a total return of 61.05%.

If we compute the simple average:

61.05% / 5 = 12.2%

This is higher than the CAGR of 10%

This is a consistent result. The geometric mean is always lower than the arithmetic mean!

How much lower?

It depends on how volatile the investment is. The reason is intuitive.

Imagine making 50% and losing 50%. The order doesn’t matter. You have net lost 25% of your initial capital.

The formula that relates the arithmetic mean and CAGR:

CAGR = Arithmetic Mean – .5 * σ²


σ = annualized volatility


This Is Not Just Theoretical

I grabbed SP500 total returns by year going from 1926-2023. Here’s what you find:

Simple arithmetic mean of the list: 12.01%

Standard deviation of returns: 19.8%

These are actual sample stats.

What did an investor experience?

If you start with $100 and let it compound over those 97 years, you end up with $1,151,937. 

What’s the CAGR?

CAGR = ($1,151,937 / $100)^(1/97) – 1 

CAGR = 10.12%

These are the actual historical results. An average annual return of 12.01% translated to an investor’s lived experience of compounding their wealth at 10.12% per year. 

Comparing the sample to theory

If you knew in advance that the stock market would increase 12.01% per year and you used the CAGR formula with our sample arithmetic mean return and standard deviation, what compound annual growth rate would you predict?

CAGR = Arithmetic Mean – .5 * σ²

CAGR = 12.01% – .5 * 19.8%²

CAGR = 10.06%

An average arithmetic return of 12.01% at 19.8% vol predicted a CAGR of 10.06% vs an actual result of 10.12%

Not too shabby. 

I used the same parameters to run a simulation where every year you draw a return from a normal distribution with mean 12% and standard deviation of 19.8% and compounded for 97 years.  

I ran it 10,000 times. (Github code — it works but you’ll go blind)

Theoretical expectations

CAGR = median return = mean return .5 * σ²

CAGR = .12 – .5 * .198² = 10.04% 

Median terminal wealth = 100 * (1+ CAGR)^ (N years)

Median terminal wealth = $100 * (1+ .104)^ (97) = $1,072,333

Arithmetic mean wealth = 100 * (1+ mean return)^ (N years)

Arithmetic mean wealth = $100 * (1+ .12)^ (97) = $5,944,950

The sample results from 10,000 sims

The median sample CAGR: 10.19%

The median sample terminal wealth = $1,2255,90

The mean terminal wealth: $5,952,373

Summary Table 

The most salient observation:

The median terminal wealth, the result of compounding, is much less than what simple returns suggest. When you are presented with an opportunity to invest in something with an IRR or expected return of X, your actual return if you keep re-investing will be lower than if you take the simple average of the annual returns.

If the investment is highly volatile…it will be much lower. 

The distribution of terminal wealth

The nice thing about simulating this process 10,000x is we can see the wealth distribution not just the mean and median outcomes.

Remember the assumptions:

  • Drawing a random sample from a normal distribution with a mean of 12% and standard deviation of 19.8%
  • Assume we fully re-invest our returns for 97 years

And our results:

  • The median sample CAGR: 10.19%

  • The median sample terminal wealth = $1,2255,90

  • The mean terminal wealth: $5,952,373

This was the percentile distribution of terminal wealth:

The mean wealth outcome is 5x the median wealth outcome due to a 2% gap between the arithmetic and geometric returns. The geometric return compounded corresponds exactly to the median terminal wealth which is why we use CAGR, a measure that includes the punishing effect of volatility. 

In terms of mathematical expectation, if you lived 10,000 lives, on average your terminal wealth would be nearly $6mm but in the one life you live, the odds of that happening are less than 20%.

The chart was calculated from this table:

Percentile Wealth 97-year CAGR
0.95 $22,323,532 13.5%
0.9 $12,048,311 12.8%
0.85 $7,955,791 12.3%
0.8 $5,601,855 11.9%
0.75 $4,098,451 11.6%
0.7 $3,210,573 11.3%
0.65 $2,480,813 11.0%
0.6 $1,981,453 10.7%
0.55 $1,604,153 10.5%
0.5 $1,275,987 10.2%
0.45 $1,009,583 10.0%
0.4 $804,035 9.7%
0.35 $627,807 9.4%
0.3 $476,756 9.1%
0.25 $357,112 8.8%
0.2 $257,498 8.4%
0.15 $186,552 8.1%
0.1 $115,257 7.5%
0.05 $58,646 6.8%

Note that, also 20% of the time, your $100 compounded for 97 years turns into $257,498 or a CAGR of 8.4%. A result that is 1/5 of the median and 1/20 of the mean. Ouch. 

So when someone says the stock market returns 10% per year because they looked at the average return in the past, realize that after adjusting for volatility and the fact that you will be re-investing your proceeds (a multiplicative process), you should expect something closer to 8% per year. 

And one last thing…you should be able to see how rates of return, when compounded for long periods of time, lead to dramatic differences in wealth. Taxes and fees are percentages of returns or invested assets. Make sure you are spending them on things you can’t get for free (like beta).

A Question I Wonder About

If you draw a return a simple return at random from a normal (ie bell curve) distribution and compound it over time, the resultant wealth distribution will be lognormally distributed with the center of mass corresponding to the CAGR return.

We saw that theory, simulation and reality all agreed. 

Or did they?

The simulation and theory were mechanically tied. I drew a random return from N [μ=12%, σ = 19.8%] and compounded it. But reality also agreed.

It may have been a coincidence. Let me explain. 

Stock market returns are not normally distributed. They are well-understood to differ from normal because they have a heavy fat-left tail and negative skew.

  1. The fat-left tail describes the tendency for returns to exhibit extreme (ie multi-standard deviation) moves more frequently than the volatility would suggest.
  2. Negative skew means that large moves are biased toward the downside.

These scary qualities are counterbalanced by the fact that the stock market goes up more often than it goes down. In the 97-year history I used to compute the stats, positive years outnumbered negative years 71-26 or nearly 3-1. 

The average returns, whichever average you care to look at, is the result of this tug-of-war between scary qualities and a bias toward heads. With the distribution not being a normal bell curve it feels suspicious that the relationship between CAGR and arithmetic mean returns conformed so closely to theory.

I have some intuitions about negative skew (that’s a long overdue post sitting in my drafts that I need to get to) that tell me that in the presence of lots of negative skew, volatility understates risk in a way that would artificially and optically narrow the gap between CAGR and mean return. By extension, I would expect that the measured CAGR of the last 97 years would have been lower relative to the theory’s prediction. 

But we did not see that.

I have 2 ideas why the CAGR was held up as expected, despite non-normal features that should penalize CAGR relative to mean return. 

  1. Path

    In Path: How Compounding Alters Return Distributions, we saw that trending markets actually reduce the volatility tax that causes CAGRs to lag arithmetic returns. It’s the “choppy” market that goes up and down by the same percent that leaves you worse off for letting your capital compound instead of rebalancing back to your original position size. The volatility tax or “variance drain” occurs when the chop happens more than trends (holding volatility constant of course). But since the stock market has gone up nearly 3x as often as it went down perhaps this trend compounding “bonus” offset the punitive negative skew effect on CAGR. 

  2. What negative skew?
      Qty Avg Return St Dev of Returns
    Up years 71 21.3% 12.7%
    Dn years 26 -13.4% 11.4%

    Using annual point-to-point returns, I’m not seeing negative skew. 

I’ve exhausted my bandwidth for this topic so I’ll leave it to the hive. Hit me up with your guesses. 



Well What Did You “Expect”?

Here’s a simple coin flip game. It costs $1 to play.

  • Heads: you get paid an additional $1 (ie 100% return)
  • Tails: you lose $.90

The expectancy of the game is $.05 or 5%.

We compute expectancy:

.5 * $1.00 + .5 * (-$.90)

It’s exactly the same calculation as a weighted average or arithmetic mean. This is a useful computation for many simple one-off decisions. Like should I buy an airline ticket for $1000 or the refundable fare for $1,100?

If there’s a 10% chance I need a refund then the extra $100 saves me $1,100.

10% * $1,100 =$110 which is greater than the $100 surcharge. 9% is my breakeven probability.

It’s tempting to use this logic in investing. Let’s say you expect the stock market to return 7% per year on average for 40 years. Start with $100 and plug in numbers:

$100 * 1.07⁴⁰ = $1497

Yay, you expect to have about 15x your starting capital after 40 years!

Eh. Sort of.

See the word “expect” in math terms and in colloquial terms is a bit different.

If I bet $1 on that coin game I theoretically expect to have $1.05 after 1 trial. In reality, I’m either going to end up with $2 when I double up or $.10 when I lose.

Another example:

I roll a die. If it comes up “1”, I win $600. Otherwise, nothing happens. Theoretically, I expect to win $100:

1/6 * $600 + 5/6 * $0 = $100

But if I asked you what you “expect” to happen if you play this game…you “expect” to win nothing. You only win 1/6 of the time after all.

Back to the investing example.

Investing is not a one-off game. It’s a compounding game where you plow your total capital back into the sausage machine to get that 7%

That’s why we use 1.07⁴⁰.

You are counting on your $100 growing by 1.07 * 1.07 * 1.07…

So that 15x number…that’s mathematical expectancy the same way the dice game is worth $100 or the coin game is worth $1.05 even though those outcomes are never actually experienced.

What you expect to happen in the colloquial sense of the term is the geometric mean. The arithmetic average is a measure of centrality when you sum the results and divide by the number of results. (In our examples you are summing results weighted by their probabilities, but you are still summing). The geometric mean corresponds to the median result of a compounding process. Compounding means “multiplying not summing”. The median is the measure that maps to our colloquial use of “expected” because it’s the 50/50 point of the distribution. That’s the number you plan life around.

The theoretical arithmetic mean result of playing the lotto might be losing 50% of your $2 Powerball ticket (which is another way of saying you are paying 2x what the ticket is mathematically worth). The median result is you lit your cash on fire. You plan your life around the median, especially when it’s far away from the mean. We’ll come back to that.

With investing we are multiplying our results from one year to the next together. The geometric mean is what you actually “expect” in the colloquial sense of the term. The geometric mean is more familiarly known as the CAGR or ‘compound annual growth rate’.

What is the relationship between the arithmetic mean to the geometric mean? This is the same exact question as “what is the relationship of mathematical expectancy and the CAGR?”

It’s an important question since that theoretical arithmetic mean is only expected if we live thousands of lives (actually there are ways to experience the arithmetic mean without relying on reincarnation. This is pleasant news because what good is being rich if you come back a pony.) We want to focus on the CAGR, which is much closer to what we might experience.

It turns out that number is lower.

How much lower? It depends on how volatile the investment is. The formula that relates the arithmetic mean and CAGR:

CAGR = Arithmetic Mean – .5 * σ²


σ = annualized volatility

If an investment earned 7% per year with a standard deviation(ie volatility) of 20% you can estimate the CAGR as follows:

CAGR = .07% – .5 * .20² = .05

In arithmetic expectancy, over 40 years you expect to earn 1.07⁴⁰ = 15. You expect to have 15x’d your money.

But the median outcome, which corresponds to the geometric mean is 1.05⁴⁰ = 7.

7x is much closer to what you “expect” in the colloquial sense of the term. Less than 1/2 the arithmetic expectation!

The formula tells us that the arithmetic and geometric mean (“CAGR”) will diverge by the volatility. And that volatility term is squared…which means the divergence is extremely sensitive to the volatility.

This is a table of CAGRs where you can see the destructive power of volatility:

Why is volatility so impactful on a compounded return?

An easy way to see the impact of high volatility is to imagine making 50% and losing 50%. The order doesn’t matter. You have net lost 25% of your initial capital.

We can compute the geometric mean by weighting each possibility by its frequency in the exponent (in this case the exponents must sum to 2 because that’s the sample space — up and down):

.5¹ x 1.5¹ = .75

Go back to the first game in the post. You invest $1 in a coin game. Heads to make 100%, tails you lose 90%. This game had a positive arithmetic expectancy of 5%.

What is our arithmetic expectancy if you compound (ie re-invest) by playing 2x then the total possibilities are:

HT: 2 x .1 = .2

HH: 2 x 2 = 4

TH: .1 x 2 = .2

TT: .1 x .1 = .01

Since each scenario is equally likely (25% each) the arithmetic expectancy is simply the average = 1.1025

This jives with 1.05² = 1.1025

The average arithmetic return compounds as expected.

But our lived (median) experience is much worse. The median result is .20, a loss of 80%!

We could have seen that by computing the geometric mean:

2¹ x .1¹ = .20

Driving the point home with an extreme example

Consider a super favorable bet.

You roll a die:

  • Any number except a ‘6’: 10x your bet
  • Roll a ‘6’: Lose your entire bet

The arithmetic expectancy is ridiculous.

5/6 x 10 + 1/6 x -1 = 8.167 or ~700% return

But if you keep reinvesting your proceeds in this bet, you will go bust as soon as the 6 comes up. The median experience is a total loss, even though the arithmetic expectancy compounded is wildly positive. If you played this game 20 times in a row you’d [arithmetically] expect to make ~ 700%²⁰.

But you have a 97.4% chance of going broke because you need “not a 6” to come up 20 times in a row = 1 – (5/6)²⁰

That arithmetic expectancy of ~ 700%²⁰ is being driven by the single scenario where the 6 never comes up (that occurs 2.6% of the time). In that case, your p/l is $10²⁰ or between a quintillion and sextillion dollars.

But the geometric mean is 0 because multiplying over the 6 sample spaces:

10⁵ x 0¹ = 0

I chose such extreme examples because nothing illustrates volatility like all-or-nothing bets. The intuition you need to keep is that high volatility means you should expect to lose your money even if the arithmetic expectancy is high.

As soon as you start re-investing (ie compounding) your results are going to be governed by that geometric mean which hates volatility.

For the people who tout lotto ticket investments like crypto or transformative technologies with talks of “asymmetrical upside” or “super positive expectancy” remember even if they might be right, the most likely scenario is they lose all their money on that investment. Even literal lotto tickets can tip into positive expectancy. When that happens how much do you put into it?

Exactly. Not much. Because you know what to expect.

The role of rebalancing and diversification

Investing is not a one-off game. You always re-invest. By re-balancing, you “create” more lives by not concentrating your wealth in a single bucket which swamps the rest of your portfolio as it grows. If you never rebalanced BTC on the way up it would have eventually become nearly 100% of your portfolio and then 2022 happened.

If you don’t ever rebalance you are effectively praying that “not a 6” comes up for the 40 years you are compounding wealth. It’s not as extreme as that because market volatility isn’t as extreme as dice or coins. But the principle holds.

You only get one life so you care about the median. Diversification plus rebalancing gives you the god-perspective of getting to invest a fraction of your wealth into many lives.

Keep in mind — rebalancing is not changing your overall expectancy; it’s changing the distribution of returns by pushing the median return (geometric mean or CAGR) up to your theoretical arithmetic return. This trade-off is not free. If you rebalance you don’t get the 1000x payoff that occurs when a single concentrated position hits 50 heads in a row.

Money Angle For Masochists

Imagine a $100 stock that can either go up or down 25% every year.

It’s 50/50 to be up or down.

Let’s look at the distribution of the stock after 4 years (with the probabilities of each price below it)

Look at the extremes after 4 years:

  • $31.64

    A -25% CAGR over 4 years = cumulative loss of 68%

  • $244.14

    A +25% CAGR over 4 years = cumulative gain of 144%

If you sumproduct every terminal probability by terminal price you get $100. And yet, while the stock is fairly valued at $100, after 4 years, you have lost money in 11/16th of scenarios (~69%). The right tail is driving the fair value of $100 while most paths take the stock lower.

This is the mathematical nature of compounding. The most likely outcomes are lower even if the stock is fairly priced.

In the real world, stocks don’t just flip up and down like coins. The probabilities are not 50/50 and there aren’t just 2 buckets they can rest in from one year to the next. The beauty of option surfaces is they allow us to separate the probabilities from the distance of the buckets (and the number of buckets is continuous…there’s no price the stock is not allowed to go to).

Here’s some homework you can do with the above data:

  1. What’s the value of the 4-year $146.68 strike call worth?1
  2. What’s the value of the 4-year $75 strike put? 2
  3. How about the 4-year $125 call? 3

Bonus Questions

Imagine this stock is an ETF and there’s a 2x levered version (which means it’s 2x as volatile) of it.

  • What strike call on the levered ETF is equivalent to the $146.48 strike on the unlevered ETF?4(Hint: It’s further than $46.48 OTM)
  • What’s the value of the call at that strike? 5
  • If I was a market-maker and I got lifted at fair value on the 2x levered ETF 4-year 200 strike call and I go buy the regular ETF 150 4-year 150 calls to cover my risk how many do I need to buy to be perfectly hedged? (Assume you can buy them for what they’re worth…you have enough information to compute their fair value). 6

If you got through this then you have a new appreciation for how far certain prices are from a spot price and how it depends on time and volatility!

Starting from basics like the volatility tax, progressing to how path influences the volatility tax (trends are more like a volatility rebate and choppiness is a tax….the ratio of trend to chop will determine the ultimate cost of the volatility), and finally bridging these concepts to Black Scholes this series will take your understanding of compounding and how returns work to a deeper level.

  1. The Volatility Drain
  2. Path: How Compounding Alters Return Distributions
    [Between this post and the bonus questions you can start to see why pricing OTM options on levered ETFs given a liquid options market on the unlevered version is an application of these concepts]
  3. Solving A Compounding Riddle With Black-Scholes

Shout Out To Matt Hollerbach

Despite trading options for nearly 20 years at the time, it wasn’t until 2019 that I thought really hard about compounding. I knew how to manipulate formulas and how it related to options but it wasn’t until I discovered Matt’s work that I started to see it from a new angle. Matt makes it approachable and builds up insights in small steps. His blog inspired mine, especially many of my earlier posts. The entire blog is worth spending time working through. It’s similar to what I’ve said about gambling — it’s a place where you will learn how to think about risk and return far better than what finance texts will teach.

These are all-time great ones:

Trend Following is Hot Air

Investing Games

Solving the Equity Premium Puzzle, and Uncovering a Huge Flaw in Investment Theory

It’s painful to watch the median (or should I say average) “investor” reason about how markets work because without these intuitions (you don’t need to know formulas necessarily) you are innumerate. That’s like being illiterate but for like numbers and stuff. And the deficiency is as obvious as illiteracy is to a literate person.

The good news is we can all get better.

Understanding Implied Forwards

These are not trick questions:

Suppose you have an 85 average on the first 4 tests of the semester. There’s one test left. All tests have an equal value in your final score. You need a 90 average for an A in the class.

What do you need on the last test to get an A in the class?

What is the maximum score you can get for the semester?

If you are comfortable with the math you have the prerequisites required to learn about a useful finance topic — implied forwards!

Implied forwards can help you:

  • find trading opportunities
  • understand arbitrage and its limits

We’ll start in the world of interest rates.

The Murkiness Of Comparing Rates Of Different Maturities

Consider 2 zero-coupon bonds. One that matures in 11 months and one that matures in 12 months. They both mature to $100.

Scenario A: The 11-month bond is trading for $92 and the 12-month bond is trading for $90.

What are the annualized yields of these bonds if we assume continuous compounding?1
Computing the 12-month yield

r = ln($100/$90) = 10.54%
Computing the 11-month yield

r = ln($100/$92) * 12/11 = 9.10%

This is an ascending yield curve. You are compensated with a higher interest rate for tying up your money for a longer period of time.

But it is very steep.

You are picking up 140 extra basis points of interest for just one extra month.

Let’s do another example.

Scenario B: We’ll keep the 12-month bond at $90 but say the 11-month bond is trading for only $91.
Computing the 11-month yield

r = ln($100/$91) * 12/11 = 10.29%

So now the 11-month bond yields 10.29% and the 12-month bond yields 10.54%

You still get paid more for taking extra time risk but maybe it looks more reasonable. It’s kind of hard to reason about 25 bps for an extra month. It’s murky.

Think back to the test score question this post opened with. There is another way of looking at this if we use a familiar concept — the weighted average.

The Implied Forward Interest Rate

We can think of the 12-month rate as the average rate over all the intervals. Just like a final grade is an average of the individual tests.

We can decompose the 12-month rate into the average of an 11-month rate plus a month-11 to month-12 forward rate:

“12-month” rate = “11-month” rate + “11 to 12-month” forward rate

Let’s return to scenario A:

12-month rate = 10.54%

11-month rate = 9.1%
Compute the “11 to 12-month” forward rate like a weighted average:

10.54% x 12 = 9.1% x 11 + Forward Rate11-12 x 1

Forward Rate11-12 = 26.37%

We knew that 140 bps was a steep premium for one month but when you explicitly compute the forward you realize just how obnoxious it really is.
How about scenario B:

12-month rate = 10.54%

11-month rate = 10.29%
Compute the “11 to 12-month” forward rate like a weighted average:

10.54% x 12 = 10.29% x 11 + Forward Rate11-12 x 1

Forward Rate11-12 = 13.26%

Arbitraging The Forward Rate (Sort Of)

It’s common to have a dashboard that shows term structures. But the slopes between months can be optically underwhelming with such a view. Seeing that the implied forward rate is 13.26% feels more profound than seeing a 25 bps difference between month 11 and month 12.

You may be thinking, “this forward rate is a cute spreadsheet trick, but it’s not a rate that exists in the market.”

Let’s take a walk through a trade and see if we can find this rate in the wild.

The first step is just to ground ourselves in a basic example before we understand what it means to capture some insane forward rate.

Consider a flat-term structure:

[Note: the forward rate should be 10.54% but because I’m computing YTM on a bond price that only goes to 2 decimal places we are getting an artifact. It’s immaterial for these demonstrations]

Now let’s look back at the steep term structure from scenario A:

With an 11-month rate of 9.10% and a 12-month rate of 10.54% we want to borrow at the shorter-term rate and lend at the longer-term rate. That means selling the nearer bond and buying the longer bond.

When you study asset pricing, one of the early lessons is to step through the cash flows. This is the basis of arbitrage pricing theory (APT), a way of thinking about asset values according to their arbitrage or boundary conditions. As opposed to other pricing models, for example CAPM, someone using APT says the price of an asset is X because if it weren’t there would be free money in the world. By walking through the cash flows, they would then show you the free money2. The fair APT price is the one for which there is no free money.

Stepping Thru The Cash Flows

Let’s see how this works:

  1. We short the 11-month bond at $92
  2. We buy 1.022 12-month bonds for $90. We can buy 1.022 of the cheaper bonds from the proceeds of selling the more expensive $92 bond. The net cash flow or outlay is $0.
  3. Spend the next 11 months surfing.

At the 11-month maturity

We will need $100 to pay the bondholder of the 11-month bond so we sell 12-month bonds.

But for what price?

Well, let’s say the prevailing 1-month interest rate matched the rates we were seeing in the flat term structure world of 10.49%, the rate implied by the 11-12 month forward when we initiated the trade.

In that case, the bonds we own are worth $99.13.

[With one month to maturity we compute the continuous YTM: ln(100/99.13) * 12 = 10.49%]

If we sell 1.009 of our bonds at $99.13 we can raise the $100 to pay back the loan. We are left with .0134 bonds.
At the 12-month maturity

Our stub of .0134 bonds mature and we are left with $1.34.

So what was our net return?

Hmm, lemme think, carry the one, uh — infinite!

We did a zero cash flow trade at the beginning. We didn’t lay out any money and ended with $1.34.

That’s what happens when you effectively shorted a 26.37% forward rate but the one-month rate has rolled down to something normal, in this case about 10.50%

[In real life there is all kind of frictions — you know like, collateral when you short bonds.]

Summary table:

What if somehow, that crazy 26.37% “11-12 month forward rate” didn’t roll down to a reasonable spot rate but actually turned out to be a perfect prediction of what the 1-month rate would be in 11 months?

Let’s skip straight to the summary table.

Note the big difference in this scenario: the bond with 1 month remaining until maturity is only worth $97.83 (corresponding to that 26.33% yield, ignore small rounding). So you need to sell all 1.022 of the bonds to raise $100 to pay back the loan.

Besides frictions, you can see why this is definitely not an arbitrage — if the 1-month rate spiked even higher than 26.33% the price of the bonds would be lower than $97.83. You would have sold all 1.022 of your bonds and still not been able to repay the $100 you owe!

So the “borrow short, lend long” trade is effectively a way to short a 1-month forward at 26.33%. It might be a good trade but it’s not free money.

Still, this exercise shows how our measure of the forward is a tradeable level!

[If you went through the much more arduous task of adjusting for all the real-world frictions and costs you would impute a forward rate that better matched what you considered to be a “tradeable price”. The principle is the same, the details will vary. I was not a fixed-income trader and own all the errors readers discover.]

The Implied Forward Implied Volatility

Now you’re warmed up.

Like interest rates, implied volatilities have a term structure. Every pair of expiries has an implied forward volatility. The principle is the same. The math is almost the same.

With interest rates we were able to do the weighted average calculation by multiplying the rates by the number of days or fraction of the year. That’s because there is a linear relationship between time and rates. If you have an un-annualized 6-month rate, you simply double it to find the annualized rate. You can’t do that with volatility.3

The solution is simple. Just square all the implied volatility inputs so they are variances. Variance is proportional to time so you can safely multiply variance by the number of days. Take the square root of your forward variance to turn it back into a forward volatility.

Consider the following hypothetical at-the-money volatilities for BTC:

Expiry1 Expiry 2
Implied Vol 40% 42%
Variance (Vol2) .16 .1764
Time to Expiry (in days) 20 30

Let’s compute the 20-to-30 day implied forward volatility. We follow the same pattern as the weighted test averages and weighted interest rate examples.

The decomposition where DTE = “days to expiry”:

“variance for 30 days” = “variance for 20 days” + “variance from day 20 to 30”

Expiry2 variance * DTEExpiry2 = Expiry1 variance * DTEexpiry1 + Forward variance20-30 * Days20-30

Re-arrange for forward variance:

Fwd Variance20-30 = (Expiry2 variance * DTEExpiry2 – Expiry1 variance * DTEexpiry1) / Days20-30

Fwd Variance20-30 = (.1764 * 30 – .16 * 20) / 10

Fwd Variance20-30 = .2092

Turning variance back into volatility:

√.2092 = 45.7%

If the 20-day option implies 40% vol and the 30-day option implies 42% vol, then it makes sense that the vol between 20 and 30 days must be higher than 42%. The 30-day volatility includes 42% vol for 20 days, so the time contained in the 30-day option that DOES NOT overlap with the 20-day option must be high enough to pull the entire 30-day vol up.

This works in reverse as well. If the 30-day implied volatility were lower than the 20-day vol, then the 20-30 day forward vol would need to be lower than the 30-day volatility.

The Arbitrage Lower Bound of a Calendar Spread

The fact that the second expiry includes the first expiry creates an arbitrage condition (at least in equities). An American-style time spread cannot be worth less than 0. In other words, a 50 strike call with 30 days to expiry cannot be worth less than a 50 strike call with 20 days to expiry.

Here’s a little experiment (use ATM options, it will not work if the options are far OTM and therefore have no vega):

Pull up an options calculator where you make a time spread worth 0.

I punched in a 9-day ATM call at 39.6% vol and a 16-day ATM call at 29.70001% vol. These options are worth the same (for the $50 strike ATM they are both worth $1.24).

Now compute the implied forward vol.

Expiry1 Expiry 2
Implied Vol 39.6% 29.70001%
Variance (Vol2) .157 .088
Time to Expiry (in days) 9 16

You can predict what happens when we weight the variance by days:

Expiry1 = .157 * 9 = 1.411

Expiry2 = .088 * 16 = 1.411

Expiry 2 has the same total variance as Expiry 1 which means there is zero implied variance between day 9 and day 16.

The square root of zero is zero. That’s an implied forward volatility of zero!

A possible interpretation of zero implied forward vol:

The market expects a cash takeover of this stock to close no later than day 9 with 100% probability.

A Simple Tool To Build

With a list of expirations and corresponding ATM volatility, you can construct your own forward implied volatility matrix:


Like the interest rate forward example, there’s no arbitrage in trying to isolate the forward volatility unless you can buy a time spread for zero.4

For most of the past decade, implied volatility term structures have been ascending (or “contango” for readers who once donned a NYMEX or CBOT badge). If you sell a fat-looking time spread you have a couple major “gotchas” to contend with:

  1. Weighting the trade
    If you are short a 1-to-1 time spread you are short both vega, long gamma, paying theta. This is not inherently good or bad. But you need a framework for choosing which risks you want and at what price (that statement is basically the bumper sticker definition of trading imbued simultaneously with truth and banality). If you want to bet on the time spread narrowing, ie the forward vol declining, then you need to ratio the trades. The end of Moontower On Gamma discusses that. Even then, you still have problems with path-dependence because the gamma profile of the spread will change as soon as the underlying moves. The reason people trade variance swaps is that the gamma profile of the structure is constant over a wide range of strikes providing even exposure to the realized volatility. Sure you could implement a time spread with variance swaps, but you get into idiosyncratic issues such as bilateral credit risk and greater slippage.
  2. The bet, like the interest rate bet, comes down to what the longer-dated instrument does outright.You were trying to isolate the forward vol, but as time passes your net vega grows until eventually the front month expires and you are left with a naked vol position in the longer-dated expiry and your gamma flips from highly positive to negative (assuming the strikes were still near the money).

Term structure bets are usually not described as bets on forward volatility bets but more in the context of harvesting a term premium as time passes and implied vols “roll down the term structure”. This is a totally reasonable way to think of it, but using an implied forward vol matrix is another way to measure term premiums.

The Wider Lessons


Forwards vols represent another way to study term structures. Since term structures can shift, slope, and twist you can make bets on the specific movements using outright vega, time spreads, and time butterflies respectively. A tool to measure forward vols is a thermometer in a doctor’s bag. How do we conceptually situate such tools in the greater context of diagnosis and treatment?

Here’s my personal approach. Recognize that there are many ways to skin a cat, this is my own.

  1. I use dashboards with cross-sectional analysis as the top of an “opportunity funnel”. You could use highly liquid instruments to calibrate to a fair pricing of parameters (skew, IV risk premium, term premium, wing pricing, etc) in the world at any one point in time. This is not trivial and why I emphasize that trading is more about measurement than prediction. To compare parameters you need to normalize across asset types.
    To demonstrate just how challenging this is, an interview question I might ask is:

    Price a 12-month option on an ETF that holds a rolling front-month contract on the price of WTI crude oil5

    I wouldn’t need the answer to be bullseye accurate. I’m looking for the person’s understanding of arbitrage-pricing theory which is fundamental to being able to normalize comparisons between financial instruments. The answer to the question requires a practical understanding of replicating portfolios, walking through the time steps of a trade, and computing implied forward vols on assets with multiple underlyers. (Beyond pricing, actually trading such a derivative requires understanding the differences in flows between SEC and CFTC-governed markets and who the bridges between them are.)

  2. The contracts or asset classes that “stick out” become a list of candidates for research. There are 2 broad steps for this research.
    • Do these “mispriced” parameters reveal an opportunity or just a shortcoming in your normalization?
      Sleuthing the answer to that may be as simple as reading something publically available or could require talking to brokers or exchanges to see if there’s something you are missing. If you are satisfied to a degree of certainty commensurate with the edge in the opportunity that you are not missing anything crucial, then you can move to the next stage of investigation.
    • Understanding the flow
      What flow is causing the mispricing? What’s the motivation for the flow? Is it early enough to bet with it? Is it late enough to bet against it? You don’t want to trade the first piece of a large order but you will not get to trade the last piece either (that piece will be either be fed to the people who got hurt trading with the flow too early as a favor from the broker who ran them over — trading is a tit-for-tat iterated game, or internalized by the bank who controls the flow and knows the end is near.)

3. Execute

Suppose you determine that the term structure is too cheap compared to a “fair term structure” as triangulated by an ensemble of cross-sectional measurements. Perhaps, there is a large oil refiner selling gasoline calls to hedge their inventory (like covered calls in the energy world). You can use the forward vol matrix to drill down to the expiry you want to buy. “Ah, the 9-month contract looks like the best value according to the matrix. Let’s pull up a montage and see if it’s really there. Let’s see what the open interest is?…”

As you examine quotes from the screens or brokers, you may discover that the tool is just picking up a stale bid/ask or wide market, and that the cheapest term isn’t really liquid or tradeable. This isn’t a problem with the tool, it’s just a routine data screening pitfall. The point is that tools of this nature can help you optimize your trade expression in the later stage of the funnel.


This discussion of forward vols was like month 1 learning at SIG. It’s foundational. It’s also table stakes. Every pro understands it. I’m not giving away trade secrets. I am not some EMH maxi6 but I’ll say I’ve been more impressed than not at how often I’ll explore some opportunity and be discouraged to know that the market has already figured it out. The thing that looks mispriced often just has features that are overlooked by my model. This doesn’t become apparent until you dig further, or until you put on a trade only to get bloodied by something you didn’t account for as a particular path unfolds.

This may sound so negative that you may wonder why I even bother writing about this on the internet. Most people are so far out of their depth, is this even useful? My answer is a confident “yes” if you can learn the right lesson from it:

There is no silver bullet. Successful trading is the sum of doing many small things correctly including reasoning. Understanding arbitrage-pricing principles is a prerequisite for establishing what is baked into any price. Only from that vantage point can one then reason about why something might be priced in a way that doesn’t make sense and whether that’s an opportunity or a trap7. By slowly transforming your mind to one that compares any trade idea with its arbitrage-free boundary conditions or replicating portfolio/strategy, you develop an evergreen lens to ever-changing markets.

You may only gain or handle one small insight from these posts. But don’t be discouraged. Understanding is like antivenom. It takes a lot of cost and effort to produce a small amount8. If you enjoy this process despite its difficulty then it’s a craft you can pursue for intellectual rewards and profit.

If profit is your only motivation, at least you know what you’re up against.