There was a famous futures trader on the NYMEX when I was there named Mark Fisher. I leased office space from him (his clearing firm had a large footprint) but I only spoke to him once or twice briefly in passing. We didn’t really know each other. Anyway, he wrote a book called Logical Trader and backed people to trade his system. A copy of it was laying around the office (it’s in my garage in boxes I haven’t unpacked since moving over 3 years ago). I read a chapter that happens to be available for free online.
The book was mentioned on Twitter and got me thinking a bit about its core observation:
If you subscribe to the “random walk” theory, which states that the market’s movements are random and totally unpredictable, then the opening range would not be any more important than any other price level during the trading day. Right? For example, crude oil trades from 9:45 a.m. Eastern time until 3:10 p.m. Eastern Time. If you divided that day into 10-minute intervals, you’d have 32 parcels of time (and five minutes left over). So, each 10-minute time interval would account for roughly 1/32 of the market activity. Using random walk theory, you’d expect that the opening range (established in the first 10 minutes of trading) would be the high 1/32 of the time, or it would be the low 1/32 of the time. Therefore, random walk theory would dictate that 1/16 of the time the opening range would be EITHER the high or the low. 16 Now, what if I told you that in volatile markets – not static, and not necessarily trending markets – the opening range tends to be the high or the low 17-23% of the time? Would that get your attention? Yes. Because this observation would tell you that the opening range being at the high of the low of the day roughly one-fifth of the time is what we call “statistically significant.” In complete layman’s terms, this means the opening range is not just another 10-minute interval out of 32 of them in the trading day. It has more weight than any other time interval.
Let’s take another example. Let’s say that you divide the trading day up into roughly 64, five-minute intervals. Random walk theory would state that the opening, five-minute range would be the high 1/64 of the time or the low 1/64 of the time. So it would be either of those extremes 1/32 of the time. However, in volatile markets, that five-minute opening range is actually the high or the low of the day about 15-18% of the time. So instead of about 3% of the time, as random walk theory would predict, the first five minutes of the trading day turns out to be the high or the low 15-18% of the time. Again, statistically significant. And, from a trader’s perspective, if you knew that something was going to market the high or the low 15% of the time, you’d want to know that.
In short, if the opening range is the high or low a disproportionate amount of the time Fisher concludes that the odds are in favor of an intraday breakout strategy. The gist of it is:
Once the opening range is established, if the futures breakout of the range by say some fraction of a standard deviation then bet that they will continue.
In the case of an upside breakout, set a stop near the bottom end of the opening range.
The book goes into sizing, money management, where to set levels and other details.
I’m not going to weigh in on the strategy’s merits because it’s not my wheelhouse. I have lots of questions and of course, come from a place of skepticism but that’s just a healthy reaction to any anomaly. It’s not yet the “work”.
But it did get me wondering about how likely you’d expect the opening price in a market to be the high or low.
First, I turned it into a simpler riddle that you could try to solve yourself or (give to some kids to noodle on).
There are a couple of neat ideas in the solutions. (You’ll also find the trick GPT taught me. Satisfying, clever and reusable.)
Based on tinkering with a simple random walk, it does seem that an opening price (or the zero crossover in my toy examples) being the high or low a disproportionate amount of time would not be random.
[Although that isn’t enough to suggest that there is a positive expectancy in the strategy. It’s possible the payoffs on the breakouts times their frequency don’t compensate you for the number of times you get stopped out. If any systematic traders reading this feel like being nerdsniped by researching it I’d love to see the conditional probabilities that surround the different scenarios].
Money Angle For Masochists
Let’s practice option intuition on this same problem.
The setup:
A 30% vol asset opens at $40
It rallies to $40.50
Half the trading day has elapsed
What’s the probability it crosses $40 again today?
If we assume a Black-Scholes lognormal distribution with no skew (not unreasonable for a single day) we can compute the probability by turning $40 into a Z-score.
ln(K/S) is basically how much percent away $40 is from $40.50.
ln(K/S) = ln(40/40.50) = -1.24%
By dividing by σ√t we scale the 1.24% by standard deviation for the remaining time.
-1.24% / 30% * √(.5/365) = -1.12
$40 is 1.12 standard devs away.
The probability of the asset sliding at least 1.12 standard devs is 13%.
In Black-Scholes world, the probability of a strike expiring in-the-money is known as N(d2). But for short-dated options, delta is valid substitute for N(d2).
So we’d expect the delta of the $40 strike with half a trading day remaining and the asset at $40.50 to be 13%.
In the context of our earlier conversation, you might think that the probability of crossing zero (ie the opening price) is 13% but we need to make a key distinction based on using the option delta:
The delta is telling us the probability of expiring in-the-money…but our riddle is concerned with whether the price or random walk ever breaches zero even if it goes back up.
The riddle is not concerned with the probability of a vanilla option but a one-touch option.
Investopedia defines these exotic options:
A one-touch option pays a premium to the holder of the option if the spot rate reaches the strike price at any time before option expiration.
I’ve never priced one-touch options but I remember a quant trader telling me that their probability of being triggered was approximately 2x the delta of the vanilla option of the equivalent strike.
In this example, the probability of the asset touching a price less than $40 before the day ends is 13% x 2 = 26%
This is intuitive if we consider an at-the-money option that has a 50% delta. The asset is nearly 100% to touch prices on either side of the strike.
[It’s convenient and expected that option trader math gets reduced to rules-of-thumb (“straddle price is 80% of the vol scaled by time”, “multiplying the daily move by 16”, “implied correlation is ratio of index variance to avg stock variance”) since so much of flow trading is making quick decisions and on-the-fly comparisons or normalizations.]
If price changes were a random walk I wouldn’t expect the opening price to be the high or low more than 1% of the time. But the open price, while cannot be predicted, likely holds meaning once it’s established because it is a single clearing price of an auction that accumulated hours of overnight information.
[I spent almost 2 years on NYSE both as a broker in the “garage” if you are familiar with the place, and as a specialist in ETFs (in the “blue room”) . The open is the price that best clears the order book when considering the stack of market and limit orders on both sides. But consider this scenario —
At a price of $40.23 there’s an imbalance of 10,000 shares for sale
At a price of $40.22 there’s an imbalance of 75,000 shares to buy
You can expect the stock to open at $40.23 and for the specialist to buy the 10,000 shares for their own account and then to display a market with $40.22 bid for size to induce buyers. The opening price had information in it.]
Understanding Variance Time (Moontower)This post ties in nicely with another riddle:
In the option demonstration above I said there was half a day until expiration.
What time is it?
Hint: The answer is not the point. And you won’t get it anyway. I’d consider your response a success if you can just identify what the inputs the answer depends on. Godspeed.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
On Wednesday, a friend and I hosted 30 kids ranging from age 7 to 13 for Financial Literacy session I. Parents had drinks and pizza in the adjacent room. We kept it fun and highly interactive. No grown ups standing in front of a room. The feedback was overwhelming — the kids not only learned but had a blast.
You can do this for your kids’ friend groups too.
I’d describe it as “Kiyosaki without the brainworms”:
The floor and ceiling on savings (your savings don’t start until you cover your costs while savings from wages are capped by the number of hours in a day) —>
how to increase your savings rate (earn more per hour — even when your sleeping via investing or business ownership) —>
how compounding grows your savings
End
If you seriously decide to do this in your community, I’m happy to offer tips.
A few canned ones:
One tricky thing was the wide 7-13 age range. Littles run out of gas by 7:30pm and fractions are hard or inaccessible to most…but the 9+ group loved the compounding riddles. When I asked who wanted paper to do a math problem I didn’t expect to get mobbed. Something I learn over and over — give kids credit. They want to be stimulated.
We had prizes for right answers and some kids were so on point we had to adlib some timed questions to cull the herd because we didn’t have enough prizes.
No standing in front of the room. Get on their level. Silly is good but be quick to shut down “bottle flipping” distractions or any intra-group condescension. You are trying your best to meet every kid where they are. Also, not all kids are comfortable speaking up, it’s on you to make the environment inclusive the best to your ability as opposed to getting carried away with the energy of the dominants (much of that energy is insecure competitiveness that kids are understandably still navigating — but then again, you’ve certainly met adults wearing the same masks. They’ve just hardened into a “personality”).
We opened the discussion of compounding with this1. You deposit $100 at 10% interest. You pocket the interest at the end of each year. Repeat for 3 years
2. You don’t remove any money until 3 years elapses.
How much do you end up with in each case?
For the more mathy questions we’d let them work out the questions on paper and work in groups if they wanted.
I’m about 60% through William Poundstone’s Fortune’s Formula: The Untold Story of the Scientific Betting System That Beat the Casinos and Wall Street.
It’s a gripping narrative full of 20th century trivia that ties together the birth of information theory, some of the greatest scientific minds of the 1900s, the rise of quantitative finance, and the role of organized crime. These topics come alive in a fresh, memorable way when discovered through the lens of its colorful characters.
It chronicles the history of the efficient market hypothesis (MIT, U Chicago, Paul Samuelson). You can organize its conclusion around this excerpt:
There is much truth in the efficient market hypothesis. The controversy has always been over just how far the claim can be pressed. Asking whether markets are efficient is like asking whether the world is round. The best way to answer depends on the expectations and sophistication of the questioner. If someone is asking whether the world is round or flat, as fifteenth-century Europeans might have asked, then “round” is a better answer. If someone knows that and is asking whether the earth is a geometrically perfect sphere, the answer is no.
A few ideas that struck me:
An industry uses academic research to protect itself from…academic research
In 1959, Harry Markowitz published his famous book on Portfolio Selection. Everyone in finance read that, or said they did. Financial advisers responded to Markowitz’s model. They were growing aware of this new and threatening current in academic thought: the efficient market hypothesis. Markowitz demonstrated that all portfolios are not alike when you factor in risk.
Investopedia aside:
The efficient frontier is the set of optimal portfolios that offer the highest expected return for a defined level of risk or the lowest risk for a given level of expected return.
Poundstone continues:
Therefore, even in an efficient market, there is reason for investors to pay handsomely for financial advice. Mean-variance analysis quickly swept through the financial profession and academia alike, establishing itself as orthodoxy.
The Problem With Markowitz
1) Indecision
The Hamlet-like indecision of mean-variance analysis
When portfolios are equal on the efficient frontier, the investor’s risk appetite to decide. Unsatisfying.
2) Only useful for single period analysis
Most people do not invest this way. They buy stocks and bonds and hang on to them until they have a strong reason to sell. Market bets ride, by default. This makes a difference because there are gambles that look favorable as a one-shot, yet are ruinous when repeated over and over. Any type of extreme “overbetting” would fit that description.
(emphasis mine)
Standard mean-variance analysis does not treat the compounding of investments. It is, you might say, a theory for Kelly’s dollar-a-week gambler. But as the wealth to be amassed by compounding is so fantastically greater than can be achieved otherwise, a practical theory of investment must largely be a theory of reinvestment.
A solution to both problems
Indecision
I made up this example inspired by a demonstration in the book.
Consider 2 investments that each have 10 possible discrete returns. The balanced one and the skewed one.
Simple mean-variance metrics will mislead you into thinking the skewed asset is superior. It has a
higher return
lower volatility
cheaper straddle price
higher Sharpe ratio
But the so-called “third moment” of the distribution (the skew) cannot hide from the geometric return which leaves no ambiguity about which investment is superior for a long-term hold.
Aside for masochists
The closer an asset’s return distribution looks to a bell-curve, the closer the straddle price will approximate 80% of the volatility. But when the straddle value is less than .80 of the volatility, you know there is skew or outliers lurking. If you are inspecting an asset’s returns for the first time, a quick trick is to compute the ratio of MAD to Volatility to see if it’s less than .8
A place where this is very handy is in looking at the price changes in inter-month future spreads. If you trade options on them this has important ramifications for pricing. But the lessons extrapolate.
In my contrived example, you are bound to get a “whammy” if you keep pressing.
Poundstone writes:
When you try to apply Markowitz theory to compounding, the results can be absurd. One of Ed Thorp’s theoretical contributions to the Kelly criterion literature is a 1969 paper in which he demonstrated the partial incompatibility of mean-variance analysis and the policy of maximizing the geometric mean. Thorp closes his article by declaring that “the Kelly criterion should replace the Markowitz criterion as the guide to portfolio selection.”
Perhaps no economist of the time would have dared such a heresy. It seems unlikely a major economic journal would have published such talk. Thorp’s article appeared in the Review of the International Statistical Institute. Probably few economists saw it. In any event, few economists had heard of John Kelly. That was about to change.
Oft-forgotten history
Defense of Markowitz
Markowitz devoted a chapter of Portfolio Selection to the geometric mean criterion (possibly the most ignored chapter in the book) and cited Latane’s work in the bibliography. Markowitz was virtually the only big-name economist to see much merit in the geometric mean criterion. He recognized that mean-variance analysis is a static, single-period theory. In effect, it assumes that you plan to buy some stocks now and sell them at the end of a given time frame. Markowitz theory tries to balance risk and return for that single period.
The insights derived from the Kelly Criterion have a complex history
Because of this complex lineage, the Kelly criterion has gone by a multitude of names. Not surprisingly, Henry Latané never used “Kelly criterion.” He favored “geometric mean principle.” He occasionally abbreviated that to the catchier “G policy” or even, simply, to “G.”
Breiman used “capital growth criterion,” and the innocuous-sounding “capital growth theory” is also heard. Markowitz used MEL, for “maximize expected logarithm” of wealth. In one article, Thorp called it the “Kelly[-Breiman-Bernoulli-Latané or capital growth] criterion.” This is not counting the yet-more-numerous discussions of logarithmic utility.
This confusion of names has made it relatively difficult for the uninitiated to follow the idea in the economic literature. The person most shortchanged by this nomenclature is probably Daniel Bernoulli. He had 218 years’ priority on Kelly. The unique and unprecedented part of Kelly’s article is the connection between inside information and capital growth. This is a connection that could not have been made before Shannon rendered information measurable. Bernoulli considers a world where the cards are on the table, so to speak, and all the probabilities are public knowledge. There is no hidden information.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
I was messing with the Black-Scholes equation (as one does for fun) and happened on another way to visually understand it.
A prerequisite for appreciating this angle is to be familiar with Black-Scholes in the first place. If you aren’t and would like an intuitive understanding of the equation check out:
⚡The Intuition Behind The Black Scholes Equation (Moontower)
This is a review of what you need from that post but if it’s still foggy you can go back to the whole post.
This is the B-S equation:
The equation was born from a principle of no arbitrage:
If you can replicate the cash flow of an asset with a strategy then the price of the asset should equal the cost of executing the strategy.
The left-side of the equation (ie the call option price) is theasset
The right-side of the equation is the strategy
Construct a long/short portfolio:
Short the call (the left-side)
Long the strategy (the right-side)
Strategy – call = 0 profit
The call price must equal the p/l of the strategy for there to be no arbitrage.
Zooming in on the strategy (the right-side of the equation) via AnalystPrep.com:
The right-side of the equation is the strategy that replicates a long call option. (To offset the actual call we are short)
Let’s break this down step-by-step:
That strategy is a portfolio
The value of that portfolio at expiration discounted to present value must equal the value of the call option today
The portfolio has 2 components:
Shares: We need long shares of the underlying stock
Cash: We will need a loan to finance those shares (An important idea in derivates pricing via arbitrage pricing is that we assume the strategy is “self-financing”. That means you don’t need money to start. If you respond with “But I do have some money to start”, the self-financing paradigm is already taking care of the opportunity cost using the RFR. The computation remains valid.)
How much cash and shares do we need?
Share quantity in the portfolioThe amount of stock you need in this replicating portfolio is weighted by the expected value of the strike being in the money. Notice we say “expected value” which is not just probability but probability x payoff. The phrase expected value of the shares going in the money is what determines the delta or hedge ratio of the option.Delta = N(d1) Share quantity = S*N(d1)
Cash quantity in the portfolio
We need cash to finance the purchase of those shares. If we are short the call and it goes in the money we know we will receive the strike price at expiration because the long option holder will exercise the call and we will sell shares to them. If the shares were 100% to be in-the-money then we know we would receive the strike price at expiration. For example, if you sold a call option struck at $125 and it was 100% to be in-the-money, you are certain to sell the stock at $125 and receive that much cash at that future date. Of course, the option is not 100% to be in the money. So we discount the strike in 2 ways:
By the probability that it will be in the money
By the risk-free rate, to get it in present value terms
We can now say, on average, you will receive the present value of the strike weighted by its probability of being in the money.
Probability of strike being in-the-money = N(d2)
Again, this is just expected value logic. We weight the present value of the strike by its probability of being in the money.
Cash quantity = PV(strike) * N(d2)
This ends the review. The next section is new material. If this is still foggy zoom in on this part of the B-S primer: Animating The Equation
Visually Representing These Quantities On The Distribution
First, let’s ignore interest.
We just cross out e⁻ʳᵗ
Remember:
N(d1) = delta
N(d2) = Probability stock (S) finishes > strike price (X)
Call = S * delta – X * P(ITM)
Here’s the logic to go with the picture:
You are replicating a long call option with a mix of shares and cash.
1. You will need to sell a zero-coupon bond at X * P(ITM) to buy the stock
Why that amount?
This is the expectancy or probability weighted cost of buying the stock at the strike. Remember, no-arbitrage replication pricing must be “self-financing”. We need the proceeds of the expected cost of the shares today so that becomes the face value of the zero-coupon bond we sell.
2. We will spend the proceeds of the zero-coupon bond to buy shares today. How many shares do we need to buy?
We must buy S * deltashares. So if the shares are $100 and the delta of the option is 30%, we need to buy $30 worth of stock.
However, there’s a problem.
We can’t afford that many shares with our current proceeds!
Why?
Because this is true:
X * P(ITM) < S * delta
Why is this true?
Because S * delta is the expectancy of the stock given it’s higher than the strike X. It’s the sumproduct of all stock prices above X weighted by their probabilities (ie integral of the PDF).
That quantity must be larger than X * P(ITM)which is the probability of the stock being above the strike times the [single point] strike X.
[Note: This idea is also captured in the fact that d1 > d2]
This shortfall in shares we can afford to buy with the proceeds of the zero-coupon bond sale is what the call option must be worth!
This is the essence of B-S. The call value is balancing price that equates the option to the cost of the replicating portfolio.
A Visual Decomposition Of Delta
We got this:
Call = S * delta – X * P(ITM)
Let’s re-arrange this equation to be in terms of delta.
delta= [Call+X*P(ITM)] / S
in words:
delta = (Call + weighted strike) / S
We know delta is a hedge ratio between 0 and 1.
Observe:
The lower the weighted strike the more the delta looks like call price/stock price which will be small value
If the weighted strike is high because P(ITM) is high, then the call/stock will be closer to 1
We can simplify this even more by noticing that dividing by S is just normalizing by the stock price. In fact, if the stock price is $100 then this will be true:
delta x 100= Call + weighted strike
The 100 just gets in the way of the intuition and is safe to ignore for our purpose. It’s just a scalar. We can see that delta can be simply decomposed as:
delta = Call + weighted strike
This decomposition is more satisfying visually. But before the grand reveal let’s just be thorough and validate that this calculation of delta matches what my B-S calculator says (it does).
Let’s also remember what the chart of call delta by strike looks like:
When the call is deep ITM the delta is driven by the expensive call value which is made of lots of intrinsic value. The proceeds of the zero-coupon bond represented by the weighted strike price is extremely low, so the call must be expensive.
As the strike becomes ATM and eventually OTM the value of the call gets cheaper. You are able to buy more shares with the proceeds of the weighted strike, so the call is doesn’t need to be worth as much to balance the cost of the replicating portfolio.
As interest rise, the discounted strike (the amount you can collect from selling a zero-coupon bond) declines, meaning you can buy less shares to replicate the option. So the call option must be worth more.This is why call options have a positive rho or sensitivity to interest rates.
It’s neat to think of delta as the ratio of a call price + the weighted strike to the stock price. If you consider the price for an OTM call, you realize its delta is entirely driven by the ratioweighted strike / stock price
And for some completeness, this is a simple chart decomposing the call value into delta hedge – weighted strike (again assuming interest rates are zero)
Overall, this post is grout in your options thinking. It helps make connections between various concepts by showing you them from a different angle.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
The history of the US stock market is a small sample size. The true sample size requires looking at non-overlapping returns as opposed to rolling 12-month returns. Which means you get as many data points as you do years.
In the long term, values are related to macro variables such as inflation, monetary policy, commodity prices, interest rates, and earnings. These change on the order of months and years. Worse still, they are all codependent. A better way to think of market data might be that we are seeing a small number of data points that occur a lot of times. This makes quantitative analysis of historical data much less useful than is commonly thought.
In other words you should be “thinking in N not T” and admonition that acknowledges that samples drawn from the same regime reduce N.
[Plagiarism Disclaimer: I can’t find the origin of that “N not T” phrase but it’s not mine originally.]
This is also why the concept of attractor landscapes is important. Samples drawn from the same basin of activity are correlated to an underlying gravity that perpetuates the basin.
Taleb’s concept of the ‘Turkey Problem’ in relation to sample size based on data drawn for one regime — the fattening.
A paper that made the rounds in October, Stocks for the Long Run? Sometimes Yes, Sometimes No describes the “regime thesis” and how it shrinks sample size:
In conventional financial history, greatest weight would be placed on values computed over the longest interval available, on the theory that sampling returns over time follows the same logic as sampling people from groups. Longer time samples ought to provide more precise estimates of the true expected return on an asset, just as a larger sample of people would produce a better estimate of any differences in height or weight across groups.
The regime thesis denies that longer time intervals are like larger samples. It does not assume a population of asset returns existing outside of time from which larger samples can be drawn by lengthening the series. There are only the asset returns that have been recorded in history thus far. These do not predict future asset returns because their pattern is specific to the regime that prevailed at the time. No analysis of US bond returns from 1792 to 1941 could have predicted the bond returns seen following the war—such a bond abyss had never occurred before.
A regime is a temporary pattern of asset returns that may persist for decades. The regime thesis entails no periodicity and requires no reversion…Stationarity prevails within regimes but not across them. The idea of temporary stationarity distinguishes the regime thesis from Pástor and Stambaugh’s idea that the parameters of the return distribution are unknown. Under their thesis, the new 19th-century US data and the country-level international results can be characterized as an expanded sample relative to Ibbotson. The much larger sample better captures the true volatility of stock and bond returns by allowing more opportunities for extremes to emerge—in particular, extremes over multi-decade intervals, which are necessarily few in a single-market, single-century sample. The regime thesis differs in expecting sustained but temporary stationarity. Stocks and bonds ran neck and neck for over a century (Figure 1). Then after the war, that regime gave way to one where stocks beat bonds, year after year, decade after decade.
An adjacent idea, perhaps an instance of the regime construct, is a trend.
I’ll start with an example that vol traders will grumble about because it’s to familiar.
Imagine the trader buys an option implying 20% annual volatility. They delta hedge a long 2-week option once a day at the close but the stock goes up 1% per day.
The daily volatility annualized is 1% x √251 = 15.84%
The weekly volatility annualized is 5% x √52 = 36.1%
They would be better off if they hedged (ie sampled the vol) weekly. The auto-correlation of the moves affects the scaling of volatility from 1 day to 1 week.
If you only had a regime crystal ball…
The assumption that volatility scales with √n, where n is a unit of time, only holds when the returns are uncorrelated.
Failing to account for autocorrelated returns can lead to serious biases in our estimates of return volatilities…[in the presence of autocorrelation] the standard square root of time rule that the industry uses to translate daily, weekly, or monthly volatilities into annualized volatilities is wrong and produced biased results.
I gave GPT a break and let Claude summarize the paper:
The paper is looking at how autocorrelation, which is the correlation between a time series and its own past or future values, affects risk measures like volatility and drawdowns for investments.
Two example investments are compared – a global stock index and a commodity trading advisor (CTA) index. Even though they have the same average return and volatility, the stock index has larger and longer drawdowns.
The reason is that the stock returns tend to be positively autocorrelated – if returns are up one month, they tend to also be up the next month. So gains build on gains, and losses build on losses.
The CTA returns tend to be negatively autocorrelated – if returns are up one month, they tend to be down the next. This means drawdowns are shorter for CTAs.
Standard methods for calculating risk don’t account for autocorrelation. So they underestimate risk for positively autocorrelated assets like stocks, and overestimate risk for negatively autocorrelated assets like CTAs.
The paper shows how to adjust volatility and drawdown estimates for autocorrelation. This gives better comparisons of risk across different investments.
It also means that common risk-adjusted return measures like the Sharpe ratio can be misleading when autocorrelation isn’t considered. The risk-adjusted returns look very different for stocks vs. CTAs after adjusting for autocorrelation.
In summary, the paper shows that accounting for autocorrelation gives better measures of risk and returns for investments.
How to adjust for auto-correlation when translating single-period to multi-period volatilities:
I asked Claude for an intuitive explanation of the 1 + 2 x (autocorrelation) adjustment:
But with autocorrelated returns, variances don’t just add up independently. The adjustment accounts for this by multiplying the standard formula by (1 + 2 x Autocorrelation).
If the autocorrelation is positive, the (1 + 2AC) term will be bigger than 1. So the yearly variance will increase compared to the standard calculation.
If autocorrelation is negative, (1 + 2AC) will be less than 1. So the yearly variance will decrease.
Intuitively:
Positive autocorrelation means returns build on themselves – gains lead to more gains, losses to more losses. This increases risk.
Negative autocorrelation means returns tend to reverse – gains are followed by losses. This decreases risk.
The adjustment accounts for this by increasing/decreasing the yearly variance compared to the standard formula.
As an option trader, a grinding (auto-correlated) trend has intuitively felt like looking at realized volatility sampled daily understated the true volatility. To tease it out I could look at realized volatility at different times scales (intra-day, daily, weekly) and consider how the implied volatility was weighing them. The market isn’t stupid…in the earlier example, if the stock has moved 5% in a week with daily 1% moves it’s unlikely that it is going to offer you vol at 16%.
(There’s a nerd reader right now rushing to regress the current IV on various historical realized windows to impute the weights. If there is known event in the rearview, when you mentally discount its effect on your assessment of future realized volatility you are doing this intuitively!)
Now if we could only predict autocorrelation in advance.
I’ll leave you with one last quote from the paper that can be extrapolated as a wider word of caution:
Correlations are volatile so it takes a long time to detect them with confidence.
The choice of a 60-month rolling estimation period was influenced in part by the time it takes to detect autocorrelation with any kind of statistical reliability. The standard deviation of a single correlation estimate when the true correlation is zero is roughly one over the square root of the number of observations, or 1 / √n. With 60 months, the standard deviation would be about 0.13 [= 1 / √60], which means that five years or 60 months is about the amount of time needed to detect an overall correlation of 0.25.
Extra
Claude’s derivation of the adjustment:
Let’s say returns follow an autocorrelated process.
Rt = φRt-1 + εt
Where φ is the autocorrelation coefficient and εt is white noise.
The variance of Rt is then:
Var(Rt) = φ²Var(Rt-1) + Var(εt)
Since Var(εt) is just the 1-period variance σ², this becomes:
Var(Rt) = φ²Var(Rt-1) + σ²
Assume variance is constant over time due to stationarity so:
Var(Rt) = Var(Rt-1)
Plugging in:
Var(Rt) = φ²Var(Rt) + σ²
Factoring:
Var(Rt) - φ²Var(Rt) = σ²
(1- φ²) Var(Rt)= σ²
Var(Rt) = σ²/(1 - φ²)
Now suppose we are converting a 1-period (e.g. daily) variance σ² to an annual variance over n periods (e.g. n = 250 trading days).
The annualized multi-period variance is then:
Var(Rannual) = nVar(Rt)
Plugging in the expression for Var(Rt) and simplifying:
Var(Rannual) = n(σ²/(1 – φ²)) =
nσ²(1 + 2φ + φ²)/(1 – φ⁴) ≈
nσ²(1 + 2φ)
Where the last step holds for small φ, with φ² and higher order terms going to 0.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
I am disappointed with investing section of my last post Plane With Zits. Let’s remediate the problem with it and see where we land.
Recapping:
a) We recalled how volatility, a first order quantity, “drags” down median returns in a non-linear fashion
The volatility term drag is a squared term. This is same intuition can be appreciated from another angle — if you lose X% you need to gain back X/(1-X) which you can plot in your trusty TI-82 to see it’s non-linear.
Lose 10% you need to make 11% to get back to even.
Lose 33%, need to gain 50%.
Lose 50%, need to gain 100%.
Lose 75%, need to gain 300%
b) I showed why the impact of large drawdowns have an outsize impact on CAGR
My toy example assumed compounded returns of 9% for 19 years then 45% drawdown.
c) In such an event you are roughly in the same place had you put 50% in stocks and 50% in bonds yielding 4%
As soon as I hit send I started to feel weird about it. I did something lazy. And the problem got worse because I got 3 messages from people saying it was one of the best things they’ve seen because it confirmed intuition but hadn’t seen it presented this way. But there’s a problem with it. In fact I told one of the readers to call me because I wanted to explain why this needed revision.
So as a mini-test, ask yourself what the problem is? (It’s not a tax thing either).
🤔
Ok, let’s just jump in to the thought process and the fix.
I originally picked 9% because I wanted a CAGR that our collective conscience would agree is a reasonable guess for what long-term equity index CAGR is.
The problem is I can’t use 9% for 19 out of 20 years because the 20 year CAGR needs to be about 9% inclusive of the drawdown! Our perception of what equities return includes all the terrible times already. I can’t just use that CAGR and then bolt on 45% drawdown.
Instead, I needed to:
Pick a number for those 19 years that was higher than the CAGR
Apply the 45% drawdown
Make sure the resulting 20 year CAGR was 9%
Once I got to that point I just looked up what SP500 monthly returns were going back to 1926 via https://www.officialdata.org/us/stocks/s-p-500/1900 (The SP500 index didn’t exist then but since they base this on Robert Shiller’s work I’ll just assume the historical reconstitution is valid).
Using monthlies, the data set includes 1161 rolling 12-month returns. We find:
Annual Simple (arithmetic) Return 11.4%
Annual CAGR: 10.2%
Annualized volatility: 15.4%
.50% (ie 1 in 200) of these returns include a 12-month loss of 45% or greater
In the last post I made the disaster year occur 1 out of 20, but historically the odds were much small than that measured at monthly resolution.
I re-did the computation assuming that the typical year is an 11.4% return and allowed 2 variables to vary:
the disaster year return (R)
the probability of a disaster (p)
The formula in each cell is:
The table output:
(emphasis on cells with a roughly a 10.2% CAGR)
This is not a stock simulation so the 11.4% assumed return can just be thought of as a compounded return net of the volatility. This isolates the effect of a 12-month drawdown of R for probability p just to see how sensitive the total CAGR is.
It’s not until a 45% disaster occurs in 1 in 50 to 1 in 200 years does it threaten to knock a full 1% off the CAGR.
This might make readers now rush to the other side of the boat…”hey it’s a great idea to put 100% in stocks”
But remember, the history of the US stock market is a small sample size. The true sample size requires looking at non-overlapping returns as opposed to rolling 12-month returns. Which means you get as many data points as you do years.
Plus it’s only the US.
Jared appears again (I’ve been reading him for a decade…his personal finance book comes out soon and this tweet is timely for this post):
But let me add a mathematical point to the discussion…looking at monthly returns hides the emotional path as well as knowledge of the distribution.
Let me explain. Standard deviations are normalized measures. They are move sizes scaled to time.
The Socratic demonstration:
Is it more likely for a stock index to fall 10% in 1 year or 1 day?
That’s easy, in 1 year of course. But the return by itself is not normalized for time. It’s just a raw number…10%
Let’s ask this another way.
Is it more likely for the stock market to fall 3 standard deviations in 1 day or in 1 year?
You should now choose 1 day.
Think of it this way…in 1987 the stock market fell more than 20% in one day. I don’t know what SP500 volatility was leading up to the crash but I’d be surprised if the daily standard deviation was more than say 3%. That day would have been 7 standard deviations.
You have never seen a 1 year 7 standard deviation move.
Largest single day moves for the Dow:
Wikipedia
Using the overlapping data from earlier we find 3 annual standard deviation moves occurring .50% of the time (fatter than normal distribution) but some of these daily moves would be considered impossible.
The shorter the sampling period, the fatter the tails.
Or said otherwise:
For a shorter time horizon, the 1% probability move will be more standard deviations than the longer time horizon. (You can see this implied in option surfaces as well)
So if you look at returns at low resolution, you miss the experience. Even if you look at 2020 monthlies, it doesn’t seem anywhere near as significant as the feelings you had as an investor through it.
Summing up:
Using monthly and annual resolution, I overstated the risk.
But risk depends on the resolution. If you are an investor and can avoid looking at your account, you actually witness less volatility (on a standard deviation basis)! This is an argument for ignoring path.
Calm In The Distance, Turbulence Up Close
The problem is there’s nothing about past US returns that indicate what the future holds. Assuming real returns (after-inflation) of 6% is aggressive.
100% stocks when your investing life is one draw from a 40-year series has more to do with faith than judgement.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
Eric Torenberg interviews Byrne Hobart and Daloopa CEO Thomas Li on Hedge Funds, VC, and Finding Alpha
I pulled some excerpts below I’d like to hold for future reference. I used ChatGPT to clean up the transcribed excerpts — the result is a mix of quotes and paraphrases.
On Alfred Winslow Jones first hedge fund being similar to the modern pod shop:
But other things were just generally a sensible part of the model that you charge based on performance. You try to hedge your exposure. So you’re not just betting the market goes up. But you’re trying to differentiate among different companies and try to figure out what makes each company unique. Not just understanding the business, but also understanding what makes the stock move. A bunch of other hedge funds started appearing in the 60s. Then, in the early 70s, almost all of them went under. Many realized they could borrow as stocks were going up. They borrowed a lot to buy a lot, and it worked really well for a while. Then most of them got wiped out. However, there were a few survivors. You had this golden age in the 80s where you had people like Soros doing more macro stuff, and people like Tiger doing more company-specific fundamental stuff. As computers got faster and we started having more data, people came up with systematic strategies across different asset classes. A lot happened between the 80s and today, but the current evolution has been towards some funds that run classic strategies like value-based stock picking and being mostly long, with a couple of short positions. The strategy that’s gaining a lot of market share in terms of assets and public attention is the multi-strategy, multi-manager, platform, or pod shop model.
In this model, you give a portfolio manager some capital budget/risk budget. You tell them they are picking stocks in their sector, the kinds of companies they pick, and they must have no net exposure to the market, large stocks versus small stocks, or one industry versus another, or momentum stocks versus value stocks. Once you hedge out all those factors that cause different companies to correlate, you end up with a very pure view of which stock is going up relative to its peers. This model has worked really well as a way to create uncorrelated streams of alpha. So if you have 100 different people doing that in 100 different subsets of the market, and they all stay on top of these companies better than anyone ever before, they will generally figure out when orders are slowing down or picking up, when an airline will accelerate its growth, or when a price war between steel companies will abate. If you are continuously tracking and turning over a portfolio, you end up always identifying the idiosyncratic news that’s going to drive a given stock’s movements, beyond just the random noise that drives prices. That’s a general overview of what those funds do and how they think.
Examples of HF strategies
We’ve mentioned the multi-manager, multi-strategy funds, and they encompass a large number of different strategies within them. We’ve talked about the fundamentals and different strategies, but many of those funds will have systematic strategies. These range from broad-scale strategies, like looking at all the different asset prices and what correlates with what. For example, if there’s a view that deviations from these correlations will snap back. So, if oil stocks generally move together with the price of oil and then one stock is lagging, that’s the one you buy and you short a basket of other stocks against it. You can also have much more sophisticated systematic strategies.
One category that goes through booms and busts is index inclusion strategies. This involves predicting who will get added to or removed from the S&P 500 or other indices. The first order problem is predicting who gets added or removed based on explicit index inclusion criteria and your view of the index committee’s decision-making. You’re also trying to bet on the volume of trades that is already making this bet. For instance, if the index inclusion means that the index funds have to buy 10 million shares of Company X, but traders betting on that inclusion have already bought 15 million anticipating selling them to the index, then it’s actually a bad catalyst. When the inclusion happens, they are trying to sell more stock than the index funds want to buy.
Another style that goes in and out of fashion is global macro, which can be split into two things with opposite cycles. One is doing these global relative value trades, where you look at the world and basically look at which countries seem to be converging in terms of standard of living and government norms with the United States. You buy their currencies or revive their assets, expecting that convergence to continue. The other is, you look at the state of the world, decide something is totally unsustainable, figure out what’s going to break, and find the most cost-effective way to bet that it breaks. This kind of strategy can work extremely well during a crisis or sometimes when there is a crisis in one place or some outlier event.
Every time there’s an election surprise, you wait a couple of weeks, and you’ll find out that some macro hedge fund is up significantly, like 300%, because they had a massively levered bet on something like Argentinian stocks. They were the only ones who truly believed it would happen and that the rally would be as magnificent as it was. Regarding Brexit, there was a lot of activity where hedge funds were commissioning private polling and trying to track the developments over time. They tried to predict what would happen and, if so, what the magnitude of the price impact would be.
Risk-parity and 60/40 being implicit macro bets on low inflation
If you look at a long chart of the equity and fixed income correlation, you see that the sign flips depending on the level and uncertainty of inflation. When I started working at a hedge fund in 2012, it was a given that when stocks went down, treasury bonds went up, and vice versa. This pattern essentially started in 1998, triggered by a market dip due to long-term capital management, the East Asian financial crisis, and the Russian crisis. The Fed significantly increased liquidity, boosting bond prices, and eventually, stocks snapped back while bonds came down, channeling liquidity into the stock market. This led to the highs of 1999 and part of 2000, which was enjoyable for everyone except the short sellers.
This situation was possible because inflation had been steadily declining since the early 80s. Around 1998, it could be argued that China’s labor supply was almost infinite compared to the world’s demand for physical goods. As long as people could move from the countryside to cities to produce goods, the cost of tradable physical items like TVs, toys, furniture, and apparel would either remain flat or decline. This price drop was largely due to production moving from more expensive countries to cheaper ones, with China offering a huge labor force and good ports, plus a government eager to grow its industrial base and invest in infrastructure.
For a long time, inflation wasn’t a concern. Whenever growth slowed, stocks would drop, and rates would decrease, causing bonds to rise. This made risk parity an excellent trade. However, it turns out that risk parity is essentially a macro bet that inflation will remain low, implying that the risk-adjusted return of stocks plus bonds is significantly better than that of either one alone.
Most strategies are implicitly a bet on the yield curve
If you’re a venture capitalist, you’re interested in the tail end of the yield curve being as low as possible.
In risk parity, the preference is for the yield curve to have a traditional curve shape, essentially what people envision when they think of a yield curve.
For a market neutral or factor neutral hedge fund, the ideal scenario is for the short end of the yield curve to be high, and for it to be flat or almost inverted, indicating high volatility.
…
Don’t tell the VCs, but it’s true. A flat low yield curve implies a very low growth environment where real rates are extremely low. This means that if you can invest in a company that can produce secular growth at a time when rates are low, the valuation becomes completely nonlinear. For instance, look at what companies were trading at in 2021, it was because the present value of profits in 10 years was really close to the value of those profits today. As a result, many of them were valued on a multiple of 2027’s revenue or something similar. As long as they were growing really fast, that multiple made them look quite cheap. [Kris: See Negative Interest Rates and the Perpetuity Paradox]
These things work really well when rates are extremely low. Low rates also mean there’s a lot of capital floating around. This goes back to the earlier point on what Limited Partners (LPs) want; often they seek a single digit return. If you can buy 10-year treasuries at 5%, a single-digit return is not hard to achieve with very simple assets. But if your Treasuries are earning 70 basis points, then you absolutely have to take risks. This creates an interesting feedback loop where a lot of money flows into the growth parts of the economy. Many startups sell things to other startups. So, every time another large check goes into Snowflake before its IPO, suddenly there are more Zoom and DocuSign seats being sold, more Slack seats being sold, and there’s more usage on AWS. It all feeds into the same ecosystem. If everything’s trading at a high enough price-to-sales ratio, then every dollar that goes into the ecosystem increases the market value of that ecosystem by more than that dollar.
Additionally, if companies are increasingly paying people in equity, then you don’t need much cash to keep the flywheel going for a long time. Venture capital turned out, at least in modern venture where you have an ecosystem of startups selling to other startups, to be about understanding unit economics well enough to look at companies burning cash and ask, “What are they getting when they burn that cash? How much Lifetime Value (LTV) are they getting for the Customer Acquisition Cost (CAC) they have to spend?” If that number looks good, then you could put a really high valuation on these companies.
That’s one of the things that changed in the venture ecosystem, even over the five years up to 2021. People got really good at quickly identifying companies with a product-market fit, looking at what the unit economics look like, and discounting that by looking at the Total Addressable Market (TAM) and then basically saying, someone else can also figure out these numbers, so someone else can capture this TAM. Therefore, we absolutely need to give this company massive funding. The playbook for growing a company fast by dumping a lot of money into it got very refined by that time. You could find someone who had worked at a company that scaled at that speed and who knew where the bottlenecks were. Meanwhile, some of the scaling got easier because of all of these third-party services.
You didn’t have to build out an entire internal communications infrastructure like Amazon did when they were getting started; they built their entire customer service system in Emacs Lisp. But now you would just use Front or something similar, so you don’t have to put any engineer hours into building that system, which means you can scale much faster. More of the money went more directly into the company’s core competency because everything that was non-core was somebody else’s SaaS product that you could just buy.
Why shorting overvalued or fraudulent companies is a weak hedge from a correlation point of view
I wrote a piece on shorting recently and how it’s become a worse hedge over time. The basic argument is that when people are shorting, whether it’s on an unconstrained generalist basis or within an industry, they tend to find the same companies.They tend to identify companies that are over-earning, have dishonest CEOs, or are overly promotional, and so they short them by default. Alternatively, they might do the funding short of just picking a company where nothing is going to change over the next decade. So if they have to have a short position, they could just short this and not think about it anymore.
One problem with this is that it means when there are extreme market disruptions and hedge funds are telling all their portfolio managers to cut their exposure in half as quickly as possible, they’re all selling the same stuff or, more likely, selling some of the same stuff and also buying some of the same stuff. Sometimes it’s gratifying when I’m on Twitter and I see a rumor that some pod somewhere blew up, and then I look at the stocks I’m short and see they’re all up five or 10%. It feels good to know that I’m shorting the same things the professionals are, even if I found out because that particular professional didn’t perform well and got fired.
An interesting example of this I stumbled on recently was a company called Zion Oil and Gas, which seems like a scam. They’re drilling for oil in Israel, which is one industry that Israel does not excel in. It’s one part of the Middle East where that’s not the main economic activity. But they’re raising money from American investors who think this is really cool or maybe it’s biblical somehow. The stock in Zion Oil and Gas was at $6 a share in December 2008 and then went up to $14 a share in February 2009, making it one of the better-performing US equities over that time period. This was during the depths of the crisis. I have to assume that a lot of it was that very smart people were shorting this, thinking it’s a retail promotion that’s going to run out of money and die. Then all of them were losing money on everything else they did and had to cut exposure and buy back. So the stock went up. Maybe they did a big promotion, or maybe they had some sort of financial crisis, the End of Times themed stock promotion, but a lot of the worst companies in the world all go up on bad days because everyone is covering. So it becomes harder; over longer periods, shorts do hedge a portfolio, but day-to-day, it’s more painful.
Framing the competition between retail and professional investors
Why different time horizons mean different arenas
In many ways, everyday investors will generally either have a really short timeframe and are more or less gambling, or making educated bets on minor market movements, or they’re making longer-term bets like, “I know this company, I like the company, I use the products all the time, I’m going to buy the stock and hold it for 20 years.” If you’re doing that, it doesn’t really matter if Citadel is better informed about how this quarter is shaping up. Sure, it’s unfortunate that you might have bought the stock for 10% less if you waited a week until they reported bad earnings. But if you truly believe in the company, then it’s a minor difference, especially over longer timescales. And if you’re investing continuously, saving money, and putting a little money into the market every so often, then it all averages out.
One of the nice effects that hedge funds have for you as an investor is that they price in all the incremental changes in the outlook all the time. So every time there’s a new round of data that tells you a bit about share shift within some industry, hedge funds immediately adjust to that, or they have predicted it and already adjusted. This makes you less likely to be blindsided by certain types of surprises, especially on the revenue side of consumer-facing companies. It’s broadly true that hedge funds do make the market more efficient, so you’re getting a better deal.
Hedge funds are not trying to figure out where the stock will trade in 10 years. To the extent that they are, it’s more like they’re trying to reverse engineer the process of large, long-only investors, like Fidelity and Capital Group, etc...and what incremental news flow over the next two weeks will adjust their 10-year price target in a predictable way that you can trade ahead of.
Retail advantages over pros
The single largest source of advantage in the markets, ironically, are not owned by hedge funds but by retail investors, and that’s the time horizon.Over a long enough time horizon, you can actually outperform most hedge funds if you do things with discipline. Hedge funds have some disadvantages which you can easily avoid as a retail investor. The first disadvantage is that hedge funds incur a lot of short-term capital gains tax when they make money because of trades that mostly don’t go above a year. For retail, holding a stock for over a year is not that difficult. The second key benefit is that hedge funds need to show short-term performance; monthly returns matter, quarterly returns absolutely matter. They are forced to take movements when the markets are not favorable. For instance, there’s a grossing down problem. If the markets are bad, and everybody’s losing money, that’s the time you want to be deploying capital. But what typically happens is they’re reducing their exposure to the market to figure out what is going on, and that’s when you see huge market dislocations. As a retail investor, you can sit there and say, “8% is nothing if I’m going to hold the stock for the next 10 years, I’ll just hang on to it.” And that time horizon difference is a huge source of alpha in a market that, for the most part, isn’t competed away, even with the biggest hedge funds, because they don’t have the ability to do that.
Hedge funds measure themselves on a risk-adjusted basis, and part of it is just how they’re structured and capitalized. They’re often levered, like six or eight to one is the usual ratio. So if you’re an individual portfolio manager at one of those funds, if you have a billion-dollar allocation, you think their target return is like 10% a year, but no, their target return is on the order of like two or 3% a year. Because they are hedging so many things out, they just aren’t taking enough risk to make massive returns. The risk comes from stacking a bunch of these portfolios together. And if you make a trade and it’s not working right away, you’re probably going to exit that trade because you don’t know why it’s not working. It means that hedge funds are in this constant effort to generate new ideas. There’s this idea of velocity, like if you have a portfolio and it has X amount of names, and you’re turning over all of the stocks in that portfolio every Y trading days, then you need at least one original long or short idea every workday to have a portfolio with the right structure. The median quality of the ideas is not necessarily good, but it is a volume game.
What is a hedge fund solving for fundamentally?
You’re in the risk removal game, trying to remove as much risk as possible, because you have access to cheap enough leverage that if you can consistently generate a 3% return, it’s world-class, it’s absolutely phenomenal. With that consistency, you can borrow 10 times the money and make a 30% return. So, to achieve a consistent 3%, the key being consistency, you are removing every type of risk possible. However, the challenge of doing that is you often end up in situations with many other funds trying to do the exact same thing. Hedge funds tend to get into crowded longs and crowded shorts, where everyone is following the same thesis. For example, everyone might be long Amazon and short a bunch of other e-commerce tech names, or long Booking and short out the rest of travel.
In these nuanced situations, if a company like Amazon reports earnings and beats them, but not by enough due to the high number of long positions, the stock may trade down. These funds that are long Amazon then have to sell because the earnings, though fundamentally good, didn’t meet the high expectations set by the market. In trying to remove risk, these funds actually take on a significant risk by not considering that everyone else is removing the same risk.
To avoid this problem, one strategy is to engage in areas others are not focusing on. This approach, however, can be challenging because it often means fewer resources, fewer people to talk to, fewer conferences to attend. You’re often on an island, which can be a more difficult psychological battle. When working for a large platform, especially those managing double-digit billions, you quickly realize you can’t deploy hundreds of billions of dollars in ideas that others aren’t looking at. The equity markets will tap out very quickly in those spaces. Thus, the risk many hedge funds end up ultimately taking, which they want to avoid, is the risk of everyone else doing the same thing.
[Kris: This section touches on a few ideas I’ve observed before:
GPs have some misalignment with LPs (and non-partner PMs)
The trading mindset is merging with investing as the focus on alpha marries and operationalizes what “trading as a business” understands with informational inputs that come from understanding what drives business fundamentals and market reaction]
The curse of hedge fund managers is that they start out because they enjoy picking stocks, building systematic models, or day trading, but as they grow, that becomes 0% of their job. Instead, 100% of their time is spent on risk management, investor relations, or recruiting. They end up building a system that automates a lot of what they’re good at and then have to find their own idiosyncratic source of returns. If a hedge fund has access to the best prime brokers, best exchange connectivity, and best algorithms for implementing trades with low slippage, they need to gain an idiosyncratic return by hiring unique people early and onboarding them effectively.
A significant part of the business becomes structuring the trade in a way that defines a person’s incentives and non-compete agreements to capture as much of the alpha as possible at an acceptable price. These funds often offer experienced portfolio managers guaranteed bonuses and agree to hire them at the beginning of a non-compete, allowing them to wait it out. The hedge fund entity’s trade is about defining the person’s incentives so that they capture as much alpha as possible.
From the LP perspective, a hedge fund is like a marvelous treasury bond, producing a stable, non-correlated, and safe return. From the GP perspective, it’s more like a venture fund, looking for the handful of superstars who will consistently generate that 3% growth every year to make the business the best it can be.
Surprisingly, the big platform funds like Point 72, Millennium, Citadel, and Balyasny, which have backgrounds in day trading and systematic fixed income, do not come from a background of deeply assessing management integrity, which was a focus of Tiger Management.Tiger Management, once one of the biggest funds, wound down but seeded funding to its best analysts and network, creating an implicit multi-manager fund. However, they didn’t have the central risk management that current multi-strategy platform funds have. Julian Robertson’s funding led to a sort of implicit multi-manager fund, but they all used very similar strategies and often crowded into the same stocks.
This paradox shows that a background in assessing portfolio managers and analysts does not necessarily translate to success in managing a multi-strategy platform fund. The people who excelled at it were those who deeply loved creating the game.
“Peak-pod thesis” and efficiency
If you look back, there was a time when hedge fund returns significantly outperformed the market. However, starting around 2000, this gap began to shrink, and by 2010, it was minimal, closely aligning with the drag from fees and taxes. Hedge funds were once consistently generating a lot of alpha, but that started to decline. Now, the quality of reported alpha is higher, with more funds truthfully reporting no net market exposure or accurately disclosing their exposure and additional returns. However, as the skill level of investors increases and they understand the model better, the quantity of alpha available inevitably shrinks.
Hedge funds have become so proficient at generating ideas and maintaining a certain hit rate that they continue to produce risk-adjusted returns. But as more capital flows into these strategies and into competing funds, it becomes harder to execute large trades. The industry might reach a peak where the role becomes more routine and systematized, potentially leading to lower compensation per person but still remaining a significant job category.
Regarding total investment returns, imagine a stock market chart resembling a zigzag line deviating from a straight linear path. The area under this zigzag line represents the total market returns, predominantly beta. Alpha is the difference between this zigzag line and the linear path. In a market where volatility is high, hedge funds tend to perform better because the deviation from beta is greater, thus increasing the total alpha available. The current question is whether we have reached the peak number of portfolio manager “pods.” This depends on the total market volatility, which has been increasing due to higher interest rates, suggesting a potential for more pods and higher alpha generation.
However, if interest rates decrease and market volatility diminishes, hedge funds may face challenges in maintaining their current levels of alpha generation. They would need to diversify into other sectors to find new sources of volatility and alpha. Theoretically, if the market were to move in a perfectly linear trajectory, there would be little need for hedge fund pods, but such a scenario is unlikely to occur.
The concept that alpha sums to zero before taxes and transaction costs is crucial. If you’re making above-average returns, it’s typically because someone else made less optimal trading decisions, either buying high when you sold or selling low when you bought. Hedge funds rely on a supply of traders who are either valuation insensitive or simply poor at trading. However, this reliance draws other traders to exploit the same opportunities.
In The Laws of Trading you hear alpha doesn’t last forever, and this applies to both positive and negative alpha. For instance, negative alpha can occur in large pension funds that execute market orders for stocks every two weeks when employees contribute. Over time, traders might notice this pattern and begin buying these stocks a day earlier, selling them back to the pension fund at a higher price, thus reducing the fund’s impact and making it harder for them to systematically lose money. If it were possible to deliberately lose substantial money consistently, then inversely, one could make money by doing the opposite of their losing strategy. In public markets, it’s almost impossible to consistently lose money in absence of significant transaction costs.
How this can get quite meta
Concerning alpha capture, multi-manager funds analyze their portfolio managers’ decisions to determine their strengths and weaknesses. They can identify managers who consistently perform poorly with certain stocks or situations. This information helps build a meta portfolio that represents what the firm’s portfolio would look like if managers were perfectly self-aware of their abilities. Interestingly, someone who is consistently wrong about a particular stock, like consistently mispredicting Nvidia earnings, can be valuable. Their predictability, even in failure, can be leveraged by a quant model to generate profits by taking the opposite position.
This leads to a somewhat disconcerting situation where a financial professional might realize their value came from consistently incorrect predictions about a specific stock, contributing to their firm’s success by serving as a reliable contrary indicator. It’s this weird Marxist alienation from your labor, where if you find out that you had a really lucrative financial career, and it was entirely because you were really, really bad at Netflix earnings or something, but you were so bad that the quant model realized it would just fade you in much larger size every single time and make money like that’s gonna be a depressing realization. But someone, someone someday will probably come to that realization that they were just so reliably bad in certain situations that they actually made their employer money.
Understanding the good and bad of the job can help you determine if pro investing is for you
It’s exhilarating to feel like you’re always in the flow, that when something happens, you either anticipated it or are among the first to grasp its implications and strategically position yourself. That’s a thrilling feeling, although it’s not the norm.Usually, you feel clueless, underperforming, and stressed by random bad news. It’s like walking into the office and getting hit in the face. But occasionally, it’s extremely fun. The most gratifying things often come through ongoing stress and suffering. If you learn to enjoy that, you’re set.
Working at a hedge fund is unique because of the day-to-day variability. You’re dealing with extreme uncertainty and making decisions where being wrong 45% of the time means you’re top-notch. If you value intellectual honesty and variety, it’s a fantastic career.
However, when things go bad, they can be drastically different. The high level of trust and unpredictability can significantly impact your personal life.
There’s a trend of hedge funds starting venture practices and vice versa. It’s interesting to see if there will be more crossover, as both sectors tolerate a high rate of being wrong. One key difference in venture capital is the longer feedback loop. You won’t know if you’re a good venture investor for many years, unlike the quicker feedback in hedge fund investing.
The hedge fund industry is known for high burnout rates. Many enter in their early to mid-20s and leave by their 30s. Often, these employees haven’t experienced a full market cycle; they’re hired in good times and shocked by downturns. For instance, the downturn in Q4 2018 was mistaken by some as an apocalypse, but it was followed by a great year, giving a misleading impression of real downturns. In 2022, with an actual downturn, the industry faced a harsh reality check.
Updating is something people do a lot within a cycle on kind of minor stuff, like on Netflix, for example, it was more of a net subscriber additions story for a long time, and then became more of a revenue story. And it was also partly a margin story. However, when there’s a quarter where you correctly predict the net adds but get the revenue wrong, and the stock reacts more to revenue, you must quickly adjust your focus.
You have to very quickly tell yourself “the thing I was really good at predicting actually does not matter as much as this other thing. And so now I have to get good at predicting that.” And it’s when the really big shifts happen — like when the focus shifts from growth to profitability, or when we can’t assume infinite capital or money having zero cost doing lazy discounts. Now you actually have to think about what is the value of 50 cents in five years versus $1 in 10 years, instead of treating $1 in 10 years is worth roughly $1 today.
The ability to quickly adjust perspectives and decide what matters is crucial. Adapting your mental model rapidly during major shifts, such as a shift in focus from growth to profitability, is challenging.
Those who can adapt and last through multiple market cycles do extremely well due to their growing experience and opportunities.
[Kris: See 5 Takeaways From Todd Simkin on The AlphaMind Podcast to understand how a trading firm trains cognitive flexibility. This is especially important when you hire smart people who aren’t used to be wrong. This is echoed below.]
There’s a saying: “a smart person knows what to do, and an experienced person knows the exceptions to what to do.”
The average age in a hedge fund is relatively low compared to many other industries, including their mutual fund counterparts. You often see people working in hedge funds who have had a series of successes throughout their lives to reach their current positions.The typical profile of an analyst, for example, is someone who excelled in high school, attended a prestigious university, graduated at the top of their class in finance or economics, then went on to work at a major investment bank. After one or two years, they’re recruited from that investment bank to a top private equity shop or hedge fund. It’s a chain of success where they haven’t experienced significant career failure.
However, once in a hedge fund, the measure of success is not about the ability to study well or work hard. The skills required for success in a hedge fund are different from those correlating with educational success or early career achievements at places like Goldman Sachs or Morgan Stanley, where hard work is more directly linked to success. In a hedge fund, working harder does not necessarily equate to generating more alpha. If it did, everyone would be working 20-hour days.
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
I added Money Angle for Masochists to the letter this year. The alliterative phrase is going to stay but only because I didn’t know I should have called it the “Bridge of Asses.”
Medieval students called the moment at which casual learners fail the pons asinorum, or “bridge of asses.” The term was inspired by Proposition 5 of Euclid’s Elements I, the first truly difficult idea in the book. Those who crossed the bridge would go on to master geometry; those who didn’t would remain dabblers.
Wikipedia says the pons asinorum or “bridge of asses” is:
used metaphorically for a problem or challenge which acts as a test of critical thinking, referring to the “ass’ bridge’s” ability to separate capable and incapable reasoners.
The entry later states that economist John Stuart Mill called Ricardo’sLaw of Rent the pons asinorum of economics.
Well…that’s just because he didn’t live to see option theory. I don’t mean the math details. I mean the conceptual rails of looking at a web of branching future payoffs, seeing how they could be replicated, and measuring the cost of that replicating portfolio today. It is the formalization of finance’s deepest truth — you cannot eradicate risk, but only change its shape.
At least Dall-E didn’t spell Theor-E
With that, I push you onto the bridge.
[As always, I write these with a motivated high schooler in mind. I’m not sure I always get here and I do think these would be well adapted to video explainers but it’s just hard to prioritize that.]
Each of these posts is a little world of editorials embedded in explanations. I want to bring extra attention to Understanding Risk Neutral Probabilitybecause its editorials pull in ideas that are profound but relatively underexposed.
But to use copper this way, you need to imagine an economy where swings in demand for durable goods are a primary driver of the economic cycle. And you also need to assume away any countervailing force. One reason copper broke down as an economic indicator is that the biggest consumer, accounting for half of worldwide demand, is China. And, for a long period that probably ended in the last few years
In another post, Byrne highlights a similar sentiment showing how hard it is to compare data long-term:
Small Caps and Like-for-Like Comparisons
Verdad Research has yet another good piece on the gap between small-cap and large-cap valuations, where they note that the small-cap stock universe is less fundamentally impressive than it used to be. The relative comparison hurts in both directions: larger companies are better-run and faster-growing than they used to be, and investors in small-caps face an adverse selection problem courtesy of private equity firms: PEs will snap up bargains and lever them up enough to compensate for the M&A premium. It’s a good reminder that long-term comparisons between indices are not like-for-like comparisons; small caps got cheap in part because the best of them became large caps and the cheapest got acquired.
I always harp on how markets are biology not physics. On Wednesday I highlighted SIG’s Todd Simkin’s response when he was asked what aspect of trading students have the most difficulty with:
The most difficult aspect, not just for our students, but for our experienced traders as well, is handling the noisy outcome and the noise that comes after the fact. As I mentioned before, the types of people that we tend to hire are those with backgrounds in computer science, physics, finance. However, many of these individuals come from fields where if you can figure out a system, then you can move forward. Biologists are very much in this camp; if you can describe the way biological systems interact, no matter how complex they are, once you’ve described them, you can build on that. You’ve got a description of an underlying process. Germ theory, for example, once developed, everything that can bolt onto germ theory ends up being correct because germ theory itself is a good underlying description of the interaction of germs and health.
But in our world, once you’ve figured out how a system works, it changes the way you behave and once you behave differently, the system itself changes fundamentally. So, we are in this world of constant change and part of that change is our own impact on it. For an astrophysicist, the way a star behaves has nothing to do with whether or not we’re observing it. But for a trader, the way a stock moves has everything to do with our perception of how that stock should move. Once we have an opinion about it, we then go out and do something differently, and somebody else can see what we did and they’re building that into their system and their model of the way the world works. So, dealing with this constant change, I think, is the biggest surprise, especially since we’re bringing in really high-level smart people. We’re not bringing in people who are used to being wrong, and we’re putting them in a world where they’re going to be wrong a lot. Not necessarily in the direction of the trades they make, but certainly wrong in terms if they only evaluated the outcome. Even wrong in terms of having to change their mind frequently, and being open and willing to change your mind and having the right mindset to say this. “This, I think, is correct for now. But it might not be correct tomorrow.” It’s a new experience for a lot of these people who are accustomed to being A+ students, to getting things right. And we’re putting them in a world where they’re not getting a lot right all the time.
Almost every take you see that starts with some comparison of the past and what it should mean for us today materially underweights the biological nature of the system.
This reality is the subtext for the most popular finance post I ever published:
The dynamic in the post is an example of trying to bridge the irreducible paradox of “no, this time is not different” with the plasticity required to incorporate financial actors’ adaptation into the most lindy aspects of your mental model [again, RIP Munger].
If you use options to hedge or invest, check out the moontower.ai option trading analytics platform
About the episode: In this episode, we had the privilege of sitting down with Todd Simkin, currently the CEO of SIG’s Reinsurance business and a veteran at Susquehanna since 1997. Todd has played pivotal roles within the company, from being a trader to overseeing the day-to-day operations and even spearheading firm-wide education and trader development programmes.
Todd is possibly one of the pre-eminent minds in the world of trading, possessing a deep understanding of what contributes to the making of great traders. We were honored that he was willing to converse with us and share with our audience some of his reflections and insights into what goes into the creation of successful traders.
The entire podcast is terrific. These were my favorite parts (emphasis mine):
1) What aspects of trading do students have the most difficulty with?
The most difficult aspect, not just for our students, but for our experienced traders as well, is handling the noisy outcome and the noise that comes after the fact. As I mentioned before, the types of people that we tend to hire are those with backgrounds in computer science, physics, finance. However, many of these individuals come from fields where if you can figure out a system, then you can move forward. Biologists are very much in this camp; if you can describe the way biological systems interact, no matter how complex they are, once you’ve described them, you can build on that. You’ve got a description of an underlying process. Germ theory, for example, once developed, everything that can bolt onto germ theory ends up being correct because germ theory itself is a good underlying description of the interaction of germs and health.
But in our world, once you’ve figured out how a system works, it changes the way you behave and once you behave differently, the system itself changes fundamentally. So, we are in this world of constant change and part of that change is our own impact on it. For an astrophysicist, the way a star behaves has nothing to do with whether or not we’re observing it. But for a trader, the way a stock moves has everything to do with our perception of how that stock should move. Once we have an opinion about it, we then go out and do something differently, and somebody else can see what we did and they’re building that into their system and their model of the way the world works. So, dealing with this constant change, I think, is the biggest surprise, especially since we’re bringing in really high-level smart people. We’re not bringing in people who are used to being wrong, and we’re putting them in a world where they’re going to be wrong a lot. Not necessarily in the direction of the trades they make, but certainly wrong in terms if they only evaluated the outcome. Even wrong in terms of having to change their mind frequently, and being open and willing to change your mind and having the right mindset to say this. “This, I think, is correct for now. But it might not be correct tomorrow.” It’s a new experience for a lot of these people who are accustomed to being A+ students, to getting things right. And we’re putting them in a world where they’re not getting a lot right all the time.
2) Okay, you’ve mentioned that being smart is important. I understand. But are there any other qualities or attributes that you think, if added, would not only ensure a successful journey through the program but also a flourishing outcome?
We’ve given a lot of thought and had many discussions about this. When considering an individual, I believe that a combination of three key skills is essential. These are strong quantitative and analytical skills, which are separate from strong interpersonal skills. They are not negatively correlated, but rather uncorrelated. We’re looking for people who excel in both of these areas.
Quantitative and analytical skills are important, as are interpersonal skills. The ability to communicate effectively with others, whether it’s brokers to develop order flow or peers in the trading world, is crucial. It’s important to be able to learn from and teach others, which is a key part of our culture.
The third dimension is gambling skills. Once you have information about what is fair value and can draw the order out of the market, it’s important to take appropriate risks. Can you identify what risk looks like? Are you taking up the right amount of risk?
The individual we’re looking for excels in all of these areas. We’ve found that being exceptionally good in one area does not compensate for lacking in the other two. We’ve encountered great gamblers with poor interpersonal skills who didn’t succeed with us in the long run. We’ve also met incredibly analytical people who excel at quantitative research but can’t make decisions when it comes to putting money at risk in the trading market. Their gambling skills are low, but their math skills are high. That doesn’t work. They end up not trading.
So, finding the right balance between these three skills is crucial for us.
3) On what it means to develop your “risk-taking” — It’s not “I need to take more risk” but the education of what taking risk means. To learn to think of it in ways that are more complete than how you thought of it before
One of the things that I frequently see from the outside, when talking to non-traders or non-finance people about our role, is they say, “Man, what you’re doing looks really risky.” What they often mean is, “What you’re doing looks really reckless.” They do not make a distinction between measured risk and the ability to see where you’re taking appropriate risk for the amount of capital that you have or the amount of information available on the market. They equate doing something that is going to have a big outcome one way or another as risky. However, I can take tons of positions that look risky, but are really just reckless. That really just means I haven’t given anything enough thought and therefore, this is not a smart risk for me to take.
Likewise, I can take positions that have huge outcomes, bigger outcomes than what we would normally see, but it’s because I’ve got much better research, much better information, and a much better handle on what the risk looks like. I can offset that risk with hedges or I can naturally offset that risk with other positions that we have. Then, I’m not doing something that’s reckless. In fact, I’m doing something that is reducing our risk of ruin, which is better for us. But it looks riskier to people who don’t understand that underlying concept.
Being able to talk through that with somebody and develop their education around that is important. I firmly believe that I can get anybody to understand it, if you give me enough time with them and I can really talk through different examples and different scenarios with them. That’s exactly what our approach is to developing traders as opposed to looking for natural-born traders.
I don’t need somebody who comes in, Wild West style, slinging their guns ready to take on gigantic risks where they don’t have good underlying information. That’s reckless. That’s not what I want. But I can take somebody who is not inclined to put on risky positions and explain why this ends up working out best in the long run. I can get them to feel that this is not a risky endeavor, just because they take on a position that has volatility but in fact, has positive expectancy. We can talk through the appropriate balance.
4) On growth mindset
I believe that anyone can improve on the three dimensions – risk-taking, quantitative and analytical skills, and interpersonal skills – with appropriate training. We, as a firm, are strong believers in the growth mindset. This concept, popularized by educator Carol Dweck, suggests that people are not confined to a fixed set of characteristics. Instead, they can grow and change over time. Once you accept this idea, you can start to teach people skills that they might not be comfortable with today. This could be anything from taking risks, improving their understanding of quantitative models, or enhancing their interpersonal skills.
I believe that one of our fundamental strengths is our philosophical approach to trading. We maintain that if someone is not a good trader, it’s not their fault. The burden lies on us to do a better job of training them. This may seem tangentially related, but hopefully, I’ll be able to connect it back. Are you familiar with the classic problem of the ball and the bat costing $1.10?
[Kris: This part is great. I assume you know the bat and ball riddle but Todd’s about to re-state it and where he goes with it is neat]
I will state the entire problem so that your listeners can think along. The question is:
A ball and a bat cost $1.10 together, and the bat costs $1 more than the ball. How much does the ball cost?
I’ll pause here for a second so people can think about it.
The classic result of this is that people give a quick intuitive answer that is incorrect. The intuitive answer here, which is incorrect, is to say that the ball costs 10 cents. However, that does not satisfy the criteria because if the ball costs 10 cents, then the bat would cost $1 more than that, which would be $1.10. The two together would cost $1.20, which is too much. We want them to be $1.10 together, therefore the ball must cost five cents.
So far, so good. All we’ve said so far is that people answer intuitively and there’s not all that much that’s interesting about it until you start digging deeper into it. Recently, just this past week, I saw research come out. I’m not sure when the research was conducted, but Andrew Meyer and Shane Frederick went back and asked 93 variations on this problem. They surveyed a variety of people, with thousands of participants from sources such as Google Surveys and Mechanical Turk. The results were consistent with the original research, indicating that most people give the intuitive answer rather than the reflexive one.
One variation of the question was particularly interesting. The question was posed as, “The bat costs $1 more than the ball. How much does the ball cost?” with a prompt to consider if the ball could cost five cents. This prompt doubled the number of respondents who answered five cents, though it only increased to 31%. The majority still got it wrong.
[Kris: gets better]
Another variation asked the same question, but with a bold statement underneath saying, “The answer to this question is five cents. On the blank below, write five cents.” This led to 77% of people writing five cents as the answer, which means that even when explicitly told the answer, 23% of people still got it wrong. This relates to trading in that some people fall into what Meyer and Frederick call the “hopeless group.” Even when told the truth, they can’t comprehend it. They won’t believe it or share it.[Kris: I hate to say “let that sink in”. But let that sink in]
However, a significant number of people fall into another group. If prompted, they can think about things in a better way. These are the people who, when asked to consider whether five cents makes sense, can recognize their initial error. With prompting and education, they can arrive at the correct answer. This is the approach we take in our education programs, ensuring we provide enough context for people to improve their decision-making process.
5) Why is the answer to every trading question “it depends”?
There’s more information available than we’re ever going to be able to access, and we’re never going to be able to process everything. The right approach at one time might be the wrong approach at another time, given a slightly different set of circumstances. “It depends” is often the answer to what to do next. Starting a new line of questioning with “What does it depend on?” can be a great way to determine how you should change your answer based on the information you find.
However, if you start with a set action, such as “buy 10,000 shares when this happens,” you may find that changing the initial conditions slightly can dramatically alter the outcome. Even a small change in your behavior can significantly impact how others react. There’s no set answer, and to the extent that there is, machines are taking care of that before you’ve even had the chance to see the opportunity for the trade.
So, given that you’re able to see a trade and make a trading decision, the right approach is going to depend on many factors. All these factors will require analysis and reflection. Starting with the understanding that your answer is going to depend on something will lead you to ask the right questions and get the right people to weigh in with the correct information.
Another important point is that rarely is there one answer that is right in all cases. Just yesterday, I was sitting on the trading desk, and on the other side of the desk was one of our senior index traders. He was talking to one of his junior traders, and I wrote down what he said because it was so insightful. He said,
“I’m not saying do this going forward. I’m saying try this for today.”
He was giving advice on something to do, and importantly, the point he was making was not about revamping the way we approach trading forever. Instead, he suggested running a test, seeing this as an opportunity to get feedback from the marketplace about what would happen differently if we did something differently.
Sometimes, this approach might cost us, and we might not make as much money as we would have if we had done it the way we did before. But sometimes, it’s going to give us information that will allow us to grow in a new direction. So, “I’m not saying do this going forward. I’m saying try this for today.” I think that phrase is an important one for traders. And that’s the answer to “it depends.” Don’t change what you do forever. Change what you do right now, and see how that leads to a different impact and how that’s going to change your participation in the market in some way.