A Jane Street Alum Teaches Trading

Finance and internet genius Patrick McKenzie started a podcast called Complex Systems during his sabbatical. His very first interview is instant canon for trader education:

🎙️How the Smart Money teaches trading with Ricki Heicklen (100 min)
Patrick and Ricki discuss real problems in trading, how trading is taught, and pedagogical game design.

Ricki is a former Jane Street trader who now runs trader bootcamps. Like “real estate seminar”, “trader bootcamp” is a word sequence you should mute. This is an exception. I’m not stepping out on any limbs when I say SIG (my alma mater) and Jane Street are tops for trader education. This isn’t surprising since several of JS’s early employees were SIG defectors (I clerked directly for some who became key players at Jane and some of its later offshoots).

Predictably there’s significant overlap with this material and Moontower educational content. This is a great opportunity for me to share excerpts from Ricki’s interview with my own commentary and links to where I have covered similar ideas.

So let’s jump in…(Ricki’s quotes are in italics…all emphasis mine)


The demographic of who is drawn to trading has varied throughout history. What is the profile of someone who goes into trading today?

I went to Princeton University, an Ivy League school; I studied computer science with a focus on theoretical computer science. I had an internship at Jane Street the summer going into my senior year in college. 

I was then hired to work at Jane Street full time as a quantitative trader and started working there immediately after graduating. This is certainly true of the median employee – went to a fancy college or university, for New York traders, mostly coming from the United States, which is where I was. 

Often it’s people who have experience spending a long time thinking about math problems or puzzles, but not necessarily a lot of life experience under their belt. Certainly not a lot of professional training and often less background than you might expect in economics and finance specifically – rather, a general comfort with concepts around probability, expected value, and math puzzles writ large.

[Kris: This was still reasonably true 20 years ago but because the skillsets and competition for talent has merged with tech giants, the technical floor is much higher today. Trading firms always hired from top schools but now that list is probably even shorter and the accomplishments of new hires even more exemplary. There were many MIT folks in my cohort but there was still plenty of humanities majors from good schools. Today math competitions are going to grab more attention than being an intellectually well-rounded athlete]

On the focus on competition in finance recruiting vs software

Patrick:

The part about competitions is interesting. One of my theories is that there’s some selection effect for people who have that competitive mindset and want to play games about these sorts of things, but I generally tend to think that strong performance in games, particularly in competitive environments, probably predicts performance in real life.

At least in the tech industry, we don’t back-propagate that into our decisions for advertising at places like the International Math Olympiad. Is finance more rational than we are on this?

Ricki:

I don’t personally know how it is that software goes about recruiting as well, but I think that part of what quantitative trading firms are often trying to do is they are trying to recruit people who have the raw intelligence, that when combined with training that those firms are capable of providing in-house will turn into good trading skills. 

I think with software development, there is both more opportunity for people to get good at software development external to the places that are hiring for it, and therefore more ease at measuring what somebody’s skill in that domain is than there is for trading.

Right now, the state of the world is if you want to learn how to be a good trader, basically your only option is to go to a good trading firm. It is very hard to find good materials for learning how to trade, learning how to have the kind of intuitions and heuristics about a market that a trader ought to develop, in any context outside of a firm.

There are a couple of reasons for this. One is that the firms are incentivized to not spread that information too far and wide, and another is that trading skills are not skills that you can easily pick up from reading a book or from consuming a YouTube lecture series. They’re way easier to learn through actually being immersed in the environment and doing it yourself.

That means that you’re going to need a really good ratio of teachers to students in order to properly transmit trading knowledge to those students. This is just not going to be that widely available when you’re bottlenecked on how many good traders there are, and when those traders will benefit a lot more from full time trading than they will from teaching those skills to a few dozen other people.

[Kris: Hence why “trader bootcamps” are a mute term with few exceptions]

“I’m going to impart a bit of information upon you to get you ready for understanding the US equities markets…” – if you only had one sentence, what is that?

The number one sentence for purposes of trading, in general, is to think about adverse selection. Adverse selection is the concept that, conditional on getting to do a trade with someone, your trade might be worse than you’d previously thought it would be – that the world that you are looking at is one that has lots of different models that will explain different systems, and you can make predictions of what those models would output for numbers. But as soon as you are putting an order into a market, you need to think about the profitability of your trade, if it gets traded with, versus if it doesn’t. If it doesn’t, it profits zero.

So the fact of somebody else’s willingness to trade with you should adjust your model, and therefore you should calculate the profitability of the order that you submit based on limiting yourself to worlds in which those trades do happen, i.e. worlds in which the trade that you want to do is worse than it otherwise would have appeared.

[Kris: This is why backtesting is so hard. Simply assuming your slippage is X bps makes assumptions about liquidity that act as if your orders don’t leak info.]

Patrick explains adverse selection in crowdfunding

If a company has decided to raise money on a crowdfunding market, it has been passed on by the people who have made it their life’s work to find profitable investments in venture capital. And it has also been passed on by rich people in tech who can easily write $250,000 angel checks. There is a reason why it is rational for them to get $1,000 checks. [Patrick says that reason explicitly: Better investors, who write the larger checks, have passed on the opportunity to invest.]

Therefore, I hear people that there is some notion of “equitable access to growth opportunities in the market,” but I don’t think that public crowdfunding will actually give the general public opportunities to tap into growth markets on an equal footing with VC because, bluntly, a level playing field is one in which professionals destroy amateurs.

[Kris: This is so blindingly obvious to anyone that has been in an adversarial environment and yet we find that meme of “maybe it’ll work for us” ready for the next stove-touching FOMO donkey]

The limit of the adverse selection argument

If adverse selection were as powerful a principle as I’m claiming that it is, shouldn’t nobody ever trade with one another, especially in zero-sum environments?

I think the answer that I give to that is, you need a story for why – despite the fact that this trade is available to you, i. e. despite the fact that there’s somebody on the other side of the trade who wants to do it with you, and nobody else has taken your side of the trade yet – it is still worthwhile for you to do it.

There are a lot of different explanations for why this might be. One explanation is, “nobody else has had the opportunity to do it yet.” You are actually the first person to get there. That might be true for those early-stage VCs – honestly, I didn’t even fully follow all the different players in the ecosystem you just mentioned, I might get some of those details wrong. That might be true for the people who get the first opportunity to invest.

As there are more and more people who had that opportunity and turned it down, the phenomenon of adverse selection should be a larger and larger factor in your weighing against whether you choose to invest. But there could be other reasons.

Patrick adds:

An interesting difference between the private markets and the public markets is that – to do a gross generalization, and I know you can come up with all the ways that this is not true – everyone gets access to an incoming order at approximately the same time, where in VC-land, the thing that you want most is differentiated deal flow, which means when someone has an idea for their new company, they think of you and pitch you on the opportunity of investing first. At that point, you are essentially a person who has exclusive rights to take this trade or pass on it. 

[Kris: This is my oft-repreated idea of self-awareness. Are you the first call? Is your money better than green (does a cap table see you as a strategic investor)? Where are you in the pecking order and why? When I was at the fund one of my advantages is I could trade large blocks which earned me flow even if I wasn’t as fast a a market-maker. The point was I understood AND communicated to the brokers why they should show me trades]

How Ricki’s Intro To Trading Bootcamp opens

I find that the best way to learn trading is by doing it. On day one, the first class that I have people participating in my trading bootcamp go through is a class where you walk in the door and immediately you start trading. 

What does that look like? I have an order book that I’ve written out on a board. I’ve seeded it with a couple orders of my own that have a huge spread between them, and I’ve written up a contract on that board that will resolve to a specific number. 

I like to keep this as far away from actual knowledge of finance and the economy as possible, so my first contract will be, “What is the sum of the number of siblings that each person in this room has?”

This is a nice market for purposes of illustrating trading concepts, because each person in the room has some amount of private information, i.e. the number of siblings they have, has some rough sense of how many siblings on average somebody in the world might have, and then can whittle that down to, what about people primarily from the united states, what about people from the socioeconomic backgrounds that we assess other people in this classroom to most likely be in.

The Tighten or Trade’ Constraint

We go around and play Tighten or Trade, a game in which each person on their turn needs to either tighten the spread by improving the best bid, in that case increasing it, or improving the best offer, decreasing it – or they need to trade with one of the existing orders in the book. 

This is an artificial constraint to ensure that trading happens in an environment that is zero-sum and therefore you should be paranoid about approaching if there weren’t such constraints forcing you to do trades with one another.

I’ve found empirically when teaching this class that people don’t necessarily have that paranoia on day one, of avoiding trading in zero sum environments, but in order to make sure that it happens, putting this constraint of requiring people to tighten their trade is a way of guaranteeing that trades will occur. 

Simulating news and pulling your quotes

While trading is still open, I have each person in the room come up and write the number of siblings that they have on the board one by one so that we are calculating the sum of the number of siblings that we have, combined, in real time. Trading is still open and people are continuing to trade, but they’re updating their models as each new number gets written.

And this allows me to say things to people like, “Hey, somebody just wrote a seven up on the board. This presumably updates your value for the true sum of the number of siblings in this room upwards. What is the first thing you want to do?” 

Usually the first thing that people in the room do is go, “whoa,” and then they look at the trades that they’ve already done and try to figure out how much money they gained or lost as a result of that. For this, I chastise them. I say the absolute first thing you want to do is going to be orienting toward the order book. You want to be staying out and clearing all the orders that you have that might be stale, even if you don’t remember whether or not those orders are good, even if you don’t remember what side of the order book they’re on. 

The first thing that happens is, you have new information that markets are different from what you thought, and the fastest thing for you to do to protect yourself is to be out on the stale orders of yours that are now stale to new information. 

The next thing you want to do is to go in the direction of the new information’s indication. And you want to be doing this to approximately the right order of magnitude.

In terms of how much you move the price, if you see a seven, that’s a big surprise relative to a one, which might be your expectation of how many siblings someone has, such that if the spread had been one wide or two wide, you should be happy lifting the offer even if you weren’t paying attention beforehand to what your model says the exact sum should be.

The conservative way to approach things is anytime there’s new information and your model shifts, you should be extra paranoid about the orders that you have that are still posted to the book, and you should be rushing to clear those orders, or to say out on your orders so that they’re removed from the order book, in particular because of the concern we talked about earlier of adverse selection. 

If you still have orders on the book, let’s say those orders are good to the fair value, i.e. you would be happy for them to be traded with, people are not necessarily going to trade with them because they don’t want to do bad trades.

But let’s say your orders are stale in the direction of being bad. Somebody is going to come in, see that, and trade with it, if they can do that faster than you can clear your order. It is more efficient for you to clear your orders, than for you to recalculate what you think the new fair value is based on having now added in that person’s seven siblings, subtracted them out from the number that you multiply by your expectation of the average number of siblings, make sure you’ve counted how many people have already written, come up with a new number and decide if your order looks good to that number. 

A lot of the thing that I’m trying to convey in the trading curriculum as a whole is that, to be a good trader, you don’t necessarily need to be the person to get the exact right number after many minutes of painstaking equations and double checking every single odd constant that gets added to the end of that equation.

You need to be the fastest, you need to be going in the correct direction, and you need to have some sense of approximately how much you think the price of this asset should move, or how much you think the price of this stock should move. Those things need to come first because if you are the first trader, it is possible that you will get a good trade. If you are the 10th trader, it is way less likely because someone of the first nine traders was able to do the overwhelming majority of the good trades and take them out from under you. 

You are looking to maximize in dollar terms to make as many dollars as possible, and in order to do that you need to be fast. You need to be fast because, if you are going slower, it is more likely that your model will have mistakes conditional on getting filled, even though your model now feels like it is so much more well thought out and more likely to be correct, if you weren’t then also conditioning on that fill.

[Kris: Classic mock trading from tighten or trade to teaching people to yell “out” to cancel bids/offers that are stale when news comes out.

See:

Patrick makes an observation that if anything is an understatement:

I like as a pedagogical approach that this allows people to infer some of the lessons without simply being told some of the lessons.

One of the ways adverse selection manifested in StockSlam was that in the final round it was possible to brute force the exact fair value of color if you had quick mental math. Which means a player needs to recognize 2 things:

  1. That it is possible to know fair value in the last round
  2. That if you haven’t figure out fair value and someone trades with your bid or offer you should feel sad.

Many people discover this listen after getting picked off when they realize what happened. Ricki has an analogous situation in her game:

Every time someone goes up and puts a number on the board, you learn a little bit more, but you also have an implication that there is less new information that’s going to come after me, and that’s another lesson that is easier for bright people to pick up on themselves after actually doing it than it is to say, “by the way, every time you get more information, there’s less information in the world that you don’t already know.”

Why people who have the best model for the world may or may not make the most money from trading.

I actually think in markets like these (ie the siblng market game), where there will be a settlement to the correct value, you’re more likely to make money by having good models than you will in markets like for various stocks in the US equities markets, where a lot of what you’re doing is trying to price things relative to what people in the market will think something’s worth than to a model that takes into account e.g. the earnings reports of a company and figures out what the actual value proposition of the product that they create is.

You are much more interested in what the directions that these prices move within the next few minutes or within the next few days will be. This also exists in a smaller form in the markets that I run that might resolve on the order of half an hour from now, where if you can notice what trends will happen in the next 10 minutes or what trends are already emerging, you can profit by buying and then selling or by selling and then buying a contract that doesn’t actually accumulate a position that you will get paid proportional to at the end, but instead, does what’s called flipping a contract, where by taking both legs of the contract, you can make money on the difference between those two prices.

[Kris: In the StockSlam game sessions we ran there was no private info and the color race was random. However, some players would follow a strategy of hoarding a color because if it won it guaranteed them victory. To be clear the strategy has zero expectancy. However, if you just try to “flip” for edge you probably won’t win — you’ll lose to a a random hoarder. The key is to understand that over the course of many games, the “flipper” who makes positive expectancy trades will win over time even though they never win any single match! In StockSlam, there was no way to have an “investing” edge by having a better model of the world since it was random but there were many relative value scalp opportunities.]

Incentivizing liquidity to overcome the fear of adverse selection

By the point that we have people start writing the number of siblings they have up on the board, we’ve relaxed the constraints of Tighten or Trade, and people are now allowed to clear their orders from the market. 

In fact, it is prudent for them to do so.

As a result, we want to incentivize people to trade and not just have liquidity entirely dry up in the market when there isn’t enough trading. We can do this by adding in naive customer flow: every minute we will flip a coin; if it’s heads, we’ll lift the best offer, i.e. buy from the best offer. If it’s tails, we will sell to the best bid. And this will incentivize people to tighten those spreads because they will be competing with one another for that top position, so that they’ll get to do trades with what is obviously explicitly an intentionally naive customer flow, uninformed trades from the coin flip bot.

[Kris: In StockSlam we used “broker cards” that players drew to simulate random order flow that you’d be happy to trade with on your bid or offer.]

How order types leak info

Market orders are much more likely to come from naive customers. This is because a market order is strictly dominated by a limit order with a really high price.

There is some number of dollars above which you would not want to buy Shmapple stock. If you specify a limit order with a limit of $200,000, that’s strictly better than a market order because any cases where the market order would trade and the limit order wouldn’t are cases where you should probably feel really, really sad, in that above $200,000 range, to have traded.

[Kris: A market order is a bit of code that says I’ll buy shares at ANY price — a statement no human has ever made.]

A market order is more likely to be a naive customer. However, I think where that point is less relevant is that you don’t often have access to the information of something being a market order or the ability to select preferential trading with them. There are many structures, including many auction mechanisms, that will prioritize those market orders in terms of who you trade with: a willingness to buy at a higher price than other market participants should allow you to get some kind of priority in a case where those orders are being aggregated and then executed all at the same time, and this should be a reason to make people who are providing to Order Flow, i.e. market makers, in an auction more inclined toward wanting to do so. 

But I do think this fact about market orders is a really useful fact for purposes of making the decision not to send them, and less useful for you as a market maker, because you’re just not able to take advantage of it in nearly as many cases.

“Fill or kill” order types

A fill or kill order, says “immediately fill this order for me if there’s an order that would take the other side of that trade in the book, or kill this order – don’t leave it up.” This is as distinguished typically from limit orders, which are orders that if you send them into the market, will stay on the order book and other people can come and transact with them later. 

One reason you might want to do this is because of similar to earlier, the concern about adverse selection, in which if you leave an order up, it will get traded with in cases where it is stale to new information that comes out, or cases where other people see that order and judge it to be bad. 

Fill or kill isn’t totally immune to the problem of adverse selection because it’s still true that any order it gets to trade with is an indication that somebody else is happy trading with you, even after conditioning on your choice to trade with them and has therefore put up an order that would trade with this. But it is a lot safer in terms of not having the problem of stale orders out on the market that then get traded against.

Why Ricki’s simulation stocks returns are drawn from a stochastic process rather than use Patrick’s simulation where stock prices were real returns from a time in history (although a user would have no feasible way to identify it)

We explicitly model the movement in stock prices as random walks for some of the stocks that we’re putting on the market instead of using actual historical returns. One reason is because it’s simpler and when you’re in the early phases of teaching somebody about trading, keeping the API, keeping the parameters of the trading game much simpler will do a better job at inculcating those first order lessons than something that also incorporates a bunch of noise that comes from, you know, the various other things that might be happening to Bank of America stock in the background that they won’t be able to anticipate.

It’s also simpler for purposes of reviewing and teaching it and saying, “your model doesn’t need to be taking into account all of these other complicated things, but instead should be interacting with this product in this way.” We are very explicit up front about what algorithms will determine the prices of these stocks as they move forward.

The other reason for this is because part of what we’re trying to teach is about the relationships and in particular the ability to do arbitrage between different markets. One thing we have going is we have three different stocks, call them A, B, and C, that are each determined by a random walk that happens continuously over the two hour trading period. We then have a separate product, ETF, which is worth exactly the sum of stocks A, stocks B, and stocks C, but with different bot behavior happening in that market, such that, as you see the prices move in A, B, and C – and those prices will get moved by sophisticated trading bots that know the true future value of them as determined by just the number output by the random walk function – then you can take that information and go and move the markets in ETF and convert shares of A, B, and C into shares of the ETF and vice versa.

I cannot emphasize this more: keeping things as simple as possible early on is really valuable. That’s why using real life historical returns won’t give you the ability to lever up the impactfulness of those early lessons – and getting people to the point where they understand the first order things actually takes a lot more time than you might expect it to.

You mentioned that class one on day one is teaching you a bit about the laws of adverse selection, and I actually want to correct that. I think class one on day one is just teaching you, “what is the API for interacting with an order book?” It’s still teaching you some lessons as we go – in particular, teaching you lessons about why is it that an exchange might be designed in this way, or what is it that you should be worried about paying attention to; how much magnitude there is in terms of contract size, but it’s also just teaching you, “what does contract size mean when there are two orders out there?” Why is that equivalent to there being one order and then another order at the same price as opposed to meaning, say, “for 35 dollars I’ll buy the package of two shares as opposed to two shares each for $35.” Why does that make sense in the context of the Exchange’s API to this order book? 

(Day two is when we start teaching adverse selection.)

The meaning of arbitrage

I think often one of the sources of confusion is people use the word arbitrage to explain a whole bunch of different things that are not in fact raw arbitrage in the form I’m about to describe. Arbitrage is the act of trading two different products that are essentially the same product, or aggregate to become the same product, in a way that causes you to profit risk free.

What does this mean? Let’s say you have two stocks, A and B, and there is a separate stock, SUM, that is the sum of A and B – that might be an exchange traded fund, an ETF, that is the basket of A and B combined. If you can buy stock A, buy stock B and sell this ETF (the sum of A and B) and make a profit by doing all of those trades, you’ve done arbitrage.

You’ve managed to engage with the market in a way that allows you to end up with no position. You now have no exposure to A, B, or SUM; when there are movements that happen where A increases by a dollar, you profit a dollar off of having a long position in A, but you also lose a dollar off of your short position in SUM.

And as a result, you have not profited or, or lost out. You have not made a profit or loss in the aggregate of the things you’re holding, and therefore it is essentially the same as if you were to successfully create that ETF and cancel out your positions, or equivalently, redeem it and cancel out your positions in the stocks.

A hands-on experience of arbitrage

One of the ways that we try to teach arbitrage specifically is by creating markets that are still based on things in the real world. We have a crosswords contest in which people race doing crosswords that are up on a screen, doing the exact same crossword – we have the fastest time of a member of team green, the fastest time of a member of team orange, the sum of those two fastest times and the difference green minus orange of those two times.

Then after we’ve concluded trading, I’ll ask everybody to take a few minutes to calculate, to come up with a state of the order books that would allow them to do arbitrage. The first mistake that people make here is they say, “Oh, let’s say green is trading for 10, Orange is trading for 14 and some is trading for 26. I could buy the first two and sell some.” 

Why is that wrong? It’s wrong because trading at isn’t one specific number that you could do any transaction at that number. There is a bid and there is an offer. You need to be comparing the offers in green and orange with the bid in the sum – or vice-versa, the bids in green and orange if you want to sell it and the offer in some if you want to be buying it – and figure out are there any sets of trades that you could do that would be profitable, recognizing the need to cross the spreads. You don’t just get to transact at whatever the midpoint is in that market or whatever the last price is in that market. Figuring out which sides of the order book you need to be looking at together to see if you have an arbitrage opportunity is the kind of thing that is conceptually, in theory, trivial, but so, so easy to make a mistake on.

While teaching this class I regularly make a sign error – accidentally think that we’re supposed to go in one direction and not the other. Working through specific, concrete examples is going to get students way closer to not making those errors, or figuring out where those mistakes will crop up, than just reasoning from first principles about what you want to do.

It also helps them write up in spreadsheets that are reading in the electronic markets we have. What cases cause there to be an arbitrage and what don’t? The thing that I then push them to do is not just check if those models make sense from first principles, but, let’s change the stock price of green by one value – how does this push everything going on? Let’s change the offer here by one value. How does that push everything going on? Let’s say there’s this settlement and you’ve taken on these positions. What is your PNL? 

Walking through different examples of how changing the prices that these values either trade at or settle to, and how that changes your profitability, how that changes the payouts of your positions, does a way better job informing people of the directional pushes and the effects that they have than just explaining from first principles why those would be the case.

Real-world concerns with arbitrage

I want to add one more complication, which is anytime you’re engaging with multiple financial products, you are adding additional risk about what might happen with those products. Let’s say you buy shares in an ETF and then the company that is issuing that ETF goes belly up or mismanaged their portfolio or reported a number that wasn’t actually the correct number, but was in fact reporting someone’s error somewhere. There’s so many opportunities for error at each piece in the system, including the ones that you assume are true or think it would be impossible to ever be violated, that adding complications to your portfolio in the interest of succeeding at arbitrage diminishes the guarantee that you are making risk free money. 

You are taking on additional risks the more products you’re interacting with.

So one of the concerns about arbitrage that I teach is this concern about needing to cross spreads in both cases – this is something I sometimes call transaction costs in addition to the kinds of fees you might need to pay for each trade – but another is the risk of just having multiple different positions, both for purposes of your own internal accounting and ways you’re more likely to incur spreadsheet errors, for example, and also in terms of the other sources of risk that come from external factors about why you might think you have a certain position, e.g. that your books don’t necessarily match some of your counterparty’s books for some reason, or those of the exchange you’re interacting with, because there are errors all over the system. 

There are times that one set of trades will get busted or canceled retroactively, and other sets of trades that you’ve done will not – so even though you thought you had a flat position, a flat delta as you mentioned people sometimes refer to it, in fact, you do not, little known to you.

The challenge of teaching position sizing

Position sizing is one of the things that I’ve struggled with teaching the most because I think there’s this intuition of, if a trade is good, you should do that trade for the full size available to you until the point that you’ve moved the market such that it’s no longer good. And again, when I’m teaching things to a first approximation, that’s pretty reasonable. You want to do good trades. One reason to do similarly sized trades across a lot of different markets is that you will get better data about how good your trading is and how much you’re improving.

Another reason is because you will be better diversified to not lose out against noise for purposes of your own portfolio staying positive, separate from your ability to track it for educational purposes or for accounting purposes.

There’s also just the fact that you will be less likely to go down to zero and then no longer be able to make money in the markets if you have size that’s spread more evenly between different places. 

How is it that I teach sizing?

I’m going to be honest with you: a lot of the fundamental lessons about sizing are ones that we don’t get to in the time that we have available to us. 

We’re trying to teach fundamentals that are easy to digest. It’s not that sizing isn’t important – it’s extremely important – it’s that sizing is just a little bit too complicated to be able to successfully teach how to do successfully, especially cause you have two different questions.

One is a question of the impact that your trades will have and how they’ll move the market in terms of optimal sizing if that’s the only opportunity available to you, and another is sizing in terms of the relative values of different parts of your portfolio and how you want to keep it steady in light of like different things changing.

There are two different important things to be teaching on how to size trades reasonably.

One is for a specific given trade, if you were only optimizing the decisions you were making as they pertain to that trade, that was the only opportunity you had, what is the optimal size to do that trade for purposes of like your expected value of its performance? That’s the kind of thing that we’ll often teach through having these liquidity-providing bots or having bots that are acting like naive customers in a market that will trade up a stock a certain amount such that you can give people kind of this formal equation, and we try to keep those as simple as possible of how much it’ll move the market. Then people can figure out based on how much the market in a few of the stocks moves, how much should the market in the exchange traded fund move and therefore how much should they trade it up in order to get it to that point.

Likewise, in terms of arbitrage between the ETF and the stock, you should figure out which of those legs of the trade will be more constrained, which one has less size available for you to take on, and have that be your cap on the size that you take on in those two markets so that you’re not sitting with two trades that would have been good if you could have done them for arbitrary size but now have much more size on one of them than on the other.

That’s as it pertains to a specific trade. In terms of the question of sizing as it pertains to sizing your different positions across lots of different markets in ways that are reasonably balanced with each other, that level of sophistication is often beyond the scope of a two day trading bootcamp, or even when I do the longer version, a 10 day trading bootcamp for high school students.

It will be the kind of thing I will touch on insofar as it pedagogical benefits for people to have lots of different positions that allow them to track performance with less noise getting in the way of the performance of their trades, and i’ll talk a little bit about why diversification is important for purposes of not having a portfolio that can easily drop to zero and take you out of the pool. 

But the question of what exact equations you want to use to result in sizing between different positions based on how much money is to be made in those different markets, and also on the fact that you want to have some balance between them, tends to be one step more complicated than the things that we end up getting around to covering in this curriculum. I think it’s a hard thing to teach and it’s important to teach well, and you shouldn’t start trading with your own money before having a good understanding of it, but it is it is interesting to me that it is harder to teach than some of the other concepts that I do manage to cover, like adverse selection and movements in price or how much you want to trade toward that price.

[Kris: I agree with the difficulty of teaching sizing. In options portfolio context this post offers a practical perspective: Options trading as a widget factory.

Bet sizing for discrete gambles is not intuitive but well-covered:

Why teach high schoolers about trading

I’m trying to leave them with two different things.

Number one is a sense of, do you like this thing? Do you maybe want to be a quant trader? Would you enjoy thinking about these questions a lot more and going forth and doing them? That’s something that inclines me very far toward the educational end of the spectrum because I’m trying to give them a flavor for what quant trading is. I think a lot of people who are very knowledgeable in domains that range from software engineering to math puzzles to history have just no idea of what it is a quant trader does, and in particular, what kinds of thinking skills and tools a quant trader ought to have, so even getting people to the point of understanding why heuristics are so important for trading, why speed is important for trading, and what kinds of things to be paranoid about or to pay attention to in markets, will get them a lot of the way there in terms of just having this general domain knowledge that can inform whether it’s something they want to dig into more.

My other goal is to teach them about the kinds of heuristics around trading that can inform them whether or not a specific area or trade is one worth then investing a lot more effort into thinking about. 

Adverse selection is a concept that’s trying to teach you to be more paranoid about your trades. Information about naive customer flow is teaching you that despite adverse selection and other attributes of more sophisticated market players doing better than you, it is still worthwhile sometimes to do trades if you can identify the good ones. And, when should you have a story for why it might be good to trade, such that you then invest a lot more time thinking about that trade, and trying to understand whether your story for it is true or not? 

I guess I’ll add a third thing which is that, I think that a lot of the features in financial markets and the things that I’m trying to teach toward crop up in lots of different places in life – adverse selection being one of the biggest examples of this – in ways that people are not necessarily paying attention to, but once you’ve been a certain amount trading-pilled, once you’ve gone through this curriculum, you’ll notice a lot more and be able to incorporate it into your ability to ascertain whether an environment is more cooperative leaning or competitive leaning – whether an environment is high trust or low trust – and how to set up incentives and agreements and contracts so that your environment can either be more safely high trust or more legibly low trust in order to cause people to make choices and do things that will be positive expectancy for them, and ideally, in the world I’m trying to create, positive expectancy for everyone by giving us the tools required to determine what places are low trust, and how to make places high trust so that we can cooperate and not burn the commons on values that might be good for all of us.

Patrick tests Ricki with the same puzzle he gave to Stockfighter players: “There are a hundred traders in this market, you have access to the order book, find out who the insider is. How would you go about doing that?”

[Patrick notes: Ricki’s answer here was delivered in real-time after thinking for approximately 15 seconds. Less than 100 of 50,000 technologists, many of whom strove for several hours, successfully implemented any of the four ideas she came up with on the spot.

Kris: In case you wondered if trading was actually a skill you can learn, Ricki’s answer demonstrates domain knowledge]

Great question, and I love this exercise. It feels to me like a good simulation of what a lot of interacting in real financial markets is like, in that you will have some participants more sophisticated than others, and identifying which ones are which is especially important.

I think the first thing that I would look to is, if I can see the behavior of the different entities with names attached to them – this is not going to be true in many major financial markets in the real world, but might have been in your simulation – I’m going to look at which ones are kind of always moving in the same direction as the one that the earnings report moves the stock in, in advance of it.

I’m going to want to look in particular at ones that do these trades very shortly before the earnings report is released because it is likely that they will want to focus their positions there so that there’s less noise, so that they are less susceptible to effects of noise that happen over the course of 30 days.

But I’ll also just want to be applying a filter of, do they go in the same direction as the stock ends up moving during the 30 day period prior to the earnings release. 

I’m going to want to look, if there are lots of different financial products available in this market, at the highest-leverage ones to figure out where it is that those traders are putting most of their efforts.

So if you see that buying, say, out of the money call options on companies that then move up a bunch.

Patrick McKenzie: This is the classic Matt Levine point. If you’re going to do insider training, don’t do it with very out of the money call options that are expiring this Friday.

Ricki Heicklen: Yep, and nevertheless he keeps gathering more and more examples of people behaving in this way. Of course, he gets to see the examples of the ones who get caught and not the ones who don’t get caught, but…

Patrick McKenzie: We as society are being adverse-selected when we see the results of the legal process. We are finding the dumbest crooks.

Finally, if you have a way of tracking the profits and losses of individual entities, again, if you can track the individual agents’ trades and what their balance is at each point in time, the ones that see the biggest bump following an earnings report and whatever ensuing market volatility takes place on account of that earnings report will likely be the ones that you should be paying more attention to.

A more sophisticated insider trader might make some deliberately bad trades in there in order to throw you off the scent, might go in the opposite direction at certain points in time, but to a first approximation, those are the kinds of signals you’re going to want to be filtering on to find the insider trader in this market.

An idea I learned at SIG with a poker analog — “paying for information”…Ricki doesn’t call it that but the concept is here

The best way to figure out what trades will cause what effects are by doing those trades for a small size and seeing what happens as a result of them. And that will save you so much time of doing so, in particular for things like catching your own mistakes in, let’s say in arbitrage land, where you do what we jokingly call “garbitrage” of going in the exact opposite direction of the arbitrage you intend to perform. 

This is something you will catch better by just trying and failing to do the arbitrage you want, and why you should always turn on your bots with, you know, a one hundredth of the size that you would want them to actually end up trading with.

The usefulness of graphs

I especially like the point there about how useful graphs can be. I think that a mistake that people often make with trading is they will have a giant CSV with tons of data points, and the only thing they’ll do with that is calculate summary statistics, like your t-statistic for how good a certain trade would be, or even read through the tape of like what trades happen in a certain time period, but in a way that human brains are less likely to successfully consume than pictures.

Picture books are so much easier to read than massive tomes of just words straightforwardly, and likewise, seeing charts of this data will be a very quick way of causing you to notice patterns that deviate from the other things you see than trying to read through an entire CSV.

[Kris: Total agreement. I have a special affinity for scatterplots.]

The information game

[Kris: I’m excerpting a huge chunk of this because it’s great. It’s a topic I wrote about in Twitter Reminds Me Of The Trading Pits and discuss in my interview with Corey Hoffstein.]

Let’s say you are a trader who has a lot of information about a specific strategy that you execute day after day. And you’re at dinner talking to somebody externally, and they’re interested in what you’re working on.

Well, you know that you can’t reveal the specific details of the strategy, and you know that you can’t explain any piece of your code. Let’s say you’re talking to somebody external. That might be a roommate of yours who happens to also work in finance. It might be out at a bar late at night. You will often end up revealing information, even if you are not disclosing any of the details, much less the code of the specific strategy in question, just by revealing what things you care about: what areas of the market, what countries, what asset classes, etc, you are thinking about. You might reveal this just by mentioning something about that asset class. You might even reveal it just by knowing more about that asset class when the other person brings up a few things with you.

If they talk to you about stocks and bonds and options and you reveal that, “oh, I actually don’t know much about bonds, that’s not my area of expertise, but I know a lot about options,” they now have a bit of data that there’s more profit to be made in options in expectation than they had previously thought because your firm has you focusing on options.

If they can get a good sense of how many people are in each different desk in a firm, or what desks even exist in that firm – if you have a desk specifically devoted to trading certain kinds of commodities, that will tip them off to the fact that those commodities have money to be made in them.

Sometimes the only thing that a competitor will need is to know where your focus is in order to be able to then take five of their researchers and say, hey, I think there’s money to be made in Brazilian options, or something like that – let’s put some attention into that part of the ecosystem, and now we too will be able to extract that trade.

Patrick McKenzie: I love this thing that there are things which seem very professionally normative and not leaking inside information at all, which allow others to adversarially reconstruct things that very much are proprietary information – like simply asking a question like, Oh, how many people sit with you? Or, how many friends do you have at work? Or, even like, do you feel lonely at work? “No, I’ve got seven buddies!”

That plus your LinkedIn profile could already be enough to leak market-moving information to other people. .

Ricki Heicklen: There are also a lot of these cases of accidental information leakage, in which somebody ends up communicating something to someone else at a different firm, inadvertently, whether through saying how many people sit near them on their desk, or what kinds of products they’re thinking about, or even, reactively to other people’s questions, indicating what they do and don’t know, or what assumptions they make.

One of my favorite examples of this is that the financial industry uses a whole bunch of acronyms. This is because acronyms are more efficient communicators and often clearer ways to express something to someone else. It’s also just because acronyms in general are useful.

Many of these acronyms are overloaded, and there are multiple things they can mean. If somebody external to your firm uses an acronym with you, and you assume it means a certain thing and react as though it means that thing, and they previously thought it meant something else, all of a sudden they know that you think about that concept a bunch, and that that concept is therefore relevant to your work because there is some money to be made in that realm or that asset class or that category of stocks.

This is just so hard to defend against. 

Patrick McKenzie: So, things I’ve seen in real life that have commercially significant consequences without giving away anything. Oh, I don’t even know if it’s possible to not give away anything now. 

Something as simple as a book recommendation in the context of who is doing the recommendation and when they are getting to that book leaks information about e.g. if it is the CTO at a particular firm who is suddenly attempting to bone up on a particular industry and they tweet out what they are reading about, you know, “I’m very into insurance tech right now,” that should move your estimates of whether that firm is institutionally interested in insurance where it wasn’t.

And given the contours of that firm you know, if it’s unexpected that they’re involved in insurance, that is probably useful information to someone somewhere. The classic example of a side channel for this is, planes are very easy to track on the internet for various reasons that get more into aviation than anything else. Some people travel by private planes, and those private planes are registered to them. If a CEO routinely flies to a particular place that only has one interesting business, it is highly likely that there is a deal happening there, and most frequently that deal might be I’m attempting to acquire this interesting business.

[Kris: I recall alt data satellite companies pitching us on this ability to track planes for this reason about 10 years ago…by the time they were pitching a vol trader on this as opposed to a Point 72 long/short pod the opportunity is long gone. Remember, where are you in the pecking order for certain kinds of info?]

And so, people will get an extremely commercially significant bit of information out of something which is one, legally required, two, absolutely anodyne, and three, does not on its surface look like inside information at all, because it isn’t inside information, it’s open source.

Ricki Heicklen: That’s fascinating. I love that example. Another one that might be similarly publicly available is who somebody is following on Twitter.

There are many Twitter accounts that will tweet out stock advice and people working at financial firms will often follow those Twitter accounts – the Twitter accounts they think are more likely to give advice that is a leading indicator of something that will happen in the market.

If you just scroll through somebody’s followers, and you see that there are certain financial advisors, or certain funds, or certain individuals that they’re following, whether those individuals are people who say things that are relevant for markets, like Donald Trump or Joe Biden, or whether they’re people who are giving advice about where to invest, you’ll end up with a competitive advantage because now you know those are places to pay attention to.

Patrick McKenzie: I love Twitter as a product, but there are many, many excellent reasons to not use Twitter. One is that your likes are public – that has caused cancellation of various people for liking tweets that have unpopular opinions. [Kris: not anymore]

One far less obvious threat model is, what is the CEO of this firm thinking about in their private moments on a day-to-day basis?

And one can very easily back solve from that to what the firm might be engaged in. Or, is there a person at another firm who suddenly picks up four followers in a row from a particular firm? Very plausibly, a conversation has happened. There’s just an infinite number of these side channels.

So what can one do about it, aside from not using Twitter and never having friends?

[Ricki poses the common solution — use an alt even if it doesn’t solve all the cases, for example, somebody’s public profile quickly being followed by a number of people from a certain industry.]

Tradeoffs in defending against info leakage

There’s some firm behavior that I think is pretty reasonable that has to do with siloing employees, or taking employees and making sure that they only have access to the amount of information that they need local to the work that they’re doing, and not access to every single trading strategy, or how much money the firm is making, because the best way to keep someone from leaking information is preventing them from having that information in the first place.

Siloing employees comes with a lot of costs. Trends that you might notice between different trading desks, or phenomena you might notice in the market that turn out to be relevant to somebody else’s strategy won’t propagate nearly as easily between different parts of the firm. A mistake that somebody is making is not going to propagate as easily between different parts of the firm.

And two strategies that look like they might be doing different things but are actually doing the same trade and duplicating that trade and therefore in aggregate sizing it too largely, putting on too much size in that trade will not be as easy to detect if you don’t have a bird’s eye view of everything happening, and if you don’t have a lot of eyes on everything happening.

But the benefits of siloing for purposes of information leakage are large, especially the larger your firm gets. First of all, you’re going to have a stronger defense against malicious actors who leave, join another firm, and give their trading strategies, because they’ll only know one or two of the strategies or whatever they were focused on, and not the entire firm’s IP.

Second of all, you are improving how safe your firm is in cases where people leak information inadvertently – they’re not going to be revealing as much information about what different desks are focused on.

[Kris: I must admit the focus on secrecy and hiding your tracks is very reminiscent of my time at SIG.]

The progression of investment skill

Here’s a long excerpt from Byrne Hobart’s appearance on Patrick McKenzie’s Complex Systems Podcast that explains modern alpha-seeking so well:

Hedge funds in their modern incarnation are machines for looking for deficiencies in other people’s model of the world that can be expressed through trades. That model has, has very much evolved – at least for the largest hedge funds, it’s evolved towards a setup where, if you look at asset classes, you can see that they have different risk and return characteristics, and then within those asset classes, you can make other judgments. 

You can say things like, let’s say, very low-rated bonds are much more sensitive to recession risk, and highly-rated bonds are more sensitive to interest rate risk; you can say that, typically, best-performing stocks will actually continue to outperform, as will worst-performing stocks; that typically statistically cheap companies do a bit better over time than statistically expensive ones; that industries correlate and industry membership explains a large share of a given stock’s performance, et cetera.

You can enumerate all of these factors that are just broad statistical explanations for where returns come from, and that allows you to look at someone’s investing track record and identify, how much of this was that you picked good stocks? How much of this was that your career happened to span a bull market? How much of this was, not only was it a bull market, but the first job you got happened to be analyst at a tech fund, and tech did unusually well in that bull market? 

The subtext of this next part is that skill has a fuzzy component

We run these regressions and find out, “okay, you, you beat the market by five points a year. It turns out that 5.5 of that was luck, and the other negative 0.5 was skill.” Someone actually did this with George Soros’s investment record and found that his skill contributed negative two points, and that following trends in currencies was just a really, really good trade to run at that time. 

I think there’s still, there’s still something valuable in having implicitly done the regression in your head and actually somehow instinctively identified this systematic signal and executed well.

Patrick McKenzie: I think there’s probably a sort of unscored pregame in which you look at every opportunity available in the world and somehow through “luck,” select an opportunity where, go figure, the beta in that opportunity, the returns to the market exceed returns available in other markets during those years. 

Maybe you sub-optimized with respect to how you executed on that opportunity before you, but you picked very well on which opportunity to spend a portion of your professional career going after.

[Patrick notes: I often feel this way about commentary on how geeks are “lucky” to have had their special interest be extremely valuable to e.g. tech industry.]

But that fuzziness will probably shrink as our ability to measure improves

Byrne Hobart: Yeah, I think that is a reasonable way to look at it, especially in earlier history, but As we have more data now, at least within financial markets…

It is very hard to time these factor performances – very hard to time, “when will this industry do well or worse, when will momentum work unusually well or worse” – I’m sure people try to do it, I’m sure some people are good at it, but if you construct a portfolio where you’re netting out exposure to all of those factors, what you have done is you’ve created a portfolio that is just a measure of someone’s skill at identifying the idiosyncratic return drivers of individual stocks. So if they bought NVIDIA, they also had to short a corresponding amount of other large companies, other growth companies, other tech companies, etc. such that, if they make money on NVIDIA after all that hedging, it’s because they actually knew something right about NVIDIA.

What that ends up meaning is that the hedge fund – we’ve actually made the full circle. People used to knock hedge funds as “a compensation scheme masquerading as an asset class,” and as they’ve gotten better at building these hedge funds, market-neutral, factor-neutral portfolios, they are increasingly a method of measuring investment skill masquerading as an asset class. 

Because, what do you want? In theory it makes sense that you should be able to charge a lot of money for skill, and you should not be able to charge very much money for, “you happen to get a job analyzing an industry that happened to do well over the time when you were a portfolio manager.”

So it means that as hedge funds have gotten better at just just delivering that idiosyncratic return, and [with] the accumulation of a bunch of different portfolio managers, who are finding a bunch of different ways to extract the “idio” from a bunch of different sets of companies, you can charge a lot more for that – which means you can pay people a lot more, and so you can bring in more talent to the industry. 

That model keeps on growing, but it does become a model where you as an analyst or as the trader or as the portfolio manager, you are constantly asking yourself questions like, “why do I deserve to be right about this?” If you have a reason to think this is a good company, what is the reason that someone else looking at the same evidence didn’t think so? 

Sometimes the reason is you looked at more evidence than they did – they talked to five people at private companies that order lots of GPUs, you talked to eight people, you have a slight edge on the person who worked less. 

Sometimes you just have a signal where you identify, not really why didn’t someone else exploit this, but why does this happen in the first place?

So you’re always trying to build this model of the world, and of what you know, what you know relative to other people, what mistakes other people might be making, how persistent those mistakes are, how much competition there is to exploit those mistakes, and you’re trying to measure the degree to which your returns are being competed away…

You’re always doing this kind of introspection and always trying to rigorously measure your own skill as an empiricist. It is basically this exercise in just being a rationalist. It is like they are mentally reinventing the entire LessWrong corpus all the time.

Why measuring VC skill will always be a hard problem

Byrne: Measuring venture investor skill is one of the hard problems in finance, and may never be solved because, if you are in a power law kind of investing situation, you have these long lags between when you write the check and when you get a wire for a much larger amount to your account. Because of that, not only do you have a fairly small sample of successes, but the more successful you are, the more likely it is to be from one really, really big thing you did, which means the more successful someone is, the easier it is to claim that they were lucky. And that just makes it a really frustrating business to analyze and understand.

It also means that it’s very hard for a venture investor to think about the marginal cost of doing one more interview or buying one more data set or something like that – whereas with a hedge fund or a prop trading firm… I don’t know that any of them explicitly measure things like “what is the marginal value of this analyst spending the next 30 minutes reading a transcript of an interview or editing the scraper that we’re using to track the inventory in this API that the company does not realize is actually exposed to the public internet.”

They don’t measure it quite that granulated, but they have a pretty good sense of what is the incremental return on the next action, and they have pretty high confidence in that. 

You don’t have that. You don’t know if the next call that you take is from someone who’s starting the next Stripe – the odds are very, very low, but the odds are non-zero, and you will never actually have enough data to be anything like confident in that. 

Mr. My Way

First I want to bring an enlightening podcast episode to your attention.

🎙️How a Professional Sports Bettor Really Makes Money (Bloomberg Odd Lots)

Joe and Tracy interview pro gambler

. In under an hour you can learn a ton about the sports gambling industry. With sports leagues shoving it down our throats, young and old alike, I feel it’s important to understand what’s going on. And it’s not pretty.

I’ve talked about how this industry is rigged in “Free” Markets Wet DreamThis podcast echoes the problems but it encompasses so much more — opportunities, cultural impact, and basic mechanics. There are lots of misconceptions about sports gambling as well.

This is a list of some of the less-obvious ideas found in the interview:

Professional Sports Betting Strategies and Challenges

  • Identifying Market Inefficiencies: Spotting mispriced odds and exploiting gaps in bookmakers’ knowledge, especially in niche markets.

    This exchange is the betting parallel of fundamental vs arbitrage investing:

    Tracy (07:49):

    So just so we can better understand this dynamic, walk us through your sort of day-to-day as a professional sports gambler. What kind of opportunities are you trying to identify and then how do you decide how much money, for instance, to apply to each individual bet?


    Isaac (08:05):

    Yeah, so it’s a great question. So it really depends on what sports are in season. So right now, you know, end of the NBA, but there’s a lot of MLB things like tennis are year round. And so it does depend on the sports.

    There are in general two ways of identifying profitable sports bets. The first is you can take sort of a market-based approach. And by a market-based approach, what we mean is there are tons of sports books all out there and as Joe mentioned, they’re all offering all of these different kinds of bets. And if you constantly scroll through all of the odds, you’re going to find slight mispricings. So let’s say everybody has the Yankees as two-to-one underdogs and one sportsbook has them as a three-to-one underdog. You can identify that as off market and you can place that.

    Tracy (08:45):

    Oh, I see. So you’re not saying that the platforms have the odds wrong, you’re trying to identify outliers among the platforms.

    Isaac (08:51):

    That’s exactly right. yeah. So that’s probably the main that the majority of professional or winning sports betters make money is by identifying markets which are simply mispriced. And for that you don’t need any special sports knowledge, you just need to have a screen with all the odds and constantly be scrolling through them and looking for price changes and looking for books that are slow to update.

    The other way to do it, which is a lot harder and a lot [more] rare, is to basically create your own numbers. So you say, okay, I know exactly how much each player is worth. I know what the weather is going to be today, I know these matchups and so I’m going to generate kind of from scratch from my own, model the odds and then apply that to the market. And when it comes to major liquid markets like the NFL or the NBA or the MLB, that’s really, really hard to do. And there are very, very few people who can do that, but those are the people who make the most money.

  • “Flipping Whales”: Partnering with high-rolling bettors who are often losing money but have high betting limits, allowing professionals to place larger bets indirectly.
  • Account Limitations and Closures: Sportsbooks limit or close accounts of consistently winning players. In fact, the first few bets you make on a platform are given a heavy weight in determining your betting limits. If you open an account to arb a line, the platform knows it because they know what the sharp side is. To get a healthy bet size limit you want to open an account with trades that casual amateurs make. Bet on the home team. Go for some parlays. Be a sucker.
  • Maintaining Multiple Accounts: To circumvent limitations, professional bettors often manage accounts across various platforms and use accounts of friends or family members.

Regulatory Concerns and Societal Impacts

  • Match-Fixing Concerns:
    • Lower-Profile Sports Vulnerability: More prevalent in sports where athletes earn less, making them susceptible to match-fixing offers.
    • Risky Prop Bets: Bets on specific player performances increase the risk of manipulation.
    • Monitoring and Enforcement: Challenges in maintaining integrity across global sports markets.
  • Addiction Risks:
    • Young Men: Disproportionately affected due to emotional attachments to sports teams and delusional beliefs about their betting acumen.
    • Mobile Access: Easy access through apps exacerbates addiction potential, making gambling omnipresent and more enticing.
  • Lack of Transparency:
    • Tracking Wins/Losses: Many platforms make it difficult for users to track their overall performance.
    • Deceptive Advertising: Emphasis on potential wins while downplaying the risks involved.
    • Targeting Minors: Use of “social” sportsbooks and fantasy sports apps to attract younger audiences.
  • US vs. European Models:
    • European Practices: European sportsbooks are known for being less accommodating to winning players, often quickly limiting or banning them.
    • US Adoption: Many US sportsbooks, owned or influenced by European companies, adopt similar practices, focusing on market share and profitability by restricting successful bettors.

Technological Aspects of Sports Betting

  • Mobile Apps and Online Platforms:
    • 24/7 Betting Access: Allows for instant betting from anywhere at any time.
    • Variety of Bet Types: Including live in-game betting and exotic prop bets.
    • Engagement Tactics: Use of push notifications and personalized offers to keep users betting.
  • Data Analysis and Algorithmic Pricing:
    • Initial Odds Setting: Advanced statistical models and real-time adjustments based on betting patterns.
    • Third-Party Providers: Specialize in generating odds for niche markets, contributing to the breadth of available bets.
  • Geolocation and Identity Verification:
    • Regulatory Compliance: Ensures users are betting within legal jurisdictions.
    • Preventing VPN Bypass: Ensures bettors cannot circumvent geographical restrictions.
    • KYC Processes: Verifying age and identity to comply with legal requirements and prevent underage gambling.

Future Outlook and Potential Reforms

  • Advertising Regulation:
    • Restrictions on Ads: Calls for limits on the frequency and content of gambling advertisements.
    • Honest Marketing: Need for ads that accurately depict the risks of gambling, including mandatory disclosures of odds and average losses.
  • In-App Transparency Improvements:
    • Display of Wins/Losses: Clearer, real-time tracking of a user’s total betting performance.
    • Responsible Gambling Tools: More prominent tools and resources to help users manage their gambling.
  • Research and Data Collection:
    • Independent Studies: Need for unbiased, non-industry-funded research on gambling impacts.
    • Tracking Problem Gambling: Better data on the prevalence and demographics of problem gambling.
    • Regulatory Effectiveness: Evaluating the impact of different regulatory measures.
  • Enhanced Responsible Gambling Tools:
    • Opt-Out Limits: Consideration of default deposit and time limits that users must opt out of, rather than opt in to.
    • AI Integration: Use of artificial intelligence to identify and address problematic betting patterns.
    • Improved Self-Exclusion Programs: Streamlined processes for self-exclusion across multiple platforms to help problem gamblers limit their activity.

Economic Reality of Gambling Companies

  • Misconception of Profitability:
    • Despite the significant growth and visibility of the sports betting industry, many companies are not as profitable as assumed.
    • Customer Acquisition Costs: Sportsbooks spend heavily on marketing, promotions, and sponsorships to attract new users. These costs often outweigh the revenue generated from bets.
    • High Operational Expenses: Maintaining compliance with regulations, technology infrastructure, and customer support adds to the costs.
    • Competitive Market: The need to offer competitive odds and bonuses to attract and retain customers further squeezes profit margins.
    • Focus on Market Share: Many companies prioritize expanding their user base and market share over short-term profitability, leading to significant investments in customer acquisition and retention.
    • iGaming as the Future:
      • Strategic Shift: Many gambling companies see online casino games (iGaming) as their future primary revenue source. These games are more addictive and provide higher margins due to their rapid play nature and the continuous betting opportunities they offer.
      • Regulatory Landscape: iGaming is currently legal in fewer states compared to sports betting, but where it is legal, it dominates revenue figures, indicating its profitability.
  • Predatory Practices:
    • Increasing Losers’ Bet Limits: Sportsbooks often increase betting limits for users who consistently lose money, encouraging them to bet more and lose more. This contrasts sharply with the practice of limiting or banning successful bettors.
    • VIP Programs: Offering special incentives and higher limits to big losers, which can exacerbate their gambling problems and financial losses.

The house edge on typical bets in sports is in the ballpark of 5% if you have have to risk $110 to win $100 on a coin flip. The house edge is ensures a healthy long term profit for the sportsbooks if they can avoid smart bettors. But the edge is small enough to keep bad bettors coming back. They win enough to think they have a chance.

On my flight back from NJ, I read applied mathematician David Sumpter’s The 10 Equations That Rule The World (he was the author of the sports analytics book Soccermatics as well).

It’s a good book for introduction to the below topics especially since it provides lots of real-world applications about problems we encounter on a regular basis. For example, he uses Bayes’ Theorem to show why forgiveness is not just a gracious thing to do but a statistically sound choice. He also bodies Jordan Peterson if you’re into that.

The book is sequential — the equations build on preceding ones to build rich models that underpin profitable business and life decisions.

Taking the baton from Odd Lots, we can use chapter 3’s Confidence Equation and chapter 6’s Market Equation to establish a basis for determining if we actually have an edge.

Let’s get into the details.

Chapter 3: The Confidence Equation

Sumpter defines the equation:

h * n ± 1.96 * σ * √n

where:

h = edge or signal per trial

n = trials

σ = standard deviation

The 1.96 gives away equation’s identity — it’s the 95% confidence interval.

The best way to understand it is by demonstration.

If you have 3% edge on a bet with a standard deviation of 71% and make 100 bets your realized edge will be:

.03 ± (1.96 * .71)/ √100

.03 ± .14

Your realized edge will have a 95% chance of falling between -11% and +17%

While the confidence interval contains zero, you cannot be particularly sure that the signal is positive and that the gambling strategy works.

The value of this equation is often best seen in reverse. We can invert the expression to ask:

“How many trials do I need to be 95% sure that my edge is positive?”

h/σ > 2/√n

*Note: The 2 comes from rounding 1.96 up. Sumpter doesn’t mind sacrificing precision to make the formula memorable.

💡Moontower readers will observe that h/σ is a measure of risk to reward and can be interpreted as a Sharpe ratio.

The bet described above is ascribed to a hypothetical gambler named Lisa. Notice that Lisa not out of the woods even if she gets to make this bet 2,300x.

Sumpter explains the problem:

During those six years, other gamblers might have picked up on her edge and started to back it. The bookmakers may adjust their odds and the edge disappears. The risk for Lisa is that she doesn’t realize that her edge has gone. It takes over one thousand matches to be confident that an edge exists. It can take just as many expensive losses to realize that it has disappeared. The profits that grew exponentially fast now crash down and decay exponentially fast.

Notwithstanding the ever-present problem of “did the world change while I was deploying my strategy” the blunt math is instructive:

Most amateur investors are vaguely aware that they need to separate the signal from the noise, but very few of them understand the importance of the square root of n rule that arises from the confidence equation. For example, detecting a signal half as strong requires four times as many observations, and increasing the number of observations from 400 to 1,600 allows you to detect edges that are half as large. It is easy to underestimate the amount of data needed to find the tiny edges in the markets.

These ideas were fundamental in options training. You can see them applied in:

Understanding Edge (10 min read)

If You Make Money Every Day, You’re Not Maximizing (23 min read)


During the 1700s, mathematician de Moivre pioneered combinatorics (i.e., how many ways can you be dealt a full house). The combination formula relies on factorials which become computationally impossible when numbers get large, especially in the 18th century. Scottish academic James Stirling showed how, at large n, the binomial distribution can be approximated by the normal bell-curve.

In 1810, Laplace developed the idea of moment-generating functions to describe features of distribution. This allowed him to study how the shape of the distribution changes as random outcomes are added together. Laplace demonstrated something truly remarkable: irrespective of what is being summed, as the number of outcomes we sum increases, the moments always become closer and closer to those of the normal curve.

While there were tricky exceptions to be grappled with:

the result that Jarl Lindeberg finally proved in 1920, it is known today as the central limit theorem, or CLT. It says that whenever we add up lots of independent random measurements, each with mean and standard deviation σ, then the sum of those measurements has a bell-shaped normal distribution with a mean μ and a standard deviation of σ.

To take in the vast scope of this result, consider just a few examples. If we sum the results of 100 dice throws, they are normally distributed. If we sum the outcomes of repeated games of dice, cards, roulette wheels, or online casinos, they are normally distributed. The total scores in NBA basketball games are normally distributed (illustrated in the bottom panel of figure 3). Crop yields are normally distributed. Speed of traffic on the highway is normally distributed. Our heights, our IQs, and the outcome of personality tests are normally distributed.

Whenever random factors are added up to come, or whenever the same type of observation is repeated over and over again, the normal distribution can be found. De Moivre, Laplace, and, later, Lindeberg created the theoretical bounds within which the confidence equation can be applied. What they couldn’t know, and what scientists have since found, is just how many phenomena can fit under that same curve.

A bridge to markets

We now jump to chapter 6 to see how this concept applies to investing.

Chapter 6: The Market Equation

Sumpter lays it out:

dX = h * dt + f(X) dt + σ * ε

This looks very similar to the stock diffusion equation known as Brownian motion where h is the drift and σ * ε is the random component. The signal and the noise respectively.

But there’s a wrinkle in this version.

We acknowledge:

The key assumption for the central limit theorem is that events are independent. In roulette, one spin of the wheel doesn’t depend on the last; the central limit theorem applies.

But not all financial mathematicians understood that the central limit theorem didn’t apply to markets.

❗This brings us to the f(X) term in the market equation. I haven’t seen that before.

It is a feedback function.

We turn back to Sumpter:

When I met J. Doyne Farmer in 2009, he told me about a colleague at one trading firm—which, unlike Farmer’s own company, had lost a lot of money during the 2007/08 crisis—who referred to the Lehman Brothers investment bank crash as a “twelve sigma event.” As we saw in chapter 3, 1-sigma events occur 1 time in 3, 2-sigma events occur about 1 time in 20, and a 5-sigma event about 1 time in 3.5 million. A 12-sigma event occurs 1 time in, well, I’m not sure, actually, because my calculator fails when I try to find anything larger than a 9-sigma.

The simple signal and noise market model assumes independence in price changes. Under the model, future values should thus follow the √n rule and the normal curve. In reality, they don’t.

On the stock market, one trader who sells causes another to lose confidence and sell too. This invalidates de Moivre’s central limit theorem. Fluctuations in share prices are no longer small and predictable. Stockholders are herd animals, following each other into one boom and bust after another. Introducing the f(X) term into the equation means that traders don’t act independently from each other, but it does assume they have short memories. It again invokes the Markov assumption, this time to say that traders’ feelings about the near future change as a function of their feelings now. Seen this way, the Market equation can be thought of as combining the Confidence equation, for separating signal and noise, with the Influencer equation, for measuring social interactions, in a single model.

Instead, as the theoretical physicists in Santa Fe showed, the variation in future share prices can become proportional to higher powers of n, such as n²/³ or even proportional to n itself.

This makes markets scarily volatile.

While I haven’t seen this market equation before, the topic is not foreign. In Thinking In N not T you learn how the presence of auto-correlation underestimates an asset’s volatility!

In a world where even laypeople know Taleb’s favorite gym lift, it’s banal to point out that we don’t inhabit Mediocristan. And yet, experience suggests it’s not so banal that we’ve internalized the implications of non-gaussian distributions.

Maja finds that non-mathematicians seldom take the time to reflect on the assumptions that underlie the models she uses. They see what she does as predicting the future, rather than describing future uncertainty. Last time we met for lunch, together with her colleague Peyman, she told me, “The biggest problem I see is when people take the results of models literally.” Peyman agreed. “You show them a confidence interval for some time in the future, and they take that as true. Very few of them understand that our model is based on some very weak assumptions.”

What’s possibly more upsetting is what the equation means for people that spout “reasons”.

The core message of the market equation is to be careful, because almost anything could happen in the future. At best, we can insure ourselves against fluctuations without needing to know why they have occurred. When the markets temporarily melted down and bounced again at the start of 2018, Manoj Narang, CEO of quantitative trading firm MANA Partners, told the business news organization Quartz that “understanding why something happened in the market is only slightly easier than understanding the meaning of life. A lot of people have educated guesses, but they don’t know.”

If traders, bankers, mathematicians, and economists don’t understand the reasons markets move, then what makes you think that you do? What makes you think that Amazon shares have reached their peak or Facebook shares will continue to fall? What makes you so confident when you talk about getting into the housing market at the right time?

Sumpter leaves us with what I’d describe as irreducibly vague advice:

The most important lesson from the market equation, a lesson that applies not only to our economic investments but also to investments in friendships, in relationships, in work, and in our free time. Don’t believe that you can reliably predict what will happen in life. Instead, make decisions that make sense to you—decisions you truly believe in. (Here you should use the judgment ie Bayes equation, of course.) Then use the three terms in the market equation to prepare yourself mentally for an uncertain future:

Remember the noise term: there will be many ups and downs that lie outside your control.

Remember the social term: don’t get caught up in the hype or discouraged when the herd doesn’t share your beliefs.

And remember the signal term: that the true value of your investment is there, even though you can’t always see it.


Finally, here’s a fun excerpt that I want to point out because everyone knows this person — Mr. My Way:

I am sitting in a café in the late afternoon and watch him come in. He shakes a waiter’s hand and then does the same thing with the barista, exchanges smiles and a few words. He doesn’t see me at first, and as I stand up to go over to him, he spots someone else he knows. A round of hugging ensues. I sit back down again, waiting for him to finish.

His celebrity here partly derives from his former life as a professional soccer player, and because his face is often on TV, but he is also popular because of how he holds himself: his confidence, his friendliness, the way he takes the time to talk to people, sharing a few words with everyone.

Within a few minutes of sitting down with me, he is into his spiel. “I think I make a difference because I show them my way of doing things. I think that’s lost sometimes,” he says. “I just do my thing, I tell it as it is, and I am honest, because that’s what is needed in this game. “I’ve got a lot of contacts. A lot of meetings like this one, you know, keeping connected. You see, people want to talk to me because I have unique way of seeing it. Because of my background, you know, a way that no one else has quite got, and that’s what I’m aiming to de liver when I sit down with you.” These observations are interspersed with anecdotes of his playing days, a bit of name-dropping, and rehearsed stories, complete with well-timed jokes.

He smiles, looks me straight in the eyes, and, at times, makes me feel like I’ve asked for all this information. But I haven’t asked for it. I wanted to talk about using data, both as it is employed in the media and within the game of soccer. Unfortunately, I’m not getting anything useful. I call this type of man “Mr. My Way,” after the song Frank Sinatra made famous. The careful steps, the standing tall, and the seeing it through provide the basis for each of his stories. It can make a beautiful melody, and for the two or three minutes during which my current Mr. My Way is hugging and greeting his way into the café, it entertains those he meets. But it only works provided he moves from one person to the next. Now, here am I, stuck in this position, with nowhere to go.

I’ve enjoyed hearing behind-the-scenes stories about players and big matches, and finding out about life at the training ground. Moving from being a fan to being someone who is confided in by those close to the action was, to use the biggest cliché possible, a dream come true. I still love hearing those stories and seeing the real world of my favorite sport for myself. But more often than not, the interesting bits are accompanied by “heroic” tales of Mr. My Way’s “vision,” followed by accounts of how their progress has been foiled by a cheating adversary or how they could do things better than anyone else if they had been given half a chance. Because of my background in math, these guys often feel they have to explain their thinking process to me. They start by telling me that I have a different way of looking at things than they do, without actually asking me how I look at things. “I think stats are great for thinking about the past,” he will tell me, “but what I bring is insight into the future.”

After that, he will explain how he has a unique ability to spot a competitive advantage. Or how it is his self-confidence and strong character that help him to make good decisions. Or how he has cracked a way of picking out patterns in data that I have (he assumes) missed.

His tales tend to include a digression to times that didn’t go quite as well for him. “It was only when I lost concentration that I started to make mistakes,” he tells me. But he always returns to emphasizing his strengths: “When I stay clear and focused, I get it right.”

What I hadn’t understood when I started working with sports statistics was just how much time I would have to sit listening to men telling me why they believed they were the special one. I should have known better because this doesn’t just happen in sports. I have experienced the same thing in industry and business: investment bankers telling me about their unique skill sets. They don’t need math because they have a feeling for their work that their quantitative traders (known as quants) can never have. Or tech leaders explaining to me that their start-up succeeded because of their unique insights and talents. Even academics do it. Failed researchers describe how their ideas were stolen by others or, when they succeed, they tell me how they stuck to their principles. Each of them did it their way. Here is a difficult question to answer: How do I know whether someone is telling me something useful or not? The guy I’m sitting with now is obviously full of it. He has talked about himself nonstop for the last hour and a half. But many other people do have something useful to say, including, on occasion, Mr. My Way. The question is how to separate the useful stuff from the self-indulgent stuff. The difference between a Mr. My Way and a mathematician can be summed up in one word: assumptions. Mr. My Way barrels through the world confident everything he assumes is true really is true.

In case you start feeling too smug about the bullshitters in your work sphere, stop to consider the second order effect of knowing that some signals are more verifiable than others. See the 1-min read The Paradox Of Provable Alpha.

Innocent Fraud

The last time I remember saying more than a sentence about macro was about 2 years ago.

Before that the only real dive into it was reading Jesse’s outstanding paper Upside Down Markets. I read it 3x. It’s the length of a short book so I did a thorough breakdown of it here if you want a condensed version.

A highlight of Jesse’s paper was the use of the Kalecki-Levy “sectoral balances” accounting framework which offers a very organized way to think about economic flows. That framework gained awareness as MMT gained prominence in economic discourse.

MMT or modern monetary theory is called a “heterodox” economic theory. This has nothing to do with what kind of theories it likes to have sex with but just means it’s outside the mainstream. It’s controversial because it’s interpreted as this hyper-Keynesian excuse for the government to spend. Valid concern. But that has less to do with MMT’s descriptive power and more to do with its politics. That said, this problem applies to macro in general. The error bars around macroeconomic theories are far wider than the political battering rams of policy. Policy is always distributional — there are winners and losers. Those contested grounds hold far greater weight to narrow interests and individuals than the economic statistics that roll up to high-level macro summary. [Most economic arguments are politics in disguise.]

I have a friend who manages a pool of capital for a HF-turned-family-office. He’s creative thinker to bounce ideas off. Inquisitive thinker who reflexively considers many angles to a problem before even speaking. When we chat he never asserts anything. And he’s one of those listeners who makes you nervous. His question-to-opinion ratio is like 100-1. The opposite of the incurious, overconfident Mr. My Ways that are overrepresented in finance (which is foremost a sales industry).

I say that because the book I’m going to recommend came from this friend and our back-and-forths on Whatsapp. It’s definitely not a book I would have read if this guy didn’t recommend it. I’m really glad I read it. But for reasons that are probably not why most people pick it up. I’ll get into that but first, the book is called:

The Seven Deadly Innocent Frauds (free download)

There’s a blog post sized version.

And my Kindle highlights.

The book is by the “father of MMT” Warren Mosler.

Reading him is a surreal experience because the extent of conventional misunderstandings of how money works at the macro level seem both comically and tragically disturbing. I’m not talking about his suggestions which are totally contestable but the revelation of how people in power don’t even understand the mechanics of banking and money.

Watching the crypto grifters “do macro” is even more ridiculous when you discover how many fallacies they’ve inhaled. Although the “taxation is theft” argument they spout is closer to the truth than not. This is not a vote for anarchy but the sentiment puts a finger on a strong feeling one gets as they read this short book — just how deeply coercive the contract between a government and its people are. That feeling is cemented as Mosler beats you over the head with a basic identity in a fiat monetary system—taxes have nothing to do with funding expenditures. Zero. (He actually shows this through an experiment you can do at home with your kids to show it…and of course I will be doing it, muahahaha).

Anyway, I recognize that MMT is left-coded and the finance bro who describes himself as “socially liberal and fiscally conservative” would rather compliment a dude on his veiny calves than read a book by Mosler. So let’s see if I can pitch the book with the right hooks.

Here’s why you should read it:

  • The “sectoral balances” framework has been credited for the late British economist and MMT influence Wynne Godley’s forecasting track record.
  • Jesse, who is anything but an MMT fan, used that framework to write the prescient Upside Down Markets paper.
  • Mosler is a trader first. He is a fund manager whose successful arbitrage style of investing shaped his first-principles understanding of how the economy works. The book is short. The 7 “deadly frauds” are covered in about 50 pages and his career memoir comprises 25 pages. He catalogs a fascinating list of trades, including several pioneering bets on lira-denominated bonds, GNMAs, bond futures and cheapest-to-deliver pricing (there’s a neat side-story to those because they were a release valve for the Hunt Brothers’ silver squeeze profits, and subsequent COMEX governance shenanigans that floor traders are aware of). The rich trader to economist to (attempted) politician arc is not one you see every day.
  • The book is easy and fun to read. Depending on how resigned to cynicism you are, you will laugh as he recounts conversations with famous policy makers and politicians that reveal an unwillingness of prestigious people to get their hands dirty and learn. In some cases, they are quite open about the fact that they are captured by audience. Incentives. It’s always incentives.
  • It’s worth commenting on Mosler’s leftism. It’s an easy box to put him in through the lens of how how left and right are oriented on the chessboard today in common discourse. But if take a longer view of orientations, his strong pro-markets/small government posture is a degree of sensible nuance that is easy to gloss over in an era where we forward articles after only reading the headline. I think his MMT-flavored recommendations sound like “big government” because his implicit demand is that it is impactful and assertive in a smaller but more focused and well-calibrated role.
  • You’ll walk away realizing how so much of macro is a series of “fallacies of composition” (he uses the paradox of thrift as a classic example which I always appreciate). In fact, the book makes macro seems quite easy. Not as in “I now get it”. I still have lots of questions. But you find that macro effects have a limited amount of outcomes when you see it reduced to accounting identities. To be repetitive, you realize just how much most macro discussions are political ones because they deal with the “distributional effects”. The same macro outcome can map to many ways of slicing the prosperity pie in America and that’s really the stuff we argue about. We argue about it as if x or y is good or bad for the economy. By analogy, substitute “league” for economy — nobody cares what’s good for the NFL they are really just wearing team colors.Maybe that was the most enduring takeaway from the book if you just read between the lines. Whenever you hear people argue macro, just smile and nod. Macro opinions are nothing but political mood rings. You can tell what color they’re feeling when they open your mouth.

A few admin notes about the book:

  • Like I alluded earlier, the book is 100 pages. The first half covers the 7 deadly financial frauds, the next quarter recounts Mosler’s investing career (he’s still active), and the final quarter of the book outlines policy proposals informed by his views. He ran for state office. I barely skimmed the last part.
  • The book was written in the wake of the GFC. Our problems today look like the opposite of what we were facing then but don’t worry we’re equally stupid.

Personal bias note:

I’ve admit of my George-pilling (this was one of my favorite explorations of a topic in economics) and preference for land value tax to replace much of our overly complex and unjust tax system. I think Mosler’s economic thinking dovetails nicely with Georgist ideals and first principles understanding of economics.

One last thought:

While a deeply enjoyable read it makes the world less joyful to look at. We are bombarded with innocent frauds* because we are lazy and it take effort to learn things but no effort to parrot.

*An “innocent fraud” is a term by John Kenneth Galbraith that inspired Mosler’s title. It refers to misunderstandings that are “sustained by conventional wisdom”. Now you have a term for those crimes of laziness that necessitate yet another buzzy term: Brandolini’s law.

Learn Probability

Dave is a quant at Paradigm. He asked:

The thread is full of recs. I mentioned David Sklansky whose books were assigned reading at SIG 20 years ago. Gambling literature is going to be a great place to search since it will likely balance academic and applied considerations.

To that end I also recommended the OG website —> wizardofodds.com

For decades, they’ve been publishing the combinatorics on casino games and so much more. They even had a list of all the specific video poker machines that had positive edge (yes, there were some. At SIG there was a group that actually exploited this as a fun side extracurricular).

The host of the exceptional gambling podcast Risk of Ruin @halfkelly immediately recommended:

Harry Crane’s First Course In Probability

Others recommended video courses such as:

Statistics 110: Probability by Harvard University

Course in Bayesian Statistics via Virginal Commonwealth University

The Gasoline of the Internet

The internet exposes us to a wide array of perspectives, beliefs, and behaviors that we might not encounter in our immediate, offline environments.

We are outraged by this.

Maybe this is our lizard-brain threat detectors tuned to primitive local survival requirements. An existence where the number of people and places you will encounter in 50 years of life expectancy can be tallied on your fingers. If our ancient white blood cells see modern connectivity as an intruder, our system 2 reasoning can be used to restrain that impulse…”chill sentry, these other ways of being were always there, we just didn’t see them before”.

There’s a particular brand of discourse that recurs constantly on #fintwit’s personal finance channels. It goes like this:

Original poster: “I need this much to retire”

Someone is 100% to be offended and responds:

“No you don’t because [reasons]”

This is a surefire way to farm engagement. It’s a close cousin to NY Times favorite style of finance article “I make $1mm a year and can’t afford life”.

Here’s Ramit, whose personal finance content I usually like, playing the game:

Everyone’s entitled to their opinion. I find it difficult to get emotional or even conjure opinions on such a topic. It’s an impotent path of inquiry because it is absolutely crushed by the sheer magnitude of how our perceptions emerge from limitless forking paths of human experience.

I just short-circuit at normative questions of “what should be”. Feels arrogant to get riled up about what others think they need. We change. If you grow up with little and do well, your sights will get set higher. Is this bad? Seems like the wrong question. A misdirection from a more useful framing. And that framing is also a matter of opinion. But that’s one I have a view on.

To invite a personal perspective I’ll admit a simplistic desire. When it comes to what I want to afford, I wanna live in a nice town in CA and not have to count my pennies in the course of a comfortable existence. One in which a special occasion stay at 5 star hotel doesn’t require a separate savings account. But I don’t expect to have a yacht or live-in maid.

At the same time, I don’t pretend we can’t have a lost decade or 2 of returns. It might be low probability but you don’t need to catastrophize to think it’s part of the distribution. Relying on even historical assumptions of investment returns feels uncomfortably fragile to me. Life is a single draw. Relying on non-stationary averages is a building a house like the first 2 little piggies.

[Note: This asset management marketing pitch called “evidence-based investing” offers useful heuristics but if it over-loads your “extrapolate from the past” muscle it’s a Trojan Horse.]

My no-shortcuts belief:

Find the mix of work and expectations that give you a chance of getting to a number with a very high margin of safety which is always going to look like a big number. But you don’t make your happiness contingent on getting there.

My rebuttal to all these “you can get retire on as little as X” is sure you can go live in Egypt too for even less. Maybe I need X because I be living like Y. And it takes a lot of arrogance to yuck people’s Ys. It’s possible they have distorted goggles on but, unless they want help, to assume they are broken feels like a strange position. The American who needs 20mm to retire vs the person who needs 1mm.

Does it really take that much effort to imagine either person without thinking one of them is crazy? You won’t unsee it once I say it — arguing over ranges where reasonable people can disagree is the gasoline of the internet.

The whole “what you need” discourse is a distraction from the plain truth — there is an “arrival” fallacy at play. It’s very well documented across the board. Win the championship, celebrate one night, then feel letdown. Living in the future (or past) sucks. Never overdose on hope or nostalgia.

So that bloke that gets to 15mm and won’t stop “until he gets 25” is probably wrong on thinking there’s an endpoint.* But calling him out for his mentality is judgmental at best and wrong at worst. Maybe that mentality was a prereq to getting to 15 in the first place. Every strength is also a weakness (or so I tell myself when I forget to buy milk because I was daydreaming).

*I don’t think about a “number” because if you hit it you still have to contend with what you’d do with your time. You should stop pretending that is a problem you acquire when you “arrive”. You have that problem right now.

Instead, we can choose grace and not think unusual people need fixing (again unless the person is seeking help). Should posters just write a boilerplate disclaimer: “I realize it’s gauche to say I want 25mm to retire and any reasonable person could retire on less but…”? Do they need to splay themselves on the altar of “be considerate to the average human” before speaking?

There are subgroups of people that, for better or worse, have their own standards for how they want to spend their revolutions around the sun. We could simply choose rules of charitable engagement to assume that default belief instead of compressing variation. When someone posts on r/FatFire they are talking to a selected group of mutuals. You know what’s going to happen when you paste that in the internet’s town square.

My wife and I come from middle to lower class families. Yet, because of our professions have seen a lot of how the other half lives. And we’re part of the other half as our childhood selves would conceive of it. I thought any kid with a GI Joe aircraft carrier was rich. My wife’s “this person must be rich moment” was when she had dinner at a friend’s house and they had Ranch dressing. I only have to drive around my town to know there’s “another half” compared to our current perspective. And to them there’s “another half” flying private everywhere. You get the point.

Not relating to someone else is not news. It’s what should be expected because of how wide the range is. How can something that shouldn’t surprise you, get you fired up? The whole discourse is low-brow projection.

Here’s an experiment to undo this impulse. Consider the Guinness Book. Longest fingernails? Is there any point in trying to relate to someone who goes for that. We don’t try because it’s absurd. The range of humanity is blindingly obvious when you turn those pages.

But when people start talking about money (as opposed to their time, which Guinness people used in wild ways) we somehow think that should be more relatable. Resist that illusion.

You don’t need worry about what everyone wants. These are not matters of right and wrong (vs the boring problem of someone making the equivalent of a counting mistake in describing the how of their money pursuits).

You need $100mm, cool. You wanna live in a van. Right on. For all I hope your expectations are met.

But I don’t think that’s how anything works. Solving for the best way to be useful and happy is a lifelong endeavor. It’s not where you are gonna eventually plant your flag it’s how you carry its weight every day.

It’s true and forgiving to recognize that a lot of people don’t have the luxury of the thought — but what’s much worse is how many don’t have the nerve.

Ratio’d

Last week in Breakpoints, the discussion was about measuring implied skew.

A common measure and the one we use in moontower.ai is normalized skew which computes the percent premium or discount of IV at the 25d strike vs the 50d strike.

It’s not a measure that lends itself to direct interpretation. If 50d IV is 30% and the 25d put is 36% that’s a normalized skew of 20%. It doesn’t mean anything on its own but it is useful to see if skew is relatively or historically high or low. You chart it as a time series or percentile the value on a 1 or 2 year lookback. You can compare skew cross-sectionally across correlated assets.

Skew, or any measure, can be attacked from any number of angles. Our single measure of normalized skew itself requires choosing tradeoffs. The last post addressed the biases of various breakpoints. Moneyness, standard deviation, and delta-relative are all common ways to fix the gridpoints.

Today, we’ll use an approach that many might find more intuitive — thinking about skew in terms of option premiums instead of implied vol. When we look at option chains we are looking at prices. When we trade options our p/l depends on how the premiums change. For many investors, premiums are a more natural way to think about options than IV.

We will use GME to demonstrate a number of ways to think about skew which are more tightly intertwined with how skew trades are expressed — through verticals and ratio’d verticals.

We can even turn the metrics into a simple oscillator based on arbitrage bounds. If I do my job right, this post will make the concept of skew more concrete and inspire you to track it in new ways.

I started this GME case on a data exploration lark. Baycrest option strategist David Boole said that the call skew on the latest rally surpassed even the 2021 craziness. I wanted to look myself but I didn’t want to just look at normalized skew by delta.

Because I knew the call skew was so fat I was a bit uneasy about the recursive nature of delta-relative gridpoints. The last post uses a concrete example to demonstrate how option vanna causes the delta of an option to change with the IV which muddies the answer to “what did skew do today?”. Truthfully, this is a nerdsnipe for non-vol traders, but as a vol trader it’s bothersome enough that I wanted to choose a different tradeoff.

I opted for a standard-deviation relative surface for the study instead.

Let’s step through it.

1) Pull end-of-day GME data from 1/4/21 to 6/14/24

In particular:

  • .50 delta IV for the option closest to 30 day expiry (range of actual expiry dates in ranged from 27 to 32 DTE)
  • The breakpoints that correspond to 1 and 2 standard deviation OTM upside strikes estimated using .50 delta IV and actual DTE.
  • Call premiums at the closest strike to the breakpoint subject to some error tolerance (ie the strike needs to be within 10% of the breakpoint IV but that 10% is scaled to IV. If IV is only 30% than a 10% divergence from strike to breakpoint is not acceptable but if IV is 250% than it’s ok). If no such strikes were listed the day is omitted.
  • The vega of each option. I also estimate the vega of the theoretical ATM call using the approximation .4*S√t
  • Call premiums are normalized by measuring them as a percentage of the stock price.

2) Chart the 1 and 2 std dev OTM call premiums as % of stock price.

The .50 delta IV is on the left-axis. We can see the recent spike relative to the early 2021 spike. We can also see how the 1 and 2 st dev OTM calls explode in value.

But we expect option prices to rip when vol explodes.

Skew is an attempt to say something about how the relative value between options of the same expiry change. So far there’s is no notion of skew.

That sounds like a job for tracking a vertical spread as percentage of the stock price.

3) Chart the 1 SD / 2 SD call spread.

Hmm…this feels unsatisfying. The call spread is also spiking with the call values. We’re not learning much from displaying the spread.

There’s a good reason for that.

These call spreads, like outright calls have positive vega. As vol increases, OTM call spreads increase in value.

Instead let’s look at the 1×2 ratio call spread.

4) Chart the 1 SD / 2 SD ratio call spread (2 further OTM calls vs a single 1 SD call)

Ahh, now we are seeing the value of the 1×2 decline on spikes in vol. In other words, a structure that is long 1 OTM option and short 2 further OTM options is losing value when vol roofs. It’s hard to say what’s driving this however.

As IV increases, OTM options gain vega. Not only do the options go up in value, but they become more sensitive to IV. This is vol convexity. Every uptick in IV increases the option value more than the prior uptick. Like your position is growing! This is the “gamma of vol” or vol convexity.

At an extremely high level of vol, all OTM options approach .50 delta. The vega of options of different strikes will converge.

Assuming you own the lower strike and short 2 further OTM strikes, the 1×2 starts as a long vega trade from low levels of volatility. But it is short vol convexity or “gamma of vol”. At crazy high vols, the vegas converge and your net greeks converge to the net amount of options in your position — in this case you are net short 1 option. Your position vega, or sensitivity to IV, is now negative. So as vol spikes, the 1×2 loses value.

(Another way to think of this is that a 1×2 is equivalent to being long 1 call spread plus being short an extra option. At high enough vols the call spread’s net vega is a wash and you are just short an option.)

The picture above makes it hard to disentangle skew changes (ie the relationship between the 1 and 2 SD strikes). It is confounded by the changing vega of the structure.

Let’s back up and simplify for a moment.

We can compare the ATM call with the 1 SD OTM call so we aren’t trying to parse skew changes across 2 OTM options. And we’re going to control for vega itself!

5)The vega-neutral call spread

Instead of fixing a ratio such a 1×2. we will stipulate that our call spread must be vega-neutral. To do that we simply solve for the ratio that makes the vega of structure zero. In other words, we count how many OTM calls we need to short, for each ATM we are long.

The fewer options we need to short, the steeper the skew must be! In an infinite vol situation, all calls go to their maximum value — the stock price itself. Which means the call spread is worth zero (since all the calls are the same price) and each call has the same vega (which is weirdly zero — at infinite vol, changing vol by one point isn’t going to change the option value).

I like using extremes because they establish arbitrage endpoints from which to reason backwards from. In a high but not infinite vol situation, the ATM and OTM calls will not be equal. But perhaps the ATM call vega is just a bit higher than the OTM call. In that world, you only have to sell slightly more than 1 OTM call to be vega-neutral.

Repeating — the lower the ratio of the vega-neutral spread, the steeper the skew.

Here’s the chart.

Remember, this is a vega-neutral ATM/ 1 SD call spread. In the recent spike, the OTM calls were so jacked compared to the ATM that selling 11 calls for every 10 you bought would have been vega-neutral!

Let’s reproduce the chart but shifting the strikes to 1 SD vs 2 SD.

The picture is similar but the ratio to make the spread vega-neutral is more volatile. (That 2 sd call is also noisier because when the premium is only say 1% of the stock price, slightly leaned or errant marks matter more.)

Extra: A 1×2 indicator

By tracking the ratio of further OTM options needed to make a spread vega-neutral is an alternative way to track skew. Like normalized skew it reduces vol artifacts but because it maps to option premium as a percent of the stock price it feels more interpretable. “Wow, the skew is so high I can sell 50% less options to finance the same long premium” or “I can own 3 OTM calls for the price of 1 ATM”

In practice, the option markets tend to coalesce around some common structures. The 1×2 vertical spread is an easy to ratio to keep in mind. It becomes like a tool in the trader quiver…they might look at a surface with low or high skew and gravitate to “how’s the 1×2?”

If we fix the ratio as 1×2, accepting how its shorthand does conflate vol and skew effects a bit, we can create an indicator that has the same shape but lives on an oscillator — it’s bounded by 0 and 1.

To do that think of the extremes.

a) The most a 1×2 ratio call spread can be worth (from the perspective of owning the 1 and shorting a ratio of the further OTM) is the premium of the 1.

Example:

A stock is $50 and the 1 sd strike is $55 and the 2 sd strike is $59. If the 55 call is worth something and the 59 call is worthless, then the ratio is simply the value of the 55 call.

b) The least a 1×2 ratio call spread can be worth is the -(the premium of the 1)

Example:

Vol is outrageously high. There is little difference in premium between OTM strikes. In other words, the call spreads are worth very little. Much like GME recently where the 55 and 60 strikes were almost the same value. We’ll be dramatic and say both calls are worth $1. The 1×2 is worth -$1. If you buy the 55 call and sell 2 60 calls, you’ll collect a $1 credit. The credit cannot be larger than this since the 55 call cannot be worth less than the 60 call.

With an upper and lower bound on the value of the ratio we can simply compute the value of the 1×2 in relationship to its range.

Here’s a time series of the GME 1×2 oscillator:


Wrapping up

The point of this post was to provide more angles to rotate the idea of skew in your head. In the process, I hope I was able to convey how implied volatility influences option prices both absolutely and relatively.

As GME goes, the skew does in fact look like it climbed higher than it did in 2021. It’s most noticeable in the simple ATM/ 1 SD vega-neutral spread.

But the peak of the skew, wasn’t that much higher than the peak in 2021 suggesting the market adapted pretty quickly the first time around. After all, option market makers presumably learned “total nonsense is possible”. They had the benefit of the 2021 experience to draw from in setting curves and didn’t push it them too much further than they did back then.

The craziest event is always in the future, but it’s not unreasonable to reference the GME case as a point of comparison the next time skew explodes in a name and you are wondering “how ridiculous is this situation compared to the Roaring Kitty meme sheets?”


Food for thought

About 20 years ago, as a still junior option trader, I interviewed for an options trading role at a fund chaired by Myron Scholes. They were called Platinum Grove iirc. LTCM lineage. I wasn’t smart enough to work there. A fortuitous miss because I think they got blown out in 2008. I don’t know for sure so don’t quote me. (I can speak more freely in the paid letters but I don’t want to offend or misrepresent unfairly either. This is just what I remember and I didn’t care enough to verify.)

Anyway, one of the pre-screen questions was how can you construct a market-neutral long vol convexity position?

The answer they were looking for was a ratio iron fly. Assuming a “typical” vol surface, you can buy about 1.4 25d strangles for each ATM straddle you sell. The position will be flat vega but:

a) as IV falls you get shorter vol

Think of the extreme where IV falls to something like 5%. The strangle you own is worthless and you are short a straddle. Your vega is short. As IV falls, your vega falls. You are long “vol gamma”.

b) as IV increases you get longer vol

The straddle goes up in price but it doesn’t gain vega. ATM (technically ATF) straddles are already at maximum vega! But the strangles you are long, gain vega so as vol increases they start gaining value at a faster pace than the short straddles hurt you.

A ratio iron fly is equivalent to a ratio call spread + a ratio put spread. If you widen the strikes from 25d to say 10 delta maybe it’s a 1×2 call spread + 1×2 put spread. By tracking the prices of specific structures normalized to the stock price you can get a sense for how the vol surface is behaving without knowing the IV on the strikes themselves.

You will still some concept of vol to measure the distance of strikes from one another whether it’s delta or standard deviation.

You can also fix the price of structures and invert the questions — “how far apart are the strikes I need to construct a zero-cost collar” or “how far apart are the strikes that make the 1×3 costless”? Then the distances become values you can track in your analytics.

The more you can apply a familiar lens to various opportunities the more you can build a mental pattern-matching library for what looks “off”.

It might sound salesy but this very much why the moontower.ai approach is so dear to me. Once I left the desk, I felt blind because I was so accustomed to seeing markets from a lens that efficiently filtered what’s normal from abnormal amidst all the noise. As a discretionary trader, it was the ladder to the diving board. There were still steps to take before you jumped but most of the effort was handled in the canned measures.

moontower.ai is building to recover my sight. In the process, we can give other option users the same vision regardless of what their objectives are.

Breakpoints

I caught David Boole’s segment about GME on CNBC because it was on Twitter. David used to cover me back when I showered every day. I agree with all of the framing.

He mentioned that the call skew in GME was higher this time around then back in early 2021. This made me want to look up the data but also prompted me to measure skew differently. But the “how to measure implied skew” question was a ball bouncing in my head already.

After publishing Scatterplot Gallery, Dave (not David Boole but an options market-maker) responded:

Dave is right. Sell side research likes normalized skew as a measure. So do I. We use it in moontower.ai. It’s how I looked at skew during my days at the fund and it’s the parameter I used at the spline points for my vol surfaces. I built up an intuition for those ratios over time per name. It’s also easy to compute.

Normalized skew is simple ratio of the volatility at an out-of-the-money point on the curve to the at-the-money volatility. It is common to measure at skew at the 25 and 10 delta strikes both on the upside and downside of the volatility surface and use the 50 delta option to normalize. (note the 50d strike is often but not always the at-the-money strike).

For example, assume:

Strike: 50 delta, IV: 28%

Strike: 25 delta put, IV: 32%

Normalized skew = OTM volatility/ ATM volatility – 1

Normalized skew = 32%/28% – 1 = 14.3%

The 25d put is trading at a 14% premium to the ATM volatility.

But…

Dave is CORRECT.

The measure doesn’t really make sense. It’s useful because it normalizes skew to ATM vol, but the measure often has a non-linear relationship itself to the vol. OTM options are sticky at the extremes of vol — so normalized skew flattens when vol gets high and steepens when it gets very low. So if you wanted to know if skew is “high” or “low” you still might want to condition it on vol level. Which of course, negates some of the benefit of normalizing it in the first place.

In general, it’s safe to use because it will correlate strongly with alternative measures of skew — it still captures “high” or “low”.

But unless you are a vol trader accustomed to how that parameter maps to prices because you see it in a model every day next to option premiums, it’s abstract.

Measuring surface and skew changes is a big topic in setting vol surfaces (or sheets if you’re “book a colonoscopy years old”).

Today, we’ll discuss skew models which will serve as background for next week’s paid post where we:

  • look at another measure of skew in the spirit of Dave’s comment
  • apply it to GME and David Boole’s comment

For both posts, we will be visual and lean towards simple.

Skew Models

Skew models allow traders to parameterize a vol surface. In other words, fit an IV curve to a discrete chain of strikes. From cubic splines to the vanna-vega-volga model, you can gorge yourself on as much complexity as you want.

Us knuckle-draggers who think an ecole is something you get from bad burger meat are just going to throw a line through some points. This is an example from the famous TT software:

Vol Curve Manager overview | Vol Curve Manager Help and Tutorials

The vol curve requires:

ATM or ATF VOLATILITY

Implying an ATM vol from the market or setting your own:

BREAKPOINTS

Finding the IV of the strike at various moneynessSD’s(standard deviation points) or deltas. Any of these measures is an attempt to measure the distance from the ATM strike to the “breakpoint” (ie 25 delta or 1 SD). Careful, moneyness is not a normalized measure. If a strike is 5% OTM, it’s much further away in a 10% vol name than a 50% vol name.

Example

The pictured model has 3 SD points above the ATM strike and below the ATM strike. Beyond 3 SD’s there will typically be a linear slope coefficient that fits the tails. That IV can then be parameterized by its ratio or spread to the ATM vol. So if the 25d put is 33% and the ATM put is 30% we are running a 25d put skew of 3 vol points or a ratio of 110%. As we’ll see, there’s no right way, just trade-offs.

Comments

  • The goal is fit the market snugly but without creating arbitrages in the option premiums due to weird kinks. These curves interpolate vols for the strikes in between the breakpoints. Market makers play whack-a-mole with kinks that get out of line.
  • Deal stocks or other assets with idiosyncratic behavior around specific price levels (think of how coal-switching puts a floor on nat gas prices) usually don’t have bell-shaped distributions. It’s hard to fit curves to them. Wrong tool for the job. That’s what Excel and some common sense is for.
  • As mentioned earlier moontower.ai uses normalized skew by delta

On modeling skew changes

  • “Sticky strike” refers to a model that assumes strike vols (vols at specific dollar strikes) stay fixed. This is reasonable over short horizons or smaller moves. It’s unlikely to hold over the time frame in which an OTM 30 strike put becomes far ITM
  • “Sticky delta” means we expect IVs at various deltas to maintain a stable relationship with the ATM volatility. This is an implicit assumption if you use normalized skew — “I see that the 25d put seems to persistently trade at 110 to 115% of ATM vol”

There are hybrids and permutations. In a professional setting, that gamut ranges from highly bespoke proprietary models to a wide array of vendor software. I’ve used many types of models. Spot-vol correlation parameters were much more widely used in my second decade of trading. I’ve even used flat sheets, no skew at all. I’d just have a mental log of how far from flat sheets options of a particular distance from ATM would trade. I’ve used models that don’t have vols — everything is handled in price space. The models themselves were never a source of edge. But if a scale is consistently adding 2 pounds it’s still useful for comparison as long as you weigh everything with that scale.

This is why normalized skew is fine for cross-sectional comparison. It answers the question you’re interested in even if it’s hard to interpret on its own. The truth is measuring skew is like trying to pinpoint a firefly from its sporadic bursts of light.

Consider a made-up $20 stock with a 6-month IV of 40%. I fit a dumb skew model to it (it’s an Easter egg for a certain group of people, all of whom likely have reading glasses by now).

Things to notice:

1) Standard deviations are computed using 40% ATM vol. Those computations are explained in Using Log Returns And Volatility To Normalize Strike Distances.

2) Let’s look at the $24.50 strike. We can describe that strike in many ways:

  • 22.5% OTM (moneyness breakpoint)
  • .25 delta (delta relative breakpoint)
  • .72 SD’s OTM (standard deviation breakpoint)

3) We can describe its skew:

  • 3.9% clicks below ATM vol (spread relative)
  • 9.8% discount to ATM vol (ratio relative aka normalized skew)

Like good little option taxonomists we can say lots of little things about our friend the $24.50 strike.

But now we shall kill him.

Some hedge fund wiseguy with an S&M fetish puts on a giant collar — buys puts and sells calls. Ravages the skew.

The ATM vol stays the same, the collar is vega-neutral. But we’ll assume the 24.50 call’s fixed strike vol drops from 36.1% to 32.3%, nearly 4 full clicks.

What can we say about this option?

  • Well, it’s still 22.5% OTM.
  • It’s still .72 standard deviations away since the ATM vol hasn’t changed (assume it happened quickly, so not much time elapsed).

The normalized skew has changed. The strike now trades at a 19.3% discount to the ATM vol (32.3/40 – 1).

That’s important. It tells us our measure is working — the call got hammered relative to the ATM and the discount got wider. Basic counting still works.

But we have a small problem.

If we parameterize our surface by delta we have a “nail Jell-O to the wall problem”. By virtue of the vol coming in, the delta at the strike also falls (this is also known as “vanna” — the Commander-in-Chief of greeks — getting both too much credit, blame, and airtime relative to its efforts).

The $24 strike is now the .25 delta call. Its vol is 32.9%. If you are looking at skew changes using delta relative parameterization you will see skew fall from -9.8% to -17.8%. A steep drop but not quite the drop to -19.3% change if you used standard deviation relative or fixed strike relative breakpoints.

Summary:

 

Pictures are always better for these things:

The nuances of tracking skew changes are to finance what Equus Erotica is to sex. Like 40 people care.

Other sources

I’m not at liberty to blast it out but this was a widely-read piece on measuring skew:

You can also see Colin Bennett’s free book. My notes are here.

Just remember…

  • The delta relative breakpoints are recursive in that the strike vols changing alters the strike deltas.
  • SD relative breakpoints slide around with the ATM vol and time passing.
  • Moneyness breakpoints jump around with every tick in the stock.

Whatever you pick, just be consistent and understand its biases. And when I say “you” I mean pros. If you are anyone else losing sleep over this, you’ve lost the script.

Options stuff is fun, “bicycle-for-the-mind” and all that. Just don’t think you need to know this. There’s no money printer at the end of this rainbow. It’s mostly useful for managing the risk of large option portfolios — the less stuff you trade the more irrelevant this stuff is.

You’ll go far in life if you can just be good enough to remember your hole cards and not need to check’em again to see if 3-7 offsuit is playable.

Narrating An Option Trade

I’ve been narrating my small GME trade this week through this substack and twitter.

On Friday I rolled my short June 20 calls to the 25 strike. I narrated my thinking on twitter but I’ll re-print it here. It’s a combination of real-time thinking and some meta-thoughts about trading as well.

Sharing my monkey thoughts as i mess around in GME… I’m long that 20 lot of June 20/30 call.

Rolling the position

Despite the stock being down today, the 30s are eroding as vol is declining so the spread actually upticked in value.

I’m also looking at the june 20/25 call spread:

From my IB montage

The spread value has increased a lot. The vega on these options is small but not totally negligible. Look at the IV spread…it’s fallen 14 points today on a spread with a penny of vega – that’s a 14c rally in the spread on a delta deutral basis! Since the spread is only .23 delta, the vol change has kept the spread little changed despite the stock move.

I decided to roll my short 20s into short 25s.

So I sold the June 20/25 cs. I was filled at $3.71

I was long the June 20/30 cs from $2.08 but collected $3.71 on the 25/30 cs.

=> On balance, I’m left long the 25/30 cs for a $1.63 credit.

 

Thinking behind the roll

The roll was driven by a sense that the risk-reward on the 20/25 cs at $3.71 is not as great as owning the 25/30 for a m-t-m level of $1.60.

I’m synthetically “selling the 20/25/30 fly” in my reasoning.

This is a mix of seeing the strike vol changes today and feel. This may sound woo woo. If you require higher standards of trade discretion I can understand that but for the most part this is kinda what trading looks like for all the nerdom that gets bandied about.

GME is a name that doesn’t lend itself to data analysis or cross-sectional triangulation.

[Also, these are microscopic stakes —the original trade only risked about $4,200 and was only 1/5 of the size I wanted to scale into. Unfortunately, the call spread went straight up in value after I was filled and I was too anchored to chase]

In a professional setting, you will be more plugged into flow which tightens the reasoning and timing plus better execution tools/costs, but the mental progressions of a MM very much rhyme with mine especially in idio situations.

The value of tracking

After selling the 25/30 call spread, I stored it in my watchlist on a delta-neutral basis vs the stock price reference I sold it against to track its performance and my fill quality.

The market on a delta-neutral basis is $3.52-$3.89, I sold at $3.71.

June 20/25 call spread

I noticed as the stock went down my fill marked better and vice versa.

This can be an artifact of market widths in the legs esp since both calls are ITM but if you can rule that out by tracking the counterpart put spread instead (CS + PS = Box so you can translate the 3.71 cs as a 1.29 PS) you can get a fingertip feel for how the skew moves as the stock changes by watching a delta neutral price move. I used to do this for lots of structures. They’d be in my window for weeks so I can see how certain large trades worked or not.

Experience is a repo of unstructured data

Overall, these little habits accumulate as a big unstructured data repo otherwise known as experience. Discretionary trading uses “science” for measuring. models, dashboards, stat studies. But the trading is a boardgame sitting on top of that.

Systematizing

Systematic trading is different. I sat next to people running large systematic strategies. It’s piloting & auto-piloting a more diversified strategy, less chunky risk but it’s not an alien approach. It’s also inspirational because you can port measures and ideas from it

I never really figured out how to systematize my trading from end to end. I like to say discretionary trading is just trading without ignoring unstructured data. The unavoidable pitfall is part of that “unstructured data” is bias however.

You can fight that with a team of people conscious of behavioral biases because while it’s hard to debug yourself it’s easy to call out others’ blindspots. You can help each other. (see Trading Is A Team Sport)

moontower.ai

It’s funny as I’m sandboxing analytics to develop for moontower.ai, I’m always thinking about how to pre-chew ideas for users.

There’s always a balance between legibility and user effort. If investors just hand you money to manage for them, it requires no effort for them but they have no control over the trades or process.

At the other extreme, you can give a user a Bloomberg terminal where they have full control but need to make all the decisions.

For a retail user this trade-off requires tremendous thought. You could just feed someone a signal without teaching them how to fish. When the signals don’t work they’ll churn. Or you can teach them how to approach their goals methodically understanding that there’s a learning curve. It takes more effort for the user but for the person who wants to learn to fish it’s the only way.

moontower.ai balances being opinionated but not signal-driven because it inherits the exact trade prospecting funnels I used as a discretionary PM.

We can see the seeds of signals in our “axe list” concept. It’s just a roadmap idea at this point. But some of the data play I’m doing for this coming week’s paid post looks like it might be a source for another analytic regarding skew.

As a reminder, if you’d like to signup for moontower.ai as a paid user, a moontower.substack paid sub is included.

We also have a fully automated affiliate program. If your audience would be interested in option analytics with a point of view…you can sign up to be an affiliate and get $100 for users who sign up.

$ Become A moontower affiliate

 


Arb in ADBE options? (twitter thread)

A Twitter friend thought he might have found an arb in the ADBE options that could be exploited by a trade known as a ‘conversion’. It was a false alarm but if you scroll through the thread you can learn a lot about computing implied rates and the details of how to execute the conversion arbitrages. They don’t sit around in plain sight but this is the masochism section so if you want to learn the thread is a real-life process of noticing something that looks off, getting to the bottom of it, and learning a lot about options in the process.

When it’s normal to have no idea what your returns are

The lingua franca of asset management regardless of strategy is returns. But this is not necessarily true in trading. In my own 21 years of trading, it was never even brought up internally. It gets mentioned somewhere far in the background. In vague terms at best. And even then, the context is more of a hurdle than a target.

We’re going to talk about this in 2 parts.

Today

I’ll walk through what this meant in the 3 phases of my career (with plenty of color interspersed):

  1. Being a market maker at SIG
  2. Running my own market-making group financed by a Chicago firm that backs traders
  3. As portfolio manager within a relative value volatility trading hedge fund. This last job was for an entity that was backed by LP money and therefore beholden to the concept, language, and delivery of returns. And yet, the idea of returns was alien to how I ran my business there.

Next week

I’ll connect this back to a reader question that got me thinking about all this in the first place:

How important is (il)liquidity in options when making risk-defined trades such as credit/debit spreads or buying single call/put options?

Hmm.

I can feel the doubt.

“Bruh, those lily pads are in different ponds, how we makin this leap?”

We gonna make it. Off we go…

At SIG

My first trading job for SIG where I had my own account and P/L was an equity options market maker on the AMEX floor.

The AMEX, like the NYSE, is a specialist system. There are “posts” where the options on a list of tickers trade. The specialist is like a lead market-maker who sees the full order book and is required to make a market according to a set of rules and guidelines. The specialist designation is a bundle of obligations and privileges. They must maintain an “orderly” market, set the vol curves which then disseminate electronic bids and offers for every strike and maturity for a name to the world. They are also entitled to 40% of the volume that trades on the bid or at the ask.

Market makers stand in front of the specialist post and announce any bids or offers that improve the specialists’ market. The market-makers are allowed to participate alongside the specialist on the bids and offers if they agree that they are “on the market”. My first year trading, I was a market-maker. The major names at my post were AIG, Qwest, Eastman Kodak, Corning, Cheesecake Factory, and about 25 other stocks. One that was notably missing because it was just delisted from that post — Enron.

As a novice trader, it was really about strapping on a helmet and getting experience. I wasn’t going to be taking massive risks but in the course of trading if I saw anything noteworthy I could discuss it with my manager to see if I should be. There were 3 notable things that happened in my time there.

  1. AIG accounting scandal (this was the only time I ever talked to Jeff Yass about a trade. UBS put up a giant print in the options and Jeff dm’d me to call him with details on one of the black phones placed near the trading posts. I was scared shitless to call because I didn’t do any part of the trade and was afraid he was going to undress me for missing it. He just listened, thanked me, and hung up. I never heard another word about it).
  2. Massive buyers of Qwest teeny puts by cap structure arb flow who were trading puts vs CDS. Selling them week after week in size and overhedging the hell out of them made my year. It was also stressful because up days were painful, but the stock leaked lower and the puts barely budged.

    [A year or two a colleague ran the same playbook in Xerox in much larger size and had an amazing year. If I remember correctly, that fellow was banned from Vegas casinos for card-counting by the time he was 22. He left SIG after that year, 25 or 26 years old, co-founded a prop firm and retired very wealthy in his 30s. Our wily friend also made the news for a buying a call option on a penthouse from its owner that ended up being the highest value residential RE trade in the city where it happened. Not a small city. One funny thing I remember about people’s impression of him — he was very lazy but insanely smart. And despite my personal belief that endurance and effort > brains, there were a lot of counterexamples to this back in those days. There were savants who struck it rich and peeled off. The billionaires in the options world were the savants who worked their asses off to build businesses, but the clever cats won life. You got $50mm in 2009 and your like 33 years old. You’re gonna triple it by age 50. In flip-flops with a flip phone.]

  3. I found a dividend in Kodak that was being priced in the wrong month. Which brings you to the question — how do you pick off the people who stand next to you all day with a structure that doesn’t tip them off too quickly and also disguises that it’s you?

    [This is a separate topic but trading cultures revere cleverness. It’s a game where the goal is to take people’s money. Any harmony is just a long-run game theoretic compromise. It is fundamentally adversarial. Can you see why effective altruism starts to look like effective autism when you consider the frames traders must adopt to deal with the discomfort of their decisions? At least back then the sociopaths were more forthcoming in their intentions rather than torturing normal people with trolley problems.]

After a year at that post, I was moved to one that included Microsoft (I was there when they announced their first dividend — it was special $3 div I believe), Oracle, and Expedia. That was 2003. After 2017, 2003 was my second least favorite year. There was less action compared to prior years and markets were getting tighter (the purely electronic ISE grabbed a ton of market share). I felt deeply discouraged. I came into the business near a peak and this was my first downcycle. I extrapolated doom.

So what were my returns in those first 2 years?

I have no idea. I had a close to $1mm profits in the first year and broke even in the second year.

I don’t know how much capital I used. In fact, they wouldn’t want me to know. For example, the margin I was paying financing charges on is likely much higher than it what it was in reality considering the difference in rates a small trader gets versus a giant client like SIG. It would only make sense for the partners to effectively lend me the difference rather than allowing me the credit for funding at the firm’s rates. It’s 6 in one hand, half dozen in the other when you zoom out but it does obscure your true rate of return. It wasn’t clear how much margin you are using or how much it varied. And they had no incentive to elucidate this.

If you were looking to leave then I’m sure you could have backed into a good guess for how much capital you needed. Which is critical if you are looking to build one of these businesses since of course it’s still a capital allocation decision whose ROI must be compared to other uses for the cash. But when you are making the donuts, there’s no talk of return.

So how do you talk about the business?

It’s raw dollars. Management figures “Ok, the average MSFT market maker for SIG should trade 5,000 lots a day for a penny of edge (remember the multiplier is 100, so it’s a true $1 per lot) times 252 trading days — the spot is worth $1.25mm per year”.

[If you make less, they’ll ding you in your bonus, if you make more, they say the trading was better that year so you were expected to make more — here’s your expected bonus. If you get discretionary bonuses you know the routine. You’ll get a verbal reach around in your bonus meeting, but then the number falls short of the rhetoric. I’ve been able to laugh about this for a long time now, but my wife can remember the days where my traders friends and I would plan what we would yell as we “flipped the desk over” 3 months in advance of our disappointing meetings. No matter how much you get paid, it’s always a letdown. She would joke about the Trader Wives Club where they’d have to hear us whine for 3 months before our reviews, and another week afterwards.]

In short, a trading spot occupied by a someone who knows what they’re doing has an expected p/l with a distribution. Based on the activity in the names that year, the value of the spot could be upgraded or downgraded for the following year. There’s always a dollar target and and the outcome is a debate about how much skill the trader brought to the result versus how the assumptions of volume, volatility, and competitive forces varied from the start-of-year forecast.

If the cost to man the spot makes sense compared to the expected p/l, someone will be assigned to it. Spots that are more valuable will be staffed with the more experienced/talented traders (ie you should expect that meme stocks are piloted by a prop firms’ top traders). So while there’s no concept of return, hurdles and opportunity costs are baked into the staffing decisions.


Backed by Prime

From 2008 until early 2012 I was backed by Prime International, a prop firm based in Chicago. Back then, they bankrolled about 100 traders, many in futures but there was several option pods of various sizes. I ran a mid-sized one and shared an office with their largest one (that pod was the largest market-maker in crude oil options in the late aughts).

These years were the most fun and learning I had in my career, outside the first year out of college as a trainee which was a zero-to-one combustion. I’ll save the stories for a non-print medium. I vaguely remember colleagues researching what it would take to create a small ice rink and get a penguin for the office. The fact that I would have put a 25 delta on it actually happening is a pretty good indication that this was not a normal work environment. Any job after this was gonna be a letdown but the floors’ days were numbered.

With the backer, everything was transparent. I was getting 70% of my p/l. I could see my daily margin. My shared and non-shared expenses. I could hire and fire as I wanted.

Yet again, no concept of return.

If you just looked at margin, you could back into an expectation that you should return 50-300% per year to make the gig worthwhile.

[This was in fact typical but I don’t think any pod back then was using more than $10mm and most utilizing less than $2mm. Floor trading isn’t that scalable. Ironically, the way to keep a floor well-fed is for fiduciaries to trade as if markets are more scalable than they actually are.]

Unfortunately margin isn’t a great denominator for returns. Buy a bunch of teeny options and you can mask risk. You could hunt for the cheapest thing to buy that makes a concern, defined too objectively on the back of assumptions, go away if you know what canned scenarios the risk group runs. I’m not saying this is done on a conscious or nefarious level but it’s too easy to affect the Ouija board by having a little extra preference for this strike or that maturity.

Computing return on average margin can also be weak depending on the nature of a strategy. Margin calculations are coarse proxies for risk. They aren’t custom enough for option traders. At one end they can be gamed, and on the other end they can be too conservative for arbitrage strategies (for example, no margin relief for WTI look-alike swaps if one leg is on ICE and the other on CME).

A picture is emerging. Return is difficult to compute because the denominator is murky.

We are accustomed to returns and volatilities. They let us use Sharpe ratios to measure risk-adjusted performance. But if we don’t have returns how do we get to risk limits that make sense? How do decide where to put capital?


Hedge fund days

In 2012, I moved to SF to build the commodity relative value volatility business for Parallax. It is a master fund with a host of sub strategies. LPs see returns but internally there’s no concept of returns at the strategy level. It’s just p/l that rolls up to the fund level. To be clear, this is typical for such a structure. It’s flexible.

The fund posts margin and manages risk such that the margin to equity ratio stays comfortably under 100% even in stressed conditions. Which means there’s always a cash buffer which can be held in T-bills, box spreads, or managed in any highly liquid way.

The sub-strategies margin requirements will bounce around based on the volatility and opportunities in the markets. Just to make up numbers, imagine the firm aum is $1B and runs 50% margin-to-equity. So in typical conditions, the margin requirement is $500mm.

Now consider that my margin requirement in the commodity book ranges between $20mm and $100mm depending on how much risk I’m taking. If I was a standalone fund and to be highly confident that my margin-to-equity wouldn’t exceed 80% than I’d need to raise $125mm. Most of the time I need much less, which causes a drag on returns but in the master fund structure I don’t really worry about this. The firm’s excess cash isn’t allocated to strategy as a hard constraint. The GP acts as the ultimate capital allocator internally.

So the best guess of what my returns are depend on the flawed denominator of average margin and relative to that number, they will always be worse if I’m a stand-alone fund because I need to raise far more capital than I’d typically be deploying.

How did it the business work in practice?

You’d come up with an annual expected p/l. In my case, the median was about 70% of the expectancy because I ran a positively skewed book.

[I was typically long vol convexity and often long gamma. When I was at Prime with my own money on the line, it was not uncommon to be paying 4 figure theta bills. I stood next to a guy that traded 100% of his own money that had a seven figure bill a few times a year. (Random thought but learned a lot about playing the player not the cards from him but this was a small market which he would turn into a game of heads-up no limit with the big customer. He recognized he had a lot of edge, and pushed. I hope he shifted gears when that was no longer the way you could play.)

In my fund book, you could buy a Porsche or Lambo every day with with the theta. That said — I kept a close eye on a very simple measure — don’t be long too much extrinsic premium unless there was a specific reason.

VRP language is something that feels like it comes from the asset management world not the floor trader world. There are exceptions. I knew some large short vol traders at every stop. The biggest lesson is that this game is far more artful than risk premia discourse pretends.

As for my own long vol bias and performance — this was not some trick of “Oh I lost half my years but won more in good years”. No backer, prop firm, or absolute return fund would tolerate that. I broke even or was down small 3 out of 20 years, made medium amounts most years and put up the numbers that drive the mean from the median around the GFC and the 3 year period spanning from the 2018 Volmageddon (although I wasn’t directly involved in that trade, it was the return of vol after the 2017 idiocy) through 2020. Look, the job pays when people are in pain. I don’t know what to tell you. The rest of the time I watch beta-maxxers and PE suits buy mansions. I write to feel prosocial. I don’t need trading to that feel that way. Sensing a dumb counterparty squirm, one who almost certainly was getting paid too much previously for charlatanism, is a reassuring hug from the god of markets as far as I’m concerned.

I am an agreeable human (I’m like 95th percentile on the Big Five personality test for this) but a highly disagreeable trader. Process, patience, fold, more patience, then f you sold. Boring, boring, boring, paper cuts, I hate this job, boring — violence. Put your style into simple words one day. It helps steer you back to the North Star whose light whose light you’ll most when your every decision feels like it takes you further into the dark.]

Back to the returns stuff. You peg an annual p/l target. Management implicitly considers how much risk needs to be tolerated to achieve that raw p/l, deems it satisfactory and off you go. Next year, the landscape is reviewed, growth initiatives weighed and you repeat a somewhat informal process. Any course-correction is mostly handled ad-hoc as the PM sees whether or not the environment calls for taking more or less risk. You can tell when there’s too many predators (entities who see the world in a similar way to us) vs prey (customers that have hedging or punting desires). There’s a time to hunt and time to hibernate. I say it all the time — this is a biological system not a Newtonian one.

So back to returns…it’s not quite right to think of return on margin. If you want to force a business like this into that framework you should probably just be conservative and consider how much AUM you’d need to run the strategy as a stand-alone fund. I’ll use a broad informed stroke. With a few hundred million in capital, I’d guess a strong manager with a trader mindset (as opposed to asset-manager mindset — if you’re in this business you know the difference so don’t @ me) could put up mean returns in the ballpark of 9-12% with the median between 50 and 70% of that.

[You can also see why many of these business are best housed within a fund that can flexibly allocate unutilized cash. The traders can be paid well enough to not tempt themselves into the brain damage of starting a fund of their own. To go out on your own has to be about more than money. There needs to a psychic reason to want your name on the door to endure what it takes to launch a fund.]

This is a damn good proposition because it behaves like long option position that you get paid to own. It’s unsurprising that the fees for the handful of managers who can do this (if you can even access them) are high. The proposition gets much worse if you’re taxable because it’s a short-term gain bonanza but of course many institutional allocators are tax-exempt.

Most of the risk lies in the ability for talented group of people to self-perpetuate themselves and the ability to assess that from the outside is probably almost zero. So all the usual caveats of active management apply.


Like I said earlier, next week I’ll tie this back to a seemingly unrelated question:

How important is (il)liquidity in options when making risk-defined trades such as credit/debit spreads or buying single call/put options?