Notes From Invest Like the Best: Brian Christian


About Brian: Author covering humans’ relationship with technology and AI

Q: What advice would you give to people, building careers. We’re in a political cycle now where things like basic income are being discussed. In your view, what are the most defensible areas of human activity, whether that’s some sort of creativity or asking great questions coming up with the objective functions that you then feed the machines? What would you recommend people focus on as they think about either early or late in their career, adding value?

A: There are sort of two ways that I can approach this question. My second book is called the Algorithms to Live By and it looks at things like career decisions from an explicitly algorithmic perspective.

1) Explore/Exploit Trade-off


There’s this paradigm, called the “explore/exploit” trade-off, which is: How much of your energy do you spend gathering information vs how much do you spend committing based on the information? There’s a number of decisions that we face throughout life, that take the form of a tension or a balance between trying new things and committing to the things that seem to be the best. Where to go out to eat, go to our favorite restaurant and we try a new restaurant. Reach out to a new acquaintance we’d like to get to know better or spend time with our close family or best friend. The same thing is true in investing, the same thing is true in managing your time and your career.

Generalizing the Problem

The structure of this problem is an iterated decision that you get to make over and over again. Do you continue to put energy into the things that seem promising, or do you spend your energy trying new things? A clinical trial can have that same structure, and indeed the FDA has been increasingly interested in looking over the disciplinary fence at the computer scientists and saying, maybe those algorithms that you’re using to optimize ads, could also be used to optimize human lives. The way a computer scientist, approaches this question is through something that’s called the multi-armed bandit problem.

The Multi-armed Bandit Problem


In the multi-armed bandit problem you walk into a casino that has all these different slot machines. Some of them pay out with a higher probability than others, but you don’t know which are which. What strategy do you employ to try to make as much money in the casino as you can. It’s going to necessarily involve some amount of exploration trying out different machines to see which ones appear to pay out more than others, and exploitation, which to a computer scientist doesn’t have the negative connotation that it has you know in regular English exploitation meaning, but just leveraging the information you’ve gained so far to crank away on those machines that do seem to be the best. Intuitively I think most of us would recognize that you need to do some amount of both, but it’s not totally obvious what that balance should look like in practice, and indeed for much of the 20th century, this was considered not only an unsolved problem but an unsolvable problem, and sort of career suicide to think about. During WWII, the British mathematicians joked about dropping the multi armed bandit problem over Germany in the ultimate intellectual sabotage. Just waste the brainpower and nerd snipe all of the German mathematicians. To the field’s own surprise, there came a series of breakthroughs on the multi-armed bandit problem through the second half of the 20th century.


Now we have a pretty good idea of what exact solutions look like given a number of constraints, but also what sort of more general flexible algorithms look like. The critical insight into thinking about this problem is that your strategy should depend entirely on how long you plan to be in the casino. If you feel that you have a long time ahead of you, then it’s worth it to invest in exploration, because if you do find something great, it has a long horizon to pay out. On the other hand, if you feel that you are about to leave the casino, then the return that you would get on making a great new discovery is going to be much smaller, because you have fewer opportunities to crank away on that handle once you find it. We should naturally transition from being more exploratory at the beginning of a process to more exploitative at the end. I think that’s an intuition that makes sense, but the math bears that out very concretely.

Observation of “Explore/Exploit” Trade-Off in Real Life


It’s interesting to see this idea that emerges in computer science in the late 50s through the 70s getting picked up by psychologists and cognitive scientists who are interested in human decision making. For example, Alison Gopnik at UC Berkeley who studies infant cognition, has been thinking about the “explore/exploit” trade-off as a framework for how the infant mind works. If you think about how children behave, we have all these stereotypes about children are just kind of random, they’re generally incompetent at things, and there’s a huge literature that shows that they have what’s called a “novelty bias”. They’re relentlessly interested in the next thing and the next thing and the next thing. Rather than viewing that as a kind of low willpower or attentional control issue, you can view it as the optimal strategy. It’s as if you’ve just burst through the doors of life’s casino and you have 80 years ahead of you. It really does make a lot of sense to just run around wildly pulling handles at random. The same is true for being in the later years of one’s life. We have a lot of stereotypes about older people being set in their ways and resistant to change. There’s a psychology literature that shows that older adults, maintain fewer social connections than younger people, and it’s tempting to view that pessimistically. In fact if you build an argument from the mathematics, you can see that older adults are simply in the exploit phase of their life and they are again doing the optimal thing, given where they are in that interval of time. You have psychologists like Stanford’s Laura Carstensen appealing to the “explore/exploit” trade off to make this argument that older adults know exactly what they’re doing and they’re very rationally choosing a strategy that makes sense given where they are. They have a lifetime’s exploration behind them, they know what they really like, they know the people and the connections that matter to them, and they have a finite amount of time left to reap the fruits of some new connection or new discoveries so they’re very deliberately enacting the strategy. The math should predict that, on average, older adults are happier than young people. Despite our preconceptions, and her research bears this out, that appears to be the case.


In business, the problem is very dynamic, which will classify it in the domain of the “restless bandit problem”. Since the research here is cloudier, researchers can invert the thinking to infer the conditions that lead to the business strategies we can observe.

Q: Interesting how this maps on to the life cycles of businesses. In the business context, “explore” might be innovation and “exploit” might be to run the same playbook to earn high returns on capital or something you know works. It seems like you always want to be handing off to a next batch of exploration or innovation, while thoughtfully maintaining something that you know works if you want to survive for very long time.

A: There’s a couple of things that I think are interesting in a business context. One is that implicitly the casino framing that I’ve described assumes that those probabilities are stable and fixed. Of course, we know that the world is not stable and not fixed that things change over time. This is true in our personal lives as well. Your favorite restaurant gets a new line cook and the burgers are not as good. These things shift. This is known as the “restless bandit problem”. How do you play this game when these probabilities are drifting on a random walk?

This is a very interesting case where the theory is not yet consolidated but humans, in practice, seem to have no problem. If you put people in a lab and give them a restless bandit problem, they have no trouble making choices within that environment but we don’t yet know what the mathematics of the optimal solution looks like. So here’s the case where the computer scientists and the mathematicians are asking the cognitive scientists, what are your models for how humans are actually approaching this because there may be some insight that we can use from the theory side. One of the implications of thinking in this way that is particularly relevant in a business setting is if the interval of time you perceive yourself to be on determines the strategy that you should employ, then it should be the case that if you observe someone else’s strategy, you can infer the interval that they’re optimizing over.

Inferring The Explore/Exploit Strategy in a Restless Bandit Problem

Let’s give an example from Hollywood. Most people have noticed, it feels like we’re living through this deluge of sequels, such as Marvel movies. It turns out that this is objectively true. There’s a sea change in Hollywood. In 1982, 2 of the top 10 grossing films were sequels. By 1990 it was six. By the year 2000, it was eight, and I think most recently it was all ten. From that, we can infer that Hollywood has taken a very hard turn towards an exploitative strategy. They are milking their existing franchises, rather than investing money speculatively to try to develop new franchises that will last them into the next few decades. From that, it’s reasonable to infer that movie ticket sales are declining, which turns out to be the case. Hollywood correctly perceives itself to be at the waning time of the golden era of cinema-going. If that’s true, then they really should invest all of their money into just squeezing everything they can out of the existing franchises. More broadly, so you can look at different industries and different corporations to see if they cut their r&d budget. If they’ve given that money to marketing that’d be an indication that they feel that the area has matured or plateaued.

My thoughts

    1. Ahem, asset management, cough
    2. Reminds me of a great Peter Chernin interview where he suggests that every business must be trying to grow new opportunities faster than the the old ones die out. While you must do your best to milk the old, it’s imperative to develop the new.

2) Predicting the Impact of Automation

The second avenue is totally different from this way of thinking, which is just what will the impacts of something like AI or UBI be on the economy. I’m reminded of a McKinsey report on which jobs they thought would be the most robust. The big picture thing that was interesting to me is that it cuts across the traditional class lines. It is not a white-collar versus blue-collar thing. It’s not an upper middle class versus lower middle class thing. It’s very sector dependent. The most resilient or robust jobs at the top end was gardener, legislator, and psychotherapist. I thought that was very fascinating that it’s this eclectic mixture of things. I don’t think of myself as a prognosticator about these sorts of things but my way of thinking about it is that there’s a lot of kind of human machinery around how capital moves and how laws get made. How licensing and permitting happen. It’s still done at a human negotiation level. “I know a guy. I’ll talk to Joe and we’ll sort it out.” I think humans will maintain oversight of these kind of flows of power and capital, even if the actual value is being created by software. So position yourself closer to the flow of that value than the actual creation of the value, which may be counterintuitive.

As far as the question of UBI, I don’t have a great intuition for that. There is already a restlessness in the labor force. A lot of the careers that employ some of the most numbers of people are the most vulnerable. People who drive cars or trucks, people who work in warehouses. A lot of those jobs are just one innovation away, and it’s not clear to me that there’s going to be a political response as well as just a pure economic response. I grew up in New Jersey where there was a robust toll collector union yet they had machines where you could toss your change in a bin and it would automatically sort your change and give you whatever you needed back from that. There was an effective effort to unionize the toll collectors so that you still had a human being in the booth counting out your quarters. That’s an example where it’s not for lack of technology. We had a coin sorting machine, but there was a political process that was directing the actual level of implementation. People will fight to use licensing requirements and regulations to maintain those things. Despite the actual technological capability having radically changed, it’s very hard to know which areas will look shockingly different than the world looks today. Which things will be in some ways shockingly backwards for their time because we’ve had for political reasons to hold the line.

(Reminds me of how rent flows to the owner of a relationship in a competitive market that has been flattened by technology)

Algorithms to make other types of decisions

The mathematics is very instructive, both in a specific way but also has a broader set of principles.

Optimal Stopping Problem

Difference from “explore/exploit” trade-off

One thing that comes to mind is the idea called “optimal stopping”. The multi-armed bandit problem in the “explore/exploit trade off” presumes framing that’s highly iterative. You can pull the handles again and again and again. You can go from one machine to another and back. There are many decisions in life where you are forced to make a single binding commitment that could be anything as banal as pulling into a parking space. It could be something like purchasing a house or signing a lease. It could be something like marrying your spouse. There’s a separate mathematics of cases where you need to find the right moment in time to go all-in, commit to an option, and no longer gather any further information.

37% Rule

There’s this very famous result called the “37% rule”. Let’s say you’re looking for an apartment. And it’s a really competitive marketplace. You’re in a situation where you encounter a series of options one by one. And at each point in time, you must either immediately commit, and then never know what else might have been out there, or decide to walk away and keep exploring your options but lose that opportunity forever. What do you do to try to end up with the best thing possible, even though you, you won’t necessarily know at the time, whether you found the best option that might be out there? There’s this beautifully elegant result that says that you should spend the first 37% of your search non-committally exploring your options. Don’t bring your checkbook, don’t commit to anything No matter how good it seems you’re just purely setting a baseline. After that 37%, whether it’s 37% of the time that you’ve given yourself to make the decision or 37% of the way through the pool of options, be prepared to immediately commit to the very first thing you see that’s better than what you saw in that first 37%. This is not just an intuitively satisfying balance between looking and leaping, this is the mathematically optimal result.

Broader insights on algorithms

Elegant solutions under a range of narrow assumptions about goals and acceptable risks

There are strategies like that that I think are wonderfully crisp in the recommendation they give, but they, of course, rest on this bed of many different assumptions about exactly how the problem is structured and exactly what your goals are. This rule, presumes that your entire goal is to maximize the chance that you get the very best thing in the entire pool, but it comes with a 37% chance of course that you have nothing at all, because you’ve passed. Many people would find that unacceptable. We can go down the rabbit hole of how do you modify this and the solutions get less and less clean as you wiggle the assumptions around.

Intuition for how complex decision-making is can be strangely comforting

More broadly, one of the highest level takeaways for me, from working on the book and just thinking in computational terms about decisions in my own life, is some decisions are just hard. The classical optimal stopping problem, due to a weird mathematical symmetry, is that if you follow the 37% rule you will only succeed 37% of the time. The other 63% of the time you’ll fail, and that is the best possible strategy you could enact in that situation. In a weird way, that’s some measure of consolation because often, in real life, we find ourselves not getting the outcome we wanted. While we can rake ourselves over the coals or try to reconstruct our entire thought process, I think it’s some comfort that computer science and mathematics can, in effect, certify that you were just up against a hard problem. There is some measure of comfort that if you have the kind of the vocabulary to understand the type of problem that you’re facing, and you have some intuitions about the general shape of what optimal solutions look like, then even when you don’t get the outcome that you wanted you can in some sense rest easy because you knew that you followed the appropriate procedure or the appropriate process for dealing with that situation.

Notes from Invest Like the Best: Ali Hamed


About Ali: Partner at CoVenture fund

His approach

  • He looks at new asset classes that can be hard to value.
  • Alternative financing like asset-backed loans (loans against fruit inventory, app for fast-food chain which allows them to clock employees in and out and allow them to pay employees whenever they wanted for a slight pay haircut)
  • Fee structures depend on the dispersion of manager skill.

Coventure recognized many seed companies never get to Series A

  • Fail to build the planned software to get to market. So Covenutures helps them.
  • Software types who don’t understand the industry they are building a solution for
  • Don’t understand the team they need

How does CoVenture fit into this?

The lesson is that the capital was easier to find than the people who can execute so :

  • Giving young businesses guidance and connecting them to the personnel they need is very valuable.
  • Having a service which serves common needs to many prospective startups is how to scale this idea.

Thoughts on cost of capital

  • If one VC fund can convince its LPs to accept 1/2 the going return because it has the clout to get the best deals that’s another way of saying it has a lower cost of capital. Sequoia can offer lower rates of return because they are less risky than an upstart fund
  • These relative differences in costs of capital sustain significant advantages.
  • A fund may offer a startup cheap financing in exchange for warrants (similar to a convert). This is a bad strategy b/c the performance of the instruments is inversely correlated. If the company takes off and does well, the warrants will perform but a larger fund with a low cost of capital like Blackrock or Apollo will refinance the debt piece for cheaper. In the case where the debt is not refinanced the warrants will be worthless.

Conundrums for seed funds

  • They are expected to “stick to their knitting” and be contrarian. This is practically impossible since being contrarian requires you to exit the seed company in a year or so to a Series A fund which is by definition consensus.
  • Any seed fund of quality naturally wants to raise more money but will find itself capacity constrained so it will drift towards Series A deals which are outside their expertise
  • Pre-seed round is about trying to methodically uncover if you are creating customer value. Revenue can be falsely equated to customer value. For example, you can spend money marketing which will lead to more revenue but this is not the relevant KPI (“key performance indicator”) to test the hypothesis that you are increasing customer value. The seed round is then about trying to find out if the improvements to KPI can scale.
  • Important to have a strong understanding of the role of the round you are in
  • Judgment vs Empathy at the core of a solution
    • Empathy reflects a true understanding of the practical trade-offs that lie within a business problem.
    • Judgment is typically what an arrogant or ignorant outsider looking at the problem prescribes when crafting the solution
  • Technology has made starting companies cheap but scaling is more expensive.
    • Trade-off when raising capital: balancing getting off to a fast start to acquire customers and scale versus discipline and overleverage.

A link to another post with takeaways from this podcast:

Notes from Invest Like the Best: Jesse Livermore


About Jesse: Jesse Livermore is a pseudonym for the financial blogger behind

3 Methods for Drawing Meaningful Inference

  1. Intuition
    • Benefit: Low cost and readily accessible
    • Costs
      • Downside is noisy especially in ‘wicked’ learning environments
      • Not transparent
    • Traders are high in ‘cognitive reflection’ and stronger intuition
      • Careful deliberation is a hallmark. Studies have shown that people who take too long or too little to decide do worse.
      • Intuition is necessary to pull triggers, but deciding too quickly without careful deliberation leads to poorer inference
  2. Analysis
    • Benefits
      • Don’t need to gather data
      • A model of how something works can handle regime change by having a transparent mechanism from input to output
    • Costs
      • They are always incomplete and “so easy to be wrong”. The fact that we are prone to stories compounds the danger of analysis.
    • Using it responsibly
      • Leave margin for error
      • Validate

3. Data Analysis

    • Benefit: It is rooted in reality
    • Costs
      • Without context can be misleading
      • It is more costly
      • Requires sufficient “trial size” not just a naively high sample size
        • If your samples are highly correlated than your effective trial size is much smaller than you think. For example, all financial data drawn from a single regime or independent coin flips with an unfair coin
      • Data mining and multiple comparison
        • Patterns emerge randomly so this can occur in subtle ways, not necessarily because of fraud or nefarious incentives
          • Suggestions:
            • “Call your shot”
            • Out of sample test
            • Avoid overfitting by testing outcomes against variables that you know should not matter (for example, changing the day of the week an investing strategy occurs on should not change the result meaningfully)

Earnings are a distorted measure

  • Current strict accounting standards around depreciation understate earnings relative to history
    • Old accounting standards did not adjust depreciation for inflation effectively understating inflation and overstating earnings. The market is wise and understood that earnings were overstated and assigned lower multiples during a period of excessive inflation
    • Difficult to compare multiples over time because of this change in standards
    • Depreciation is not just about physical decay of an asset but the competitiveness of an asset. E.g: Inventory of Kodak cameras become obsolete much faster than their physical decay when digital cameras emerged. Any typical depreciation formula would have vastly understated the depreciation of the assets and overestimated the book value.
  • Inflation overstates earnings
    • He calculated the book value of the entire market and keeps track of retained earnings
    • The earnings being overstated means that the retained earnings that remain to actually be either re-invested or paid out to shareholders are understated once adjusted for inflation and compared to history. This means that published return on equity is likely understated because the money being re-invested is actually understated.
    • This is a known issue
      • Studied by prior economists
      • Big corps like Sears in mid-1900s argued for inflation adjusting depreciation because the overstated earnings were weakening their position in labor negotiations
  • Free cash flow handles many of these distortions more accurately
    • Free cash flow “plunges” during high inflation periods validating the distortions caused by inflation on earnings
  • P/IE Ratio (price to ‘integrated equity’)adjusts for all these shortcomings
    • outperforms all measures of valuation including many permutations of CAPE in correlating to future returns.
    • Highly correlated itself to CAPE and tells us that market is on the higher end of valuation which Jesse thinks is structurally justifiable
  • If you want to  dig deeper, OSAM published their joint findings

Why is it plausible that markets get permanently more expensive?

  1. Valuation is a function of the required rate of return to which liquidity is an input. Imagine a pre-Fed wildcat bank. You would not accept such meager real rates of return because you do not have the confidence in the liquidity of your deposit. So much of our required rates of return come down to confidence. The progress of finance has been towards greater networks levels confidence which creates downward pressure on required rates of return. The Fed put is an example of this.
  2. With low growth and inflation (demographics follow Japan, Europe), volatility will be to the downside but the Fed can also act more aggressively without fear of inflation. Higher structural valuations may be reflecting this market understanding.
  • Implications
    • Trend: we are seeing less trend formation and more whipsaws. Speculative but possibly due to Fed put. This has led Jesse to try to restrict his trend strategy to when it is most likely to work (ie fewer whipsaws). Historically, trend’s alpha has come from times of large market drawdowns. So he uses the trend strategy when it coincides with fundamental recession indicators. He admits the sample size is small so the research is thin and probably overfit. Best recession indicators:
      • Retail sales
      • Earnings
      • Unemployment trend
      • Housing starts
      • Industrial Production
    • He is agnostic on trend. Thinks it works but is worried about it.

Valuing the Market

  • CAPE and other statistical attempts to correlate valuation with future returns suffer from small trial sizes. Markets cycles so multiple years in succession are really just a draw from the same regime (overlapping data sets)
  • An alternative method of using the relative supply of assets to predict future returns. Derived from his work. My own notes on his full post are here.
  • Interesting inefficiency which hints at the validity of this: There are some egregiously overpriced preferred stocks carrying low yields, are callable, and sit in inferior positions in the cap structure. The only reasonable explanation is they are in relatively short supply. It’s a “rare baseball card”. The explanation issuance of preferred stocks has declined faster than investment demand for yielding securities. In other words, the demand for asset allocation in relative proportions has not changed as much as the composition of supply has changed.
  • Being biased towards flow-based explanations of pricing myself, I find this idea very compelling
  • His conclusions without proof: supply matters and there are inefficiencies. The presence of the inefficiency doesn’t surprise me since constrained supply means fear of squeezes and lack of scalability. Arbitrage or relative value trading is less likely to close the mispricing.

How using OSAM data, he tried to gain insight into how factors work

  • Value and momentum work very differently
  • His “Factors From Scratch” work with OSAM (O’Shaugnessey Asset Management)
  • My own notes on his post as well as the related OSAM work on “Alpha Within Factors”

Notes from Invest Like the Best Podcast: David Epstein


About David: Best-selling author of The Sports Gene and Range: Why Generalists Triumph in a Specialized World.  A former journalist at Sports Illustrated and ProPublica, David is also known for his talks on performance science and the proper use of data across many fields including sports, medicine and natural sciences.


Epstein’s Research Process

  • 10 journal articles a day for 1 year; hire translators for foreign journals
  • Consults with statistician 

A weaker  10,000 hours idea (Tiger vs Roger Problem)

  • Contrary Research Favoring Breadth

Showed elite athletes did not require a head start in deliberate practice. More likely specialization was delayed. A long sampling period exposed them to many sports which allowed them to better match their abilities to the sport.  Evidence: Success of Olympic talent transfer programs in other countries


Lots of variation in how people respond to stimulus. True of medicine. True of training. You baseline ability is uncorrelated with your ability to improve with training, which makes extrapolating difficult. “So much to gain from fitting people into the right sport”

  • Supporting research flaws
    • “Restriction of range” problem with the study of 30 violinists. When you squash the range of a variable that is correlated with the dependent variable you risk understating the correlation with the restricted variable. In this case, the sample was violinists who had already been accepted to a famous academy. We have squashed their innate talent even though it likely has a wide range. Likewise, if you studied the correlation of height to points scored in basketball for NBA players you find a jarring negative correlation but that is because you are selecting from a sample of abnormally tall players, to begin with. You’ve squashed the height variable, which would lead people to think that height has no impact on points scored. 
    • Inconsistent numerical data, no estimates of variances on variables, poor statistical inference


  • When to be like Tiger?
    • Kind learning environment
      • Fast, accurate feedback
      • Discrete turns
      • Well defined rules
  • When to be like Roger? 
    • Wicked learning environment
      • “Martian Tennis”: You see people out there playing, something’s going on, you don’t know the rules, it’s up to you to introduce them. And they could change at any moment without notice. And that’s the situation that we’re actually in for most of the things, the complex things that most of us care about.”
  • Most surprising study in Range: air force study is a natural experiment. Professors who were the best at causing students to do well in their own class do well on the test, (ie overperforming compared to the baseline characteristics they came in with) systematically undermined those students “deep learning” (performance in the follow on courses).
    • Professors taught narrow performance to optimize for their own exam to their detriment for overall learning. They undermined the students “making connections” framework. Professors failed to learn themselves because the students who would feel rapid progress would rate them highly. “really wicked feedback”
    • Professors themselves are incentivized to maximize for short term evaluations which have impaired their ability to teach frameworks that students can apply in novel situations.
    • Professors who did not teach to the test taught broader concepts relying less on “using procedures” knowledge. This type of knowledge is most effective in kind learning environments where possible tasks and choices are restricted. 
  • “Closed skills”: techniques that you can teach very quickly and see an advantage. these are temporary advantages as people with broader frameworks eventually catch up but have brought wider understanding as well.
  • Around the world, we are performing better on “culturally reduced tests” (meaning tests that are not influenced by formal learning). Our collective performance should stay stable on this portion of tests but in fact, our performance is increasing. Known as the Flynn effect. Flynn speculates “we have moved to a world where we are used to classifying things to grouping things instead of being stuck with lots of concrete knowledge and, and factual knowledge.” Pre-modern people did not have much need for classification, but the modern world relies heavily on this ability since we’re constantly laterally translating knowledge to different areas we’ve never seen. This ability to have knowledge that we don’t have from hands-on exposure is really important.

(Me: Don’t be fooled by a sense of progress when the task you are excelling at is not varying. Being able to match abstract models to a correct strategy is a more valuable goal and benefits from practice in dealing with variation. )

  • Learning hacks supported research but ignored by the media (3 out of 5)
    • Testing: Test people before they have a chance to study. It primes your brain and exploits the “hypercorrection” effect — our tendency to remember the correct answer to a question you tuned out to be wrong about
    • Spacing: Intervals between practice make learning stick longer. A useful technique is to learn several subjects at once. Switching provides natural breaks.
      • “Difficulty isn’t a sign that you’re not learning but ease is”. To maximize stickiness you actually want to re-learn something just after you have forgotten it! Your steepest learning occurs when the task is difficult.
    • Interleaving: Mixing types of problems will extend the time it takes to learn one type but improves broader ability to match approach to the type.

Grit is Misunderstood

West Point study: the survey which measured grit was more predictive than the conventional metrics for predicting who would complete Beast Barracks (physically demanding module of training). This grit survey was applied to other domains like the Spelling Bee championship contenders. Grit appears to have a measurable, effect independent of other variables. 

  • Problem with these studies is they suffer from the same “restriction of range” problem
  • The measured effect is significant but small. Much smaller than what companies are interested in testing for. 
  • Sample of people is dedicated to a short term task like winning a spelling bee or completing their training. Very difficult to generalize to a wider measure of this individuals’ determination when the task is less well-defined
  • When zoomed out, we find that attrition is a poor proxy for ‘lack of grit’. Attrition is occurring in a time when people in these studies are going through periods of rapid self-discovery and personality change during their early 20s (this is the peak change period in our lives) and re-assessing as they search for “match quality”. The degree of fit between work, interests, and ability. 
  • Grit is not necessarily stable. It seems to vary within the same individual depending on the context or task.
  • In general, the study of grit is has been contained to very short term, narrow environments

Avoiding Premature Optimization

Paul Graham admonishes against working towards some projection of future self when you are young since what you can conceive is too limited because your experience is limited. Too risky to throw yourself on a path based on such a limited hunch. 

  • Our personality is only .23 correlated between teen years and middle age
  • We learn by doing then reflecting, rather than introspecting to form a theory about ourselves. Frequent trial and error is a better way to decide which direction to go.
  • Harvard’s Darkhorse Project studies how people match careers. The students who matched best excelled in short term planning.
  • Economist Robert Miller refers to the “2 arm bandit process”. Metaphor on a gambler pulling levers in a casino, getting feedback, before focusing on a game. He advocates jumping into high risk, high reward fields early because you learn the most from them. That informational signal is a faster input into your decision path. 

Opportunity to recombine

Information including specialized information is disseminated more widely and quickly than ever and at an increasing rate giving people greater opportunity to recombine from all the available information. 

  • Parallel trenches: “everyone’s in their own trench and not usually standing up to look over at the next trench even though that might be where their answer is” (this is why he hires translators)
    • Gunpei Yokoi — Nintendo employee who used lateral thinking to recombine older, cheaper, “withered” technologies to create products including the GameBoy. The GameBoy competed with more advanced products on the basis of its ease and durability.
    • Yokoi viewed cutting edge technologies as zero-sum arm’s races fought by specialists. “Many more opportunities to take this stuff that was already well known that everyone was looking past and recombine them in new ways”
    • We are in an age where its feasible for a generalist to crowdsource specialists in novel ways which allow them to outperform specialists themselves (Kaggle has been able to solve problems that have stumped NASA)
    • Specialists perform better when the next steps are clear and the path is more obvious. The right mix of generalists and optimists depends on how well characterized the problem is. 
    • 3M has many interesting examples and lateral thinking is entrenched in their DNA. They maintain a “periodic table of technologies” so its teams can use their awareness to recombine. 
  • Superman or Fantastic Four
    • Metric that best predicted a comic book creator’s potential to write a blockbuster was the range of genres they covered, not reps or experience. 
    • In addition, they found that a team of writers with combined experience in diverse genres outperformed a single writer unless the single writer was fluid in at least 4 genres. “Individual in some ways is the best unit for integrating information” although a diverse team is next best. 
  • To a specialist with a hammer “everything looks like a nail”
    • Specialists continuing to administer procedure in face of evidence that it doesn’t work
      • Scandinavian meniscus placebos undermine the benefit of surgery 
      • Practices that make intuitive sense (“bioplausible”) but poorly supported by evidence
        • When outcomes are poor surrogates for health: stents for otherwise healthy people with a narrowed artery do not reduce their heart attack or mortality rates. A wider artery is not a perfect proxy for the desired outcome because “There’s a clogged artery, how could opening it up not work. It’s got to work except it turns out the body’s much more complicated than like a kitchen sink, and we didn’t design it. And it’s the disease is much more diffuse.” (Me: any counterintuitive but effective remedy that works by using a seemingly oblique strategy is at risk of confusing surrogate markers for the outcome. Hormetic processes, body’s use of iron, etc).
  • A better way forward
    • Need generalists to work with the specialists for a more zoomed out view which better aligns practice with objectives. Medicine seems especially prone to the errors and resistance to reform that can result when an inordinate amount of specialists populate a “wicked” learning environment
    • Medicine and similar “wicked” environments are “devilishly” hard. It will take generational change as the entire approach to “how information is evaluated and how scientific thinking works”. Need to de-specialize a bit and increase breadth. Statistical understanding requires more than “hitting buttons on a statistical program”
    • Freeman Dyson has said we need more birds in medicine. “Frogs are down on the ground looking at like a very narrow area of the ground, the birds are up. They don’t have a good definition on the ground, but they see the bigger picture. And I think we need to make the medical ecosystem more friendly to some of these birds who are looking at the outcomes we actually care about, not just those surrogate markers or did I fix the meniscus?”

Invest Like the Best: Andy Rachleff


About Andy: Partner at Benchmark Capital and CEO of Wealthfront

Benchmark Capital started in 1995 by 5 equal partners (including Bill Gurley)


  • Turn your opponents biggest strengths to weaknesses
    • The biggest competitor at the time was Kleiner Perkins and ‘the best venture capitalist that ever lived’ John Doerr. Benchmark would woo portfolio companies using a team approach since not all Kleiner Perkins companies had access to Doerr.
    • The second strength of KP they flipped was the promise of doing business with other portfolio companies. Benchmark painted this advantage as an obligation they were free from if they joined Benchmark. Benchmark took a backseat to the portfolio companies management and did not demand to be the chairman of the board.
  • The other interesting thing they did was not allow the partners to ‘suck up the economics’ in the room. As soon as partner’s felt it was time to relax they needed to step aside for the younger team to be able to step up.
  • “Putting the gun in the other person’s hand”
    • Partner Bruce Dunlevie philosophy of trustfully dealing with people and if the person took advantage of him he would not work with them. This technique would usually engender trust and good faith in others

Product Market Fit

  • Products that are ‘bought not sold’. Delighted customers demand the product.
  • Running a business with such a product leaves lots of room for operational error and explains how a “25 year old can run a billion-dollar business”
  • The first book on the topic was Steve Case’s “The Four Steps to the Epiphany” which his eventual student Eric Ries would update and improve with “The Lean Startup” These books used the scientific method to approach business
    • A ‘value’ hypothesis needs to be proven
    • A ‘growth’ hypothesis is validated if growth is exponential and organic (ie word of mouth).
      • Growth hacking via experiments and A/B testing.
    • Typical businesses focus on the who, where and what and iterate on the what. Great technology companies ramp a new technology by finding the ‘who’. This is often not obvious and leads to non-consensus outcomes. This is now commonly understood (Me: reminds me of ‘theory of demand aggregation’)

His role as operator vs investor

  • Now as Wealthfront of CEO vs an investor a few points:
    • The skills aren’t necessarily transferrable
    • He speaks less on boards realizing how little perspective he has compared to management
  • “Crossing the Chasm” by Geoffrey Moore first book that discussed product adoption cycle and diffusion of innovation

Wealthfront features that grabbed my attention

  • Peer reviewed rules based strategies
    • Tax loss harvesting (added 1.8% pa). Automating it in software allows more consistent application of decades-old strategy (Me: Twitter discussions suggest this is highly overstated)
    • Tax loss harvesting within an index adds 25-50 bps pa. This includes selling index components that had losses and buy correlated names to maintain exposure
    • Portfolio line of credit leveraging risk-based margining. For accounts >100k this provides access to cheap loans
    • No hedge funds or expensive alts bc of the Grouch Marx “I don’t want to be a member of any club that will have me”. The best institutional investors are long term, not performance chasing (ie endowments and charitable foundations). The worst of the funds can’t access them so they would be the only ones open to listing on retail platforms. Classic adverse selection.

Business strategy not always best self strategy

  • In business, amplifying what you excel at has a better payoff than improving weaknesses. He asserts that this is also professionally true at the career level since differentiating expertise is a large determinant of a person’s value-add. He mentions that this is not the same strategy one should employ in their personal life, where boosting your weaknesses as a person is very valuable. In professional life, learning from success can certainly be more important than learning from failure. “I’m not hiring you because of what you can’t do. [I’m hiring you] because you have learned some tricks!”
  • Well-rounded people are interesting to talk to but not necessarily the best teammates in a business.

Invest Like the Best: Brad Stulberg


About Brad: Performance coach and author of Peak Performance


  • Studying brains we find that people can summon extreme abilities if they have core beliefs which override the fear responses in their brain (is lifting a car off of a person trapped underneath). The importance lies in having core beliefs or purpose.
  • The Growth Equation: Stress + rest = growth
    • Need the right amount of stress/stimuli. Too much stress is overwhelming, too little leads to no adapting.
    • Just manageable challenges are those which are just outside your comfort zone. It’s self-defeating to onboard too many of these at once.
  • Mechanics of creativity
    1. Immersion: this work is stress
    2. Incubation: stepping away (this is the rest)
    3. Creative insight…this tends to happen after a period of rest…end of a vacation, in the shower, taking a walk.
  • Studies show deep work cycles are most effective in 45 to 90 min blocks followed by 15-20 min breaks.

Practical Tips

  • Whether you perform better in the am or pm is largely biologically determined. Manage your energy not time. If you are better at focused work in the am, then reserve that time for that.
  • Time in nature or outside is shown to improve stress, physical, cognitive markers.
  • Study of air force cadet squadrons showed that group performance was more influenced by the lowest common denominator as opposed to the leading performer. The group has more to gain from eliminating bad attitudes than from enhancing leadership.
  • Fatigue happens in the brain, not the body. Your central nervous system slows you down. How to overcome this? When interviewing peak performers you find that they are thinking about a larger purpose than what they are doing. So marathoner thinking of his family can override his brain’s fatigue signals.

Invest Like the Best: Jason Karp


About Jason: Founder and CIO of Tourbillon Capital Partners

Growth of private markets

  • Smaller supply of scalable opportunities and increased competition
    • Less companies going public and staying private for longer. 50% of listed companies compared to 20 years ago (mostly M&A and lack of IPOs as opposed to bankruptcy)
    • Unicorns able to get unprecedented amount of private funding
  • Increasingly competitive public markets for short term performance.
    • Data from prime brokers shows that over 90% of flows driven by non-discretionary accounts (CTA, systematic, quant, passive).
      • He argues that this has caused large dislocation opportunities in past 5 years in valuations when historically values typically take no longer than 3-5 years to converge to fair valuations.
    • Short term pressures on advisors (quarterly performance, fee compression) incentivize move away from mark-to-market and costs of wooing performance-chasing allocators.
    • Information edges are gone or not scalable
      • Data points include the sheer number of funds and books referencing ‘value’ investing.
      • Growth of quant platforms like Quantopian.
      • The growth of data sets favors short term trading which is the domain of quants.
      • Ratio of sell side analysts to public stocks is at an all-time high
      • Reg FD neutralized many active managers’ edge b/c large commission paying funds could no longer get privileged info

How to compete in a competitive, expensive market

  • Look for real business growth. If a company is growing faster than it’s implied multiple there is a margin of safety that can ensure an investment in the event of multiple contraction
  • The deep value game is difficult since it’s a basket of adverse selection. A small minority will turnaround their distressed situation. It’s easier to look for ‘value’ names that are growing than it is to pick ‘deep’ value names to revert
  • Align investors with long term horizon as a structural edge. The only way to profit from longer-term dislocations. The current environment has a historic high correlation of growth to momentum which is a trend that continues to pay off, in turn, reinforcing the exit of value flows and increase in quant/momentum flows setting up a historic relative value opportunity.
  • Understand cyclicality and stickiness- He calibrates the riskiness of a business with the nature of its sales. Stickier businesses are less risky (ie high consumer daily engagement)
    • Fashion is unpredictable
    • Cyclical’s are dependent on GDP
    • Staples are more dependable

Invest Like the Best: Dan Egan


About Dan: Managing Director of Behavioral Finance and Investing at Betterment

Uses insights from behavioral economics to nudge more adaptive behavior

Design a better dashboard

  • speed bumps on mobile platform to discourage impulsive trading
  • ‘tax impact preview’; the magnitude seems to be less influential than the outright presence of this speedbump; interesting side note is that this has a larger effect in Census areas known to be Republican
  • When messaging entire user base, they would sometimes prompt action (ie selling stock) in users who would have done nothing; they learned to send messages to people who were about to commit an action. This improved efficiency of the message by eliminating ‘false positives’
  • Displaying information which aligns focus with objectives instead of just defaulting to emotionally charged performance metrics or even the red/green triggers we are used to seeing
  • Recent related WSJ article pointing out the tyranny of what is displayed to you:

Designing a system informed by your beliefs when you are rational to pre-empt decisions you might make when emotional

  • Cranky judges: parole sentences are heavily influenced by time since their last meal. For such important decisions, we need a better system. Doctors’ diagnosis subject to the same effect!
  • Building custom tailored indices aligned with people’s stated goals

Notes from Invest Like the Best: Richard Craib


About Richard: Founder and CEO of Numerai; hedge fund which crowdsources machine learning algos

What does Numerai do?

Numerai runs open contests where they supply scrubbed financial data which is unlabelled and respondents submit ‘targets’. They do not need to submit their algorithms. Contrasting with Quantopian, the system doesn’t rely on trust. You preserve your IP which encourages collaboration and also has zero interpretability.

Problem is set up in a specific way, data is highly normalized, and doesn’t seem to be too non-stationary. Respondents compete on the predictive value of their machine learning algos.

Important criteria for crowdsourcing to be effective

  • Diversity of opinion
  • Non-overlapping (ie low correlation )
  • Decentralized


  • Model Edge
    • Users are using more unique methods than just neural nets, random forests, and support vector machines.
    • There is diminishing returns to model improvements but because you can crowdsource not hire it is cost effective to seek the best signals
  • Data Edge
    • 2 Sigma claims that extremely clean data is a huge driver of edge
    • RenTec reportedly does not use machine learning, and much of their edge is surmised to be a long, unique data history. In fact the data may be >> the talent. [Me: Interesting as a moat to think of unique data sets trading firms can have: executions, unstructured data such as voice trades that are passed on vs executed]
    • They do not introduce the complexity of unstructured data
    • Data normalizing is important as well as normalizing the targets to risk-adjusted returns
  • Ensembling the models via staking
    • The info content in how users staked their entries with NMR crypto exceeded the conclusions from Numerai’s extensive research into how to ensemble the models to construct portfolios. Craib, being friends with many of leaders in the crypto space, saw that bitcoin could be used to trustlessly pay contestants. NMR evolved as a smart contract to enable the entire staking mechanism, providing Numerai with a powerful ‘skin in the game’ filter. It also served to discourage spam since there is a cost to submit an entry.
      • The Sharpes of the strategies with staking was about 2 vs 1.5 for unstaked. The ‘floor sharpe’ was 1 since the data set was already high quality
    • They pay about 5k per week in prizes. $8mm to date.

Notes from Invest Like the Best: Michael Kitces


About Michael: Leading expert on financial planning and building advisories


Financial Advisory Fee Model History

Commission Model

  • Until 1975: high commission — stock brokers making $200 in 1975 dollars for execution; fees were set by price control in aftermath of the 1920’s free for all where clients were ripped off during the bull market preceding the Great Depression1975: May 1st — ” May Day”: deregulation of stock commissions. Some brokers thought they could raise prices but Bay Area-based Charles Schwab uses computers to undercut and in the next 20 years commissions fell 90%.

Fee Model

  • 1980s to 2011: Mutual fund model rises as stockbrokers go out of business. The rise of independent broker dealers as the creation of the financial product unbundled from distribution. Financial advisors would recommend mutual funds that were not manufactured by the wirehouse they worked for creating less conflict of interest from stockbroker model. Mutual funds assets rise from 1/2 trillion to 5 trillion! Advisors lobbied for 12b-1 fees which allowed them to charge a recurring fee on assets to support their advisory business since commission dollars were now unsustainably low.

Internet Era

  • Technology platforms all funds to be distributed direct-to-consumer. Etrade: “It’s so easy a baby can do it”
  • 1998: Schwab One Source Program pioneers the no-load fund
  • Advisor model becomes the AUM model where once again the value prop to consumers improved as advisors were now constructing diversified portfolios for clients instead of jamming them into a single loaded fund. The rise of the fee-based account and RIAs. The value prop being constructing diversified portfolios tailored to the client’s goals
  • Mutual funds in decline as advisors no longer incentivized to sell them and their inferior structure to etfs. Advisors competing wanted to actually cut high fee funds from client portfolios. We are witnessing the acceleration of this process now as we are actually seeing net flows out of mutual funds in aggregate. It took 15 years to get to this point, he expects another 10-15 to finish the trend.
  • Advisors are disintermediating funds — “wholesale transfer pricing”. When there is competition up and down the value chain the owner of the relationship, in this case the advisor stands to survive.
  • Roboadvisors have competed for assets of self-directed investors much more than displacing human advisors. They were the biggest threat to Schwab and Vanguard ironically and turns out they were the first to respond with robo-advising of their own

The new model: Barbell

  1. Economies of scale and tech. From 1995 until now fees dropped yet another 90%. Commoditized funds and etfs fees going to zero. Large firms like Fidelity, Schwab, Vanguard ok with this because as fund managers they capture AUM fees on the back end.
  2. Niche advisors that add value in ways that often has nothing to do with asset management.

Understanding the New Model

  • Advisor fees average about 1%, although closer to 75bps on a weighted basis (larger accounts get discounts).
  • Robo advisors started at 25 bps which was a venture backed hypothesis price. They have been steadily raising fees and it looks to settle in around 35 bps. So the gap between a robo-advisor and a full service human advisor is about 40 bps.

Advisors recognize they need to add enough value to justify the premium.

How are they doing it?

  • Comprehensive financial planning (taxes, estate)
  • Upgrade talent. The bar is going up dramatically in terms of credentials. A basic license no longer viable qualification to compete.
  • Specialized expertise: most advisors advise about 100 clients and the top 20% are 80% of the revenue, meaning you can serve tiny niches
    • Niches:
      • Seniors
        • Social security timing (when to take it)
        • Medicare and health plan guidance
      • Narrow cohorts
        • Doctors at a certain hospital with complex hospital negotiation process
        • Competitive bass fishermen (endorsements and prize money guidance)
        • Expats from certain countries

Does the data support advisor fees declining or them fighting to add more value at constant revenue?

Instead of average fee declining, we are seeing advisor profit margins decline as they add value and costs to ‘defend the 1%’ fee. Profits are not exploding despite markets on record highs

Michael’s view on flat fee versus AUM fees

On AUM fees:

“the only business model that I’m aware of anywhere in the history of any industry or your average revenue per client automatically goes up at a real rate above inflation simply by keeping your client because your [fee] lifts with the return of the markets and return the markets is generally risk premium plus inflation so we get this natural lift in a world where every other industry that ever existed has to actually go back to their people to ask for an increase and get them to buy in”

This makes the flat fee business difficult to compete.

But why are flat fees the future?

This will be the dominant structure in 20 years because the size of the addressable market will widen.

Currently, only about 1/3 of households have over $100k in investable assets. But 1/2 of these assets are in 401k plans which cannot be advised. This leaves less than 20%.

Of that proportion,

  • 1/3 are DIYers
  • 1/3 are “validators”. They have an idea of what they want to do but need some guidance but unwilling to pay AUM fee for validation
  • 1/3 are “delegators” and thus great clients

Current AUM model only addresses about 7% of the population

So fee-for-service has an opportunity because 50% of the population need financial guidance but neither the asset level nor will to delegate to an advisor.

He says the model may shift to “1% of assets to 1% of income”