Notes from Invest Like the Best: Richard Craib


About Richard: Founder and CEO of Numerai; hedge fund which crowdsources machine learning algos

What does Numerai do?

Numerai runs open contests where they supply scrubbed financial data which is unlabelled and respondents submit ‘targets’. They do not need to submit their algorithms. Contrasting with Quantopian, the system doesn’t rely on trust. You preserve your IP which encourages collaboration and also has zero interpretability.

Problem is set up in a specific way, data is highly normalized, and doesn’t seem to be too non-stationary. Respondents¬†compete on the predictive value of their machine learning algos.

Important criteria for crowdsourcing to be effective

  • Diversity of opinion
  • Non-overlapping (ie low correlation )
  • Decentralized


  • Model Edge
    • Users are using more unique methods than just neural nets, random forests, and support vector machines.
    • There is diminishing returns to model improvements but because you can crowdsource not hire it is cost effective to seek the best signals
  • Data Edge
    • 2 Sigma claims that extremely clean data is a huge driver of edge
    • RenTec reportedly does not use machine learning, and much of their edge is surmised to be a long, unique data history. In fact the data may be >> the talent. [Me: Interesting as a moat to think of unique data sets trading firms can have: executions, unstructured data such as voice trades that are passed on vs executed]
    • They do not introduce the complexity of unstructured data
    • Data normalizing is important as well as normalizing the targets to risk-adjusted returns
  • Ensembling the models via staking
    • The info content in how users staked their entries with NMR crypto exceeded the conclusions from Numerai’s extensive research into how to ensemble the models to construct portfolios. Craib, being friends with many of leaders in the crypto space, saw that bitcoin could be used to trustlessly pay contestants. NMR evolved as a smart contract to enable the entire staking mechanism, providing Numerai with a powerful ‘skin in the game’ filter. It also served to discourage spam since there is a cost to submit an entry.
      • The Sharpes of the strategies with staking was about 2 vs 1.5 for unstaked. The ‘floor sharpe’ was 1 since the data set was already high quality
    • They pay about 5k per week in prizes. $8mm to date.

Leave a Reply