From Flirting With Models host Corey Hoffstein:
My guest in this episode is Kai Wu, CEO and founder of Sparkline Capital. Kai is a pioneer in the measurement of intangible value. Using machine learning, he tackles unstructured data sources like patent filings, earnings transcripts, LinkedIn network connections, and GitHub code repositories to try to measure value across the four key pillars of Brand, Intellectual Property, Network, and Human Capital.
We discuss why intangibles are important, how they differ from the traditional factor zoo, the opportunities and risks of unstructured data, and how even big data can have small data problems within it.
Finally, we discuss Kai’s most recent applications of his research to the world of crypto.
A word on my notes:
These are just interesting bits that stood out to me, not a comprehensive summary. Kai and Corey pick over many nuanced questions related to unstructured data, meta-problems in data analysis, distinctions between Kai’s “4 pillars”, and techniques. I encourage you to listen to the whole episode to appreciate the depth that both of them are able to bring to the discussion. Kai has thought about these problems deeply and Corey, despite being an outsider, asks extremely poignant questions reflecting his own deep appreciation for the pitfalls of number-crunching.
To take advantage of machine learning truly requires rather large investments, alternative data and the infrastructure required to support it can be very expensive. And even worse is that you know, the prohibitive item here really is getting the right people to run it. Machine learning is complicated and has many pitfalls. And it’s also a relatively new field so that the pool of experienced folks is pretty small.
I actually wrote a paper in May 2019, called machine learning in the Investment Management age. And so in this paper, I outlined three ways to apply machine learning to the industry:
The father of value investing Ben Graham wrote Security Analysis in the 1930s. The world was very different. The big companies were railroads and industrial firms. Buying stocks below book value was a reliable way to make money. Fast forward to today. We have Google and Apple which don’t use tangible capital to generate earnings. They rely on intangibles. We have these four pillars at Sparkline:
These are the pillars most firms rely on today. Our research has shown that intangible capital has grown from basically 0% to 60 to 80% of the capital stock of the S&P 500. Meanwhile, the efficacy of traditional value metrics like trailing earnings or book value has declined. So Baruch Lev and Fong Gu in their excellent book, The End of Accounting show that the R squared of using book value and earnings to explain market caps across nationality used to be 90% in 1950, and it’s fallen to around 50% in 2010 and this was 10 years ago. So I’m not the first person to argue that value investors need to incorporate intangible assets into their assessment of corporate value. But as far as I can tell, we are the first firm to use machine learning and unstructured data to measure this value. For example, we use live data to track the flow of human capital from company to company or Twitter to measure the brand perception of firms. These datasets require using machine learning to take the unstructured data and form them into factors which we can then use to trade like each of these four pillars. So basically, we have two big insights at the firm.
By combining these two insights, we hope to help investors access the opportunities in these undervalued intangible assets.
There are a dozen or so researchers who have written about how to incorporate intangibles into measures of book value. While they each have slightly different approaches, the common theme is that they all rely on accounting data to measure intangible assets. To be more specific, they focus on two particular line items in the accounting statements.
The idea is that R&D and SG&A are expensed rather than capitalized. For example, if I were to spend $10 million dollars building a factory to manufacture a new drug that I developed, that capex is capitalized, that goes on my balance sheet. On the other hand, $10 million of R&D to develop the drug that will then be manufactured is considered a cost that comes out of the income. This inconsistency means that investments in intangible capital are considered not an asset but an expense. So led by Baruch Lev who we mentioned just a second ago, a lot of different researchers have now decided to treat intangible investments the same way they do tangible investments, in other words, to build balance sheet assets for intellectual property and brand.
If you take price-to-book plus capitalized r&d you end up with this slightly more comprehensive version of a value factor. This adds somewhere between one to four points of excess returns each year to performance. And the problem though, for us, is that value still in a deep drawdown notwithstanding. So, while these are very sensible adjustments, they’re not a panacea. I think the limitations are twofold.
Corey: As more and more firms adopt NLP tools to rapidly trade news releases and earnings transcripts. How do you outrun the adversarial issue where CEOs may now get coached against using specific words and phrases or coach to use specific words and phrases?
Kai’s answer confirms just how much of an arms-race market communication can be!
I love this question. Look, investing is like poker. It’s a game theoretic endeavor. One of my favorite papers is actually called How To Talk When A Machine Is Listening. And it has a really interesting finding. So there’s this dictionary called the Loughran and McDonald dictionary. It consists of a bunch of lists of words. Like positive and negative keywords. And the key is that it’s adapted to the finance industry. It was created by two finance professors solely for this focus on trying to classify financial jargon. It was published in 2011 and quickly became widely used in natural language processing. The paper How To Talk When A Machine Is Listening found that companies started to avoid using the negative Loughran and McDonald words in their 10-Ks and 10-Qs after this dictionary was published. So yeah, this is a very real thing. As investors watch and try make sense of unstructured data and deceit in general, CEOs will try to manipulate the narrative to their advantage.
The way we deal with this is we define three buckets of data with varying levels of susceptibility to such a manipulation.
Porting our model into crypto was actually pretty seamless. Brand new human capital matters just as much for Web3 as Web2 organizations. So we were really able to just apply the framework wholesale with no modifications. The big difference in crypto is the data sources are different. But because Web3 is being built in the open, in many ways, crypto is actually an even more attractive area to apply this framework. So we focused on three different data sets.
I think what makes us confident in this strategy and gets us all excited about it is this is an inefficient frontier asset class, and very few other investors, if any, are approaching it with systematic valuations. So it just stands to reason that there might be some alpha here.
You’re right that in general, the token economics are a bit different from that of equities… Many of these projects are using tokens as a method of financing their growth, but they want to avoid technically calling them equity securities from like a regulatory standpoint but it doesn’t diminish the actual value in these tokens. Let’s take the example Ethereum.Eth is a utility token. It is required if you want to use the Ethereum network. Therefore, the value of Eth is a function of the demand for the Ethereum network. This logic applies to any other token, whether it’s a video game, a decentralized exchange or a blockchain. The value of tokens will be a function of demand for the underlying project. So our framework attempts to establish what is the fundamental attraction of these underlying projects. So in this way, we’re actually much more similar to venture capitalists. We think about these projects as early stage startups. They may not have monetized their projects or their users or whatever yet, but if we have a lot of users, we have a robust development community and a strong brand, it certainly does bode well for their ability to flourish ultimately, which of course, would somehow filter down to the token investors profiting.
[Kris: This response sparked a thought for me. A casino requires its chips to be able to play. But the chips themselves never increase in value even though they provide utility in the form of “access to entertainment and gambling”. And for poker pros, the chips are literally an on-ramp to their professional “business”. And still the chips do not increase in value. The analogy is weak since the casino can always produce more chips but it’s just a reminder that the value of a “token” is not just a function of its user base, but its supply, the incentive to increase its supply, and the alternatives. If a user can just cash out because there’s a quality competing casino or blockchain it acts as a limitation on any token’s value]
Dogecoin, which is a joke, has its main value on its brand. A lot of people think it’s funny, they like it. It’s kind of fun to play with. So its primary pillar is brand.
On the infrastructure side, you have things like Filecoin for decentralized storage.
Decentralized exchanges. Similar to how the NYSE and CME derive value from the fact that you have many buyers and sellers who want to aggregate liquidity on their platform. Same thing for uniswap and sushi.
So yeah, very much the same concept here. What we’re trying to look for, same as with equities, are firms where you have a bit of everything. What we’ve discovered is that simply having one pillar is generally insufficient for success. I always give the example Wozniak & Jobs. You have technology and IP, but you really need marketing as well. So what we’re looking for is crypto organizations, stocks, whatever, asset class doesn’t matter, is strength on all of the advantages, or as you know, as much as possible.
While a successful volatility trader’s edge is in discerning relative value between options, they are…
Friends, It’s graduation season. I always share these 2 posts in May: Wooderson’s Commencement Speech…
A reader asks: Do you have any insight into the activities of market makers when…
A guest post showing how "markets find a way" My friend Rajiv Rebello has helped both my…
I listened to Founder's podcast episode #345 about the life of George Lucas. The following…
Know-Nothing Sizing We’ve been talking about how the market does follow the fundamentals you are…