Most early stage Venture Capitalists use very little data when investing. It is largely a world of intuition, relying on mutual relationships, and in some cases sector knowledge or thesis development.
But that tide is shifting. Today there was an article on how Steve Blank now thinks accelerators should go the “Moneyball” route. As “big data” gets more popular, the idea of using a quantitative approach to help make better early stage investment decisions is becoming more frequent. As Rob Go recently wrote, there is a strong rise in the number of VCs employing data scientists, and a couple have even made a firm-wide bet on being entirely data driven.
Unfortunately most of the press coverage of this trend generally falls into a very surface level narrative about “quantitative vs qualitative” investing. That is, early stage investing is about gut belief and vision, and you better not try to rely on numbers to make a decision. This is a false conflict, and blinds us to the fact that data, properly applied, doesn’t deter our intuitive senses it informs them.
"Quants vs the Scouts"
I think the best illustration of how the “qualitative vs quantitative” argument is false is to examine how Moneyball in baseball actually worked. The story depicted in the book was, partly for dramatic purposes, presented as a “quants vs intuition” situation. But that’s really an oversimplification.
There is no question that Moneyball has transformed baseball. While the book depicted the transformation of the Oakland A’s, it was famously the Boston Red Sox that used quant approaches to baseball scouting in order to bring Bambino’s curse to an end. I was living in Boston at the time, and being intellectually curious about their approach I actually spent quite a bit of time talking to some of the quants involved in the Red Sox to really understand what they had done. I think it can help illustrate the way forward for VCs investing in the startup ecosystem in ways that will help startups and VCs.
The truth is that both for the grizzled old school scouts, and for the new school finance nerds, statistics are used heavily in their evaluation of players. Scouts were using numbers like ERA, pitch speed, and the “impact toolbox” of observational stats like hitting for power, combined with their own judgement. Meanwhile, quants were looking at new sets of data like on-base percentage. Incredible storytelling aside, Moneyball was really just about the evaluation of two things:
a) whether the use of a host of new statistics like on-base percentage would be a helpful addition to the more traditional stats in evaluating talent.
b) whether a “statistics only” approach could bear better fruit than the mixture of qualitative and quantitative analysis that was the standard.
The results on A turned out to be true, new ways of looking at stats allowed formerly overlooked players to be unearthed. Those that adopted these new metrics faster had a serious upper hand, until inevitably most teams caught up with the tactic.
And in the case of B, whether these new quant methods would perform well as the only signal, the results are equally clear. As Nate Silver details in his book “The Signal and the Noise” using only these new metrics as a way of evaluating talent was a disaster. These models consistently underperformed the hybrid quant/qual approach of the best scouts.
Pure “moneyball” doesn’t even work in baseball!
The reason qualitative evaluation is still so important is because in baseball, venture investing, product management and in fact with any activity, there is always a gap in what we know we can measure, and what is actually happening. And while we constantly strive to find more truth objectively, we must use qualitative measurements to best approximate the areas we have not yet found a way to measure.
The fact that we still find this hybrid approach to be the most performant method in baseball, which is possibly the most measured activity we have outside of pure finance, should give us some indication of how well a purely quantitative approach would perform in early stage venture investing.
Where Moneyball VC likely won’t work
Niels Bohr famously quipped, “prediction is very difficult, especially if it’s about the future.” That said, our ability to predict is not evenly distributed. To state the obvious, the big question here is whether startups are an environment where the conditions exist for us to model. Whether any area is ripe for forecasting depends on three criteria, and I’ll extend the baseball metaphor just a little bit longer for comparison.
1) What are our inputs? In baseball, the data piece is fairly straightforward: there are hundreds (even thousands) of players, with thousands of pieces of data each, and their stats are completely public. 
In startups the inputs are relatively sparse and often not accurate. While there is preliminary research indicating that models like Multiple Criteria Decision Analysis (MCDA) might be helpful in deciding which startups will succeed, that’s only if we have correlative data. Suffice to say, Alexa ratings and Karma scores don’t get you there. We have no easy means of even starting with the kind of consistent data that baseball scouts, economic forecasters, and weathermen start with.
2) What are our outputs? The second thing we need to do is to have enough positive outcomes (ie exits) where we have the relevant data to draw causations. In consumer technology, for instance, there have been 10 companies in the last 5 years with over $1b in M&A or IPO value.  Compare this to the 750 players who are succeeding in the majors each year and you begin to see why, as Mike Greenfield says, "big data beats small data."
So far we have a dearth of information on both of these axis. You have much less data, and much less reliable data, than baseball. And you have fewer results because there are so few billion dollar companies being created.
Since the ultimate goal is a large scale exit, this puts the challenge at a significantly higher bar than picking who is going to be a great major league player next year. A more comparable statistical challenge in baseball would be trying to pick who is going to be in the hall of fame, 5 years from now, by staring at somewhat spotty data of the thousands of minor league players. Not particularly ripe territory.
And that is before we take our third criteria into account.
3) The math is constantly changing. One of the most common mistakes people make in statistical forecasting is to assume that all statistics are like physics, a fundamentally unchanging set of rules that might be unknown but are fixed.
In areas where this is true, such as weather prediction (which is based on actual physics of course), we have improved our forecasting accuracy immensely over the last 20 years. But in areas where relationships between the numbers change we have made little progress.
One area that has such a problem is the assessment of young founders. Many investors try to take a read on a founders leadership skills, persistence, determination and understanding their own strengths and weaknesses. But, as Arthur Jensen discusses in a brilliant book on brain development called G Factor  a persons mental ability are actually things that don’t start to settle until a person is in their mid to late 20s.
And sure enough, baseball scouts have developed similar theories that the mental part of the game is not worth evaluating deeply before the age of 25. This is another reason why evaluating minor league talent vs major league talent is so difficult. You just have no idea if that person is going to develop into a great on-field leader because their brain is still changing too aggressively and unpredictably. With 23 year old founders, you’ve got that same problem.
A second example is when externalities change all the data, best represented in black swan events. These are the rare events that by their very definition could not be included in any data set. An earlier startup of mine, Ambient Devices, was started just prior to 9/11 — a categorical black swan event that drastically changed our ability to raise money, sell product, and partner with some of the financial institutions we were talking to at the time. The opening of the Facebook platform was a similar event that had an effect on a whole generation of startups but could not have been predicted only a couple years beforehand.
My friend Rob Coneybeer, an investor at Shasta Ventures, used this analogy, “Moneyball VC is like trying to predict the next great player when every time you play the game you are changing the number of outs, innings, and number of players on the field.”
So let’s step back a moment. Imagine we were an awesome quant forecaster, looking to make our mark on the world and revolutionize some industry the way baseball forecasting was changed. If we made a list of hundreds of potential markets, then rated them for their ripeness based on these key criteria, it is likely that early stage Venture Capital would be in the bottom third of that list.
That said I still believe there are some places where quantitative analysis could work to help investors, and entrepreneurs. And I’ll try and detail some early thinking on that tomorrow in Where “Moneyball VC” may work.
(Thanks to Keith R, Andrew P and Mike G for reading drafts of this and helping form my thinking)
 What’s most important to both quantitative and qualitative assessment is the caliber of the inputs. The St. Louis Cardinals, which had one of the most legendary scout programs in baseball, have transformed their program into a quant-based approach over the last decade. However, they have not replaced their scouts out in the field with armies of wonks pivoting tables in Excel all day. In fact, they have increased the number of scouts on the team to compliment the addition of quants. They understood that the rise deeper statistical analysis is not mutually exclusive with qualitative analysis. Also see http://www.baseballprospectus.com/article.php?articleid=20928
 From Jacob Mullins analysis: http://techcrunch.com/2013/03/02/how-do-you-build-a-1b-consumer-company/ — Except I added in Tumblr which occurred after this piece was published.
 The G Factor - http://www.amazon.com/The-Factor-Evolution-Behavior-Intelligence/dp/0275961036