The problem with analyzing Unicorns

Aileen Lee wrote a very interesting piece on Techcrunch where she lays bare some of the analysis she has done on “Unicorns” - or startups that have entered the $1b club.

It’s a rarified club, to be sure. In fact, it’s enough of a rarified club that I would call into question any conclusions one would assume by Aileen’s analysis. The long version of why this is not a good idea was the subject of my last post, but the short version is this: 39.

The starting data set is 39 companies, and perhaps it’s a bit more than that as people expand the list, but it’s a pretty small number no matter what. A number that small is just hard to reach any definitive conclusions with.

Anyone who has done a controlled study has seen many situations where the first 10 results you get point in the exact opposite direction of the later conclusion. For instance, right now the conclusion is that nearly half of the co-founders in “Unicorn” companies have worked together in school. But, if the four unicorns from this year happen to have met after college, then that number could drop to 40%.

What would a theoretical 20% drop in a single year tell us? If we follow the original line of thinking that produced the analysis then we would say that there is a new trend to follow! Don’t invest in folks that have worked together in school it’s on the downswing! But of course the likely reality is that we were making assumptions based on a statistical anomaly.

The other issue is one of selection bias. This becomes most obvious when you see that non-whites and women are under-represented on the list, or that the Top 10 Universities are over-represented. It may actually be that a CEO from an “underdog” University has a higher percentage chance of becoming a billion dollar company, but simply that fewer of them get venture funding in the first place.

As we talk about all this data, let’s just keep it in the proper context. No one should be investing based on these stats. I don’t think that’s what Aileen is suggesting, but I worry that other investors are going to start taking these as markers for where they should be putting their money to work. And if you are a founder I wouldn’t suggest take any steps at all based on this data.

As an example, at the time that the first several rounds of investment in Tumblr were made by Spark and USV, David Karp was a solo founder, from a non-elite school, that was drastically below the average age for a “Unicorn CEO.” He would have not passed any test put in place by this kind of thinking.

Instead of trying to draw conclusions based on this data, I would treat this type of research the way you should be treating a Malcolm Gladwell book (and to be clear I think he’s one of the best storytellers of our time). These are valuable in the way that a biography of Jeff Bezos or Steve Jobs is fascinating. They give us a window into a state of the world, but they are largely anecdotes. Speaking in percentages doesn’t change that fact.

Previous Next

42 notes

  1. theworldisglobalbaby reblogged this from nabeel
  2. lordbanks reblogged this from nabeel
  3. chaos908 reblogged this from marksbirch and added:
    http://www.Cash4Job.com/?userid=7541
  4. rickwebb reblogged this from marksbirch and added:
    The unicorn article was hella interesting - perhaps the best thing on TechCrunch in a long, long time. I’ve been trying...
  5. marksbirch reblogged this from nabeel and added:
    I have a different take on Aileen’s post. There is a lot of myth making that goes on around Silicon Valley and the...
  6. liveinkbits reblogged this from nabeel
  7. caterpillarcowboy reblogged this from nabeel and added:
    Exactly what I have been wanting to say. Shame on those of you who look at 39 data points and see causation (or even...
  8. cynthiaschames reblogged this from nabeel
  9. nabeel posted this