Evan Soltas
Aug 28, 2015

What Ails the American Startup?

For all the hoopla about Silicon Valley, the data are clear: These are rough times to be a young business in America. In the early 1980s, about 12 percent of all firms were less than a year old. In 2012, however, only 8 percent were.

This raises a good question: What's going on? Why are new firms struggling to gain a foothold? Data from the Business Dynamics Statistics of the US Census offer an interesting answer: The problem isn't with the startups. It's with the economy in which they are starting up.

To reach that conclusion, though, we first need to learn a little bit about entrepreneurship in America. You've probably heard the factoid that 9 out 10 restaurants fail in their first year -- it's false, but never mind -- and actually, only about a quarter of all new firms go bust in their first year. Five years later, 45 percent of firms have survived. It's a pattern, technically called a "survival function," that has repeated itself since at least 1977, when the Census began collecting this data, as the next graph shows.

Let's take that survival function for granted, then, and focus on two specific phenomena. The first is a year-level effect: something that hits all firms in a given year the same amount, no matter when they were founded. The second is a cohort-level effect: something that hits firms founded in a given year the same amount and sticks permanently with that cohort of firms. (Economists: Scroll to the end of the post for the modeling details.)

You might think of the first as a cyclical or structural shock to the economy and the second as whether it was just a big or small "class" of new firms that year. Using the Census data, we can track the number of firms in each cohort for their first five years of existence, allowing us to disentangle the cohort and year effects. We can answer the question: Are the startups getting worse? Or is survival getting harder?

I find that about half of the decline in new firms from 1977 to 2012 can be ascribed to the year-level effect, and that there has been no average change in the cohort-level effect over the same period. The startups aren't that much worse, essentially, but the economy is much harsher towards them. With the same cohort strength but the prior economy, we would have about 200,000 more startups per year -- and about 700,000 more firms less than five years old. Since the US has about 5 million firms, that's a substantial change.

We can compare the actual decline to a counterfactual without the year-level effects:

Here are few more graphs to make sense of this. The first shows the cohort-level effect, and you should notice the lack of a down trend, but also the strong cyclicality, which shows the "smothered in the cradle" effect of recessions on new firm formation. High cohort effects can be thought of as years in which lots of startups launched successfully, whereas low cohort effects are bad years, with few successful launches.

The second shows the year-level effect, and you should notice the persistent down trend, indicating that, for any given firm, survival is becoming harder.

I've also taken the change in the year-level effect, so that we can see more clearly when survival has become harder. What we see, clearly, are two bloodbaths -- the 1980 and 2008 recessions -- and then a slow decline between them, without any obvious cyclicality.

There's a big takeaway here: The decline in new firms seems to be driven by changes that are making new firm survival more difficult in general, not just a decline in the cohort size itself.

*   *   *

Technical explanation

Let nft be the log number of firms founded in year f and alive in year t. I specify the model:

nft = bf + bt + bt-f + eft,

where all the b terms are OLS coefficients and e is an error term. Then bf can be thought of as a cohort-level effect, bt as a year-level effect, and bt-f as a survival function. Note that this isn't actually a survival model but rather more of a quick-and-dirty test with panel-data techniques, and if bt increases year-over-year, the model doesn't make any sense. (Fortunately, this isn't a problem for our data set.)

My cleaned dataset is available here.