Evan Soltas
May 3, 2014

Why Participation Is Down

There have been many attempts to answer this question: Is the decline in the U.S. labor force participation rate structural or cyclical? Or, more precisely, to what extent is it either one?

And there have been so many attempts because it really is an important question. Think about the economy as a big machine that takes three inputs -- technology, labor, and capital -- and produces output. The drop in the labor force means that the U.S. has forfeited, perhaps permanently, that labor input and whatever marginal output it would have yielded. A simple calculation1 suggests that the share of output lost is about three percent; more in-depth calculations from Reifschneider, Wascher, and Wilcox (2013) place it at the center of their estimate of a seven-percent drop in potential output. That's a lot. You don't blow three percent of GDP, let alone seven, every day.

Another reason that economists keep coming back to the labor force participation rate is that, ominously, it keeps falling. Not only does that render much of the research overtaken by events, but also the data presents a challenge to reports that see the decline as cyclical and transitory.

I'd also say that the reason that the research continues2 is because it hasn't settled on a single analytical framework. That's not necessarily a bad thing at all, as disagreement over methods forces researchers to reconcile differences in results rather than herd around a single conclusion. Yet to a certain degree it reflects dissatisfaction with the methods offered so far.

In this post I take an approach that is mostly3 new to the question of the decline in the labor force participation rate but will be familiar to most labor economists, the Blinder-Oaxaca decomposition of a probit model for the labor force participation decision. I use microdata from the March 2007 and 2013 supplements to the Current Population Survey, downloaded from IPUMS. I conclude that, of the 2.8-percentage-point decline in the labor force participation rate over that six-year period, more than half (1.7 percentage points) can be explained by underlying changes in demography, though a substantial fraction (1.1 percentage points) cannot.

The method

For the majority of my audience that has no idea what a Blinder-Oaxaca decomposition is, here's a quick 101. It's a statistical technique invented by Alan Blinder and Ronald Oaxaca in 1973 that takes the change in a variable and determines how much of it can be explained by a set of other variables in a model and how much can't. (Note: What comes next gets rather mathy, but you can skip down to "My idea..." if math isn't your thing.)

For example, Blinder and Oaxaca both wanted to understand why people differ in their earnings. Let's say that you think pay is determined by a bunch of factors, like your education, work experience, occupation, and so on. Let's put all of those factors into a matrix X, which contains data on lots of people. Let's put all of their earnings into another matrix Y. Then we can estimate the impact of all of those factors by an ordinary least squares regression:

Y = Ε

where β is the matrix of coefficients, which reflects the impacts of the factors, and Ε is a matrix of residuals

Now here's the innovation from Blinder and Oaxaca: If we want to understand a change in Y between two periods, then in the context of our model, there can only be two things going on. Either X or β could have changed -- that is, there could have been an underlying change in the determinants X of pay Y, or there could have been a change in the impacts of factors, reflected in β. We can express that idea as:

ΔY = Ya - Yb = (X- Xb)βb + Xa(β- βb)

where the "a" subscripts are for the "after" period and the "b" subscripts for the "before" period. You can think of the first term as the explained share, changes in the composition of the independent variables. And you can think of the second term as the unexplained share, changes in effects.

Now, my application of this model is a little bit more complex, because we're trying to explain a binary variable. "Are you in the labor force?" can get an answer of yes or no. So I've used something called a probit model, which allows us to estimate the probability that you answer yes or no to that question, given your characteristics. Our changes in the probabilities can also be divided in just the same way into changes in characteristics and changes in the effects of characteristics.

These techniques might seem exotic or advanced to newcomers. To economists, though, they're standard practice. So much so that it's surprising that I was not able to find a single piece of research that did what I think should be the first cut at answering this oh-so-important question about the decline in labor force participation. 

My idea, to be sure, was pretty simple. Here, I'll explain it without the math. Create a model that includes everything you think might be relevant to the decision of whether to participate in the labor force or not. Find data on an "after" period (March 2013) and a "before" period (March 2007). Then see what change in the labor force participation rate the model predicts. But, whatever you do, don't tell the model that a recession happened between 2007 and 2013. Include everything you think might explain the labor force participation decision in a structural capacity -- but nothing else.

My dataset is the March 2007 and 2013 supplements to the Current Population Survey. That gives me a sample size of roughly 150,000 people for both years. To predict whether or not each of these people are in the labor force, I had data on lots of different things: their age, sex, race, marital status, health status, disability status, education, whether they are currently enrolled in school, whether they're a war veteran, whether they have young children at home, and whether they're on welfare. 

It turns out that all this information is enough to make a good guess at whether you actually are in the labor force or not. On average, the model gets it right 81 percent of the time, assuming that you think of predictions of 50 percent and above as a "yes" and below 50 percent as a "no."

And I've deliberately gone out of my way to include common narratives about why the labor force participation rate has fallen. The aging and retirement of the Baby Boomers. The rise in worker disability. The rise in college enrollment. Furthermore, the unexplained share of this method will identify the specific areas of unexplained changes -- for instance, if women en masse suddenly have decided to stop working (and it turns out they haven't), this method will point at that issue. So one of the huge advantages to this approach is that it allows us to do a bunch of tests of specific theories one-by-one and say whether they hold water or not. 

The results

The headline result is that 1.7 percentage points of the decline in the labor force participation rate are explained by changes in the demographic composition of the population, and that 1.1 percentage points are left unexplained. The 95-percent confidence intervals on those figures are that between 1.4 and 1.9 percentage points are explained and between 0.8 and 1.4 percentage points are unexplained. 

This is a good place to note that I've made my .do files available here, so that you can go home and replicate this work, as I know you're all dying to do.

What matters to explaining the decline in the labor force participation rate? One thing above all else: aging, which explains 1.3 percentage points of the drop. The next most important: enrollment in school, which explains 0.8 percentage points of the drop. Remember that individual explanations can sum to more than the total, because there are other changes that partially offset. For example, the rise in educational attainment, which comes from this enrollment, explains a 0.6 percent rise in the labor force participation rate, because the well-educated work like crazy.

What matters less? The rise of disability, which explains 0.2 percentage points of the drop. The decline in the birthrate during the recession, which would suggest a 0.1-percentage-point increase, since fewer people are tied down at home with four-year-olds. 

And what just straight up doesn't matter? Changes in the share of people on welfare, disability aside. Changes in health, after accounting for disability and age. Changes in the sex and race composition of the labor force.

It also turns out that there's no single category that absorbs most of the unexplained share. In fact, the model puts almost all of the unexplained share into a constant. Which basically means that the model is saying, "Whatever your background, take what your probability of being in the labor force was in 2007 and mark it down by some amount for your 2013 probability." I found this compelling evidence that what our model says is unexplained really is the business cycle, and not some omitted structural explanation.

Here, also, is maybe a conclusion you wanted: What does the model predict the labor force participation rate is in March 2013 based on these changes in composition? 64.7 percent, as compared to an actual rate of 63.5 percent. Perhaps this makes you view my conclusion differently, if "less than half cyclical" sounded dour. This wouldn't be a trivial amount of recovery, as you can see in this graph. The black dot not on the line indicates the March 2013 counterfactual.


I've been meaning to write a post on this for a long time. It is the analytical challenge of our era for economists. It's taken me so long to put together an estimate because I wanted an approach I could defend.

One valuable side note is that the change in the working-age labor force participation rate is probably a good rule of thumb for the change in the overall structural labor force participation rate. The drop is about the same as predicted. Which makes sense: These are people whose labor force decision should not be sensitive to the business cycle. They're in the working period of their lives.

I should also mention some shortcomings of this analysis. One of them is that I've only used data from two months, the March 2007 and 2013 CPS supplements. This was mainly out of convenience, as that was the data available on IPUMS, the database I linked to earlier.

Another concern is the obvious endogeneity problem with education. That is, if the economy's terrible, that affects your decision of whether to work now or to go back to school. But note that this problem is insoluble without a model of how the economy affects education decisions, something well beyond the scope of my work here. What my work suggests, though, is that this exercise is worthwhile. Since you get a year older every year, there's not a lot of mystery to the aging-working link. But, since we know now that education decisions were actually important to driving down overall labor force participation, maybe we should go back and think about it carefully.

A final concern is that a lot of the prior research I looked at includes what are called "cohort effects," that is, you think about labor force participation evolving differently for different generations of people, based on their pre-recession starting age. I don't do that in this model. If cohorts matter, this approach will miss it.

Part of my hope of writing this post, whether or not you agree with the overall conclusion, is to enlighten people about the explanatory power of all the theories on the table. If you're on the right, and walk away from this post saying, "Gosh, I wasn't convinced that the decline in the labor force participation rate is partly cyclical, but wow, maybe it really isn't all about more people on welfare," I'll take that as a victory. Or, if you're on the left, and think, "Gosh, I wasn't convinced that the decline in the labor force participation rate is more than half structural, but wow, maybe aging is a bigger part of the story than I thought," I'll also take that as a victory. And, for sure, this won't be the last word. There are many other compelling approaches, each with their advantages and disadvantages. But I think this is an important one that needs to be added to the conversation.

If you have questions, I'm happy to answer them in the comments.


1. Assume that GDP is described by a Cobb-Douglas aggregate production function with a labor share of 0.6, consistent with U.S. levels. Then, holding capital and technology constant, you would predict that a 5-percent drop in the labor force participation rate would cause a 3-percent drop in output.

2: You can find a good literature review in Erceg and Levin (2003).

3: There is an exception, Hotchkiss and Rios-Avila (2013). But it does something I think is not good, which is that it includes a measure of labor-market conditions. My approach differs importantly in that I don't include one because I want to see the conclusions of the model without telling it about the recession. I have some other concerns about the particular measure they've chosen and whether we really can include it in the model if it is codetermined with labor force participation.

Update: I've made my fully cleaned up .dta file available for direct download here.


Further results:

Alan Reynolds of the Cato Institute asked me to try repeating the decomposition with broader measures of welfare programs -- the one I used originally was narrow, i.e. TANF, and Reynolds wanted SNAP (food stamps), Medicare, and Medicaid.

Following other ideas in the comments, I also included cubic and quartic terms in the age, so as to better approximate the curve of the LFPR in the cage. I found that inclusion of the extra age terms didn't do much.

I found that the increase in the fraction receiving public health insurance was an important explanatory variable for the decline of the labor force participation rate: It explains about 0.6 percentage points. I found the increase from SNAP was rather small: It explains 0.2 percentage points. In the new specification, fully 2.5 percentage points of the 2.8 percentage point from in the LFPR is explained by changes in the composition of the workforce.

I would strongly caution Alan, or anyone really, from interpreting this as a causal result. Don't conclude that because Obama expanded Medicaid and food stamps, those new recipients aren't working any more. I imagine that most of this growth was the result of the business cycle. The causal pathway probably goes from unemployment to those programs. I am aware Medicaid expanded permanently, but there is no way to disentangle this.