A New Generation of Econ Data?
In the beginning, there was no economics data. Greats like Adam Smith wrote treatises on political economy in the equivalent of near-total darkness. Later economists such as Vilfredo Pareto and Alfred Marshall introduced mathematical foundations, changing the direction of what had been a very qualitative philosophical endeavor. After the Great Depression, Paul Samuelson and John Hicks consolidated Keynes' work into the modern field of macroeconomics -- and they received critical (and I might argue significantly under-appreciated) support from econometricians and statisticians like Simon Kuznets.
Kuznets developed the United States' program of national income accounting -- from which the ubiquitous measure of GDP comes -- and more broadly, he put heavy emphasis upon data collection. That enabled empirical analysis and complemented economics' ever more quantitative bent.
Call Kuznets' revolution the First Generation of economics data. Much of it was low-frequency, with figures released on yearly and quarterly bases. Only some data, largely from labor markets and prices, came out with greater frequency. In large part, data was supplied from government bureaus of statistics and industry groups -- a highly centralized model of collection and distribution. And the supply of data was scarce, with each figure an expensive undertaking.
I think we are approaching a Second Generation of economics data. The model is changing, a trend driven by information technology. It's not just the Internet; it's the increasing fraction of economic and social interactions which are taking place in venues from which data can be collected. With this technological assist, economics will get data with increasing frequency -- monthly, weekly, or even daily. And the high-cost First Generation model is giving way to cheaper decentralized tools, like MIT's Billion Prices Project. Data is becoming plentiful and cheap. And these are early stage projects compared to where we are going.
Justin Wolfers, one of the economists whose work you need to be following today, recently wrote this on "Big Think":
Economics is in the midst of a massive and radical change. It used to be that we had little data, and no computing power, so the role of economic theory was to “fill in” for where facts were missing. Today, every interaction we have in our lives leaves behind a trail of data. Whatever question you are interested in answering, the data to analyze it exists on someone’s hard drive, somewhere. This background informs how I ] think about the future of economics.One such example is Google's N-grams book search tool, which I've highlighted on the blog in the past to show how it can be used to analyze cultural and linguistic change, such as the death of the long s. Such tools will greatly further the field of economic history. Alongside this, web search frequency is providing new measures. There are many caveats, but the New York Fed's "Liberty Street" blog wrote that such Internet data has implications on forecasting and "now-casting" lower-frequency official data.
Specifically, the tools of economics will continue to evolve and become more empirical. Economic theory will become a tool we use to structure our investigation of the data. Equally, economics is not the only social science engaged in this race: our friends in political science and sociology use similar tools; computer scientists are grappling with “big data” and machine learning; and statisticians are developing new tools. Whichever field adapts best will win. I think it will be economics.
Here's one example of a potential application. For obvious reasons, Google searches of the keywords "layoff" and "layoffs" are highly correlated with the Bureau of Labor Statistics' measure of private layoffs. Similar forecasting is already done by the Center for Disease Control using searches for flu-related terms. Economics can become increasingly real-time using such tools.