In lieu of an abstract, here is a brief excerpt of the content:

  • Comments and Discussion
  • Steven N. Durlauf and Jeffrey C. Fuhrer

Steven N. Durlauf:

This ambitious paper tackles an extraordinarily difficult question: what is the role of formal statistical models in evaluating economic policies? In particular, the paper studies the use of such models by central banks in various capacities. Although the paper addresses a wide range of issues and provides a fair amount of qualitative description of how central banks use large-scale models, its main contributions are twofold. First, it provides an evaluation of the forecasting performance of the Federal Reserve. Second, it addresses several broad questions concerning the appropriate ways of using statistical models in the policy process. I will deal with each of these components in turn.

Sims' evaluation of the forecasting accuracy of the Federal Reserve provides some useful additions to a long-standing literature, in particular a recent paper by Christina Romer and David Romer.1 Sims compares both the judgmental and the model-based forecasts of the Federal Reserve with two alternatives: naïve forecasts and a consensus forecast from the private sector. What is new in Sims' comparison relative to that of Romer and Romer is the attention to the relative virtues of the judgmental and the model-based forecasts. The main claims Sims makes are, first, that the Federal Reserve forecasts well, especially when forecasting inflation; second, that the informational contents of different forecasts are highly correlated, so that strong claims of superiority of one forecast over another should be treated as suspect; and third, that there does not appear to be strong evidence that the judgmental forecasts of the Federal Reserve are [End Page 41] superior (as measured by the root mean square forecast error) to its model-based forecasts.

Although these points are well taken, the analysis succeeds less well in giving a clear understanding of the differences between the model-based forecasts and the forecasts that embody subjective judgments. One limitation is that the procedures used for forecast comparison are not well chosen if one's objective is to go beyond crude summary measures of relative forecast accuracy to an understanding of why forecasts differ. Root mean square error is certainly a sensible single summary statistic for comparing forecasts, but like all such summaries it is limited. In my view, an additional useful way of comparing two forecasts is to attempt to identify periods when the two forecasts diverge relatively sharply and compare their behavior at those times. I suspect that, during shifts across business cycle regimes, differences are larger between the Federal Reserve forecasts and the private sector forecasts than in other periods. Put differently, the fact that two forecasts are approximately equally accurate in periods when they are close to each other is not informative about their relative performance when they are far apart. Presumably what one is interested in is whether, when the differences are relatively large, one forecast performs better than the other.

Further, it would seem that a deeper evaluation of the forecast differences should address in greater detail the relationship between forecasts and their use, especially if one is interested in how subjective judgment affects forecasts. In a 1997 paper (which, curiously, Sims does not reference), David Reifschneider, David Stockton, and David Wilcox give three justifications for the use of judgmental over model-based forecasts: first, the ability of the former to use "potentially valuable information contained in monthly and weekly data" not incorporated into the model; second, the integration of "extramodel information and anecdotal evidence into the forecast"; and third, the ability to address model uncertainty: "the judgmental approach . . . enables the staff to examine a range of econometric specifications—both structural and reduced form—in producing the forecast rather than relying on a single specification enshrined in the 'staff model.'"2 These are all plausible reasons for using judgment, and all would seem relevant to evaluating the effectiveness and value of subjective judgment in Federal Reserve forecasting. Although Sims' paper gives [End Page 42] some attention to the question of information asymmetries, virtually none is given to these other explanations for why judgmental forecasts deviate from model-based ones. Now, some of these reasons may not be identifiable from available data, but...

pdf

Share