In lieu of an abstract, here is a brief excerpt of the content:

  • Measuring Power in International Relations
  • Caleb Pomeroy (bio) and Michael Beckley (bio)

To the Editors (Caleb Pomeroy writes):

In "The Power of Nations: Measuring What Matters," Michael Beckley persuasively argues that aggregate measures of power systematically exaggerate the capabilities of relatively populous countries.1 Traditional measures of capabilities—namely, gross domestic product (GDP) and Composite Index of National Capability (CINC) scores—fail to deduct state liabilities, including production, welfare, and security costs. Beckley's proposed solution is to replace measures of aggregate capabilities with a measure of GDP multiplied by GDP per capita (GDP × GDPPC), thus penalizing countries with large populations. The variable exhibits strong theoretical appeal and case-level evidence. For quantitative international relations scholars, however, the most convincing argument for the adoption of this variable hails from the variable's model fit improvements in the "majority of studies published in leading journals over the past five years," as noted in the article's summary. I raise two issues with such a claim.

First, the variable that Beckley proposes does not appear to improve model fits in the majority of replicated studies. He soundly reasons that if the replacement of GDP or CINC scores with his GDP × GDPPC variable improves model fit, then the proposed variable better explains the outcome of interest. As evidence, Beckley compares Akaike information criterion (AIC) scores and tallies improvements based upon GDP versus GDP × GDPPC separately from CINC versus GDP × GDPPC for a given study's replication.2 This approach to model selection is relatively unorthodox, however. A sounder approach consists of a simultaneous comparison among all three models. Unless the proposed variable outperforms both of the existing variables, then at least one of the existing variables suffices.

Although Beckley rightly points out that models specified with GDP × GDPPC exhibit lower AICs than those that employ CINC scores in seventeen of the twenty-four studies that he examines and GDP in eleven of the twenty-four studies, in only ten of the studies does this measure exhibit an AIC lower than both of the models specified with GDP and CINC scores. Furthermore, these differences must meet some threshold in order to conclude significant fit improvement, typically a difference of [End Page 197] three or four.3 When subjected to a more traditional model selection procedure—an AIC difference of at least three and the simultaneous outperformance of both GDP and CINC scores—the measure yields superior model fits in six of the twenty-four replicated studies. The sample size of twenty-four studies inhibits generalizations, but this reanalysis provides a corrective to the claim that GDP × GDPPC improves fit in the majority of replicated studies.

Second, Beckley's variable introduces potential inferential complications. His proposed measure is equivalent to GDP-squared divided by population. This intuition implies a theory that is quadratic in GDP with the desire to control for population. A more traditional model for such a theory would specify population separately from GDP. The squared nature of GDP further implies that a first order term should be specified. These steps would preserve Beckley's intuition but avoid violations of model hierarchy; furthermore, this specification helps isolate the explanatory work done by population versus GDP.4

As evidence, this correspondence replicated each study that utilized a linear capabilities term according to Beckley's approach (i.e., the replacement of the traditional variable with GDP × GDPPC).5 The models were then respecified with (1) the more traditional battery of population + GDP + GDP-squared, and (2) GDP × GDPPC alongside the variable's square root (i.e., to approximate and control for main effects). Both specifications yield significant fit improvements over the proposed variable in three of the five studies.6 These improvements are noteworthy, because AIC penalizes models with additional parameters. If one's theory is quadratic in GDP with the desire to control for population, these results suggest that a sounder specification consists of population + GDP + GDP-squared.

Quoting Joseph Nye, Beckley points out that power is like love, "easier to experience than to define or measure" (p. 8). This correspondence echoes Beckley's theoretical critique of GDP and CINC scores. His article's compelling case studies highlight the measure's utility as a...