In lieu of an abstract, here is a brief excerpt of the content:

Reviewed by:
  • Statistical Models and Causal Inference: A Dialogue with the Social Sciences
  • Daniel J. Dutton
Statistical Models and Causal Inference: A Dialogue with the Social Sciences by David A. Freedman. Edited by David Collier, Jasjeet S. Sekhon, and Philip B. Stark. New York: Cambridge University Press, 2010, 399 pp.

This volume is a collection of various works by the late University of California—Berkeley statistician David A. Freedman compiled shortly before his death and edited by his colleagues. Professor Freedman worked on issues in applied statistics and was particularly concerned with their usage in disciplines like epidemiology, political science, and economics. His work in this text is on using mathematics and statistical theory to undo common misconceptions that have emerged in the social sciences around model design and model diagnostic testing. The topics in the book range from valuable elementary arguments presented with sophisticated theory to prove them, to seemingly dated or esoteric presentations of specific topics that may interest only a minority of researchers. The valuable sections definitely outweigh the easily skipped digressions into less popular topics. Freedman’s recommendations for practice are always needed in our fields, which sometimes gloss over the details behind the assumptions we use in modeling.

Part 1, “Statistical Modeling: Foundations and Limitations,” presents Freedman’s skeptical attitude toward the use of “causal” models in the statistics. After the first chapter’s introduction to the difference between Bayesian and frequentist statistics, Freedman lays out his thoughts on various established practices in the social sciences. For example, the practice of imagining a population that does not exist on its own (e.g., The homeless population is the process generating homeless people, of which all the homeless people in a city are a sample) is sharply criticized, as are meta-analyses (“Just say no”). Freedman also decries assuming independence of observations outside the context of society, and his recommendations for practice are to avoid imputation processes and understand the assumptions required by models. He closes with the importance of topic-specific understanding when seeking out causal relationships and the futility in using certain forms of data to answer causal questions.

Part 2, “Studies in Political Science, Public Policy, and Epidemiology,” despite its promising title, is decidedly weak because of its inclusion of strange topics. For example, the discussion of methods used in the United States Census in 2000 seems out of place, but the lessons from that chapter have some applicability to large datasets in general. The discussion on the United States Geological Service and the prediction of earthquakes is more valuable as a consideration of probability than a social science dialogue. Two chapters deal with conventional wisdom concerning the Intersalt study on hypertension and survival analysis, respectively, and are must-reads for epidemiologists.

Part 3, “New Developments: Progress or Regress?,” gets to the root of Freedman’s argument [End Page 537] against the spread of uninformed modeling in social sciences. The two motifs of Freedman’s work— model diagnostics not being a replacement for topic knowledge, and intellectual wrangling of models not being a replacement for better data gathering— are ubiquitous in this section. These motifs are presented in discussions on endogeneity in Probit models, particularly focusing on the inadequacy of Heckman two-step procedures. For epidemiologists, there is an informative discussion on logistic regression, which in the process of a randomized experimental design is shown to be inconsistent and is compared to a more general test.

Part 4, “Shoe Leather Revisited,” concludes by reiterating Freedman’s cautions over letting models supersede the reasoning process he claims has been used to make great discoveries throughout modern history (such as Snow’s cholera study). The section also stresses the importance of what Freedman calls “qualitative knowledge.” Such knowledge includes informal reasoning, insights into possible mechanisms or processes creating the data, and substantive understanding that comes from expending effort to understand the question under investigation.

Freedman’s arguments are well taken and, in the opinion of the reviewer, classic and well-worn critiques that most practitioners are aware of on some level. The added-value of this text is the explicit statement, through math and argument, of how wrong things can go when the assumptions behind...

pdf

Share