-
21. Handling Messing Data
- Russell Sage Foundation
- Chapter
- Additional Information
399 21 HANDLING MISSING DATA THERESE D. PIGOTT Loyola University, Chicago C O N T E N T S 21.1 Introduction 400 21.2 Types of Missing Data 400 21.2.1 Missing Studies 400 21.2.2 Missing Effect Sizes 400 21.2.3 Missing Descriptor Variables 401 21.3 Reasons for Missing Data in Meta-Analysis 402 21.3.1 Missing Completely at Random 402 21.3.2 Missing at Random 403 21.3.3 Not Missing at Random 403 21.4 Commonly Used Methods for Missing Data in Meta-Analysis 403 21.4.1 Complete Case Analysis 403 21.4.1.1 Complete Case Analysis Example 404 21.4.2 Available Case Analysis 404 21.4.2.1 Available Case Analysis Example 406 21.4.3 Single-Value Imputation with the Complete Case Mean 406 21.4.3.1 Imputing the Complete Case Mean 406 21.4.3.1.1 Complete Case Mean Imputation Example 407 21.4.3.2 Single-Value Imputation with Conditional Means 407 21.4.3.2.1 Regression Imputation Example 408 21.4.4 Missing Effect Sizes 408 21.4.5 Summary of Simple Methods for Missing Data 409 21.5 Model-Based Methods for Missing Data 409 21.5.1 Maximum Likelihood Methods Using the EM Algorithm 409 21.5.1.1 Maximum Likelihood Methods 410 21.5.1.2 Example of Maximum Likelihood Methods 411 21.5.1.3 Other Applications of Maximum Likelihood Methods 411 400 DATA INTERPRETATION 21.1 INTRODUCTION This chapter discusses what researchers can do when studies are missing the information needed for meta-analysis. Despite careful evaluation of coding decisions, researchers will find that studies in a research synthesis invariably differ in the types and quality of the information reported. Here I examine the types of missing data that occur in a research synthesis, and discuss strategies synthesists can use when faced with missing data. Problems caused by missing data can never be entirely alleviated. As such, the first strategy for addressing missing data should be to contact the study authors. Doing so remains a viable strategy in some cases. When it fails, the next strategy is to use statistical missing data methods to check the sensitivity of results to different assumptions about the distribution of the hypothetically complete data and the reasons for the missing data. We can think of these methods as a trade-off—if observations are missing in the data, then the cost is the need to make assumptions about the data and the mechanism that causes the missing data, assumptions that are difficult to verify in practice. Synthesists can feel more confident about results that are robust to different assumptions about the missing data. On the other hand, results that differ depending on the missing data assumptions may indicate that the sample of studies do not provide enough evidence for robust inference . Joseph Schafer and John Graham pointed out that the main goal of statistical methods for missing data is not to recover or estimate the missing values but to make valid inferences about a population of interest (2002). They thus noted that the missing data method is embedded in the particular model or testing procedure the analyst is using. My goal in this chapter is to introduce methods for estimating effect size models when either effect sizes or predictors in the model are missing from primary studies. 21.2 TYPES OF MISSING DATA Synthesists encounter missing data in three major areas: missing studies, missing effect sizes, and missing study descriptor variables. Although the reasons for missing observations on any of these three areas vary, each type of missing data presents difficulties for the synthesists. 21.2.1 Missing Studies A number of mechanisms lead to studies missing in a research synthesis. Researchers in both medicine and in the social sciences have documented the bias in published literature toward statistically significant results (for example , Rosenthal 1979; Hemminki 1980; Smith 1980; Begg and Berlin 1988). In this case, studies that find nonsignificant statistical results are less likely to appear in the published literature (see chapters 6 and chapter 23, this volume). Another reason studies may be missing in a synthesis is lack of accessibility; some studies are unpublished reports that are not identifiable through commonly used search engines or are not easily accessed by synthesists . For example, Matthias Egger and George Davey (1997) and Peter Jüni et al. (2002) demonstrated that studies published in languages other than English may...