Regression with Missing X's: A Review

RJA Little - Journal of the American statistical association, 1992 - Taylor & Francis
Journal of the American statistical association, 1992Taylor & Francis
The literature of regression analysis with missing values of the independent variables is
reviewed. Six classes of procedures are distinguished: complete case analysis, available
case methods, least squares on imputed data, maximum likelihood, Bayesian methods, and
multiple imputation. Methods are compared and illustrated when missing data are confined
to one independent variable, and extensions to more general patterns are indicated.
Attention is paid to the performance of methods when the missing data are not missing …
Abstract
The literature of regression analysis with missing values of the independent variables is reviewed. Six classes of procedures are distinguished: complete case analysis, available case methods, least squares on imputed data, maximum likelihood, Bayesian methods, and multiple imputation. Methods are compared and illustrated when missing data are confined to one independent variable, and extensions to more general patterns are indicated. Attention is paid to the performance of methods when the missing data are not missing completely at random. Least squares methods that fill in missing X's using only data on the X's are contrasted with likelihood-based methods that use data on the X's and Y. The latter approach is preferred and provides methods for elaboration of the basic normal linear regression model. It is suggested that more widely distributed software is needed that advances beyond complete-case analysis, available-case analysis, and naive imputation methods. Bayesian simulation methods and multiple imputation are reviewed; these provide fruitful avenues for future research.
Taylor & Francis Online