-
Appendix F - Methodology and Research Design
- W.E. Upjohn Institute
- Chapter
- Additional Information
201 Appendix F Methodology and Research Design This appendix provides a review of the methodology and research design used in this book. It describes the statistics that are employed in this analysis and raises concerns pertinent to the application of these techniques. Citations of statistical sources are included for reference and for further explanation. OVERVIEW OF STATISTICS EMPLOYED Correlation measures the degree to which two variables are linearly related. Unlike the covariance, the correlation coefficient neutralizes any differences in scales (e.g., X might be measured in dollars and Y might be measured in percent ) that might exist for the two variables. The correlation coefficient always varies between −1.0 and +1.0, with the extremes representing perfectly linear relationships. If X and Y are independent, then the correlation is zero. Multiple Regression is a type of linear model that enables us to measure the effects of several independent (i.e., exogenous) variables upon a dependent variable. By estimating the parameters of the linear model, this technique provides us with four important pieces of information, including 1) the amount of unit change in Y (the dependent variable) attributable to each of the independent variables (such as the beta parameters or the amount of variance in Y explained by each of the independent variables), 2) the partial correlation coefficients, 3) the amount of variance in Y explained by the combined effects of the independent variables (such as the multiple correlation coefficient), and 4) the statistical significance of the model. The equation for multiple regression may be expressed as Y = a + bx + bz + e , where a is the intercept (as if plotting a line), bx is the beta coefficient of X (the slope of the line of X on Y), bz is the beta coefficient of Z (the slope of the line of Z on Y), and e is the random error term. When we standardize the model, we set a, alpha, at zero (Wonnacott and Wonnacott 1977, pp. 359–365). Generally, regression equations require continuous or interval-level variables in order to measure how unit change in the independent variables affects In order to view this proof accurately, the Overprint Preview Option must be checked in Acrobat Professional or Adobe Reader. Please contact your Customer Service Representative if you have questions about finding the option. Job Name: -- /347091t 202 Wasem unit change in the dependent variable. Thus, party and New Deal realignment are converted to 0/1 variables, and indices and scores are calculated to gauge biographical status and ideology. If one has ordinal or nominal data, then three other techniques are more appropriate: 1) analysis of variance techniques, such as Multiple Classification Analysis (MCA); 2) contingency table analysis techniques , such as chi-squared or Leo Goodman’s ECTA statistics; and 3) logistic regression techniques. Political party is a dichotomous variable that generally should not be used as a dependent variable in a regression equation. One option would be to use partisanship score, which is a continuous-level variable; however, it is problematic as a subsequent predictor of the full employment score because both are composed of roll call votes. The more statistically sound approach is to use LOGIT or PROBIT, which are both regression techniques designed to handle dichotomous dependent variables. This solution poses problems, nonetheless, with the overall set of structural equations, because they must be parallel—that is, they must use identical methods of calculation. I have run the regression equations predicting political party using LOGIT and using ordinary least squares (OLS). The results are quite similar, due largely to the fact that the distribution of party is not skewed—in other words, there is a fairly even balance of Democrats and Republicans in the 79th Congress . Thus, I am presenting the OLS results for the sake of consistency with the overall model. As is apparent from the equations, regression techniques assume an additive model, and thus the exogenous variables must be independent of each other. Given the nature of social science data and its accompanying problems of measurement, it is sometimes difficult to have purely independent exogenous variables. Multicolinearity, which exists when some of the exogenous variables are correlated with each other, is not uncommon. Multicolinearity may occur in instances where the exogenous variables are not truly causally related and result from definitional and measurement problems. Or, the model may be misspecified, such as by omitting variables. In resolving the problem of multicolinearity, one should bear in mind the following caveat: “If minor changes in the model...