Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models

TF Jaeger - Journal of memory and language, 2008 - Elsevier
Journal of memory and language, 2008Elsevier
This paper identifies several serious problems with the widespread use of ANOVAs for the
analysis of categorical outcome variables such as forced-choice variables, question-answer
accuracy, choice in production (eg in syntactic priming research), et cetera. I show that even
after applying the arcsine-square-root transformation to proportional data, ANOVA can yield
spurious results. I discuss conceptual issues underlying these problems and alternatives
provided by modern statistics. Specifically, I introduce ordinary logit models (ie logistic …
This paper identifies several serious problems with the widespread use of ANOVAs for the analysis of categorical outcome variables such as forced-choice variables, question-answer accuracy, choice in production (e.g. in syntactic priming research), et cetera. I show that even after applying the arcsine-square-root transformation to proportional data, ANOVA can yield spurious results. I discuss conceptual issues underlying these problems and alternatives provided by modern statistics. Specifically, I introduce ordinary logit models (i.e. logistic regression), which are well-suited to analyze categorical data and offer many advantages over ANOVA. Unfortunately, ordinary logit models do not include random effect modeling. To address this issue, I describe mixed logit models (Generalized Linear Mixed Models for binomially distributed outcomes, Breslow and Clayton [Breslow, N. E. & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Society 88(421), 9–25]), which combine the advantages of ordinary logit models with the ability to account for random subject and item effects in one step of analysis. Throughout the paper, I use a psycholinguistic data set to compare the different statistical methods.
Elsevier