Alternative transformations to handle extreme values of the dependent variable

JB Burbidge, L Magee, AL Robb - Journal of the American …, 1988 - Taylor & Francis
Journal of the American statistical Association, 1988Taylor & Francis
Transformations that could be used to reduce the influence of extreme observations of
dependent variables, which can assume either sign, on regression coefficient estimates are
studied in this article. Two that seem reasonable on a priori grounds—the extended Box—
Cox (BC) and the inverse hyperbolic sine (IHS)—are evaluated in detail. One feature is that
the log-likelihood function for IHS is defined for zero values of the dependent variable, which
is not true of BC. The double-length regression technique (Davidson and MacKinnon 1984) …
Abstract
Transformations that could be used to reduce the influence of extreme observations of dependent variables, which can assume either sign, on regression coefficient estimates are studied in this article. Two that seem reasonable on a priori grounds—the extended Box—Cox (BC) and the inverse hyperbolic sine (IHS)—are evaluated in detail. One feature is that the log-likelihood function for IHS is defined for zero values of the dependent variable, which is not true of BC. The double-length regression technique (Davidson and MacKinnon 1984) is used to perform hypothesis tests of one transformation against the other using Canadian data on household net worth. These tests support the use of IHS instead of BC for this data set. Empirical investigators in economics often work with a logged dependent variable (taking the natural logarithm of a data series is, of course, a special case of BC) to reduce the weight their particular estimation procedure might otherwise attach to extreme values of the dependent variable. Logging dependent and independent variables has the added attraction that slope coefficients may be interpreted as elasticities. In the event that the dependent variable assumes nonpositive values, some researchers (e.g., Diamond and Hausman 1984; King and Dicks-Mireaux 1982) have dropped the nonpositive values and others have added a constant so that each observation is positive. An alternative transformation, which is defined for any real number, is the IHS transformation, sinh-1(x) = log(x + (x 2 + l)1/2). This was proposed in Johnson (1949) and is just as easy to employ as the BC transformation.
Taylor & Francis Online