Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation
Tóm tắt
Nonnormality of univariate data has been extensively examined previously (Blanca et al., Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84, 2013; Miceeri, Psychological Bulletin, 105(1), 156, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74 % of univariate distributions and 68 % multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17 % in a t-test and 30 % in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances. To facilitate future report of skewness and kurtosis, we provide a tutorial on how to compute univariate and multivariate skewness and kurtosis by SAS, SPSS, R and a newly developed Web application.
Tài liệu tham khảo
Anscombe, F.J., & Glynn, W.J. (1983). Distribution of the Kurtosis Statistic b2 for Normal Samples Distribution of the Kurtosis Statistic b2 for Normal Samples. Biometrika, 70(1), 227. doi:10.2307/2335960.
Becker, M., & Klößner, S. (2016). PearsonDS: Pearson Distribution System PearsonDS: Pearson Distribution System. Retrieved from https://CRAN.R-project.org/package=PearsonDS (R package version 0.98.
Blanca, M.J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and Kurtosis in Real Data Samples Skewness and kurtosis in real data samples. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84. doi:10.1027/1614-2241/a000057.
Bliss, C. (1967). Statistical Tests of Skewness and Kurtosis Statistical Tests of Skewness and Kurtosis. In Statistics in Biology: Statistical Methods for Research in the Natural Sciences Statistics in Biology: Statistical Methods for Research in the Natural Sciences (VOL 1, pp. 140–146). New York McGraw-Hill Book Company.
Box, G.E., & Cox, D.R. (1964). An analysis of transformations An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211–252. Retrieved 2016-07-26, from. http://www.jstor.org/stable/2984418.
Corder, G.W., & Foreman, D.I. (2014). Nonparametric statistics: A step-by-step approach: Wiley.
D’Agostino, R.B. (1970). Transformation to Normality of the Null Distribution of g 1 Transformation to Normality of the Null Distribution of g 1. Biometrika, 57(3), 679. doi:10.2307/2334794.
DeCarlo, L. (1997a). Mardia’s multivariate skew (b1p) and multivariate kurtosis (b2p). Retrieved from. http://www.columbia.edu/ld208/Mardia.sps.
DeCarlo, L. (1997b). On the meaning and use of kurtosis. On the meaning and use of kurtosis. Psychological methods, 2(3), 292. doi:10.1037/1082-989X.2.3.292.
Huber, P.J (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (VOL. 1, pp. 221–233.
Joanes, D., & Gill, C. (1998). Comparing measures of sample skewness and kurtosis Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 47(1), 183–189. doi:10.1111/1467-9884.00122 10.1111/1467-9884.00122.
Komsta, L., & Novomestky, F. (2015). Moments: Moments, cumulants, skewness, kurtosis and related tests moments: Moments, cumulants, skewness, kurtosis and related tests. Retrieved from https://CRAN.R-project.org/package=moments (R package version 0.14.
Levine, D.W., & Dunlap, W.P. (1982). Power of the F test with skewed data: Should one transform or not? Power of the f test with skewed data: Should one transform or not?. Psychological Bulletin, 92(1), 272. doi:10.1037/0033-2909.92.1.272.
Mardia, K.V. (1970). Measures of Multivariate Skewness and Kurtosis with Applications Measures of Multivariate Skewness and Kurtosis with Applications. Biometrika, 57(3), 519. doi:10.2307/2334770.
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2015). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. Retrieved from https://CRAN.R-project.org/package=e1071 (R package version 1.6-7).
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156. doi:10.1037/0033-2909.105.1.156.
Palmer, E.M., Horowitz, T.S., Torralba, A., & Wolfe, J.M. (2011). What are the shapes of response time distributions in visual search?. Journal of Experimental Psychology: Human Perception and Performance, 37 (1), 58–71. Retrieved 2016-07-26, from 10.1037/a0020747.
R Core Team (2016). R: A Language and Environment for Statistical Computing R: A Language and Environment for Statistical Computing. Vienna, Austria R Foundation for Statistical Computing. https://www.R-project.org/.
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36. http://www.jstatsoft.org/v48/i02/.
Sakia, R.M. (1992). The Box-Cox Transformation Technique: A Review. The Statistician, 41(2), 169. Retrieved 2016-07-26, from 10.2307/2348250.
Satorra, A., & Bentler, P. (1988). Scaling corrections for statistics in covariance structure analysis (UCLA Statistics Series 2). Los Angeles: University of California at Los Angeles: Department of Psychology.
Scheffe, H. (1959). The analysis of variance. The analysis of variance. New York: Wiley.
Shohat, J. (1929). Inequalities for Moments of Frequency Functions and for Various Statistical Constants. Biometrika, 21(1/4), 361. doi:10.2307/2332566.
Tabachnick, B.G., & Fidell, L.S. (2012). Using Multivariate Statistics Using Multivariate Statistics (6 ed). Pearson.
Wang, L., Zhang, Z., McArdle, J.J., & Salthouse, T.A. (2008). Investigating Ceiling Effects in Longitudinal Data Analysis. Multivariate Behavioral Research, 43(3), 476–496. doi:10.1080/00273170802285941.
Yanagihara, H., & Yuan, K.H. (2005). Four improved statistics for contrasting means by correcting and kurtosis. British Journal of Mathematical and Statistical Psychology, 58(2), 209–237. doi:10.1348/000711005X64060.
Yuan, K.H., & Bentler, P.M. (1998). Normal theory based test statistics in structural equation modelling. British Journal of Mathematical and Statistical Psychology, 51(2), 289– 309.
Yuan, K.H., Bentler, P.M., & Zhang, W. (2005). The Effect of Skewness and Kurtosis on Mean and Covariance Structure Analysis: The Univariate Case and Its Multivariate Implication. Sociological Methods & Research, 34(2), 240–258. doi:10.1177/0049124105280200.
Yuan, K.H., & Zhang, Z. (2012). Robust structural equation modeling with missing data and auxiliary variables. Psychometrika, 77(4), 803–826. doi:10.1007/s11336-012-9282-4.
Zhang, Z., & Yuan, K.H. (2012). WebSEM: Conducting structual equation modelling online. Notre Dame, IN. Retrieved from. https://websem.psychstat.org.