A generic all-purpose transformation for multivariate modeling through copulas

Manoj Bahuguna1, Ravindra Khattree2
1Department of Mathematics and Statistics, Oakland University, Rochester, USA
2Department of Mathematics and Statistics, and Center for Data Science and Big Data Analytics, Oakland University, Rochester, USA

Tóm tắt

Copulas have been used in various applications in biomedical sciences and finance. We suggest copulas as the generic all-purpose transformations which can enable one to apply various standard multivariate procedures more efficiently and with better statistical properties and results. More specifically, we consider the problem of transformation of any continuous data to multivariate normality using copulas as a device for defining the transformation. Such a transformation effectively enables us to model a variety of problems involving non-normal data using the classical multivariate statistical techniques. We evaluate and illustrate various applications including those in regression, multicollinearity, principal component analysis, factor analysis, partial least square modeling and structural equation modeling where analyses using the appropriate copula transformations result in substantial improvement in implementation, interpretation, prediction as well as in the corresponding models. A great many datasets available in the literature are analyzed which amply demonstrate the power of such an approach.

Tài liệu tham khảo

Abdi, H.: Partial Least Square Regression (PLS Regression), Encyclopedia for Research Methods for the Social Sciences, pp. 792–795. Sage, Thousand Oaks (2003) Atkinson, A., Riani, M., Cerioli, A.: Exploring Multivariate Data with The Forward Search. Springer, New York, NY (2004) Belsley, D.A.: Conditioning Diagnostics: Collinearity and Weak Data in Regression. Wiley, NewYork, NY (1991) Bentler, P. M.: EQS, Structural Equations Program Manual, Program Version 5.0, Encino, CA (1995) Bentler, P.M., Bonett, D.G.: Significance tests and goodness of fit in the analysis of covariance structures. Psychol. Bull. 88(3), 588 (1980) Box, G.E.P., Cox, D.R.: An analysis of transformations. J. R. Stat. Soc. Ser. B (Methodol.) 26(2), 211–252 (1964) Casella, G., Berger, R.L.: Statistical Inference, 2nd edn. Duxbury, Pacific Grove, CA (2002) Chatterjee, S., Hadi, A., Price, B.: The Use of Regression Analysis by Example. Wiley, New York, NY (2006) Cherubini, U., Gobbi, F., Mulinacci, S., Romagnoli, S.: Dynamic Copula Methods in Finance. Wiley, Newyork, NY (2012) Daniel, C., Wood, F.S.: Fitting Equations to Data: Computer Analysis of Multifactor Data. Wiley, New York, NY (1999) Graybill, F.A., Iyer, H.K.: Regression Analysis Concepts and Applications. Duxbury, Belmont, CA (1994) Joe, H.: Multivariate Models and Multivariate Dependence Concepts. CRC Press, New York, NY (1997) Kendall, M.G.: Multivariate Analysis. Macmillan, New York, NY (1980) Khattree, R.: Antieigenvalues, multicollinearity and influence: a revisit to regression diagnostics (preprint) (2016) Khattree, R., Bahuguna, M.: A revisit to estimation of beta risk and an analysis of stock market through copula transformation and winsorization with S&P 500 index as proxy. J. Index Invest. 8(4), 61–83 (2018) Khattree, R., Bahuguna, M.: An alternative data analytic approach to measure the univariate and multivariate skewness. Int. J. Data Sci. Anal. https://doi.org/10.1007/s41060-018-0106-1 (2018) Khattree, R., Naik, D.N.: Applied Multivariate Statistics with SAS Software, 2nd edn. SAS Institute Inc., Cary, NC (1999) Khattree, R., Naik, D.N.: Multivariate Data Reduction and Discrimination with SAS Software. SAS Institute Inc., Cary, NC (2000) Kiplinger’s Personal Finance, 57(12), 104–123 (2003) Kutner, M.H., Nachtsheim, C.J., Neter, J., Li, W.: Applied Linear Statistical Models. McGraw-Hill, New York, NY (2005) Mai, J., Scherer, M.: Financial Engineering with Copulas Explained. Palgrave-Macmillan, London (2014) McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3), 463–481 (1973) McDonald, G.C., Ayers, J.: Some applications of the Chernoff faces: a technique for graphically representing multivariate data. In: Wang, P. (ed.) Graphical Representation of Multivariate Data, pp. 183–197. Academic Press, New York, NY (1978) Nelsen, R.B.: An Introduction to Copulas. Springer, New York, NY (2006) Rao, C.R.: Test of significance in multivariate analysis. Biometrika 35, 58–79 (1948) O’Rourke, N., Hatcher, L.: A Step-by-Step Approach to Using SAS for Factor Analysis and Structural Equation Modeling, Second edn. SAS Press, Cary, NC (2013) SAS Institute, SAS/STAT 12.1 User’s Guide: Survey Data Analysis.SAS Institute Inc., Cary, NC (2012) Sklar, A.: Distribution functions of \(n\) dimensions and margins. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959) Sklar, A.: Random variables, distribution functions, and copulas: a personal look backward and forward. IMS Lect. Notes Monogr. Ser. Inst. Math. Stat. 28, 1–14 (1996) TC2000.com., TC2000 Software, Version 7 (2010) Wang, J., Wang, X.: Structural Equation Modeling: Applications Using Mplus. Wiley, New York, NY (2012) Wicklin, R.: Generating a random orthogonal matrix, SAS Blogs. http://blogs.sas.com/content/iml/2012/03/28/generating-a-random-orthogonal-matrix.html. SAS Institute Inc. (2012) Wicklin, R.: Simulating Data with SAS. SAS Institute Inc., Cary, NC (2013) Wright, S.: On the nature of size factors. Genet. Genet. Soc. Am. Bethesda, MD 3(4), 367–374 (1918) Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58(2), 109–130 (2001)