Kiểm định khối chéo cho ma trận hiệp phương sai có chiều cao

TEST - Tập 32 - Trang 447-466 - 2022
Jiayu Lai1, Xiaoyi Wang2, Kaige Zhao1, Shurong Zheng1
1School of Mathematics and Statistics and KLAS, Northeast Normal University, Changchun, China
2Center for Statistics and Data Science, Beijing Normal University, Zhuhai, China

Tóm tắt

Việc kiểm tra cấu trúc của một ma trận hiệp phương sai có chiều cao đóng vai trò quan trọng trong phân tích chứng khoán tài chính, phân tích chuỗi di truyền và nhiều lĩnh vực khác. Kiểm tra xem ma trận hiệp phương sai có dạng khối chéo trong bối cảnh chiều cao là trọng tâm chính của bài báo này. Một số quy trình kiểm định dựa trên giả định phân phối bình thường, giả định hai khối chéo, hoặc giả định về độ chiều của các khối con đã được đề xuất để giải quyết vấn đề này. Để nới lỏng các giả định này, chúng tôi phát triển một khung kiểm định dựa trên thống kê U, và phân phối tiệm cận của thống kê U được thiết lập dưới các giả thuyết không và giả thuyết thay thế cục bộ. Hơn nữa, một phương pháp kiểm định được phát triển cho các trường hợp thay thế có mức độ thưa khác nhau. Cuối cùng, cả nghiên cứu mô phỏng và phân tích dữ liệu thực tế đều chứng minh hiệu suất của các phương pháp mà chúng tôi đề xuất.

Từ khóa

#ma trận hiệp phương sai #kiểm định khối chéo #thống kê U #giả thuyết #chiều cao

Tài liệu tham khảo

Al-Shalalfa M, Alhajj R (2007) Attractive feature reduction approach for colon data classification. In: 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW’07), vol 1. IEEE, Niagara Falls, ON, Canada, pp 678–683, https://doi.org/10.1109/AINAW.2007.103 Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd edn. Wiley, New York Bai Z, Jiang D, Yao J et al (2009) Corrections to LRT on large dimensional covariance matrix by RMT. Ann Stat 37(6B):3822–3840. https://doi.org/10.1214/09-AOS694 Bao Z, Hu J, Pan G et al (2017) Test of independence for high-dimensional random vectors based on block correlation matrices. Electron J Stat 11:1527–1548. https://doi.org/10.1214/17-EJS1259 Berisa T, Pickrell JK (2016) Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32(2):283–285. https://doi.org/10.1093/bioinformatics/btv546 Bodnar T, Dette H, Parolya N (2019) Testing for independence of large dimensional vectors. Ann Stat 47(5):2977–3008. https://doi.org/10.1214/18-AOS1771 Cai T, Jiang T (2011) Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. Ann Stat 39(3):1496–1525. https://doi.org/10.1214/11-AOS879 Cai T, Ma Z (2013) Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli 19(5B):2359–2388. https://doi.org/10.3150/12-BEJ455 Chen S, Zhang L, Zhong P (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105(490):810–819. https://doi.org/10.1198/jasa.2010.tm09560 Devijver E, Gallopin M (2018) Block-diagonal covariance selection for high-dimensional gaussian graphical models. J Am Stat Assoc 113(521):306–314. https://doi.org/10.1080/01621459.2016.1247002 Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87. https://doi.org/10.1198/016214502753479248 He Y, Xu G, Wu C et al (2021) Asymptotically independent u-statistics in high-dimensional testing. Ann Stat 49(1):154–181. https://doi.org/10.1214/20-AOS1951 Hyodo M, Shutoh N, Nishiyama T et al (2015) Testing block-diagonal covariance structure for high-dimensional data. Stat Neerl 69(4):460–482. https://doi.org/10.1111/stan.12068 Jiang D, Qi Y (2015) Likelihood ratio tests for high-dimensional normal distributions. Scand J Stat 42(4):988–1009. https://doi.org/10.1111/sjos.12147 Jiang D, Jiang T, Yang F (2012) Likelihood ratio tests for covariance matrices of high-dimensional normal distributions. J Stat Plan Inference 142(8):2241–2256. https://doi.org/10.1016/j.jspi.2012.02.057 Jiang D, Bai Z, Zheng S (2013) Testing the independence of sets of large-dimensional variables. Sci China Math 56(1):135–147. https://doi.org/10.1007/s11425-012-4501-0 Jiang T, Yang F (2013) Central limit theorems for classical likelihood ratio tests for high-dimensional normal distributions. Ann Stat 41(4):2029–2074. https://doi.org/10.1214/13-AOS1134 John S (1971) Some optimal multivariates tests. Biometrika 58(1):123–127. https://doi.org/10.1093/biomet/58.1.123 Kan R (2008) From moments of sum to moments of product. J Multivar Anal 99(3):542–554. https://doi.org/10.1016/j.jmva.2007.01.013 Kumar SS, Sumathi A, Ramaraj DE (2012) Development of an efficient clustering technique for colon dataset. Int J Eng Innovative Technol 1(5):83–86 Ledoit O, Wolf M (2002) Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann Stat 30(4):1081–1102. https://doi.org/10.1214/aos/1031689018 Li W, Yao J (2018) On structure testing for component covariance matrices of a high dimensional mixture. J R Stat Soc Ser B (Stat Methodol) 80(2):293–318. https://doi.org/10.1111/rssb.12248 Li W, Chen J, Yao J (2017) Testing the independence of two random vectors where only one dimension is large. Statistics 51(1):141–153. https://doi.org/10.1080/02331888.2016.1266988 Lin Z, Xiang Y (2008) A hypothesis test for independence of sets of variates in high dimensions. Statist Probab Lett 78(17):2939–2946. https://doi.org/10.1016/j.spl.2008.05.003 Marques F, Coelho C, Marques P (2013) The block-matrix sphericity test: exact and near-exact distributions for the test statistic. Recent Developments in Modeling and Applications in Statistics pp 169–177. https://doi.org/10.1007/978-3-642-32419-2_18 Nagao H (1973) On some test criteria for covariance matrix. Ann Stat 1(4):700–709. https://doi.org/10.2307/2958313 Nirmalakumari K, Rajaguru H, Rajkumar P (2020) Inference on the shape of elliptical distributions based on the mcd. Int J Imaging Syst Technol pp 1–21. https://doi.org/10.1002/ima.22431 Pavlenko T, Björkström A, Tillander A (2012) Covariance structure approximation via glasso in high-dimensional supervised classification. J Appl Stat 39(8):1643–1666 Qi Y, Wang F, Zhang L (2019) Limiting distributions of likelihood ratio test for independence of components for high-dimensional normal vectors. Ann Inst Stat Math 71:911–946. https://doi.org/10.1007/s10463-018-0666-9 Qiu Y, Chen S (2012) Test for bandedness of high-dimensional covariance matrices and bandwidth estimation. Ann Stat 40(3):1285–1314. https://doi.org/10.1214/12-AOS1002 Rahman MA, Muniyandi RC (2018) Feature selection from colon cancer dataset for cancer classification using artificial neural network. Int J Adv Sci Eng Inform Technol 8(4–2):1387–1393 Schott JR (2005) Testing for complete independence in high dimensions. Biometrika 92(4):951–956. https://doi.org/10.1093/biomet/92.4.951 Silva IR, Zhuang Y, Junior JCAdS (2021) Kronecker delta method for testing independence between two vectors in high-dimension. Stat Pap (Berl) pp 1–23. https://doi.org/10.1007/s00362-021-01238-z Srivastava M (2005) Some tests concerning the covariance matrix in high-dimensional data. J Japan Stat Soc 35(2):251–272. https://doi.org/10.14490/jjss.35.251 Srivastava M, Reid N (2012) Testing the structure of the covariance matrix with fewer observations than the dimension. J Multivar Anal 112:156–171. https://doi.org/10.1016/j.jmva.2012.06.004 Wang Q, Yao J (2013) On the sphericity test with large-dimensional observations. Electron J Stat 7:2164–2192. https://doi.org/10.1214/13-EJS842 Wang X, Xu G, Zheng S (2022) Adaptive tests for bandedness of high-dimensional covariance matrices arXiv:2204.11155 [stat.ME] Xiao H, Wu W (2013) Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stochastic Processes and Their Appl 123(7):2899–2920. https://doi.org/10.1016/j.spa.2013.03.012 Xu K (2017) Testing diagonality of high-dimensional covariance matrix under non-normality. J Stat Comput Simul 87(16):3208–3224. https://doi.org/10.1080/00949655.2017.1362405 Xu K, Hao X (2019) A nonparametric test for block-diagonal covariance structure in high dimension and small samples. J Multivar Anal 173:551–567. https://doi.org/10.1016/j.jmva.2019.05.001 Yamada Y, Hyodo M, Nishiyama T (2017) Testing block-diagonal covariance structure for high-dimensional data under non-normality. J Multivar Anal 155:305–316. https://doi.org/10.1016/j.jmva.2016.12.009 Yang Y, Pan G (2015) Independence test for high dimensional data based on regularized canonical correlation coefficients. Ann Stat 43(2):467–500. https://doi.org/10.1214/14-AOS1284 Yata K, Aoshima M (2016) High-dimensional inference on covariance structures via the extended cross-data-matrix methodology. J Multivar Anal 151:151–166. https://doi.org/10.1016/j.jmva.2016.07.011 Yu K, Li Q, Bergen AW et al (2009) Pathway analysis by adaptive combination of p-values. Genet Epidemiol 33(8):700–709. https://doi.org/10.1002/gepi.20422 Zhang W, Jin B, Bai Z (2021) Learning block structures in u-statistic based matrices. Biometrika 108(4):933–946. https://doi.org/10.1093/biomet/asaa099 Zhang X, Cheng G (2014) Bootstrapping high dimensional time series. statistics arXiv:1406.1037 [math.ST] Zheng S, He X, Guo J (2022) Hypothesis testing for block-structured correlation for high dimensional variables. Statistica Sinica 32. https://doi.org/10.5705/ss.202019.0319 Zhu Z, Kay SM (2016) The Rao test for testing bandedness of complex-valued covariance matrix. In: 2016 IEEE International Conference On Acoustics, Speech and Signal Processing (ICASSP). IEEE, Shanghai, China, pp 3960–3963, https://doi.org/10.1109/ICASSP.2016.7472420