Bodnar, Taras and Dette, Holger and Parolya, Nestor (2019): Testing for independence of large dimensional vectors. Published in: The Annals of Statistics , Vol. 47, No. 5 (3 August 2019): pp. 29773008.

PDF
MPRA_paper_97997.pdf Download (1MB)  Preview 
Abstract
In this paper, new tests for the independence of two highdimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variancetype statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak convergence of linear spectral statistics of central and (conditionally) noncentral Fisher matrices. In particular, a central limit theorem for linear spectral statistics of large dimensional(conditionally) noncentral Fisher matrices is derived which is then used to analyse the power of the tests under the alternative. The theoretical results are illustrated by means of a simulation study where we also compare the new tests with several alternatives, in particular with the commonly used corrected likelihood ratio test. It is demonstrated that the latter test does not keep its nominal level if the dimension of one subvector is relatively small compared to the dimension of the other subvector.On the other hand, the tests proposed in this paper provide a reasonable approximation of the nominal level in such situations. Moreover, we observe that one of the proposed tests is most powerful under a variety of correlation scenarios.
Item Type:  MPRA Paper 

Original Title:  Testing for independence of large dimensional vectors 
Language:  English 
Keywords:  Testing for independence, large dimensional covariance matrix, noncentral Fisher random matrix, linear spectral statistics, asymptotic normality 
Subjects:  C  Mathematical and Quantitative Methods > C1  Econometric and Statistical Methods and Methodology: General > C12  Hypothesis Testing: General C  Mathematical and Quantitative Methods > C1  Econometric and Statistical Methods and Methodology: General > C18  Methodological Issues: General 
Item ID:  97997 
Depositing User:  Dr. Nestor Parolya 
Date Deposited:  08 Jan 2020 14:23 
Last Modified:  08 Jan 2020 14:23 
References:  Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Wiley Interscience, Hoboken, NJ. Bai, Z. D. and Silverstein, J. W. (2004). CLT for linear spectral statistics of largedimensional sample covariance matrices. Ann. Probab. 32 553–605. Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer Series in Statistics. Springer, New York. Bai, Z., Jiang, D., Yao, J.F. and Zheng, S. (2009). Corrections to LRT on largedimensional covariance matrix by RMT. Ann. Statist. 37 3822–3840. Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227. Birke, M. and Dette, H. (2005). A note on testing the covariance matrix for large dimension. Statist. Probab. Lett. 74 281–289. Bodnar, T., Dette, H. and Parolya, N. (2019). Supplement to “Testing for independence of large dimensional vectors.” DOI:10.1214/18AOS1771SUPP. Bodnar, T., Gupta, A. K. and Parolya, N. (2014). On the strong convergence of the optimal linear shrinkage estimator for large dimensional covariance matrix. J. Multivariate Anal. 132 215–228. Bodnar, T., Gupta, A. K. and Parolya, N. (2016). Direct shrinkage estimation of large dimensional precision matrix. J. Multivariate Anal. 146 223–236. Cai, T., Liu, W. and Luo, X. (2011). A constrained ℓ1 minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607. Cai, T. T., Ren, Z. and Zhou, H. H. (2013). Optimal rates of convergence for estimating Toeplitz covariance matrices. Probab. Theory Related Fields 156 101–143. Cai, T. T. and Shen, X., eds. (2011). HighDimensional Data Analysis. Frontiers of Statistics 2. World Scientific Co. Pte. Ltd., Singapore; Higher Education Press, Beijing. Cai, T. T. and Zhou, H. H. (2012). Minimax estimation of large covariance matrices under ℓ1norm. Statist. Sinica 22 1319–1349. Chen, S. X., Zhang, L.X. and Zhong, P.S. (2010). Tests for highdimensional covariance matrices. J. Amer. Statist. Assoc. 105 810–819. Devijver, E. and Gallopin, M. (2018). Blockdiagonal covariance selection for highdimensional Gaussian graphical models. J. Amer. Statist. Assoc. 113 306–314. Dozier, R. B. and Silverstein, J. W. (2007). On the empirical distribution of eigenvalues of large dimensional informationplusnoisetype matrices. J. Multivariate Anal. 98 678–694. Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In International Congress of Mathematicians. Vol. III 595–622. Eur. Math. Soc., Zürich. Fisher, R. A. (1939). The sampling distribution of some statistics obtained from nonlinear equations. Ann. Eugenics 9 238–249. Fisher, T. J., Sun, X. and Gallagher, C. M. (2010). A new test for sphericity of the covariance matrix for high dimensional data. J. Multivariate Anal. 101 2554–2570. Fujikoshi, Y., Himeno, T. and Wakaki, H. (2004). Asymptotic results of a high dimensional Manova test and power comparison when the dimension is large compared to the sample size. J. Japan Statist. Soc. 34 19–26. Gupta, A. K. and Xu, J. (2006). On some tests of the covariance matrix under general conditions. Ann. Inst. Statist. Math. 58 101–114. Hyodo, M., Shutoh, N., Nishiyama, T. and Pavlenko, T. (2015). Testing blockdiagonal covariance structure for highdimensional data. Stat. Neerl. 69 460–482. Jiang, D., Bai, Z. and Zheng, S. (2013). Testing the independence of sets of largedimensional variables. Sci. China Math. 56 135–147. Jiang, T. and Yang, F. (2013). Central limit theorems for classical likelihood ratio tests for highdimensional normal distributions. Ann. Statist. 41 2029–2074. John, S. (1971). Some optimal multivariate tests. Biometrika 58 123–127. Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327. Johnstone, I. (2006). High dimensional statistical inference and random matrices. Instituto de Ciencias Matemáticas (ICMAT). Available at: http://www.icm2006.org/proceedings/Vol_I/17.pdf. Johnstone, I. M. (2008). Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann. Statist. 36 2638–2716. Kakizawa, Y. and Iwashita, T. (2008). A comparison of higherorder local powers of a class of oneway MANOVA tests under general distributions. J. Multivariate Anal. 99 1128–1153. Ledoit, O. and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist. 30 1081–1102. Ledoit, O. and Wolf, M. (2012). Nonlinear shrinkage estimation of largedimensional covariance matrices. Ann. Statist. 40 1024–1060. Markowitz, H. (1952). Portfolio selection. J. Finance 7 77–91. Mauchly, J. W. (1940). Significance test for sphericity of a normal nvariate distribution. Ann. Math. Stat. 11 204–209. Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Mathematical Statistics. Wiley, New York. Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist. 1 700–709. Pillai, K. C. S. and Jayachandran, K. (1967). Power comparisons of tests of two multivariate hypotheses based on four criteria. Biometrika 54 195–210. Schott, J. R. (2007). Some highdimensional tests for a oneway MANOVA. J. Multivariate Anal. 98 1825–1839. Digital Object Identifier: doi:10.1016/j.jmva.2006.11.007 Wang, C., Pan, G., Tong, T. and Zhu, L. (2015). Shrinkage estimation of large dimensional precision matrix using random matrix theory. Statist. Sinica 25 993–1008. Yamada, Y., Hyodo, M. and Nishiyama, T. (2017). Testing blockdiagonal covariance structure for highdimensional data under nonnormality. J. Multivariate Anal. 155 305–316. Yang, Y. and Pan, G. (2015). Independence test for high dimensional data based on regularized canonical correlation coefficients. Ann. Statist. 43 467–500. Yao, J. (2013). Estimation and fluctuations of functionals of large random matrices. Telecom ParisTech, tel00909521v1. Yao, J., Zheng, S. and Bai, Z. (2015). Large Sample Covariance Matrices and HighDimensional Data Analysis. Cambridge Series in Statistical and Probabilistic Mathematics 39. Cambridge Univ. Press, New York. Zheng, S. (2012). Central limit theorems for linear spectral statistics of large dimensional Fmatrices. Ann. Inst. Henri Poincaré Probab. Stat. 48 444–476. Zheng, S., Bai, Z. and Yao, J. (2015a). CLT for linear spectral statistics of a rescaled sample precision matrix. Zheng, S., Bai, Z. and Yao, J. (2015b). Substitution principle for CLT of linear spectral statistics of highdimensional sample covariance matrices with applications to hypothesis testing. Ann. Statist. 43 546–591. Zheng, S., Bai, Z. and Yao, J. (2017). CLT for eigenvalue statistics of largedimensional general Fisher matrices with applications. Bernoulli 23 1130–1178. 
URI:  https://mpra.ub.unimuenchen.de/id/eprint/97997 