Efficient Estimation of Approximate Factor Models

Bai, Jushan and Liao, Yuan (2012): Efficient Estimation of Approximate Factor Models.

There is a more recent version of this item available.

Preview

PDF
MPRA_paper_41558.pdf
Download (524kB) | Preview

Abstract

We study the estimation of a high dimensional approximate factor model in the presence of both cross sectional dependence and heteroskedasticity. The classical method of principal components analysis (PCA) does not efficiently estimate the factor loadings or common factors because it essentially treats the idiosyncratic error to be homoskedastic and cross sectionally uncorrelated. For the efficient estimation, it is essential to estimate a large error covariance matrix. We assume the model to be conditionally sparse, and propose two approaches to estimating the common factors and factor loadings; both are based on maximizing a Gaussian quasi-likelihood and involve regularizing a large covariance sparse matrix. In the first approach the factor loadings and the error covariance are estimated separately while in the second approach they are estimated jointly. Extensive asymptotic analysis has been carried out. In particular, we develop the inferential theory for the two-step estimation. Because the proposed approaches take into account the large error covariance matrix, they produce more efficient estimators than the classical PCA methods or methods based on a strict factor model.

Item Type:	MPRA Paper
Original Title:	Efficient Estimation of Approximate Factor Models
Language:	English
Keywords:	High dimensionality; unknown factors; principal components; sparse matrix; conditional sparse; thresholding; cross-sectional correlation; penalized maximum likelihood; adaptive lasso; heteroskedasticity
Subjects:	C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C31 - Cross-Sectional Models ; Spatial Models ; Treatment Effect Models ; Quantile Regressions ; Social Interaction Models C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C33 - Panel Data Models ; Spatio-temporal Models C - Mathematical and Quantitative Methods > C0 - General > C01 - Econometrics
Item ID:	41558
Depositing User:	Yuan Liao
Date Deposited:	26 Sep 2012 14:27
Last Modified:	08 Oct 2019 21:11
References:	Alessi, L., Barigozzi, M. and Capassoc, M. (2010). Improved penalization for determining the number of factors in approximate factor models. Statistics and Probability Letters, 80, 1806-1813. Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica. 71 135-171. Bai, J. and Li, K. (2012). Statistical analysis of factor models of high dimension. Ann. Statist. 40, 436-465. Bai, J. and Ng, S.(2002). Determining the number of factors in approximate factor models. Econometrica. 70 191-221. Bickel, P. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 2577-2604. Bickel, P. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199-227. Bien, J. and Tibshirani, R. (2011). Sparse estimation of a covariance matrix. Biometrika. 98, 807-820. Breitung, J. and Tenhofen, J. (2011). GLS estimation of dynamic factor models. J. Amer. Statist. Assoc. 106, 1150-1166. Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106, 672-684. Cai, T. and Yuan, M. (2012). Adaptive covariance matrix estimation through block thresholding. Forthcoming in Ann. Statist. Cai, T. and Zhou, H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. Forthcoming in Ann. Statist. Caner, M. and Fan, M. (2011). A near minimax risk bound: adaptive lasso with heteroskedastic data in instrumental variable selection. Manuscript. North Carolina State University. Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure and mean-variance analysis in large asset markets. Econometrica. 51 1305-1324. Choi, I. (2012). Ecient estimation of factor models. Econometric Theory. 28 274-308. Deng, X. and Tsui, K. (2010). Penalized covariance matrix estimation using a matrix logarithm transformation. Manuscript, University of Wisconsin-Madison Dias, F., Pinherio, M. and Rua, A. (2008). Determining the number of factors in approximate factor models with global and group-specic factors. Manuscript. Technical University of Lisbon. Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics. 147, 186-197. Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360 Fan, J., Liao, Y. and Mincheva, M. (2012). Large covariance estimation by thresholding principal orthogonal complements. Manuscript. Princeton University. Forni, M., Hallin, M., Lippi, M. and Reichlin, L. (2000). The generalized dynamic factor model: identication and estimation. Review of Economics and Statistics. 82 540-554. Forni, M. and Lippi, M. (2001). The generalized dynamic factor model: representation theory. Econometric Theory. 17 1113-1141. van de Geer, S., B�uhlmann, P. and Zhou, S. (2011). The adaptive and the thresholded Lasso for potentially misspecied models (and a lower bound for the Lasso). Electronic Journal of Statistics. 5, 688-749. Hallin, M. and Liska, R. (2007). Determining the number of factors in the general dynamic factor model. J. Amer. Statist. Assoc. 102, 603-617. Han, X. (2012). Determining the number of factors with potentially strong cross-sectional correlation in idiosyncratic shocks. Manuscript. North Carolina State University Huang, J., Ma, S. and Zhang, C. (2006). Adaptive lasso for sparse high-dimensional regression models. Manuscript. University of Iowa. Johnstone, I.M. and Lu, A.Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. Jour. Ameri. Statist. Assoc., 104, 682-693. Jung, S. and Marron, J.S. (2009). PCA consistency in high dimension, low sample size context. Ann. Statist., 37, 4104-4130. Kapetanios, G. (2010). A testing procedure for determining the number of factors in approximate factor models with large datasets. Journal of Business and Economic Statistics. 28, 397-409. El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 2757-2790. Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254-4278. Lam, C. and Yao, Q. (2012). Factor modelling for high-dimensional time series: inference for the number of factors. Forthcoming in Ann. Statist. Lawley, D. and Maxwell, A. (1971). Factor analysis as a statistical method. Second ed. London, Butterworths. Ledoit, O. and Wolf, M. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Statist. 40, 1024-1060. Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics. 92, 1004-1016. Pati, D., Bhattacharya, A., Pillai, N. and Dunson, D. (2012) Posterior contraction in sparse Bayesian factor models for massive covariance matrices. Manuscript, Duke University Ravikumar, P., Wainwright, M., Raskutti, G. and Yu, B. (2011), High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence, Electronic Journal of Statistics. 5 935-980. Rohde, A. and Tsybakov, A. (2011), Estimation of high-dimensional low-rank matrices. Ann. Statist. 39 887-930. Rothman, A., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177-186. Stock, J. and Watson, M. (1998). Diusion Indexes, NBER Working Paper 6702. Stock, J. and Watson, M. (2002). Forecasting using principal components from a large number of predictors. J. Amer. Statist. Assoc. 97, 1167-1179. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Ser. B, 58 267-288 Wang, P. (2009). Large dimensional factor models with a multi-level factor structure: identification, estimation and inference. Manuscript. Hong Kong University of Science and Technology. Witten, D.M., Tibshirani, R. and Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10, 515-534. Xue, L. and Zou, H. (2012). Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Forthcoming in Ann. Statist. Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Machine Learning Research. 2010, 2261-2286 Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38 894-942 Zhou, S., R�utimann, P., Xu, M. and B�uhlmann, P. (2011), High-dimensional covariance estimation based on Gaussian graphical models. Journal of Machine Learning Research. 12, 2975-3026. Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101, 1418-1429
URI:	https://mpra.ub.uni-muenchen.de/id/eprint/41558

Available Versions of this Item

Efficient Estimation of Approximate Factor Models. (deposited 26 Sep 2012 14:27) [Currently Displayed]
- Efficient Estimation of Approximate Factor Models via Regularized Maximum Likelihood. (deposited 01 Oct 2012 13:40)

All papers reproduced by permission. Reproduction and distribution subject to the approval of the copyright owners.

View Item