Bai, Jushan and Liao, Yuan (2012): Efficient Estimation of Approximate Factor Models.
There is a more recent version of this item available. 

PDF
MPRA_paper_41558.pdf Download (524kB)  Preview 
Abstract
We study the estimation of a high dimensional approximate factor model in the presence of both cross sectional dependence and heteroskedasticity. The classical method of principal components analysis (PCA) does not efficiently estimate the factor loadings or common factors because it essentially treats the idiosyncratic error to be homoskedastic and cross sectionally uncorrelated. For the efficient estimation, it is essential to estimate a large error covariance matrix. We assume the model to be conditionally sparse, and propose two approaches to estimating the common factors and factor loadings; both are based on maximizing a Gaussian quasilikelihood and involve regularizing a large covariance sparse matrix. In the first approach the factor loadings and the error covariance are estimated separately while in the second approach they are estimated jointly. Extensive asymptotic analysis has been carried out. In particular, we develop the inferential theory for the twostep estimation. Because the proposed approaches take into account the large error covariance matrix, they produce more efficient estimators than the classical PCA methods or methods based on a strict factor model.
Item Type:  MPRA Paper 

Original Title:  Efficient Estimation of Approximate Factor Models 
Language:  English 
Keywords:  High dimensionality; unknown factors; principal components; sparse matrix; conditional sparse; thresholding; crosssectional correlation; penalized maximum likelihood; adaptive lasso; heteroskedasticity 
Subjects:  C  Mathematical and Quantitative Methods > C3  Multiple or Simultaneous Equation Models; Multiple Variables > C31  CrossSectional Models; Spatial Models; Treatment Effect Models; Quantile Regressions; Social Interaction Models C  Mathematical and Quantitative Methods > C3  Multiple or Simultaneous Equation Models; Multiple Variables > C33  Models with Panel Data; Longitudinal Data; Spatial Time Series C  Mathematical and Quantitative Methods > C0  General > C01  Econometrics 
Item ID:  41558 
Depositing User:  Yuan Liao 
Date Deposited:  26. Sep 2012 14:27 
Last Modified:  13. Feb 2013 19:07 
References:  Alessi, L., Barigozzi, M. and Capassoc, M. (2010). Improved penalization for determining the number of factors in approximate factor models. Statistics and Probability Letters, 80, 18061813. Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica. 71 135171. Bai, J. and Li, K. (2012). Statistical analysis of factor models of high dimension. Ann. Statist. 40, 436465. Bai, J. and Ng, S.(2002). Determining the number of factors in approximate factor models. Econometrica. 70 191221. Bickel, P. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 25772604. Bickel, P. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199227. Bien, J. and Tibshirani, R. (2011). Sparse estimation of a covariance matrix. Biometrika. 98, 807820. Breitung, J. and Tenhofen, J. (2011). GLS estimation of dynamic factor models. J. Amer. Statist. Assoc. 106, 11501166. Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106, 672684. Cai, T. and Yuan, M. (2012). Adaptive covariance matrix estimation through block thresholding. Forthcoming in Ann. Statist. Cai, T. and Zhou, H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. Forthcoming in Ann. Statist. Caner, M. and Fan, M. (2011). A near minimax risk bound: adaptive lasso with heteroskedastic data in instrumental variable selection. Manuscript. North Carolina State University. Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure and meanvariance analysis in large asset markets. Econometrica. 51 13051324. Choi, I. (2012). Ecient estimation of factor models. Econometric Theory. 28 274308. Deng, X. and Tsui, K. (2010). Penalized covariance matrix estimation using a matrix logarithm transformation. Manuscript, University of WisconsinMadison Dias, F., Pinherio, M. and Rua, A. (2008). Determining the number of factors in approximate factor models with global and groupspecic factors. Manuscript. Technical University of Lisbon. Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics. 147, 186197. Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 13481360 Fan, J., Liao, Y. and Mincheva, M. (2012). Large covariance estimation by thresholding principal orthogonal complements. Manuscript. Princeton University. Forni, M., Hallin, M., Lippi, M. and Reichlin, L. (2000). The generalized dynamic factor model: identication and estimation. Review of Economics and Statistics. 82 540554. Forni, M. and Lippi, M. (2001). The generalized dynamic factor model: representation theory. Econometric Theory. 17 11131141. van de Geer, S., B�uhlmann, P. and Zhou, S. (2011). The adaptive and the thresholded Lasso for potentially misspecied models (and a lower bound for the Lasso). Electronic Journal of Statistics. 5, 688749. Hallin, M. and Liska, R. (2007). Determining the number of factors in the general dynamic factor model. J. Amer. Statist. Assoc. 102, 603617. Han, X. (2012). Determining the number of factors with potentially strong crosssectional correlation in idiosyncratic shocks. Manuscript. North Carolina State University Huang, J., Ma, S. and Zhang, C. (2006). Adaptive lasso for sparse highdimensional regression models. Manuscript. University of Iowa. Johnstone, I.M. and Lu, A.Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. Jour. Ameri. Statist. Assoc., 104, 682693. Jung, S. and Marron, J.S. (2009). PCA consistency in high dimension, low sample size context. Ann. Statist., 37, 41044130. Kapetanios, G. (2010). A testing procedure for determining the number of factors in approximate factor models with large datasets. Journal of Business and Economic Statistics. 28, 397409. El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 27572790. Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 42544278. Lam, C. and Yao, Q. (2012). Factor modelling for highdimensional time series: inference for the number of factors. Forthcoming in Ann. Statist. Lawley, D. and Maxwell, A. (1971). Factor analysis as a statistical method. Second ed. London, Butterworths. Ledoit, O. and Wolf, M. (2012). Nonlinear shrinkage estimation of largedimensional covariance matrices. Ann. Statist. 40, 10241060. Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics. 92, 10041016. Pati, D., Bhattacharya, A., Pillai, N. and Dunson, D. (2012) Posterior contraction in sparse Bayesian factor models for massive covariance matrices. Manuscript, Duke University Ravikumar, P., Wainwright, M., Raskutti, G. and Yu, B. (2011), Highdimensional covariance estimation by minimizing l1penalized logdeterminant divergence, Electronic Journal of Statistics. 5 935980. Rohde, A. and Tsybakov, A. (2011), Estimation of highdimensional lowrank matrices. Ann. Statist. 39 887930. Rothman, A., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177186. Stock, J. and Watson, M. (1998). Diusion Indexes, NBER Working Paper 6702. Stock, J. and Watson, M. (2002). Forecasting using principal components from a large number of predictors. J. Amer. Statist. Assoc. 97, 11671179. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Ser. B, 58 267288 Wang, P. (2009). Large dimensional factor models with a multilevel factor structure: identification, estimation and inference. Manuscript. Hong Kong University of Science and Technology. Witten, D.M., Tibshirani, R. and Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10, 515534. Xue, L. and Zou, H. (2012). Regularized rankbased estimation of highdimensional nonparanormal graphical models. Forthcoming in Ann. Statist. Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Machine Learning Research. 2010, 22612286 Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38 894942 Zhou, S., R�utimann, P., Xu, M. and B�uhlmann, P. (2011), Highdimensional covariance estimation based on Gaussian graphical models. Journal of Machine Learning Research. 12, 29753026. Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101, 14181429 
URI:  http://mpra.ub.unimuenchen.de/id/eprint/41558 
Available Versions of this Item
 Efficient Estimation of Approximate Factor Models. (deposited 26. Sep 2012 14:27) [Currently Displayed]