Endogeneity in ultrahigh dimension

Fan, Jianqing and Liao, Yuan (2012): Endogeneity in ultrahigh dimension.

Preview

PDF
MPRA_paper_38698.pdf
Download (569kB) | Preview

Abstract

Most papers on high-dimensional statistics are based on the assumption that none of the regressors are correlated with the regression error, namely, they are exogenous. Yet, endogeneity arises easily in high-dimensional regression due to a large pool of regressors and this causes the inconsistency of the penalized least-squares methods and possible false scientic discoveries. A necessary condition for model selection of a very general class of penalized regression methods is given, which allows us to prove formally the inconsistency claim. To cope with the possible endogeneity, we construct a novel penalized focussed generalized method of moments (FGMM) criterion function and oer a new optimization algorithm. The FGMM is not a smooth function. To establish its asymptotic properties, we rst study the model selection consistency and an oracle property for a general class of penalized regression methods. These results are then used to show that the FGMM possesses an oracle property even in the presence of endogenous predictors, and that the solution is also near global minimum under the over-identication assumption. Finally, we also show how the semi-parametric efficiency of estimation can be achieved via a two-step approach.

Item Type:	MPRA Paper
Original Title:	Endogeneity in ultrahigh dimension
Language:	English
Keywords:	Focused GMM, Sparsity recovery, Endogenous variables, Oracle property, Conditional moment restriction, Estimating equation, Over identi cation, Global minimization, Semi-parametric efficiency
Subjects:	C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C13 - Estimation: General C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C52 - Model Evaluation, Validation, and Selection C - Mathematical and Quantitative Methods > C0 - General > C01 - Econometrics
Item ID:	38698
Depositing User:	Yuan Liao
Date Deposited:	10 May 2012 01:43
Last Modified:	26 Sep 2019 21:46
References:	Andrews, D. (1999). Consistent moment selection procedures for generalized method of moments estimation. Econometrica, 67 543-564 Andrews, D. and Lu, B. (2001). Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models. J. Econometrics, 101, 123-164 Antoniadis, A. (1996). Smoothing noisy data with tapered coi ets series. Scand. J. Stat., 23, 313-330 Belloni, A. and Chernozhukov, V. (2011a). Least squares after model selection in high-dimensional sparse models. Forthcoming in Bernoulli. Manuscript. MIT. Belloni, A. and Chernozhukov, V. (2011b). l1-penalized quantile regression in high dimensional sparse models. Ann. Statist., 39, 82-130. Bickel, P., Klaassen, C., Ritov, Y. and Wellner, J. (1998). Efficient and adaptive estimation for semiparametric models. Springer, New York. Bradic, J., Fan, J. and Wang, W. (2011). Penalized composite quasi-likelihood for ultrahigh-dimensional variable selection. J. R. Stat. Soc. Ser. B, 73, 325-349. B�uhlmann, P., Kalisch, M. and Maathuis, M. (2010). Variable selection in high-dimensional models: partially faithful distributions and the PC-simple algorithm. Biometrika, 97, 261-278 B�uhlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, New York. Caner, M. (2009). Lasso-type GMM estimator. Econometric Theory, 25 270-290 Caner, M. and Zhang,H. (2009). General estimating equations: model selection and estimation with diverging number of parameters. Manuscript, North Carolina State University Candes, E. and Tao, T. (2007). The Dantzig selector: statistical estimation when p is much larger than n. Ann. Statist., 35 2313-2404 Chamberlain, G. (1987). Asymptotic eciency in estimation with conditional moment restrictions. J. Econometrics, 34 305-334 Daubechies, I., Defrise, M. and De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math., 57, 1413-1457. Donald, S., Imbens, G. and Newey, W. (2003). Empirical likelihood estimation and consistent tests with conditional moment restrictions. J. Econometrics,117 55-93 Donoho, D. L. (2006). Compressed sensing. IEEE Trans. Inform. Theory 52, 1289{1306. Donoho, D. L. and Elad, E. (2003). Maximal sparsity representation via l1 Minimization, Proc. Nat. Aca. Sci., 100, 2197-2202. Engle, R., Hendry, D. and Richard, J. (1983). Exogeneity. Econometrica. 51, 277-304. Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc., 96 1348-1360 Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B, 70, 849-911. Fan, J. and Lv, J. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE Trans. Inform. Theory, 57,5467-5484. Fan, J. and Yao, Q. (1998). Ecient estimation of conditional variance functions in stochastic regression. Biometrika, 85, 645-660. Fu, W. (1998). Penalized regression: The bridge versus the LASSO. J. Comput. Graph. Statist., 7, 397-416. Gautier, E. and Tsybakov, A. (2011). High dimensional instrumental variables regression and condence sets. Manuscript. Hansen, B. (2010). Econometrics, Unpublished manuscript. University of Wisconsin. Hansen, L. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50 1029-1054 Horowitz, J. (1992). A smoothed maximum score estimator for the binary response model. Econometrica 60 505-531 Huang, J., Horowitz, J. and Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Statist. 36 587-613 Kitamura, Y., Tripathi, G. and Ahn, H. (2004). Empirical likelihood-based inference in conditional moment restriction models. Econometrica, 72 1667-1714 Liao, Z. (2010). Adaptive GMM shrinkage estimation with consistent moment selection. Manuscript. Yale University. Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat., 2, 90-102. Lv. J. and Fan, Y. (2009). A unied approach to model selection and sparse recovery using regularized least squares. Ann. Statist. 37 3498-3528 Newey, W. (1990). Semiparametric eciency bound J. Appl. Econometrics, 5 99-125 Newey, W. (1993). Efficient estimation of models with conditional moment restrictions, in Handbook of Statistics, Volume 11: Econometrics, ed. by G. S. Maddala, C. R. Rao, and H. D. Vinod. Amsterdam: North-Holland. Newey, W. and McFadden, D. (1994). Large sample estimation and hypothesis testing, in Handbook of Econometrics, Chapter 36, ed. by R. Engle and D. McFadden Owen, A. (1988). Empirical likelihood ratio condence intervals for a single functional. Biometrika, 75, 237-249. Raskutti, G., Wainwright, M. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over lq-balls. IEEE Trans. Inform. Theory, 57,6976-6994. St�adler, N., B�uhlmann, P. and van de Geer, S. (2010). l1-penalization for mixture regression models (with discussion). Test, 19, 209-256 Severini, T. and Tripathi, G. (2001). A simplied approach to computing efficiency bounds in semiparametric models. J. Econometrics, 102, 23-66. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B, 58 267-288 Verbeek, M. (2008). A guide to modern econometrics. 3rd edition. John Wiley and Sons, England. Wasserman L. and Roeder, K.(2009). High-dimensional variable selection. Ann. Statist., 37 2178-2201. Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38 894-942 Zhang, C. and Huang, J. (2008). The sparsity and bias of the Lasso selection in high dimensional linear models. Ann. Statist., 36 1567-1594. Zhang, C. and Zhang, T. (2012). A general theory of concave regularization for high dimensional sparse estimation problems/ Manuscript, Rutgers University. Zhang, T. (2010). Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res., 11 1087-1107. Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res., 7 2541-2563 Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc., 101, 1418-1429 Zou, H. and Hastie, t. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67 301-320 Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist., 36 1509-1533
URI:	https://mpra.ub.uni-muenchen.de/id/eprint/38698

All papers reproduced by permission. Reproduction and distribution subject to the approval of the copyright owners.

View Item