Zhu, Ying (2013): Sparse Linear Models and Two-Stage Estimation in High-Dimensional Settings with Possibly Many Endogenous Regressors.
Preview |
PDF
MPRA_paper_49846.pdf Download (953kB) | Preview |
Abstract
This paper explores the validity of the two-stage estimation procedure for sparse linear models in high-dimensional settings with possibly many endogenous regressors. In particular, the number of endogenous regressors in the main equation and the instruments in the first-stage equations can grow with and exceed the sample size n. The analysis concerns the exact sparsity case, i.e., the maximum number of non-zero components in the vectors of parameters in the first-stage equations, k1, and the number of non-zero components in the vector of parameters in the second-stage equation, k2, are allowed to grow with n but slowly compared to n. I consider the high-dimensional version of the two-stage least square estimator where one obtains the fitted regressors from the first-stage regression by a least square estimator with l_1-regularization (the Lasso or Dantzig selector) when the first-stage regression concerns a large number of instruments relative to n, and then construct a similar estimator using these fitted regressors in the second-stage regression. The main theoretical results of this paper are non-asymptotic bounds from which I establish sufficient scaling conditions on the sample size for estimation consistency in l_2-norm and variable-selection consistency (i.e., the two-stage high-dimensional estimators correctly select the non-zero coefficients in the main equation with high probability). A technical issue regarding the so-called "restricted eigenvalue (RE) condition" for estimation consistency and the "mutual incoherence (MI) condition" for selection consistency arises in the two-stage estimation from allowing the number of regressors in the main equation to exceed n and this paper provides analysis to verify these RE and MI conditions. Depending on the underlying assumptions that are imposed, the upper bounds on the l_2-error and the sample size required to obtain these consistency results differ by factors involving k1 and/or k2. Simulations are conducted to gain insight on the finite sample performance of the high-dimensional two-stage estimator.
Item Type: | MPRA Paper |
---|---|
Original Title: | Sparse Linear Models and Two-Stage Estimation in High-Dimensional Settings with Possibly Many Endogenous Regressors |
Language: | English |
Keywords: | High-dimensional statistics; Lasso; sparse linear models; endogeneity; two-stage estimation |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C13 - Estimation: General C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C31 - Cross-Sectional Models ; Spatial Models ; Treatment Effect Models ; Quantile Regressions ; Social Interaction Models C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C36 - Instrumental Variables (IV) Estimation |
Item ID: | 49846 |
Depositing User: | Ms Ying Zhu |
Date Deposited: | 18 Sep 2013 12:45 |
Last Modified: | 09 Oct 2019 04:52 |
References: | [1] Ackerberg, D. A., and G. S. Crawford (2009). "Estimating Price Elasticities in Differentiated Product Demand Models with Endogenous Characteristics". Working Paper. [2] Amemiya, T. (1974). "The Non-Linear Two-Stage Least Squares Estimator". Journal of Econometrics, 2, 105-110. [3] Angrist, J. D., and A. B. Krueger (1991). "Does Compulsory School Attendance Affect Schooling and Earnings?" Quarterly Journal of Economics, 106, 979-1014. [4] Benkard, C. L., and P. Bajari (2005). "Hedonic Price Indexes with Unobserved Product Characteristics, and Application to Personal Computers". Journal of Business and Economic Statistics, 23, 61-75. [5] Belloni, A., V. Chernozhukov, and L. Wang (2011). "Square-Root Lasso: Pivotal Recovery of Sparse Signals Via Conic Programming". Biometrika, 98(4): 791-806. [6] Belloni, A., and V. Chernozhukov (2011a). "L1-Penalized Quantile Regression in High-Dimensional Sparse Models". The Annals of Statistics, 39, 82-130. [7] Belloni, A., and V. Chernozhukov (2011b). "High Dimensional Sparse Econometric Models: an Introduction", in: Inverse Problems and High Dimensional Estimation, Stats in the Château 2009, Alquier, P., E. Gautier, and G. Stoltz, Eds., Lecture Notes in Statistics, 203, 127-162, Springer, Berlin. [8] Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen (2012). "Sparse Models and Methods for Instrumental Regression, with an Application to Eminent Domain". Econometrica, 80, 2369-2429. [9] Belloni, A., V. Chernozhukov, and C. Hansen (2012). "Inference on Treatment Effects after Selection amongst High-Dimensional Controls". Working Paper. cemmap. [10] Belloni, A., and V. Chernozhukov (2013). "Least Squares after Model Selection in High-Dimensional Sparse Models". Bernoulli, 19, 521-547. [11] Berry, S. T., J. A. Levinsohn, and A. Pakes (1995). "Automobile Prices in Market Equilibrium". Econometrica, 63, 841-890. [12] Bickel, P., J. Y. Ritov, and A. B. Tsybakov (2009). "Simultaneous Analysis of Lasso and Dantzig Selector". The Annals of Statistics, 37, 1705-1732. [13] Bühlmann, P., and S. A. van de Geer (2011). Statistics for High-Dimensional Data. Springer, New-York. [14] Caner, M. (2009). "LASSO Type GMM Estimator". Econometric Theory, 25, 1-23. [15] Candès, E., and T. Tao (2007). "The Dantzig Selector: Statistical Estimation when p is Much Larger Than n". The Annals of Statistics, 35, 2313-2351. [16] Carrasco, M., and J. P. Florens (2000). "Generalization of GMM to a Continuum of Moment Conditions". Econometric Theory, 16, 797-834. [17] Carrasco, M. (2012). "A Regularization Approach to the Many Instruments Problem". Journal of Econometrics, 170, 383-398. [18] Dalalyan, A., and A. B. Tsybakov (2008). "Aggregation by Exponential Weighting, Sharp PAC-Bayesian Bounds and Sparsity". Journal of Machine Learning Research, 72, 39-61. [19] Donoho, D. L., M. Elad, and V. N. Temlyakov (2006). "Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise". IEEE Transactions on Information Theory, 52, 6-18. [20] Fan, J., and R. Li (2001). "Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties". Journal of American Statistical Association, 96, 1348-1360. [21] Fan, J., and Y. Liao (2011). "Ultra High Dimensional Variable Selection with Endogenous Covariates". Manuscript. Princeton University. [22] Fan, J., and J. Lv (2010). "A Selective Overview of Variable Selection in High Dimensional Feature Space". Statistica Sinica, 20, 101-148. [23] Fan, J., and J. Lv (2011). "Non-Concave Penalized Likelihood with NP-Dimensionality". IEEE Transactions on Information Theory, 57, 5467-5484. [24] Fan, J., J. Lv, and L, Qi (2011). "Sparse High Dimensional Models in Economics". Annual Review of Economics, 3, 291-317. [25] Garen, J. (1984), "The Returns to Schooling: A Selectivity Bias Approach with a Continuous Choice Variable". Econometrica, 52, 1199-1218. [26] Gautier, E., and A. B. Tsybakov (2011). "High-dimensional Instrumental Variables Regression and Confidence Sets". Manuscript. CREST (ENSAE). [27] Hansen, C., J. Hausman, and W. K. Newey (2008). "Estimation with Many Instrumental Variables". Journal of Business and Economic Statistics, 26, 398-422. [28] Koltchinskii, V. (2009). "The Dantzig Selector and Sparsity Oracle Inequalities". Bernoulli, 15, 799-828. [29] Koltchinskii, V. (2011). "Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems". Forthcoming in Lecture Notes in Mathematics, Springer, Berlin. [30] Ledoux, M. (2001). The concentration of measure phenomenon. Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI. [31] Ledoux, M., and M. Talagrand (1991). Probability in Banach Spaces: Isoperimetry and Processes. Springer-Verlag, New York, NY. [32] Lin, Y., and H. H. Zhang (2006). "Component Selection and Smoothing in Multivariate Nonparametric Regression". The Annals of Statistics, 34(5): 2272-2297. [33] Loh, P., and M. Wainwright (2012). "High-Dimensional Regression with Noisy and Missing data: Provable Guarantees with Non-convexity". Annals of Statistics, 40(3): 1637-1664. [34] Lounici, K. (2008). "Sup-Norm Convergence Rate and Sign Concentration Property of the Lasso and Dantzig Selector". Electronic Journal of Statistics, 2, 90-102. [35] Manresa, E. (2013). "Recovery of Networks using Panel Data". Working paper. CEMFI. [36] Meinshausen, N., and B. Yu (2009). "Lasso-type Recovery of Sparse Representations for Highdimensional Data". The Annals of Statistics, 37(1): 246-270. [37] Negahban, S., P. Ravikumar, M. J. Wainwright, and B. Yu (2012). "A Unified Framework for High-Dimensional Analysis of M-Estimators with Decomposable Regularizers". Statistical Science, 27, 538-557. [38] Nevo, A. (2001). "Measuring Market Power in the Ready-to-Eat Cereal Industry". Econometrica, 69, 307-342. [39] Ravikumar, P., H. Liu, J. Lafferty, and L. Wasserman (2009). "Sparse Additive Models". Journal of the Royal Statistical Society, Series B, 71, 1009-1030. [40] Ravikumar, P., M. J. Wainwright, and J. Lafferty (2010). "High-dimensional Ising Model Selection Using l1-Regularized Logistic Regression". The Annals of Statistics, 38(3): 1287-1319. [41] Raskutti, G., M. J. Wainwright, and B. Yu (2010). "Restricted Eigenvalue Conditions for Correlated Gaussian Designs". Journal of Machine Learning Research, 11: 2241-2259. [42] Raskutti, G., M. J. Wainwright, and B. Yu (2011). "Minimax Rates of Estimation for Highdimensional Linear Regression over lq-Balls". IEEE Trans. Information Theory, 57(10): 6976-6994. [43] Rigollet, P., and A. B. Tsybakov (2011). "Exponential Screening and Optimal Rates of Sparse Estimation". The Annals of Statistics, 35, 731-771. [44] Rosenbaum, M., and A. B. Tsybakov (2010). "Sparse Recovery Under Matrix Uncertainty". The Annals of Statistics, 38, 2620-2651. [45] Rosenbaum, M., and A. B. Tsybakov (2013). "Improved Matrix Uncertainty Selector", in: From Probability to Statistics and Back: High-Dimensional Models and Processes - A Festschrift in Honor of Jon A. Wellner, Banerjee, M. et al. Eds, IMS Collections, 9, 276-290, Institute of Mathematical Statistics. [46] M. Rudelson, and S. Zhou (2011). "Reconstruction from Anisotropic Random Measurements". Technical report, University of Michigan. [47] Sala-i-Martin, X. (1997). "I Just Ran Two Million Regressions". The American Economic Review, 87, 178-183. [48] Tibshirani, R. (1996). "Regression Shrinkage and Selection via the Lasso". Journal of the Royal Statistical Society, Series B, 58(1): 267-288. [49] Vershynin, R. (2012). "Introduction to the Non-Asymptotic Analysis of Random Matrices", in Eldar, Y. and G. Kutyniok, Eds, Compressed Sensing: Theory and Applications, 210-268, Cambridge. [50] Wainwright, J. M. (2009). "Sharp Thresholds for High-dimensional and Noisy Sparsity Recovery Using l1- Constrained Quadratic Programming (Lasso)". IEEE Trans. Information Theory, 55: 2183-2202. [51] Wainwright, J. M. (2014). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. In preparation. University of California, Berkeley. [52] Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge. [53] Ye, F., and C.-H. Zhang (2010). "Rate Minimaxity of the Lasso and Dantzig Selector for the lq Loss in lr Balls". Journal of Machine Learning Research, 11, 3519-3540. [54] Zhao, P., and Yu, B. (2007). On model selection consistency of Lasso. Journal of Machine Learning Research, 7, 2541-2567. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/49846 |
Available Versions of this Item
- Sparse Linear Models and Two-Stage Estimation in High-Dimensional Settings with Possibly Many Endogenous Regressors. (deposited 18 Sep 2013 12:45) [Currently Displayed]