Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments

Zhu, Ying (2015): Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments.

This is the latest version of this item.

Preview

PDF
MPRA_paper_65703.pdf
Download (671kB) | Preview

Abstract

We explore the validity of the 2-stage least squares estimator with l1−regularization in both stages, for linear models where the numbers of endogenous regressors in the main equation and instruments in the first-stage equations can exceed the sample size, and the regression coefficients belong to lq−“balls” for q in [0, 1], covering both exact and approximate sparsity cases. Standard high-level assumptions on the Gram matrix for l2−consistency require careful verifications in the two-stage procedure, for which we provide detailed analysis. We establish finite-sample bounds and conditions for our estimator to achieve l2−consistency and variable selection consistency. Practical guidance for choosing the regularization parameters is provided.

Item Type:	MPRA Paper
Original Title:	Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments
Language:	English
Keywords:	High-dimensional statistics; Lasso; sparse linear models; endogeneity; two-stage estimation
Subjects:	C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C13 - Estimation: General C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C31 - Cross-Sectional Models ; Spatial Models ; Treatment Effect Models ; Quantile Regressions ; Social Interaction Models C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C36 - Instrumental Variables (IV) Estimation
Item ID:	65703
Depositing User:	Ms Ying Zhu
Date Deposited:	23 Jul 2015 09:22
Last Modified:	30 Sep 2019 04:46
References:	Allen, D. M. (1974). “The relationship between variable selection and data argumentation and a method of prediction.” Technometrics, 16, 125-127. Amemiya, T. (1974). “The non-linear two-stage least squares estimator.” Journal of Econometrics, 2, 105-110. Antoniadis, A. (2010). “Comments on: l1 −penalization for mixture regression models.” Test, 19, 257-258. Bach, F. (2008). “Bolasso: model consistent Lasso estimation through the bootstrap.” Proceed ings of the 25th international conference on Machine learning. Belloni, A., V. Chernozhukov, and L. Wang (2011). “Square-root Lasso: pivotal recovery of sparse signals via conic programming.” Biometrika, 98, 791-806. Belloni, A. and V. Chernozhukov (2011a). “L1-penalized quantile regression in high-dimensional sparse models.” The Annals of Statistics, 39, 82-130. Belloni, A. and V. Chernozhukov (2011b). “High dimensional sparse econometric models: an in troduction”, in: Inverse problems and high dimensional estimation, Stats in the Château 2009, Alquier, P., E. Gautier, and G. Stoltz, Eds., Lecture Notes in Statistics, 203, 127-162, Springer, Berlin. Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen (2012). “Sparse models and methods for instrumental regression, with an application to eminent domain.” Econometrica, 80, 2369- 2429. Belloni, A. and V. Chernozhukov (2013). “Least squares after model selection in high-dimensional sparse models.” Bernoulli, 19, 521-547. Bickel, P., J. Y. Ritov, and A. B. Tsybakov (2009). “Simultaneous analysis of Lasso and Dantzig selector.” The Annals of Statistics, 37, 1705-1732. Breiman, L. (1995). “Better subset regression using the nonnegative garrote.” Technometrics, 37. Breiman, L. (1996). “Heuristics of instability and stabilization in model selection.” The Annals of Statistics, 24, 2350-2383. Breiman, L. (2001). “Statistical modeling: the two cultures.” Statistical Science, 16, 199-231. Bühlmann, P. and S. A. van de Geer (2011). Statistics for high-dimensional data. Springer, New- York. Caner, M. (2009). “Lasso type GMM estimator.” Econometric Theory, 25, 1-23. Candès, E. and T. Tao (2007). “The Dantzig selector: statistical estimation when p is much larger than n.” The Annals of Statistics, 35, 2313-2351. Carrasco, M. and J. P. Florens (2000). “Generalization of GMM to a continuum of moment con ditions.” Econometric Theory, 16, 797-834. Carrasco, M. (2012). “A regularization approach to the many instruments problem.” Journal of Econometrics, 170, 383-398. Chen, X. H. and M. Reiss (2011). “On rate optimality for ill-posed inverse problems in economet rics.” Econometric Theory, 27, 497-521. Fan, J. and R. Li (2001). “Variable selection via nonconcave penalized likelihood and its oracle properties.” Journal of American Statistical Association, 96, 1348-1360. Fan, J. and Y. Liao (2014). “Endogeneity in ultrahigh dimension.” The Annals of Statistics, 42, 872-917. Fan, J. and J. Lv (2010). “A Selective overview of variable selection in high dimensional feature space.” Statistica Sinica, 20, 101-148. Fan, J. and J. Lv (2011). “Non-concave penalized likelihood with NP-dimensionality.” IEEE Trans actions on Information Theory, 57, 5467-5484. Fan, J., J. Lv, and L. Qi (2011). “Sparse high dimensional models in economics.” Annual Review of Economics, 3, 291-317. Garen, J. (1984). “The returns to schooling: a selectivity bias approach with a continuous choice variable.” Econometrica, 52, 1199-1218. Gautier, E. and A. B. Tsybakov (2014). “High-dimensional instrumental variables regression and confidence sets.” Manuscript. CREST (ENSAE). Hastie, T., R. Tibshirani, and J. Friedman (2002). The elements of statistical learning: data min ing, inference, and prediction, Springer. Huang, J., J. L. Horowitz, and S. Ma (2008). “Asymptotic properties of Bridge estimators in sparse high-dimensional regression models.” The Annals of Statistics, 36, 587-613. Jing, B.-Y., Q. M. Shao, and Q. Wang (2003). “Self-normalized Cramér-type large deviations for independent random variables.” The Annals of Probability, 31, 2167-2215. Koltchinskii, V. (2009). “The Dantzig selector and sparsity oracle inequalities.” Bernoulli, 15, 799- 828. Koltchinskii, V. (2011). “Oracle inequalities in empirical risk minimization and sparse recovery problems.” Forthcoming in Lecture Notes in Mathematics, Springer, Berlin. Ledoux, M. (2001). The concentration of measure phenomenon. Mathematical Surveys and Mono graphs. American Mathematical Society, Providence, RI. Ledoux, M. and M. Talagrand (1991). Probability in Banach spaces: isoperimetry and processes. Springer-Verlag, New York, NY. Lim, C. and B. Yu. (2013). “Estimation stability with cross validation (ESCV).” arXiv:1303.3128. Lin, Y. and H. H. Zhang (2006). “Component selection and smoothing in multivariate nonpara metric regression.” The Annals of Statistics, 34(5): 2272-2297. Loh, P., and M. Wainwright (2012). “High-dimensional regression with noisy and missing data: provable guarantees with non-convexity.” The Annals of Statistics, 40, 1637-1664. Manresa, E. (2014). “Estimating the structure of social interactions using panel data.” Working paper. CEMFI. Meinshausen, N., and P. Bühlmann (2006). “High-dimensional graphs and variable selection with the Lasso.” The Annals of Statistics, 34:1436-1462. Meinshausen, N., and P. Bühlmann (2010). “Stability selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 417-473. Meinshausen, N., and B. Yu (2009). “Lasso-type recovery of sparse representations for high dimensional Data.” The Annals of Statistics, 37, 246-270. Minsker, S. (2014). “Geometric median and robust estimation in Banach spaces.” arXiv:1308.1334v5. Negahban, S., P. Ravikumar, M. J. Wainwright, and B. Yu (2012). “A unified framework for high dimensional analysis of M-estimators with decomposable regularizers.” Statistical Science, 27, 538-557. Nemirovski, A., and D. Yudin (1983). Problem complexity and method efficiency in optimization. John Wiley and Sons Inc. Ravikumar, P., H. Liu, J. Lafferty, and L. Wasserman (2009). “Sparse additive models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71, 1009-1030. Ravikumar, P., M. J. Wainwright, and J. Lafferty (2010). “High-dimensional Ising model selection using l1 −regularized logistic regression.” The Annals of Statistics, 38, 1287-1319. Raskutti, G., M. J. Wainwright, and B. Yu (2010). “Restricted eigenvalue conditions for correlated Gaussian designs.” Journal of Machine Learning Research, 11, 2241-2259. Raskutti, G., M. J. Wainwright, and B. Yu (2011). “Minimax rates of estimation for high dimensional linear regression over lq −balls.” IEEE Trans. Information Theory, 57, 6976-6994. Rosenbaum, M. and A. B. Tsybakov (2010). “Sparse recovery under matrix uncertainty.” The An nals of Statistics, 38, 2620-2651. Rosenbaum, M. and A. B. Tsybakov (2013). “Improved matrix uncertainty selector”, in: From Prob ability to Statistics and Back: High-Dimensional Models and Processes - A Festschrift in Honor of Jon A. Wellner, Banerjee, M. et al. Eds, IMS Collections, 9, 276-290, Institute of Mathematical Statistics. Rudelson, M. and S. Zhou (2011). “Reconstruction from anisotropic random measurements.” Tech nical report, University of Michigan. Sala-i-Martin, X. (1997). “I Just ran two million regressions.” The American Economic Review, 87, 178-183. Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statis tics and Applied Probability, 26, Chapman and Hall, London. Stone, M. (1974). “Cross-validation choice and assessment of statistical prediction.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 39, 44-47. Sun, T. and C.-H. Zhang (2010). “Comments on: l1 −penalization for mixture regression models.” Test, 19, 270-275. Sun, T. and C.-H. Zhang (2012). “Scaled sparse linear regression.” Biometrika, 99, 879-898. Tibshirani, R. (1996). “Regression shrinkage and selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58, 267-288. Vershynin, R. (2012). “Introduction to the non-asymptotic analysis of random matrices”, in Eldar, Y. and G. Kutyniok, Eds, Compressed Sensing: Theory and Applications, 210-268, Cam bridge. Wainwright, J. M. (2009). “Sharp thresholds for high-dimensional and noisy sparsity recovery us ing l1 −constrained quadratic programming (Lasso).” IEEE Trans. Information Theory, 55, 2183-2202. Wainwright, J. M. (2015). High-dimensional statistics: A non-asymptotic viewpoint. In prepara tion. University of California, Berkeley. Wooldridge, J. M. (2002). Econometric analysis of cross section and panel data. MIT Press, Cam bridge. Ye, F., and C.-H. Zhang (2010). “Rate minimaxity of the Lasso and Dantzig selector for the lq loss in lr balls.” Journal of Machine Learning Research, 11, 3519-3540. Yu, B. (2013). “Stability.” Bernoulli, 19, 1484-1500. Zhang C.-H. and S. S. Zhang (2013). “Confidence intervals for low dimensional parameters in high dimensional linear models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 217-242. Zhao, P., and B. Yu. (2007). “On model selection consistency of Lasso.” Journal of Machine Learn ing Research, 7, 2541-2567. Zhu, Y. (2014). “High-dimensional linear models with endogeneity and sparsity.” The California Econometrics Conference. Stanford University. Zhu, Y. (2014). “High-dimensional semiparametric selection models: estimation theory with an application to the retail gasoline market.” Working paper. University of California, Berkeley.
URI:	https://mpra.ub.uni-muenchen.de/id/eprint/65703

Available Versions of this Item

Sparse Linear Models and Two-Stage Estimation in High-Dimensional Settings with Possibly Many Endogenous Regressors. (deposited 18 Sep 2013 12:45)
- Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments. (deposited 23 Jul 2015 09:22) [Currently Displayed]

All papers reproduced by permission. Reproduction and distribution subject to the approval of the copyright owners.

View Item