Zhu, Ying (2015): Sparse Linear Models and l1−Regularized 2SLS with HighDimensional Endogenous Regressors and Instruments.
This is the latest version of this item.

PDF
MPRA_paper_65703.pdf Download (671kB)  Preview 
Abstract
We explore the validity of the 2stage least squares estimator with l1−regularization in both stages, for linear models where the numbers of endogenous regressors in the main equation and instruments in the firststage equations can exceed the sample size, and the regression coefficients belong to lq−“balls” for q in [0, 1], covering both exact and approximate sparsity cases. Standard highlevel assumptions on the Gram matrix for l2−consistency require careful verifications in the twostage procedure, for which we provide detailed analysis. We establish finitesample bounds and conditions for our estimator to achieve l2−consistency and variable selection consistency. Practical guidance for choosing the regularization parameters is provided.
Item Type:  MPRA Paper 

Original Title:  Sparse Linear Models and l1−Regularized 2SLS with HighDimensional Endogenous Regressors and Instruments 
Language:  English 
Keywords:  Highdimensional statistics; Lasso; sparse linear models; endogeneity; twostage estimation 
Subjects:  C  Mathematical and Quantitative Methods > C1  Econometric and Statistical Methods and Methodology: General C  Mathematical and Quantitative Methods > C1  Econometric and Statistical Methods and Methodology: General > C13  Estimation: General C  Mathematical and Quantitative Methods > C3  Multiple or Simultaneous Equation Models ; Multiple Variables > C31  CrossSectional Models ; Spatial Models ; Treatment Effect Models ; Quantile Regressions ; Social Interaction Models C  Mathematical and Quantitative Methods > C3  Multiple or Simultaneous Equation Models ; Multiple Variables > C36  Instrumental Variables (IV) Estimation 
Item ID:  65703 
Depositing User:  Ms Ying Zhu 
Date Deposited:  23 Jul 2015 09:22 
Last Modified:  16 Mar 2019 14:30 
References:  Allen, D. M. (1974). “The relationship between variable selection and data argumentation and a method of prediction.” Technometrics, 16, 125127. Amemiya, T. (1974). “The nonlinear twostage least squares estimator.” Journal of Econometrics, 2, 105110. Antoniadis, A. (2010). “Comments on: l1 −penalization for mixture regression models.” Test, 19, 257258. Bach, F. (2008). “Bolasso: model consistent Lasso estimation through the bootstrap.” Proceed ings of the 25th international conference on Machine learning. Belloni, A., V. Chernozhukov, and L. Wang (2011). “Squareroot Lasso: pivotal recovery of sparse signals via conic programming.” Biometrika, 98, 791806. Belloni, A. and V. Chernozhukov (2011a). “L1penalized quantile regression in highdimensional sparse models.” The Annals of Statistics, 39, 82130. Belloni, A. and V. Chernozhukov (2011b). “High dimensional sparse econometric models: an in troduction”, in: Inverse problems and high dimensional estimation, Stats in the Château 2009, Alquier, P., E. Gautier, and G. Stoltz, Eds., Lecture Notes in Statistics, 203, 127162, Springer, Berlin. Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen (2012). “Sparse models and methods for instrumental regression, with an application to eminent domain.” Econometrica, 80, 2369 2429. Belloni, A. and V. Chernozhukov (2013). “Least squares after model selection in highdimensional sparse models.” Bernoulli, 19, 521547. Bickel, P., J. Y. Ritov, and A. B. Tsybakov (2009). “Simultaneous analysis of Lasso and Dantzig selector.” The Annals of Statistics, 37, 17051732. Breiman, L. (1995). “Better subset regression using the nonnegative garrote.” Technometrics, 37. Breiman, L. (1996). “Heuristics of instability and stabilization in model selection.” The Annals of Statistics, 24, 23502383. Breiman, L. (2001). “Statistical modeling: the two cultures.” Statistical Science, 16, 199231. Bühlmann, P. and S. A. van de Geer (2011). Statistics for highdimensional data. Springer, New York. Caner, M. (2009). “Lasso type GMM estimator.” Econometric Theory, 25, 123. Candès, E. and T. Tao (2007). “The Dantzig selector: statistical estimation when p is much larger than n.” The Annals of Statistics, 35, 23132351. Carrasco, M. and J. P. Florens (2000). “Generalization of GMM to a continuum of moment con ditions.” Econometric Theory, 16, 797834. Carrasco, M. (2012). “A regularization approach to the many instruments problem.” Journal of Econometrics, 170, 383398. Chen, X. H. and M. Reiss (2011). “On rate optimality for illposed inverse problems in economet rics.” Econometric Theory, 27, 497521. Fan, J. and R. Li (2001). “Variable selection via nonconcave penalized likelihood and its oracle properties.” Journal of American Statistical Association, 96, 13481360. Fan, J. and Y. Liao (2014). “Endogeneity in ultrahigh dimension.” The Annals of Statistics, 42, 872917. Fan, J. and J. Lv (2010). “A Selective overview of variable selection in high dimensional feature space.” Statistica Sinica, 20, 101148. Fan, J. and J. Lv (2011). “Nonconcave penalized likelihood with NPdimensionality.” IEEE Trans actions on Information Theory, 57, 54675484. Fan, J., J. Lv, and L. Qi (2011). “Sparse high dimensional models in economics.” Annual Review of Economics, 3, 291317. Garen, J. (1984). “The returns to schooling: a selectivity bias approach with a continuous choice variable.” Econometrica, 52, 11991218. Gautier, E. and A. B. Tsybakov (2014). “Highdimensional instrumental variables regression and confidence sets.” Manuscript. CREST (ENSAE). Hastie, T., R. Tibshirani, and J. Friedman (2002). The elements of statistical learning: data min ing, inference, and prediction, Springer. Huang, J., J. L. Horowitz, and S. Ma (2008). “Asymptotic properties of Bridge estimators in sparse highdimensional regression models.” The Annals of Statistics, 36, 587613. Jing, B.Y., Q. M. Shao, and Q. Wang (2003). “Selfnormalized Cramértype large deviations for independent random variables.” The Annals of Probability, 31, 21672215. Koltchinskii, V. (2009). “The Dantzig selector and sparsity oracle inequalities.” Bernoulli, 15, 799 828. Koltchinskii, V. (2011). “Oracle inequalities in empirical risk minimization and sparse recovery problems.” Forthcoming in Lecture Notes in Mathematics, Springer, Berlin. Ledoux, M. (2001). The concentration of measure phenomenon. Mathematical Surveys and Mono graphs. American Mathematical Society, Providence, RI. Ledoux, M. and M. Talagrand (1991). Probability in Banach spaces: isoperimetry and processes. SpringerVerlag, New York, NY. Lim, C. and B. Yu. (2013). “Estimation stability with cross validation (ESCV).” arXiv:1303.3128. Lin, Y. and H. H. Zhang (2006). “Component selection and smoothing in multivariate nonpara metric regression.” The Annals of Statistics, 34(5): 22722297. Loh, P., and M. Wainwright (2012). “Highdimensional regression with noisy and missing data: provable guarantees with nonconvexity.” The Annals of Statistics, 40, 16371664. Manresa, E. (2014). “Estimating the structure of social interactions using panel data.” Working paper. CEMFI. Meinshausen, N., and P. Bühlmann (2006). “Highdimensional graphs and variable selection with the Lasso.” The Annals of Statistics, 34:14361462. Meinshausen, N., and P. Bühlmann (2010). “Stability selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 417473. Meinshausen, N., and B. Yu (2009). “Lassotype recovery of sparse representations for high dimensional Data.” The Annals of Statistics, 37, 246270. Minsker, S. (2014). “Geometric median and robust estimation in Banach spaces.” arXiv:1308.1334v5. Negahban, S., P. Ravikumar, M. J. Wainwright, and B. Yu (2012). “A unified framework for high dimensional analysis of Mestimators with decomposable regularizers.” Statistical Science, 27, 538557. Nemirovski, A., and D. Yudin (1983). Problem complexity and method efficiency in optimization. John Wiley and Sons Inc. Ravikumar, P., H. Liu, J. Lafferty, and L. Wasserman (2009). “Sparse additive models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71, 10091030. Ravikumar, P., M. J. Wainwright, and J. Lafferty (2010). “Highdimensional Ising model selection using l1 −regularized logistic regression.” The Annals of Statistics, 38, 12871319. Raskutti, G., M. J. Wainwright, and B. Yu (2010). “Restricted eigenvalue conditions for correlated Gaussian designs.” Journal of Machine Learning Research, 11, 22412259. Raskutti, G., M. J. Wainwright, and B. Yu (2011). “Minimax rates of estimation for high dimensional linear regression over lq −balls.” IEEE Trans. Information Theory, 57, 69766994. Rosenbaum, M. and A. B. Tsybakov (2010). “Sparse recovery under matrix uncertainty.” The An nals of Statistics, 38, 26202651. Rosenbaum, M. and A. B. Tsybakov (2013). “Improved matrix uncertainty selector”, in: From Prob ability to Statistics and Back: HighDimensional Models and Processes  A Festschrift in Honor of Jon A. Wellner, Banerjee, M. et al. Eds, IMS Collections, 9, 276290, Institute of Mathematical Statistics. Rudelson, M. and S. Zhou (2011). “Reconstruction from anisotropic random measurements.” Tech nical report, University of Michigan. SalaiMartin, X. (1997). “I Just ran two million regressions.” The American Economic Review, 87, 178183. Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statis tics and Applied Probability, 26, Chapman and Hall, London. Stone, M. (1974). “Crossvalidation choice and assessment of statistical prediction.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 39, 4447. Sun, T. and C.H. Zhang (2010). “Comments on: l1 −penalization for mixture regression models.” Test, 19, 270275. Sun, T. and C.H. Zhang (2012). “Scaled sparse linear regression.” Biometrika, 99, 879898. Tibshirani, R. (1996). “Regression shrinkage and selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58, 267288. Vershynin, R. (2012). “Introduction to the nonasymptotic analysis of random matrices”, in Eldar, Y. and G. Kutyniok, Eds, Compressed Sensing: Theory and Applications, 210268, Cam bridge. Wainwright, J. M. (2009). “Sharp thresholds for highdimensional and noisy sparsity recovery us ing l1 −constrained quadratic programming (Lasso).” IEEE Trans. Information Theory, 55, 21832202. Wainwright, J. M. (2015). Highdimensional statistics: A nonasymptotic viewpoint. In prepara tion. University of California, Berkeley. Wooldridge, J. M. (2002). Econometric analysis of cross section and panel data. MIT Press, Cam bridge. Ye, F., and C.H. Zhang (2010). “Rate minimaxity of the Lasso and Dantzig selector for the lq loss in lr balls.” Journal of Machine Learning Research, 11, 35193540. Yu, B. (2013). “Stability.” Bernoulli, 19, 14841500. Zhang C.H. and S. S. Zhang (2013). “Confidence intervals for low dimensional parameters in high dimensional linear models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 217242. Zhao, P., and B. Yu. (2007). “On model selection consistency of Lasso.” Journal of Machine Learn ing Research, 7, 25412567. Zhu, Y. (2014). “Highdimensional linear models with endogeneity and sparsity.” The California Econometrics Conference. Stanford University. Zhu, Y. (2014). “Highdimensional semiparametric selection models: estimation theory with an application to the retail gasoline market.” Working paper. University of California, Berkeley. 
URI:  https://mpra.ub.unimuenchen.de/id/eprint/65703 
Available Versions of this Item

Sparse Linear Models and TwoStage Estimation in HighDimensional Settings with Possibly Many Endogenous Regressors. (deposited 18 Sep 2013 12:45)
 Sparse Linear Models and l1−Regularized 2SLS with HighDimensional Endogenous Regressors and Instruments. (deposited 23 Jul 2015 09:22) [Currently Displayed]