Zhu, Ying (2015): Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments.
Preview |
PDF
MPRA_paper_81217.pdf Download (719kB) | Preview |
Abstract
We explore the validity of the 2-stage least squares estimator with l_{1}-regularization in both stages, for linear regression models where the numbers of endogenous regressors in the main equation and instruments in the first-stage equations can exceed the sample size, and the regression coefficients are sufficiently sparse. For this l_{1}-regularized 2-stage least squares estimator, finite-sample performance bounds are established. We then provide a simple practical method (with asymptotic guarantees) for choosing the regularization parameter. We show that this practical method can produce an l_{2}-consistent 2SLS estimator whose rate of convergence can be made as arbitrarily close as the scaling of our finite-sample performance bounds under quite standard conditions.
Item Type: | MPRA Paper |
---|---|
Original Title: | Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments |
Language: | English |
Keywords: | High-dimensional statistics; Lasso; sparse linear models; endogeneity; two-stage estimation |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C13 - Estimation: General C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C31 - Cross-Sectional Models ; Spatial Models ; Treatment Effect Models ; Quantile Regressions ; Social Interaction Models C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C36 - Instrumental Variables (IV) Estimation |
Item ID: | 81217 |
Depositing User: | Ms Ying Zhu |
Date Deposited: | 10 Sep 2017 07:29 |
Last Modified: | 27 Sep 2019 14:02 |
References: | Amemiya, T. (1974). “The non-linear two-stage least squares estimator.” Journal of Econometrics, 2, 105-110. Belloni, A. and V. Chernozhukov (2011a). “L1-penalized quantile regression in high-dimensional sparse models.” The Annals of Statistics, 39, 82-130. Belloni, A. and V. Chernozhukov (2011b). “High dimensional sparse econometric models: an in troduction”, in: Inverse problems and high dimensional estimation, Stats in the Château 2009, Alquier, P., E. Gautier, and G. Stoltz, Eds., Lecture Notes in Statistics, 203, 127-162, Springer, Berlin. Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen (2012). “Sparse models and methods for instrumental regression, with an application to eminent domain.” Econometrica, 80, 2369-2429. Belloni, A. and V. Chernozhukov (2013). “Least squares after model selection in high-dimensional sparse models.” Bernoulli, 19, 521-547. Bickel, P., J. Y. Ritov, and A. B. Tsybakov (2009). “Simultaneous analysis of Lasso and Dantzig selector.” The Annals of Statistics, 37, 1705-1732. Bühlmann, P. and S. A. van de Geer (2011). Statistics for high-dimensional data. Springer, New York. Caner, M. (2009). “Lasso type GMM estimator.” Econometric Theory, 25, 1-23. Candès, E. and T. Tao (2007). “The Dantzig selector: statistical estimation when p is much larger than n.” The Annals of Statistics, 35, 2313-2351. Carrasco, M. (2012). “A regularization approach to the many instruments problem.” Journal of Econometrics, 170, 383-398. Chen, X. H. and M. Reiss (2011). “On rate optimality for ill-posed inverse problems in economet rics.” Econometric Theory, 27, 497-521. Fan, J. and R. Li (2001). “Variable selection via nonconcave penalized likelihood and its oracle properties.” Journal of American Statistical Association, 96, 1348-1360. Fan, J. and Y. Liao (2014). “Endogeneity in ultrahigh dimension.” The Annals of Statistics, 42, 872-917. Fan, J. and J. Lv (2010). “A Selective overview of variable selection in high dimensional feature space.” Statistica Sinica, 20, 101-148. Fan, J. and J. Lv (2011). “Non-concave penalized likelihood with NP-dimensionality.” IEEE Trans actions on Information Theory, 57, 5467-5484. Fan, J., J. Lv, and L. Qi (2011). “Sparse high dimensional models in economics.” Annual Review of Economics, 3, 291-317. Garen, J. (1984). “The returns to schooling: a selectivity bias approach with a continuous choice variable.” Econometrica, 52, 1199-1218. Gautier, E. and A. B. Tsybakov (2014). “High-dimensional instrumental variables regression and confidence sets.” Manuscript. CREST (ENSAE). Javanmard, A. and A. Montanari (2014). “Confidence intervals and hypothesis testing for high- dimensional regression.” Journal of Machine Learning Research, 15, 2869-2909. Jing, B.-Y., Q. M. Shao, and Q. Wang (2003). “Self-normalized Cramér-type large deviations for independent random variables.” The Annals of Probability, 31, 2167-2215. Leeb, H. and Pötscher, B. M. (2006). “Can one estimate the conditional distribution of post-modelselection estimators?” The Annals of Statistics, 34, 2554-2591. Lim, C. and B. Yu. (2013). “Estimation stability with cross validation (ESCV).” arXiv:1303.3128. Lin, Y. and H. H. Zhang (2006). “Component selection and smoothing in multivariate nonpara metric regression.” The Annals of Statistics, 34(5): 2272-2297. Loh, P., and M. Wainwright (2012). “High-dimensional regression with noisy and missing data: provable guarantees with non-convexity.” The Annals of Statistics, 40, 1637-1664. Manresa, E. (2015). “Estimating the structure of social interactions using panel data.” Working paper. CEMFI. Meinshausen, N., and P. Bühlmann (2006). “High-dimensional graphs and variable selection with the Lasso.” The Annals of Statistics, 34:1436-1462. Meinshausen, N., and B. Yu (2009). “Lasso-type recovery of sparse representations for high dimensional Data.” The Annals of Statistics, 37, 246-270. Minsker, S. (2014). “Geometric median and robust estimation in Banach spaces.” arXiv:1308.1334v5. Negahban, S., P. Ravikumar, M. J. Wainwright, and B. Yu (2012). “A unified framework for high dimensional analysis of M-estimators with decomposable regularizers.” Statistical Science, 27, 538-557. Nemirovski, A., and D. Yudin (1983). Problem complexity and method efficiency in optimization. John Wiley and Sons Inc. Ravikumar, P., H. Liu, J. Lafferty, and L. Wasserman (2009). “Sparse additive models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71, 1009-1030. Ravikumar, P., M. J. Wainwright, and J. Lafferty (2010). “High-dimensional Ising model selection using l1 −regularized logistic regression.” The Annals of Statistics, 38, 1287-1319. Raskutti, G., M. J. Wainwright, and B. Yu (2010). “Restricted eigenvalue conditions for correlated Gaussian designs.” Journal of Machine Learning Research, 11, 2241-2259. Raskutti, G., M. J. Wainwright, and B. Yu (2011). “Minimax rates of estimation for high dimensional linear regression over lq −balls.” IEEE Trans. Information Theory, 57, 6976-6994. Rosenbaum, M. and A. B. Tsybakov (2010). “Sparse recovery under matrix uncertainty.” The Annals of Statistics, 38, 2620-2651. Rosenbaum, M. and A. B. Tsybakov (2013). “Improved matrix uncertainty selector”, in: From Prob ability to Statistics and Back: High-Dimensional Models and Processes - A Festschrift in Honor of Jon A. Wellner, Banerjee, M. et al. Eds, IMS Collections, 9, 276-290, Institute of Mathematical Statistics. Tibshirani, R. (1996). “Regression shrinkage and selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58, 267-288. Vershynin, R. (2012). “Introduction to the non-asymptotic analysis of random matrices”, in Eldar, Y. and G. Kutyniok, Eds, Compressed Sensing: Theory and Applications, 210-268, Cam bridge. Wainwright, J. M. (2009). “Sharp thresholds for high-dimensional and noisy sparsity recovery us ing l1 −constrained quadratic programming (Lasso).” IEEE Trans. Information Theory, 55, 2183-2202. Wainwright, J. M. (2015). High-dimensional statistics: A non-asymptotic viewpoint. In prepara tion. University of California, Berkeley. Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data. MIT Press, Cam bridge. Zhang C.-H. and S. S. Zhang (2013). “Confidence intervals for low dimensional parameters in high dimensional linear models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 217-242. Zhao, P., and B. Yu. (2007). “On model selection consistency of Lasso.” Journal of Machine Learn ing Research, 7, 2541-2567. Zhu, Y. (2013). “Consistent variable selection of the l1- regularized 2SLS with high-dimensional endogenous regressors and instruments” Manuscript. University of California, Berkeley. (https://sites.google.com/site/yingzhu1215/home/HD2SLS_2013.pdf) |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/81217 |
Available Versions of this Item
- Sparse Linear Models and l1−Regularized 2SLS with High-Dimensional Endogenous Regressors and Instruments. (deposited 10 Sep 2017 07:29) [Currently Displayed]