Leeb, Hannes and Pötscher, Benedikt M. and Ewald, Karl
(2014):
*On various confidence intervals post-model-selection.*

PDF
MPRA_paper_58325.pdf Download (594kB) |

## Abstract

We compare several confidence intervals after model selection in the setting recently studied by Berk et al. (2013), where the goal is to cover not the true parameter but a certain non-standard quantity of interest that depends on the selected model. In particular, we compare the PoSI-intervals that are proposed in that reference with the `naive' confidence interval, which is constructed as if the selected model were correct and fixed a-priori (thus ignoring the presence of model selection). Overall, we find that the actual coverage probabilities of all these intervals deviate only moderately from the desired nominal coverage probability. This finding is in stark contrast to several papers in the existing literature, where the goal is to cover the true parameter.

Item Type: | MPRA Paper |
---|---|

Original Title: | On various confidence intervals post-model-selection |

Language: | English |

Keywords: | Confidence intervals, model selection |

Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General |

Item ID: | 58326 |

Depositing User: | Benedikt Poetscher |

Date Deposited: | 04 Sep 2014 18:18 |

Last Modified: | 27 Sep 2019 00:43 |

References: | D. W. K. Andrews and P. Guggenberger. Hybrid and size-corrected subsampling methods. Econometrica, 77, 721-762, 2009. R. Berk, L. Brown, A. Buja, K. Zhang, and L. Zhao. Valid post-selection inference. Ann. Statist., 41, 802-837, 2013. P. J. Bickel and K. A. Doksum. Mathematical Statistics: Basic Ideas and Selected Topics. Holden-Day, Oakland, 1977. L. D. Brown. The conditional level of Student's t test. Ann. Math. Stat., 38, 1068-1071, 1967. R. J. Buehler and A. P. Feddersen. Note on a conditional property of Student's t. Ann. Math. Stat., 34, 1098-1100, 1963. P. Craven and G. Wahba. Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math., 31, 377-403, 1978. T. K. Dijkstra and J. H. Veldkamp. Data-driven selection of regressors and the bootstrap. Lecture Notes in Econom. and Math. Systems, 307, 17-38, 1988. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Ann. Statist., 32, 407-499, 2004. K. Ewald. On the influence of model selection on confidence regions for marginal associations in the linear model. Master's thesis, University of Vienna, 2012. P. Kabaila. Valid confidence intervals in regression after variable selection. Econometric Theory, 14, 463-482, 1998. P. Kabaila. The coverage properties of confidence regions after model selection. Int. Statist. Rev., 77, 405-414, 2009. P. Kabaila and H. Leeb. On the large-sample minimal coverage probability of confidence intervals after model selection. J. Amer. Statist. Assoc., 101, 619-629, 2006. H. Leeb. The distribution of a linear predictor after model selection: unconditional finite-sample distributions and asymptotic approximations. IMS Lecture Notes - Monograph Series, 49, 291-311, 2006. H. Leeb. Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli, 14, 661-690, 2008. H. Leeb and B. M. Pötscher. The finite-sample distribution of post-model-selection estimators, and uniform versus non-uniform approximations. Econometric Theory, 19, 100-142, 2003. H. Leeb and B. M. Pötscher. Model selection and inference: Facts and fiction. Econometric Theory, 21,21-59, 2005. H. Leeb and B. M. Pötscher. Can one estimate the conditional distribution of post-model-selection estimators? Ann. Statist., 34, 2554-2591, 2006a. H. Leeb and B. M. Pötscher. Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory, 22, 69-97, 2006b. H. Leeb and B. M. Pötscher. Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory, 24, 338-376, 2008a. H. Leeb and B. M. Pötscher. Model selection. In T. G. Andersen, R. A. Davis, J.-P. Kreiß, and Th. Mikosch, editors, Handbook of Financial Time Series, pages 785-821, New York, NY, 2008b. Springer. R. A. Olshen. The conditional level of the F-test. J. Amer. Statist. Assoc., 68, 692-698, 1973. B. M. Pötscher. Effects of model selection on inference. Econometric Theory, 7, 163-185, 1991. B. M. Pötscher. The distribution of model averaging estimators and an impossibility result regarding its estimation. IMS Lecture Notes - Monograph Series, 52, 113-129, 2006. B. M. Pötscher. Confidence sets based on sparse estimators are necessarily large. Sankhya, 71, 1-18, 2009. B. M. Pötscher and H. Leeb. On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal., 100, 2065-2082, 2009. B. M. Pötscher and U. Schneider. On the distribution of the adaptive LASSO estimator. J. Statist. Plann. Inference, 139, 2775-2790, 2009. B. M. Pötscher and U. Schneider. Confidence sets based on penalized maximum likelihood estimators in Gaussian regression. Electron. J. Statist., 4, 334-360, 2010. B. M. Pötscher and U. Schneider. Distributional results for thresholding estimators in high-dimensional Gaussian regression models. Electron. J. Statist., 5, 1876-1934, 2011. J.O. Rawlings. Applied Regression Analysis: A Research Tool. Springer Verlag, New York, NY, 1998. P. K. Sen. Asymptotic properties of maximum likelihood estimators based on conditional specification. Ann. Statist., 7, 1019-1033, 1979. P. K. Sen and E. A. K. Md. Saleh. On preliminary test and shrinkage M-estimation in linear models. Ann. Statist., 15, 1580-1592, 1987. J.W. Tukey. Discussion of 'Topics in the investigation of linear relations fitted by the method of least squares' by F. J. Anscombe. J. Roy. Statist. Soc. Ser. B, 29, 47-48, 1967. |

URI: | https://mpra.ub.uni-muenchen.de/id/eprint/58326 |