Martin, Eisele and Zhu, Junyi (2013): Multiple imputation in a complex household survey - the German Panel on Household Finances (PHF): challenges and solutions.
Preview |
PDF
MPRA_paper_57666.pdf Download (429kB) | Preview |
Abstract
In this paper, we present a case study of the imputation in a complex household survey - the first wave of the German Panel on Household Finances (PHF). A household wealth survey has to be built on a questionnaire with rather complex logical structure mainly because the probes of many wealth items have to be proceeded on both intensive and extensive margins. Hence the number of potential predictors for each imputation model grows and more non-compliance can confront standard modelling due to, e.g., irregular missing patterns, interdependent logical constraints, data anomalies etc. Our model selection procedure borrows the techniques for the out-of-sample prediction to handle the overfitting often associated with the introduction of a large number of predictors. We also take the measures to produce ex ante evaluation for modelling which can be more efficient than the common diagnosis done after imputation in practice. Solutions for the difficulties in the real data and questionnaire structures are also presented. On the other hand, we incorporate the rich flagging information in developing various measures of item-nonresponse to access this complication from logical structure. We find that information loss due to the contagion of item-nonresponse between variables is not serious in our imputed data.
Item Type: | MPRA Paper |
---|---|
Original Title: | Multiple imputation in a complex household survey - the German Panel on Household Finances (PHF): challenges and solutions |
Language: | English |
Keywords: | Multiple imputation, Model selection, Panel on household finance, item-nonresponse evaluation |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C15 - Statistical Simulation Methods: General C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C52 - Model Evaluation, Validation, and Selection |
Item ID: | 57666 |
Depositing User: | Junyi Zhu |
Date Deposited: | 31 Jul 2014 13:52 |
Last Modified: | 27 Sep 2019 11:15 |
References: | Abayomi, K., A. Gelman, et al. (2008). Diagnostics for multivariate imputations. Journal of the Royal Statistical Society: Series C (Applied Statistics). 3: 273--291. Albacete, N. (2012). Multiple Imputation in the Austrian Household Survey on Housing Wealth. Oesterreichische Nationalbank (Austrian Central Bank) Working Paper. Barceló, C. (2006). Imputation of the 2002 wave of the Spanish survey of household. Banco de España Occasional Papers 0603. Barnard, J. and X. L. Meng (1999). Applications of Multiple Imputation in Medical Studies: From AIDS to NHANES. Statistical Methods in Medical Research. 8: 17-36. Christelis, D. (2011). Imputation of Missing Data in Waves 1 and 2 of SHARE. CSEF Work-ing Papers. Cragg, J. G. (1971). Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica: Journal of the Econometric Society: 829--844. David, M., R. J. A. Little, et al. (1986). Alternative methods for CPS income imputation. Journal of the American Statistical Association: 29-41. Drechsler, J. (2011). Multiple imputation in practice - a case study using a complex German establishment survey. AStA Advances in Statistical Analysis. 95: 1-26. Gelman, A. and D. B. Rubin (1992). Inference from iterative simulation using multiple se-quences. Statistical science. 7: 457-472. Jaenichen, U. and J. W. Sakshaug (2012). Multiple imputation of household income in the first wave of PASS. Institute for Employment Research (IAB). Kalton, G. and D. Kasprzyk (1985). The Treatment of Missing Survey Data. Survey Method-ogogy. 12: 1-16. Kennickell, A. B. (1991). Imputation of the 1989 Survey of Consumer Finances: Stochastic Relaxation and Multiple Imputation. the Annual Meetings of the American Statistical Association. Kennickell, A. B. (1998). Multiple imputation in the Survey of Consumer Finances. Proceedings of the Section on Business and Economic Statistics, 1998 Annual Meetings of the American Statistical Association. Little, R. J. A. (1988). Missing-data adjustments in large surveys. Journal of Business and Economic Statistics: 287--296. Little, R. J. A. and D. B. Rubin (2002). Statistical analysis with missing data. New York, Wiley. Meng, X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science: 538--558. Raghunathan, T. E., J. M. Lepkowski, et al. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology. 27: 85-96. Reiter, J. P., T. E. Raghunathan, et al. (2006). The importance of modeling the sampling de-sign in multiple imputation for missing data. Survey Methodology. 32: 143. Rubin, D. B. (1978). Multiple Imputations in Sample Surveys - A Phenomenological Bayesi-an Approach to Nonresponse. Proceedings of the Survey Research Methods Section, Ameri-can Statistical Association: 20-34. Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. New York, Wiley. Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association. 91: 473--489. Rubin, D. B. (2004). The design of a general and flexible system for handling nonresponse in sample surveys. The American Statistician. 58. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York, Chapman and Hall. Schenker, N. a. R. T. E., P. L. Chiu, et al. (2006). Multiple imputation of missing income data in the National Health Interview Survey. Journal of the American Statistical Association. 101: 924-933. Schunk, D. (2008). A Markov Chain Monte Carlo algorithm for multiple imputation in large surveys. AStA Advances in Statistical Analysis. 92: 101-114. van Buuren, S., H. C. Boshuizen, et al. (1999). Multiple Imputation of Missing Blood Pres-sure Covariates in Survival Analysis. Statistics in Medicine. 18: 681-694. White, I. R., R. Daniel, et al. (2010). Avoiding bias due to perfect prediction in multiple im-putation of incomplete categorical variables. Computational Statistics \& Data Analysis. 54: 2267--2275. Yucel, R. M. (2011). State of the Multiple Imputation Software. Journal of statistical soft-ware. 45. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/57666 |