Calzolari, Giorgio and Neri, Laura (2002): Imputation of continuous variables missing at random using the method of simulated scores. Published in: Compstat 2002, Proceedings in Computational Statistics, 15th Symposium held in Berlin No. Ed. by W. Haerdle and B. Roenz. Heidelberg: Physika Verlag (2002): pp. 389-394.
Download (125kB) | Preview
For multivariate datasets with missing values, we present a procedure of statistical inference and state its "optimal" properties. Two main assumptions are needed: (1) data are missing at random (MAR); (2) the data generating process is a multivariate normal linear regression. Disentangling the problem of convergence of the iterative estimation/imputation procedure, we show that the estimator is a "method of simulated scores" (a particular case of McFadden's "method of simulated moments"); thus the estimator is equivalent to maximum likelihood if the number of replications is conveniently large, and the whole procedure can be considered an optimal parametric technique for imputation of missing data.
|Item Type:||MPRA Paper|
|Original Title:||Imputation of continuous variables missing at random using the method of simulated scores|
|Keywords:||Simulates scores; missing data; estimation/imputation; structural form; reduced form|
|Subjects:||C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C15 - Statistical Simulation Methods: General|
|Depositing User:||Giorgio Calzolari|
|Date Deposited:||04. Jun 2010 20:12|
|Last Modified:||12. Feb 2013 19:09|
Calzolari G., Neri L. (2002): "A Method of Simulated Scores for Imputation of Continuous Variables Missing At Random", Quaderni del Dipartimento di Statistica "G.Parenti" No. 49, Universita' degli studi di Firenze.
Gourieroux C., Monfort A. (1996): Simulation-Based Econometric Methods. Oxford University Press.
Greene W. H. (2000): Econometric Analysis (fourth edition). Upper Saddle River, NJ: Prentice-Hall, Inc.
Hajivassiliou V., McFadden D. (1990): "The Method of Simulated Scores, with Application to Models of External Debt Crises", Cowles Foundation Discussion Paper No. 967, Yale University.
Horton N. J. and Lipsitz S. R. (2001): "Multiple Imputation in Practice: Comparison of Software Packages for Regression Models with Missing Variables", The American Statistician, 55, 244-254.
Little R. J. A., Rubin D. B. (1987): Statistical Analysis with Missing Data. New York : Wiley.
Mc Fadden D. (1989): "A Method of Simulated Moments for Estimation of Discrete Response Model without Numerical Integration", Econometrica , 57, 995-1026.
Raghunathan T. E.: www.isr.umich.edu/src/smp/ive.
Raghunathan T. E., Lepkowski J., Van Voewyk J., Solenberger P. (1997): "A Multivariate Technique for Imputing Missing Values Using a Sequence of Regression Models", Technical Report, Survey Methodology Program, Survey Research Center, ISR, University of Michigan.
Rubin D. B. (1976): "Inference with Missing Data", Biometrika, 63, 581-592.
Rubin D. B. (1978): "Multiple Imputations in Sample Surveys- A Phenomeno-Logical Bayesian Approach to Nonresponse", The Proceeding of the Survey Research Methods Section of the American Statistical Association, 20-34, with discussion and reply.
Rubin D. B. (1987): Multiple Imputation for Nonresponse in Survey. New York: Wiley.
Rubin D. B. (2000): "The Broad Role of Multiple Imputations in Statistical Science", in Proceeding in Computational Statistics, 14th Symposium, Utrecht- The Netherlands, 2000, ed. by J. G. Bethlehem and P. G. M. van der Heijden. Vienna: Physica-Verlag, 3-14.
Schafer J. L. (1997): Analysis of Incomplete Multivariate Data. London: Chapman and Hall.
Thisted, R. A. (1988): Elements of Statistical Computing. New York: Chapman and Hall.