Arpino, Bruno and Mealli, Fabrizia (2008): The specification of the propensity score in multilevel observational studies.

PDF
MPRA_paper_17407.pdf Download (247Kb)  Preview 
Abstract
Propensity Score Matching (PSM) has become a popular approach to estimation of causal effects. It relies on the assumption that selection into a treatment can be explained purely in terms of observable characteristics (the “unconfoundedness assumption”) and on the property that balancing on the propensity score is equivalent to balancing on the observed covariates. Several applications in social sciences are characterized by a hierarchical structure of data: units at the first level (e.g., individuals) clustered into groups (e.g., provinces). In this paper we explore the use of multilevel models for the estimation of the propensity score for such hierarchical data when one or more relevant clusterlevel variables is unobserved. We compare this approach with alternative ones, like a single level model with cluster dummies. By using Monte Carlo evidence we show that multilevel specifications usually achieve reasonably good balancing in cluster level unobserved covariates and consequently reduce the omitted variable bias. This is also the case for the dummy model.
Item Type:  MPRA Paper 

Original Title:  The specification of the propensity score in multilevel observational studies 
Language:  English 
Keywords:  propensity score, multilevel studies, unconfoundedness, causal inference 
Subjects:  C  Mathematical and Quantitative Methods > C2  Single Equation Models; Single Variables > C21  CrossSectional Models; Spatial Models; Treatment Effect Models; Quantile Regressions C  Mathematical and Quantitative Methods > C0  General > C01  Econometrics 
Item ID:  17407 
Depositing User:  Bruno Arpino 
Date Deposited:  20. Sep 2009 10:35 
Last Modified:  17. Feb 2014 17:41 
References:  Aassve, A. and Arpino B. (2007) Estimation of causal effects of fertility on economic wellbeing: Evidence from rural Vietnam, ISER Working Paper 200724. Colchester: University of Essex. Aassve, A., Betti, G., Mazzuco, S., Mencarini, L. (2007) Marital disruption and economic wellbeing: A comparative analysis. Journal of the Royal Statistical Society, Series A, 170(3), 781–799. Agresti, A. (2002) Categorical Data Analysis, 2nd edition. New Jersey: Wiley. Bloom, H. S., Michalopoulos, C., Hill, C. J. and Lei, J (2002) Can nonexperimental comparison group methods match the findings from a random assignment evaluation of mandatory welfaretowork programs? MDRC Working Paper on Research Methodology, available at http://www.mdrc.org/ResearchMethodologyPprs.htm. Blundell, R., Dearden, L. and Sianesi B. (2005) Evaluating the Impact of Education on Earnings in the UK: Models, Methods and Results from the NCDS. Journal of the Royal Statistical Society, Series A, 168(3), 473512. Brand, J. E. and Halaby, C. N. (2006) Regression and Matching Estimates of the Effects of Elite College Attendance on Education and Career Achievement. Social Science Research, 35, 749770. Bryson, A., Dorsett, R. and Purdon S. (2002) The use of propensity score matching in the evaluation of labour market policies. Working Paper No. 4, Department for Work and Pensions. Bryson, A. (2002) The Union Membership Wage Premium: An Analysis Using Propensity Score Matching, Discussion Paper No. 530, Centre for Economic Performance, London School of Economics. Caliendo, M. and Kopeining, S. (2008) Some Practical Guidance for the Implementation of Propensity Score Matching. Journal of Economic Surveys, 22(1), 3172. Cox, D. R. (1958) Planning of Experiments. New York, Wiley. Dawid, A. P. (1979) Conditional Independence in Statistical Theory, Journal of the Royal Statistical Society, Series B, 41, 131. Dehejia, R., and Wahba, S. (1999) Causal Effects in Non Experimental Studies: Re Evaluating the Evaluation of Training Programs. Journal of the American Statistical Association, 94 (448), 1053–1062. Eren, O. (2007) Measuring the UnionNonunion Wage Gap Using Propensity Score Matching. Industrial Relations, 46(4), 766780. Fisher, R. A. (1925) Statistical Methods for Research Workers. 1st Edition. Edinburgh: Oliver and Boyd. Friedlander, D. and Robins, P. K. (1995) Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods, The American Economic Review, 85(4), 923937. Goldstein, H. (1995) Multilevel Statistical Models. London: Edward Arnold. Heckman, J. J., Ichimura, H. and Todd, P. (1997) Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme, Review of Economic Studies, 64(4), 605654. Heckman, J. J., Ichimura, H. and Todd, P. (1998) Matching as an Econometric Evaluation Estimator. Review of Economic Studies, 65(2), 261294. Hong, G., and Raudenbush, S. W. (2006) Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101(475), 901910. Hox, J. J. (1995) Applied Multilevel Analysis. Amsterdam: TT Publikaties. Ichino, A., Mealli, F. and Nannicini, T. (2008) From Temporary Help Jobs to Permanent Employment: What Can We Learn from Matching Estimators and their Sensitivity? Journal of Applied Econometrics, 23(3), 305327. Imbens, G. W. (2004) Nonparametric Estimation of Average Treatment Effects under Exogeneity: A Review. Review of Economics and Statistics, 86(1), 430. Kim, J. and Seltzer, M. (2007) Causal Inference in Multilevel Settings in which Selection Process Vary across Schools. Working Paper 708, Center for the Study of Evaluation (CSE): Los Angeles. Manski C. F. (1990) Nonparametric Bounds on Treatment Effects, American Economic Review Papers and Proceedings, 80, 319323. Manski, C. F. and Garfinkel, I. (1992) Evaluating Welfare and Training Programs, Cambridge, MA: Harvard University Press. Neyman, J. (1923) On the application of probability theory to agricultural experiments: essay on principles, section 9. Translated in Statistical Science, 5(4), 465–480 (1990). Rosenbaum, P. R. and Rubin, D. B. (1983a) The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. Rosenbaum P. and Rubin D. (1983b), Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome. Journal of the Royal Statistical Society, Series B, 45, 212218. Rosenbaum, P. R. and Rubin, D. B. (1985) Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score. The American Statistician, 39(1), 3338. Rubin, D. B. (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701. Rubin, D. B. (1978) Bayesian Inference for Causal Effects: The Role of Randomization. Annals of Statistics, 6, 34–58. Rubin, D. (1980) Discussion of Randomization Analysis of Experimental Data: The Fisher Randomization Test by D. Basu. Journal of the American Statistical Association, 75, 591593. Sianesi (2004) An Evaluation of the Swedish System of Active Labour Market Programmes in the 1990s. The Review of Economics and Statistics, 86(1), 133155. Snijders, T. A. B. and Bosker, R. J. (1999) Multilevel Analysis. An Introduction to Basic and Advanced Multilevel Modelling. London: Sage. Zhao, Z. (2005) Sensitivity of Propensity Score Methods to the Specifications. IZA Discussion Paper No. 1873. 
URI:  http://mpra.ub.unimuenchen.de/id/eprint/17407 