López, Alberto (2011): The effect of microaggregation on regression results: an application to Spanish innovation data.
Download (138kB) | Preview
Microaggregation is a technique for masking confidential data by aggregation. The aim of this paper is to analyze the extent to which microaggregated data can be used for rigorous empirical research. In doing this, I adopt an empirical perspective. I use data from the Technological Innovation Panel (PITEC) and compare regression results using both original and anonymized data. PITEC is a new firm-level panel data base for innovative activities of Spanish firms based on CIS data. I find that the microaggregation procedure used has a slight effect on the coefficient estimates and their estimated standard errors, especially when estimating linear models.
|Item Type:||MPRA Paper|
|Original Title:||The effect of microaggregation on regression results: an application to Spanish innovation data|
|Keywords:||Microaggregation; Individual ranking; Bias; Innovation data|
|Subjects:||O - Economic Development, Innovation, Technological Change, and Growth > O3 - Innovation ; Research and Development ; Technological Change ; Intellectual Property Rights > O30 - General
C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C80 - General
|Depositing User:||Alberto López|
|Date Deposited:||21. Apr 2011 12:15|
|Last Modified:||24. Feb 2015 18:30|
Adam, N. R. and Wortmann, J. C., (1989), "Security-Control Methods for Statistical Databases: A Comparative Study", ACM Computing Surveys, 21(4), 515-556.
Bernard, A.B. and Jensen, J.B., (1999), "Exceptional exporter performance: cause, effect, or both?", Journal of International Economics, 47 (1), 1-26.
Crepon, B., Duguet, E. and Mairesse, J., (1998), "Research and Development, Innovation and Productivity: An Econometric Analysis at the Firm Level", Economics of Innovation and New Technology, 7(2), 115-156.
Eurostat (1996), Manual on Disclosure Control Methods, 9E, Statistical Office of the European Communities, Luxembourg.
Eurostat (1999), "Annex II.9. Micro-Aggregation Process", in The Second Community Innovation Survey, Statistical Office of the European Communities, Luxembourg.
Hausman, J.A., Abrevaya, J.and Scott-Morton, F.M., (1998), "Misclassification of the dependent variable in a discrete-response setting", Journal of Econometrics, 87, 239-269.
López, A. (2008), "Determinants of R&D cooperation: Evidence from Spanish manufacturing firms", International Journal of Industrial Organization, 26, 113-136.
Mairesse, J.and Mohnen, P., (2001), "To be or not to be innovative: An exercise in measurement", STI Review, OECD 27, 103-129.
Ronning, G., (2005), "Randomized response and the binary probit model", Economics Letters, 86, 221-228.
Schmid, M. and Schneeweiss, H., (2005). "The effect of microaggregation procedures on the estimation of linear models: A simulation study", In Econometrics of Anonymized Micro Data (W. Pohlmeier, G. Ronning, J. Wagner, eds.), Jahrbficher ffir NationalSkonomie und Statistik, 225, No. 5, Lucius & Lucius, Stuttgart.
Schmid, M. and Schneeweiss, H., (2009), "The effect of microaggregation by individual ranking on the estimation of moments", Journal of Econometrics, 153, 174-182.
Willenborg, L. and de Waal, T., (2001), "Elements of Statistical Disclosure Control", Springer Lecture Notes in Statistics, vol. 155. Springer, Berlin.
Winkler, W. E., (2004), "Masking and Re-identification Methods for Public-Use Microdata: Overview and Research Problems", In Proc. Privacy in Statistical Databases, J. Domingo-Ferrer and V. Torra (Eds.): LNCS 3050, 231-246.