Munich Personal RePEc Archive

Estimation bias due to duplicated observations: a Monte Carlo simulation

Sarracino, Francesco and Mikucka, Malgorzata (2016): Estimation bias due to duplicated observations: a Monte Carlo simulation.

[thumbnail of MPRA_paper_69064.pdf]

Download (479kB) | Preview


This paper assesses how duplicate records affect the results from regression analysis of survey data, and it compares the effectiveness of five solutions to minimize the risk of obtaining biased estimates. Results show that duplicate records create considerable risk of obtaining biased estimates. The chances of obtaining unbiased estimates in presence of a single sextuplet of identical observations is 41.6%. If the dataset contains about 10% of duplicated observations, then the probability of obtaining unbiased estimates reduces to nearly 11%. Weighting the duplicate cases by the inversion of their multiplicity minimizes the bias when multiple doublets are present in the data. Our results demonstrate the risks of using data in presence of non-unique observations and call for further research on strategies to analyze affected data.

Atom RSS 1.0 RSS 2.0

Contact us: mpra@ub.uni-muenchen.de

This repository has been built using EPrints software.

MPRA is a RePEc service hosted by Logo of the University Library LMU Munich.