Munich Personal RePEc Archive

The k-NN algorithm for compositional data: a revised approach with and without zero values present

Tsagris, Michail (2014): The k-NN algorithm for compositional data: a revised approach with and without zero values present. Published in: Journal of Data Science , Vol. 3, No. 12 (July 2014): pp. 519-534.

[img]
Preview
PDF
MPRA_paper_65866.pdf

Download (1MB) | Preview

Abstract

In compositional data, an observation is a vector with non-negative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to extend the taxicab metric and a newly suggested metric for com-positional data by employing a power transformation. Both metrics are to be used in the k-nearest neighbours algorithm regardless of the presence of zeros. Examples with real data are exhibited.

UB_LMU-Logo
MPRA is a RePEc service hosted by
the Munich University Library in Germany.