Tsagris, Michail (2015): A novel, divergence based, regression for compositional data. Published in: Proceedings of the 28th Panhellenic Statistics Conference (17 April 2015): pp. 430444.

PDF
MPRA_paper_72769.pdf Download (521kB)  Preview 
Abstract
In compositional data, an observation is a vector with nonnegative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to propose a new, divergence based, regression modelling technique for compositional data. To do so, a recently proved metric which is a special case of the JensenShannon divergence is employed. A strong advantage of this new regression technique is that zeros are naturally handled. An example with real data and simulation studies are presented and are both compared with the logratio based regression suggested by Aitchison in 1986.
Item Type:  MPRA Paper 

Original Title:  A novel, divergence based, regression for compositional data 
Language:  English 
Keywords:  compositional data, JensenShannon divergence, regression, zero values, φdivergence 
Subjects:  C  Mathematical and Quantitative Methods > C1  Econometric and Statistical Methods and Methodology: General 
Item ID:  72769 
Depositing User:  Mr Michail Tsagris 
Date Deposited:  29 Jul 2016 04:11 
Last Modified:  19 Oct 2019 04:06 
References:  Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B 44, 139177. Aitchison, J. (2003). The statistical analysis of compositional data, New Jersey: Reprinted by The Blackburn Press. Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions. Information Theory, IEEE Transactions on 49, 18581860. Gueorguieva, R., Rosenheck, R., and Zelterman, D. (2008). Dirichlet component regression and its applications to psychiatric data. Computational statistics & data analysis 52, 53445355. Jolliffe, I. T. (2005). Principal component analysis, New York: SpringerVerlag. Kateri, M. and Agresti, A. (2010). A generalized regression model for a binary response. Statistics & Probability Letters 80, 8995. Kent, J. T. (1982). The FisherBingham distribution on the sphere. Journal of the Royal Statistical Society. Series B 44, 7180. Kullback, S. (1997). Information theory and statistics, New York: Dover Publications. Lancaster, H. (1965). The Helmert matrices. American Mathematical Monthly 72, 412. Le, H. and Small, C. G. (1999). Multidimensional scaling of simplex shapes. Pattern Recognition 32, 16011613. Maier, M. J. (2014). DirichletReg: Dirichlet Regression in R. R package version 0.52 . MartinFernandez, J., Hron, K., Templ, M., Filzmoser, P., and PalareaAlbaladejo, J. (2012). Modelbased replacement of rounded zeros in compositional data: Classical and robust approaches. Computational Statistics & Data Analysis 56, 26882704. Murteira, J. M. and Ramalho, J. J. (2014). Regression analysis of multivariate fractional data. Econometric Reviews (aheadofprint) 138. Neocleous, T., Aitken, C., and Zadora, G. (2011). Transformations for compositional data with zeros with an application to forensic evidence evaluation. Chemometrics and IntelligentLaboratory Systems 109, 7785. Osterreicher, F. and Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics 55, 639653. Scealy, J. L. and Welsh, A. H. (2011). Regression for compositional data by using distributions de�ned on the hypersphere. Journal of the Royal Statistical Society. Series B 73, 351375. Stephens, M. A. (1982). Use of the von Mises distribution to analyse continuous proportions. Biometrika 69, 197203. Templ, M., Hron, K., and Filzmoser, P. (2011). robCompositions: Robust estimation for compositional data. R package version 0.84. Theil, H. (1967). Economics and information theory. Amsterdam: NorthHolland pub lishing company. 
URI:  https://mpra.ub.unimuenchen.de/id/eprint/72769 