Tsagris, Michail (2015): A novel, divergence based, regression for compositional data. Published in: Proceedings of the 28th Panhellenic Statistics Conference (17 April 2015): pp. 430-444.
Preview |
PDF
MPRA_paper_72769.pdf Download (521kB) | Preview |
Abstract
In compositional data, an observation is a vector with non-negative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to propose a new, divergence based, regression modelling technique for compositional data. To do so, a recently proved metric which is a special case of the Jensen-Shannon divergence is employed. A strong advantage of this new regression technique is that zeros are naturally handled. An example with real data and simulation studies are presented and are both compared with the log-ratio based regression suggested by Aitchison in 1986.
Item Type: | MPRA Paper |
---|---|
Original Title: | A novel, divergence based, regression for compositional data |
Language: | English |
Keywords: | compositional data, Jensen-Shannon divergence, regression, zero values, φ-divergence |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General |
Item ID: | 72769 |
Depositing User: | Mr Michail Tsagris |
Date Deposited: | 29 Jul 2016 04:11 |
Last Modified: | 19 Oct 2019 04:06 |
References: | Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B 44, 139-177. Aitchison, J. (2003). The statistical analysis of compositional data, New Jersey: Reprinted by The Blackburn Press. Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions. Information Theory, IEEE Transactions on 49, 1858-1860. Gueorguieva, R., Rosenheck, R., and Zelterman, D. (2008). Dirichlet component regression and its applications to psychiatric data. Computational statistics & data analysis 52, 5344-5355. Jolliffe, I. T. (2005). Principal component analysis, New York: Springer-Verlag. Kateri, M. and Agresti, A. (2010). A generalized regression model for a binary response. Statistics & Probability Letters 80, 89-95. Kent, J. T. (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society. Series B 44, 71-80. Kullback, S. (1997). Information theory and statistics, New York: Dover Publications. Lancaster, H. (1965). The Helmert matrices. American Mathematical Monthly 72, 4-12. Le, H. and Small, C. G. (1999). Multidimensional scaling of simplex shapes. Pattern Recognition 32, 1601-1613. Maier, M. J. (2014). DirichletReg: Dirichlet Regression in R. R package version 0.5-2 . Martin-Fernandez, J., Hron, K., Templ, M., Filzmoser, P., and Palarea-Albaladejo, J. (2012). Model-based replacement of rounded zeros in compositional data: Classical and robust approaches. Computational Statistics & Data Analysis 56, 2688-2704. Murteira, J. M. and Ramalho, J. J. (2014). Regression analysis of multivariate fractional data. Econometric Reviews (ahead-of-print) 1-38. Neocleous, T., Aitken, C., and Zadora, G. (2011). Transformations for compositional data with zeros with an application to forensic evidence evaluation. Chemometrics and IntelligentLaboratory Systems 109, 77-85. Osterreicher, F. and Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics 55, 639-653. Scealy, J. L. and Welsh, A. H. (2011). Regression for compositional data by using distributions de�ned on the hypersphere. Journal of the Royal Statistical Society. Series B 73, 351-375. Stephens, M. A. (1982). Use of the von Mises distribution to analyse continuous proportions. Biometrika 69, 197-203. Templ, M., Hron, K., and Filzmoser, P. (2011). robCompositions: Robust estimation for compositional data. R package version 0.8-4. Theil, H. (1967). Economics and information theory. Amsterdam: North-Holland pub- lishing company. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/72769 |