T. Tsagris, Michail and Preston, Simon and T.A. Wood, Andrew (2011): A data-based power transformation for compositional data. Published in: Proceedings of the 4th international workshop on Compositional Data Analysis, Girona, Spain (May 2011)
Preview |
PDF
MPRA_paper_53068.pdf Download (294kB) | Preview |
Abstract
Compositional data analysis is carried out either by neglecting the compositional constraint and applying standard multivariate data analysis, or by transforming the data using the logs of the ratios of the components. In this work we examine a more general transformation which includes both approaches as special cases. It is a power transformation and involves a single parameter�. The transformation has two equivalent versions. The �first is the stay-in-the-simplex version. This expression is the power transformation as de�fined by Aitchison (1986). The second version, which is a linear transformation of the stay-in-the-simplex, is a Box-Cox type transformation. We call the second version the isometric �alpha-transformation because of the multiplication with the Helmert sub-matrix. We discuss a parametric way of estimating the value of alpha�, which is maximization of its pro�le like-lihood (assuming multivariate normality of the transformed data) and the equivalence between the two versions is exhibited. Other ways include maximization of the correct classi�cation probability in discriminant analysis and maximization of the pseudo-R2 in linear regression. We examine the relationship between the transformation, the raw data approach and the isometric log-ratio transformation. Furthermore, we also de�fine a suitable family of metrics corresponding to the family of �alpha-transformation and consider the corresponding family of Fr�echet means.
Item Type: | MPRA Paper |
---|---|
Original Title: | A data-based power transformation for compositional data |
English Title: | A data-based power transformation for compositional data |
Language: | English |
Keywords: | Compositional data, power transformation, alpha, Frechet mean |
Subjects: | C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C89 - Other |
Item ID: | 53068 |
Depositing User: | Mr Michail Tsagris |
Date Deposited: | 20 Jan 2014 17:21 |
Last Modified: | 27 Sep 2019 21:04 |
References: | Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B 44(2), 139-177. Aitchison, J. (1983). Principal component analysis of compositional data. Biometrika 70(1), 57-65. Aitchison, J. (1986). The Statistical Analysis of Compositional Data. Monographs on Statistics and Applied Probability. Chapman & Hall Ltd., London (UK). (Reprinted in 2003 with additional material by The Blackburn Press). 416 p. Aitchison J. (1989). Measures of location of compositional data sets. Mathematical Geology 21(7),787-790. Aitchison J. (1999). Logratios and Natural Laws in Compositional Data Analysis. Mathematical Geology 31(5), 563-580. Aitchison J. (1992). On criteria for measure of compositional di�erence. Mathematical Geology 24(4) 365-379. Aitchison, J., Barcel�o-Vidal, C., Mart��n-Fern�andez, J.A. and Pawlowsky-Glahn, V. (2000). Logratio Analysis and Compositional Distance. Mathematical Geology 32(3), 271-275. Barcel�o-Vidal, C., Pawlowsky, V. and Grunsky, E. (1996). Some aspects of transformations of compositional data and the identi�cation of outliers. Mathematical geology 28(4), 501-518. Baxter, M.J. (1995). Standardization and transformation in principal component analysis, with ap- plications to archaeometry. Applied Statistics 44 (4), 513{527. Baxter, M.J. (2001). Statistical modelling of artefact compositional data. Archaeometry 43 (1), 131{147. Baxter, M.J., Beardah, C.C., Cool, H.E.M. and Jackson C.M. (2005). Compositional data analysis of some alkaline glasses. Mathematical Geology 37(2), 183-196. Baxter, M.J. and Freestone, I.C. (2006). Log-ratio compositional data analysis in archaeometry. Archaeometry 48(3), 511-531. Beardah, C.C., Baxter, M.J., Cool, H.E.M. and Jackson, C.M. (2003). Compositional data analysis of archaeological glass: problems and possible solutions. Proceedings of the Compositional Data Analysis Workshop, CODAWORK 2003 . Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G. and Barcel�o-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology 35(3), 279-300. Lancaster H.O. (1965). The Helmert matrices. American Mathematical Monthly 72(1), 4-12. Sharp W.E. (2006). The graph median{A stable alternative measure of central tendency for compositional data sets. Mathematical Geology 38(2), 221-229. Srivastava, D.K., Boyett, J.M., Jackson, C.W., Tong, X. and Rai, S.N (2007). A comparison of permutation Hotelling's T2 test and log-ratio test for analyzing compositional data. Communications in Statistics-Theory and Methods 36(2), 415-431. Tsagris, M., Preston, S. and Wood, A.T.A. (2011). Semi-parametric methods for compositional data analysis involving a family of power transformations. In preparation. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/53068 |