T. Tsagris, Michail and Preston, Simon and T.A. Wood, Andrew (2011): A databased power transformation for compositional data. Published in: Proceedings of the 4th international workshop on Compositional Data Analysis, Girona, Spain (May 2011)

PDF
MPRA_paper_53068.pdf Download (294kB)  Preview 
Abstract
Compositional data analysis is carried out either by neglecting the compositional constraint and applying standard multivariate data analysis, or by transforming the data using the logs of the ratios of the components. In this work we examine a more general transformation which includes both approaches as special cases. It is a power transformation and involves a single parameter�. The transformation has two equivalent versions. The �first is the stayinthesimplex version. This expression is the power transformation as de�fined by Aitchison (1986). The second version, which is a linear transformation of the stayinthesimplex, is a BoxCox type transformation. We call the second version the isometric �alphatransformation because of the multiplication with the Helmert submatrix. We discuss a parametric way of estimating the value of alpha�, which is maximization of its pro�le likelihood (assuming multivariate normality of the transformed data) and the equivalence between the two versions is exhibited. Other ways include maximization of the correct classi�cation probability in discriminant analysis and maximization of the pseudoR2 in linear regression. We examine the relationship between the transformation, the raw data approach and the isometric logratio transformation. Furthermore, we also de�fine a suitable family of metrics corresponding to the family of �alphatransformation and consider the corresponding family of Fr�echet means.
Item Type:  MPRA Paper 

Original Title:  A databased power transformation for compositional data 
English Title:  A databased power transformation for compositional data 
Language:  English 
Keywords:  Compositional data, power transformation, alpha, Frechet mean 
Subjects:  C  Mathematical and Quantitative Methods > C8  Data Collection and Data Estimation Methodology ; Computer Programs > C89  Other 
Item ID:  53068 
Depositing User:  Mr Michail Tsagris 
Date Deposited:  20 Jan 2014 17:21 
Last Modified:  17 Jun 2016 00:32 
References:  Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B 44(2), 139177. Aitchison, J. (1983). Principal component analysis of compositional data. Biometrika 70(1), 5765. Aitchison, J. (1986). The Statistical Analysis of Compositional Data. Monographs on Statistics and Applied Probability. Chapman & Hall Ltd., London (UK). (Reprinted in 2003 with additional material by The Blackburn Press). 416 p. Aitchison J. (1989). Measures of location of compositional data sets. Mathematical Geology 21(7),787790. Aitchison J. (1999). Logratios and Natural Laws in Compositional Data Analysis. Mathematical Geology 31(5), 563580. Aitchison J. (1992). On criteria for measure of compositional di�erence. Mathematical Geology 24(4) 365379. Aitchison, J., Barcel�oVidal, C., Mart��nFern�andez, J.A. and PawlowskyGlahn, V. (2000). Logratio Analysis and Compositional Distance. Mathematical Geology 32(3), 271275. Barcel�oVidal, C., Pawlowsky, V. and Grunsky, E. (1996). Some aspects of transformations of compositional data and the identi�cation of outliers. Mathematical geology 28(4), 501518. Baxter, M.J. (1995). Standardization and transformation in principal component analysis, with ap plications to archaeometry. Applied Statistics 44 (4), 513{527. Baxter, M.J. (2001). Statistical modelling of artefact compositional data. Archaeometry 43 (1), 131{147. Baxter, M.J., Beardah, C.C., Cool, H.E.M. and Jackson C.M. (2005). Compositional data analysis of some alkaline glasses. Mathematical Geology 37(2), 183196. Baxter, M.J. and Freestone, I.C. (2006). Logratio compositional data analysis in archaeometry. Archaeometry 48(3), 511531. Beardah, C.C., Baxter, M.J., Cool, H.E.M. and Jackson, C.M. (2003). Compositional data analysis of archaeological glass: problems and possible solutions. Proceedings of the Compositional Data Analysis Workshop, CODAWORK 2003 . Egozcue, J.J., PawlowskyGlahn, V., MateuFigueras, G. and Barcel�oVidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology 35(3), 279300. Lancaster H.O. (1965). The Helmert matrices. American Mathematical Monthly 72(1), 412. Sharp W.E. (2006). The graph median{A stable alternative measure of central tendency for compositional data sets. Mathematical Geology 38(2), 221229. Srivastava, D.K., Boyett, J.M., Jackson, C.W., Tong, X. and Rai, S.N (2007). A comparison of permutation Hotelling's T2 test and logratio test for analyzing compositional data. Communications in StatisticsTheory and Methods 36(2), 415431. Tsagris, M., Preston, S. and Wood, A.T.A. (2011). Semiparametric methods for compositional data analysis involving a family of power transformations. In preparation. 
URI:  https://mpra.ub.unimuenchen.de/id/eprint/53068 