Buzzigoli, Lucia and Giusti, Antonio (2006): From Marginals to Array Structure with the Shuttle Algorithm. Published in: Journal of Symbolic Data Analysis , Vol. 4, No. number 1 (June 2006): pp. 1-14.
Preview |
PDF
MPRA_paper_49245.pdf Download (316kB) | Preview |
Abstract
In many statistical problems there is the need to analyze the structure of an unknown n-dimensional array given its marginal distributions. The usual method utilized to solve the problem is linear programming, which involves a large amount of computational time when the original array is large. Alternative solutions have been proposed in the literature, especially to find less time consuming algorithms. One of these is the shuttle algorithm introduced by Buzzigoli and Giusti [1] to calculate lower and upper bounds of the elements of an n-way array, starting from the complete set of its (n-1)-way marginals. The proposed algorithm, very easy to implement with a matrix language, shows interesting properties and possibilities of application. The paper presents the algorithm, analyses its properties and describes its disadvantages. It also suggests possible applications in some statistical fields and, in particular, in Symbolic Data Analysis and, finally, shows the results of some simulations on randomly generated arrays.
Item Type: | MPRA Paper |
---|---|
Original Title: | From Marginals to Array Structure with the Shuttle Algorithm |
Language: | English |
Keywords: | Shuttle algorithm, Linear programming, Statistical disclosure control, Linked tables, Zero restrictions |
Subjects: | C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C15 - Statistical Simulation Methods: General C - Mathematical and Quantitative Methods > C4 - Econometric and Statistical Methods: Special Topics > C44 - Operations Research ; Statistical Decision Theory C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C88 - Other Computer Software |
Item ID: | 49245 |
Depositing User: | Prof. Antonio Giusti |
Date Deposited: | 23 Aug 2013 13:57 |
Last Modified: | 26 Sep 2019 08:20 |
References: | 1. Buzzigoli, L. and Giusti, A.: An Algorithm to Calculate the Lower and Upper Bounds of the Elements of an Array Given its Marginals. Working Paper 70. Dipartimento di Statistica "Giuseppe Parenti", Firenze (1996) 2. Billard, L. and Diday, E.: From the Statstics of Data to the Statistics of nowledhe: Symbolic Data Analysis. Journal of the American Statistical Association, 98, 462, 470-487 (2003) 3. Brickman, L.: Mathematical Introduction to Linear Programming and Game Theory. Springer-Verlag, New York (1989) 4. de Carvalho, F.D., Deallert, N.P., and de Sanches Osorio, M.: Statistical Disclosure in Two-Dimensional Tables: General Tables. Journal of the American Statistical Association, 89, 1547-1557 (1994) 5. Cox, L.H.: Suppression Methodology and Statistical Disclosure Control. Journal of the American Statistical Association, 75, 377-385 (1980) 6. Cox, L.H.: Network Models for Complementary Cell Suppression. Journal of the American Statistical Association, 90, 1453-1462 (1995) 7. Chowdhury, S.D., Duncan, G.T., Krishnan, R., Roehrig, S.F., and Mukherjee, S.: Disclosure Detection in Multivariate Categorical Databases: Auditing Confidentiality Protection Through Two New Matrix Operators. Management Science, 45, 1710-23 (1999) 8. Roehrig., S.F., Padman, R., Duncan, G.T. and Krishnan, R.: Disclosure Detection in Multiple Linked Categorical Datafiles: A Unified Network Approach. In: Statistical Data Protection - Proceedings of the conference, 131-147. Eurostat, Luxembourg (1998) 9. Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R. and Roehrig, S.F.: Disclosure Limitation Methods and Information Loss for Tabular Data. In: Doyle et al. (eds.). Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier Science, Amsterdam (2001) 10. Buzzigoli, L. and Giusti, A.: Statistical disclosure control problems for linked tables. Italian Journal of Applied Statistics, 10, 443-458 (1998) 11. Buzzigoli, L. and Giusti, A.: An algorithm to calculate the lower and upper bounds of the elements of an array given its marginals. In: Statistical Data Protection - Proceedings of the Conference, 131-147. Eurostat, Luxembourg (1999) 12. Roehrig, S. F.: Auditing Disclosure in Multi-Way Tables With Cell Suppression: Simplex and Shuttle Solutions. Paper presented at: Joint Statistical Meeting 1999, August 5-12, Baltimore (1999) 13. Dobra, A., and Fienberg, S.E.: Bounds for cell entries in Contingency Tables Given Marginal Totals and Decomposable Graphs. Proceedings of the National Academy of Sciences, 97, 11185-92 (2000) 14. Dobra, A.: Computing Sharp Integer Bounds for Entries in Contingency Tables Given a Set of Fixed Marginals. Technical Report, Department of Statistics. Carnegie Mellon University, Pittsburgh (2001) 15. Chen, Y., Dinwoodie, I. H., and Sullivant, S.: Sequential Importance Sampling for Multiway Tables. The Annals of Statistics, 34, 1, in press (2006) 16. Cox, L.H.: Bounds on Entries in 3-Dimensional Contingency Tables Subject to Given Marginal Totals. In: Domingo-Ferrer, J. (Ed.). Inference Control in Statistical Databases : From Theory to Practice. Springer-Verlag, Heidelberg (2002) 17. Fienberg, S.E.: Fréchet and Bonferroni Bounds for Multi-way Tables of Counts With Applications to Disclosure Limitation. In: Statistical Data Protection - Proceedings of the conference, 115-129. Luxembourg: Eurostat (1999) 18. Federal Committee On Statistical Methodology: Report on Statistical Disclosure Limitation Methodology. Statistical Policy Working Paper 22, Statistical Policy Office, Office of Information and Regulatory Affairs. Office of Management and Budget, Washington D.C. (1994) 19. Willenborg, L.C.R.J. and de Waal, A.G.: Statistical Disclosure Control in Practice. Lecture Notes in Statistics. Springer-Verlag, New York (1996) 20. Domingo-Ferrer, J. (ed.): Proceedings of the SDP'98. IOS Press, Luxembourg (1998) 21. Doyle, P., Lane, J.I., Theeuwes, J.J.M. and Zayatz, L.V. (eds.): Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier Science, Amsterdam (2001) 22. de Vries, R.E.: Disclosure Control of Tabular Data Using Subtables. Netherlands Central Bureau of Statistics, Voorburg (1993) 23. Armitage, P., Merret, K., Lyons, A. and Tame, E.: Neighbourhood Statistics in England and Wales: Disclosure Control Problems and Solutions. In: Work session on Statistical data confidentiality. Part 2. Monographs of Official Statistics. Eurostat, Luxembourg (2004) 24. Dobra, A., Karr A.F. and Sanil A.P.: Preserving Confidentiality of High-dimensional Tabulated Data: Statistical and Computational Issues. Technical Report no.130, National Institute of Statistical Science (2002) 25. Karr, A.F., Dobra A. and Sanil A.P.: Table Server Protect Confidentiality in Tabular Data Releases. Communications of the ACM, 46, 1 (2003) 26. Frechet, M.: Sur les tableau de corrélation dont le marge sont données. Ann. Univ. Lyons Sect. A, Ser. 3, 14, 53-77 (1951) 27. Bonferroni, C.E.: Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R. Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8, 1-62 (1936) 28. Rizzi, A.: Osservazioni sulle classi di Fréchet delle funzioni di ripartizione a più variabili. Bollettino Unione Matematica Italiana, 12, 269-277 (1957) 29. Dall'Aglio, G.: Sulle distribuzioni doppie con margini assegnati soggette a delle limitazioni. Giornale dell'Istituto Italiano degli Attuari, XXIII-XXIV, 94-105 (1960) 30. Dall'Aglio, G.: Les fonctions extremes de la classe de Fréchet a 3 dimensions. Pubblic. de l'Institute de Statistique de l'Université de Paris, 9, 175-188 (1961) 31. Dall’Aglio, G., Kotz, S. and Salinetti, G.: Advances in Probability Distributions with Given Marginals. Kluwer Academic Publishers, Dordrecht (1990) 32. Rüschendorf, L.: Bounds for Distributions With Multivariate Marginals, Stochastic Orders and Decision under Risk, IMS Lecture Notes - Monograph Series, 19, 285-310 (1991) 33. Genest, C., Quesada Molina, J.J. and Rodriguez Lallena, J.A.: De l'impossibilité de construire des lois à marges multidimensionnelles données à partir de copules. C.R. Acad. Sci. Paris, t. 320, Série I, 723-726 (1995) 34. Rüschendorf, L.: Developments on Fréchet Bounds. In L. Rüschendorf, B. Schweizer and M.D. Taylor (Eds.): Distributions with Fixed Marginals and Related Topics. IMS Lecture Notes - Monograph Series, 28, 273-296 (1996) 35. Galambos, J. and Simonelli, I.: Bonferroni-type Inequalities with Applications. Springer- Verlag, New York (1996) 36. Fienberg, S.E. and Makov, U.E.: Confidentiality Uniqueness and Disclosure Limitation for Categorical Data. Journal of Official Statistics, 14, 385-397 (1998) 37. Dobra, A.: Statistical Tools for Disclosure Limitation in Multi-Way Contingency Tables. PhD Thesis. Carnegie Mellon University, Pittsburgh (2002) 38. Diday, E.: An Introduction to Symbolic Data Analysis and the Sodas Software. The Electronic Journal of Symbolic Data Analysis, 0, 0 (2002) 39. Buzzigoli, L. and Giusti, A.: Disclosure Control On Multi-Way Tables By Means Of The Shuttle Algorithm: Extensions And Experiments. In: Compstat 2000 – Proceedings in Computational Statistics, 229-234. Phisica Verlag, Heidelberg (2000) 40. Buzzigoli, L. and Giusti, A.: Shuttle algorithm and Simplex Method: Some experiments. Poster presented at COMPSTAT 2004, August 23-27, Prague (2004) 41. Cox, L.H.: Some remarks on research directions in statistical data protection. In: Statistical data protection - Proceedings of the Conference, 163-176. Eurostat, Luxembourg (1999) |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/49245 |