Tseng, Yen-hsuan (2025): Recovering Unobserved Network Links from Aggregated Relational Data: Discussions on Bayesian Latent Surface Modeling and Penalized Regression.
PDF
MPRA_paper_123164.pdf Download (275kB) |
Abstract
Accurate network data are essential in fields such as economics, finance, sociology, epidemiology, and computer science. However, real-world constraints often prevent researchers from collect- ing a complete adjacency matrix, compelling them to rely on partial or aggregated information. One widespread example is Aggregated Relational Data (ARD), where respondents or institutions merely report the number of links they have to nodes possessing certain traits, rather than enu- merating all neighbors explicitly. This dissertation provides an in-depth examination of two major frameworks for reconstruct- ing networks from ARD: the Bayesian latent surface model and frequentist penalized regression ap- proaches. We supplement the original discussion with additional theoretical considerations on identifiability, consistency, and potential misreporting mechanisms. We also incorporate robust estimation techniques and references to privacy-preserving strategies such as differential privacy. By embedding nodes in a hyperspherical space, the Bayesian method captures geometric distance- based link formation, while the penalized regression approach casts unknown edges in a high- dimensional optimization problem, enabling scalability and the incorporation of covariates. Sim- ulations explore the effects of trait design, measurement error, and sample size. Real-world ap- plications illustrate the potential for partially observed networks in domains like financial risk, social recommendation systems, and epidemic contact tracing, complementing the original text with deeper investigations of large-scale inference challenges. Our aim is to show that even though ARD may be coarser than full adjacency data, it retains sub- stantial information about network structures, allowing reasonably accurate inference at scale. We conclude by discussing how adaptive trait selection, hybrid geometry-penalty methods, and privacy- aware data sharing can further advance this field. This enhanced treatment underscores the prac- tical relevance and theoretical rigor of ARD-based network inference.
Item Type: | MPRA Paper |
---|---|
Original Title: | Recovering Unobserved Network Links from Aggregated Relational Data: Discussions on Bayesian Latent Surface Modeling and Penalized Regression |
Language: | English |
Keywords: | Aggregated Relational Data (ARD) Network Inference Bayesian Latent Surface Model (BLSM) Penalized Regression Hyperspherical Embedding Differential Privacy Federated Learning Privacy-Preserving Networks Robust Estimation Misreporting in Networks High-Dimensional Optimization Sparse Networks Social Recommendation Systems Financial Interbank Networks Epidemic Contact Tracing |
Subjects: | C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C38 - Classification Methods ; Cluster Analysis ; Principal Components ; Factor Models C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C55 - Large Data Sets: Modeling and Analysis C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C81 - Methodology for Collecting, Estimating, and Organizing Microeconomic Data ; Data Access D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D85 - Network Formation and Analysis: Theory |
Item ID: | 123164 |
Depositing User: | Mr Yen-hsuan Tseng |
Date Deposited: | 04 Jan 2025 14:21 |
Last Modified: | 04 Jan 2025 14:21 |
References: | Acemoglu, D., A. Ozdaglar, and A. Tahbaz-Salehi (2015): “Systemic Risk and Stability in Financial Networks,” American Economic Review, 105, 564–608. Alidaee, H., K. Sankaran, and R. Bhattacharya (2020): “Recovering Latent Network Struc- tures Using Penalized Likelihood from Aggregated Relational Data,” Journal of Multivariate Analysis, 179, 104630. Breza, E. and A. G. Chandrasekhar (2017): “Using Aggregated Relational Data to Feasibly Identify Network Links and Measure Degrees,” Tech. Rep. w24239, National Bureau of Eco- nomic Research. Dou, X. and N. Li (2022): “Partial Contact Tracing with Aggregated Relational Data: Application to COVID-19 in a University Setting,” Epidemics, 40, 100576. Fan, J. and R. Li (2001): “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of the American Statistical Association, 96, 1348–1360. Gandy , A. and L. A. Veraart (2019): “Adjustable Network Reconstruction with Applications to CDS Exposures,” Journal of Banking & Finance, 116, 105811. Handcock, M. S., K. J. Gile, and C. M. Mar (2010): “Modeling Social Networks with Sampled or Missing Data,” The Annals of Applied Statistics, 4, 5–25. He, H. and L. Liu (2022): “Collective Graphical Models for Weighted Aggregated Data,” Journal of the American Statistical Association, 117, 1–14. Hoff, P. D., A. E. Raftery , and M. S. Handcock (2002): “Latent Space Approaches to Social Network Analysis,” Journal of the American Statistical Association, 97, 1090–1098. Jiang, L., P. Xu, and S. Li (2022): “Neural ARD Embeddings for Massive Privacy-Constrained Networks,” in Proceedings of the 39th International Conference on Machine Learning (ICML), 9992– 10005. Li, Q., X. Wang, and M. Freedman (2023): “Federated and Differentially Private Estimation of Network Links from Aggregated Relational Data,” Annals of Applied Statistics, 17, 156–178. Marsden, P. V. (2002): “Egocentric and Sociocentric Measures of Network Centrality,” Social Networks, 24, 407–422. McCormick, T. H., T. Zheng, A. Gelman, and R. Little (2015): “Latent Demographic Pro- file Estimation in Hard-to-Reach Groups: An Application to Commercial Sex Workers in El Salvador,” The Annals of Applied Statistics, 9, 1247–1277. Tibshirani, R. (1996): “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society: Series B, 58, 267–288. Wasserman, S. and K. Faust (1994): Social Network Analysis: Methods and Applications, vol. 8, Cambridge University Press. Wood, A. T. (1994): “Simulation of the von Mises Fisher Distribution,” Communications in Statistics–Simulation and Computation, 23, 157–164. Zhang, F. and R. Cao (2021): “Robust Partial Network Inference under Aggregated Relational Data,” Biometrika, 108, 599–611. Zheng, T., M. J. Salganik, and A. Gelman (2006): “Many Are Called but Few Are Chosen: Specialized Network Resources in Small Worlds,” Journal of the Royal Statistical Society: Series A, 169, 151–168. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/123164 |