John Michael, Riveros-Gavilanes (2025): Municipality synthetic Gini index for Colombia: A machine learning approach.
![]() |
PDF
MPRA_paper_123561.pdf Download (2MB) |
Abstract
This paper presents two synthetic estimations of the Gini coefficient at a municipality level for Colombia in the years 2000-2020. The methodology relies on several machine learning models to select the best model for imputation of the data. This derives in two Random Forest models were the first is characterized by containing Dominant Fixed Effects, while the second contains a set of Dominant Varying Factors. Upon these estimations, the Synthetic Gini Coefficients for both models are inspected, and public links are generated to access them. The Dominant Fixed Effects models is rather ”stiff” in contrast to the Varying Factor model. Hence, for researchers it is recommended to use the Synthetic Gini Coefficient with Varying Factors because it contains greater variability across time than the Dominant Fixed Effects models.
Item Type: | MPRA Paper |
---|---|
Original Title: | Municipality synthetic Gini index for Colombia: A machine learning approach |
English Title: | Municipality synthetic Gini index for Colombia: A machine learning approach |
Language: | English |
Keywords: | Gini; Machine learning; Random forest; estimation; synthetic; economics |
Subjects: | C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C80 - General H - Public Economics > H7 - State and Local Government ; Intergovernmental Relations O - Economic Development, Innovation, Technological Change, and Growth > O1 - Economic Development > O10 - General P - Economic Systems > P1 - Capitalist Systems > P19 - Other |
Item ID: | 123561 |
Depositing User: | John Michael Riveros Gavilanes |
Date Deposited: | 07 Feb 2025 11:35 |
Last Modified: | 07 Feb 2025 11:36 |
References: | Abdel-Rahman, H. M. and Wang, P. (1997). Social welfare and income inequality in a system of cities. Journal of Urban Economics, 41(3):462–483. Alwateer, M., Atlam, E.-S., Abd El-Raouf, M. M., Ghoneim, O. A., and Gad, I. (2024). Missing data imputation: A comprehensive review. Journal of Computer and Communications, 12(11):53–75. Castelló-Climent, A. and Doménech, R. (2021). Human capital and income inequality revisited. Education Economics, 29(2):194–212. CEDE (2023). Panel municipal cede, centro de estudios sobre el desarrollo económico. https://datoscede.uniandes.edu.co/catalogo-de-datos/. Clark, C. M. and Kavanagh, C. (1996). Basic income, inequality, and unemployment: rethinking the linkage between work and welfare. Journal of Economic Issues, 30(2):399– 406. Coady, D., D’Angelo, D., and Evans, B. (2022). Fiscal redistribution, social welfare and income inequality:‘doing more’or ‘more to do’? Applied Economics, 54(21):2416–2429. Coburn, D. (2015). Income inequality, welfare, class and health: A comment on pickett and wilkinson, 2015. Social science & medicine, 146:228–232. Dagum, C. (1990). On the relationship between income inequality measures and social welfare functions. Journal of Econometrics, 43(1-2):91–102. Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. Gond, V. K., Dubey, A., and Rasool, A. (2021). A survey of machine learning-based approaches for missing value imputation. In 2021 third international conference on inventive research in computing applications (ICIRCA), pages 1–8. IEEE. Hong, S. and Lynn, H. S. (2020). Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity, and interaction. BMC medical research methodology, 20:1–12. Kim, K.-t. (2017). The relationships between income inequality, welfare regimes and aggregate health: a systematic review. The European Journal of Public Health, 27(3):397–404. Kuhn, M. (2008). Building predictive models in r using the caret package. Journal of statistical software, 28:1–26. Kühn, M. (2015). Peripheralization: Theoretical concepts explaining socio-spatial inequalities. European Planning Studies, 23(2):367–378. Lakshminarayan, K., Harp, S. A., Goldman, R. P., Samad, T., et al. (1996). Imputation of missing data using machine learning techniques. In KDD, volume 96. Lee, J.-W. and Lee, H. (2018). Human capital and income inequality. Journal of the Asia Pacific Economy, 23(4):554–583. Lee, K.-K. and Vu, T. V. (2020). Economic complexity, human capital and income inequality: a cross-country analysis. The Japanese Economic Review, 71(4):695–718. Lin, W.-C. and Tsai, C.-F. (2020). Missing value imputation: a review and analysis of the literature (2006–2017). Artificial Intelligence Review, 53:1487–1509. Lin, W.-C., Tsai, C.-F., and Zhong, J. R. (2022). Deep learning for missing value imputation of continuous data and the effect of data discretization. Knowledge-Based Systems, 239:108079. Oppido, S., Ragozino, S., and Esposito De Vita, G. (2023). Peripheral, marginal, or noncore areas? setting the context to deal with territorial inequalities through a systematic literature review. Sustainability, 15(13):10401. Paas, T. and Schlitte, F. (2008). Regional income inequality and convergence processes in the eu-25. Scienze regionali: Italian Journal of regional Science: 7, supplemento 2, 2008, pages 29–49. Rey, S. J. (2004). Spatial analysis of regional income inequality. Spatially integrated social science, 1:280–299. Ridgeway, G. and Ridgeway, M. G. (2004). The gbm package. R Foundation for Statistical Computing, Vienna, Austria, 5(3). Riveros-Gavilanes, J. M. (2021). Estimación de la función de bienestar social de amartya sen para américa latina. Ensayos de Economía, 31(59):13–40. Riveros-Gavilanes, J. M. (2023). On the empirics of violence, inequality, and income. Journal of Economics and Management, 45(1):102–136. Riveros-Gavilanes, J. M., Al Akayleh, F., Oduniyi, O., Samuel, A. H., and Hassan, S. M. (2022). On the welfare trends: A view from the sen’s social welfare function. Technical report, M&S Research Hub institute. Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons. Salvati, L. (2016). The dark side of the crisis: disparities in per capita income (2000–12) and the urban-rural gradient in greece. Tijdschrift voor economische en sociale geografie, 107(5):628–641. Schafer, J. L. and Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2):147–177. Seu, K., Kang, M.-S., and Lee, H. (2022). An intelligent missing data imputation techniques: A review. JOIV: International Journal on Informatics Visualization, 6(1-2):278– 283. Sologon, D. M., Doorley, K., and O’Donoghue, C. (2023). Drivers of income inequality: what can we learn using microsimulation? Handbook of Labor, Human Resources and Population Economics, pages 1–37. Sullivan, T. R., Lee, K. J., Ryan, P., and Salter, A. B. (2017). Multiple imputation for handling missing outcome data when estimating the relative risk. BMC medical research methodology, 17:1–10. Sun, Y., Li, J., Xu, Y., Zhang, T., and Wang, X. (2023). Deep learning versus conventional methods for missing data imputation: A review and comparative study. Expert Systems with Applications, 227:120201. Therneau, T., Atkinson, B., Ripley, B., and Ripley, M. B. (2015). Package ‘rpart’. Available online: cran. ma. ic. ac.uk/web/packages/rpart/rpart. pdf (accessed on 20 April 2016). Wang, S., Li, B., Yang, M., and Yan, Z. (2019). Missing data imputation for machine learning. In IoT as a Service: 4th EAI International Conference, IoTaaS 2018, Xi’an, China, November 17–18, 2018, Proceedings 4, pages 67–72. Springer. Wickham, H. (2011). ggplot2. Wiley interdisciplinary reviews: computational statistics, 3(2):180–185. Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., et al. (2019). Welcome to the tidyverse. Journal of open source software, 4(43):1686. Wildowicz-Szumarska, A. (2022). Is redistributive policy of eu welfare state effective in tackling income inequality? a panel data analysis. Equilibrium. Quarterly Journal of Economics and Economic Policy, 17(1):81–101. Xue, J. (2023). Review on data imputation methods in machine learning. In Journal of Physics: Conference Series, volume 2646, page 012034. IOP Publishing. Yang, X. and Tang, W. (2023). Additional social welfare of environmental regulation: the effect of environmental taxes on income inequality. Journal of Environmental Management, 330:117095. Yarberry, W. and Yarberry, W. (2021). Dplyr. CRAN recipes: DPLYR, stringr, lubridate, and regex in R, pages 1–58. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/123561 |