Aggarwal, Sakshi (2023): Machine Learning algorithms, perspectives, and real-world application: Empirical evidence from United States trade data.
Preview |
PDF
MPRA_paper_116579.pdf Download (1MB) | Preview |
Abstract
Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without being explicitly programmed. It is one of today’s most rapidly growing technical fields, lying at the crossroads of computer science and statistics, and at the core of artificial intelligence (AI) and data science. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in this area. Recent progress in ML has been driven both by the development of new learning algorithms theory, and by the ongoing explosion in the availability of vast amount of data (commonly known as “big-data”) and low-cost computation. The adoption of data-intensive ML-based methods can be found throughout science, technology, and commerce, leading to more evidence-based decision-making across many walks of life, including finance, manufacturing, international trade, economics, education, healthcare, marketing, policymaking, and data governance. The present paper provides a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and capabilities of an application. Moreover, the paper attempts to determine the accurate clusters of similar industries in United States that collectively account for more than 85 percent of economy’s aggregate export and import flows over the period 2002-2021 through clustering algorithm (unsupervised learning). Four clusters of mapping labels have been used, namely the low investment (LL), category 1 medium investment (HL), category 2 medium investment (LH) and high investment (HH). The empirical results indicate that machinery and electrical equipment is classified as a high investment sector due to its efficient production mechanism. The analysis further underlines the need for upstream value chain integration through skill-augmentation and innovation especially in low investment industries. Overall, this paper aims to explain the trends of ML approaches and their applicability in various real-world domains, as well as serve as a reference point for academia, industry professionals and policymakers particularly from a technical, ethical, and regulatory point of view.
Item Type: | MPRA Paper |
---|---|
Original Title: | Machine Learning algorithms, perspectives, and real-world application: Empirical evidence from United States trade data |
English Title: | Machine Learning algorithms, perspectives, and real-world application: Empirical evidence from United States trade data |
Language: | English |
Keywords: | Machine learning, Artificial intelligence, Clustering, K-means, international trade |
Subjects: | F - International Economics > F1 - Trade > F14 - Empirical Studies of Trade |
Item ID: | 116579 |
Depositing User: | Miss Sakshi Aggarwal |
Date Deposited: | 04 Mar 2023 09:21 |
Last Modified: | 04 Mar 2023 09:21 |
References: | Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Agarwal, A., Dawson, S., McKee, D., Eugster, P., Tancreti, M., & Sundaram, V. (2017, April). Detecting abnormalities in IoT program executions through control-flow-based features. In Proceedings of the Second International Conference on Internet-of-Things Design and Implementation (pp. 339-340). Aggarwal, S. (2016). Determinants of money demand for India in presence of structural break: An empirical analysis. Business and Economic Horizons (BEH), Prague Development Center (PRADEC), 12(4), 173-177. Aggarwal, S. (2020). Determinants of Intra-Industry Trade and Labour Market Adjustment: A Sectoral Analysis for India (Doctoral dissertation, Indian Institute of Foreign Trade). Aggarwal, S., & Chakraborty, D. (2017). Determinants of India’s bilateral intra-industry trade over 2001–2015: Empirical results. South Asia Economic Journal, 18(2), 296–313. Aggarwal, S. (2017). Smile curve and its linkages with global value chains. Journal of Economic Bibliography, 4(3). Aggarwal, S., & Chakraborty, D. (2019). Which factors influence India’s intra-industry trade? Empirical findings for select sectors. Global Business Review. Retrieved from https://journals.sagepub.com/doi/10.1177/0972150919868343 (Accessed on April 23, 2020). Aggarwal, S., & Chakraborty, D. (2020a). Labour market adjustment and intra-industry trade: Empirical results from Indian manufacturing sectors. Journal of South Asian Development, 15(2), 238-269. Aggarwal, S., & Chakraborty, D. (2020b). Determinants of vertical intra-industry trade: Empirical evidence from Indian manufacturing sectors. Prajnan: Journal of Social and Management Sciences, 49(3), 221-252. Aggarwal, S., & Chakraborty, D. (2020c). Is there any relationship between Marginal Intra-Industry Trade and Employment Change? Evidence from Indian Industries. Working Paper, No. 20-44, Indian Institute of Foreign Trade, Delhi. Aggarwal, S., Chakraborty, D., & Bhattacharyya, R. (2021). Determinants of Domestic Value Added in Exports: Empirical Evidence from India’s Manufacturing Sectors. Global Business Review. https://doi.org/10.1177/09721509211050138. Aggarwal, S., & Chakraborty, D. (2021). Which factors influence vertical intra-industry trade in India?: Empirical results from panel data analysis. Working Paper, No. 21-04, Indian Institute of Foreign Trade, Delhi. Aggarwal, S., Chakraborty, D. (2022). Which Factors Influence India’s Bilateral Intra-Industry Trade? Cross-Country Empirical Estimates. Working Papers 2260, Indian Institute of Foreign Trade, Delhi. Aggarwal, S., Mondal, S., & Chakraborty, D. (2022). Efficiency Gain in Indian Manufacturing Sectors: Evidence from Domestic Value Addition in Exports. Empirical Economics Letters, 21(2): 69-83. Aggarwal, S., Chakraborty, D., & Banik, N. (2023). Does Difference in Environmental Standard Influence India’s Bilateral IIT Flows? Evidence from GMM Results. Journal of Emerging Market Finance, 22(1), 7–30. https://doi.org/10.1177/09726527221088412. Alakus, T. B., & Turkoglu, I. (2020). Comparison of deep learning approaches to predict COVID-19 infection. Chaos, Solitons & Fractals, 140, 110120. Athukorala, P. C., & Yamashita, N. (2006). Production fragmentation and trade integration: East Asia in a global context. The North American Journal of Economics and Finance, 17(3), 233-256. Baldwin, R. (2013). Trade and industrialization after globalization's second unbundling: How building and joining a supply chain are different and why it matters. In Globalization in an age of crisis: Multilateral economic cooperation in the twenty-first century (pp. 165-212). University of Chicago Press. Baraniuk, R. G. (2011). More is less: Signal processing and the data deluge. Science, 331(6018), 717-719. Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31. Bendre, M. R., & Thool, V. R. (2016). Analytics, challenges, and applications in big data environment: a survey. Journal of Management Analytics, 3(3), 206-239. Bevan, A. (2015). The data deluge. Antiquity, 89(348), 1473-1484. Beyer, M. A., & Laney, D. (2012). The importance of ‘big data’: a definition. Stamford, CT: Gartner, 2014-2018. Bezdek, J. C., Chuah, S. K., & Leep, D. (1986). Generalized k-nearest neighbor rules. Fuzzy Sets and Systems, 18(3), 237-256. Bishop, C. M. (2006). Pattern recognition and machine learning: springer New York. Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big data & society, 3(1), 2053951715622512. Buşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. Innovations in multi-agent systems and applications-1, 183-221. Cao, L. (2017). Data science: a comprehensive overview. ACM Computing Surveys (CSUR), 50(3), 1-42. Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28. Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., & Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug discovery today, 23(6), 1241-1250. Chowdary, M. K., Nguyen, T. N., & Hemanth, D. J. (2021). Deep learning-based facial emotion recognition for human–computer interaction applications. Neural Computing and Applications, 1-18. Cingolani, I., Iapadre, L., & Tajoli, L. (2018). International production networks and the world trade structure. International Economics, 153, 11-33. Clement, J. C., Ponnusamy, V., Sriharipriya, K. C., & Nandakumar, R. (2021). A survey on mathematical, machine learning and deep learning models for COVID-19 transmission and diagnosis. IEEE reviews in biomedical engineering, 15, 325-340. Cummins, N., Baird, A., & Schuller, B. W. (2018). Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning. Methods, 151, 41-54. Dai, X., Li, C. K., & Rad, A. B. (2005). An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Transactions on Intelligent Transportation Systems, 6(3), 285-293. Das, S., & Mandal, K. (2000). Modeling money demand in India: Testing weak, strong & super exogeneity. Indian Economic Review, 1-19. Das, S., Dey, A., Pal, A., & Roy, N. (2015). Applications of artificial intelligence in machine learning: review and prospect. International Journal of Computer Applications, 115(9). Dike, H. U., Zhou, Y., Deveerasetty, K. K., & Wu, Q. (2018). Unsupervised learning based on artificial neural network: A review. In 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS) (pp. 322-327). IEEE. Duda, R. O., Hart, P. E. & Stork, D. G. (2001). Pattern Classification. Wiley, New Jersey. Eagle, N., & Pentland, A. (2006). Reality mining: sensing complex social systems. Personal and ubiquitous computing, 10(4), 255-268. Figueiredo, M. A. T., & Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Transactions on pattern analysis and machine intelligence, 24(3), 381-396. Frank, E., Trigg, L., Holmes, G., & Witten, I. H. (2000). Naive Bayes for regression. Machine Learning, 41, 5-25. Grossman, G. M., & Rossi-Hansberg, E. (2008). Trading tasks: A simple theory of offshoring. American Economic Review, 98(5), 1978-1997. Han, J., Kamber, M., & Pei, J. (2011). Data mining concepts and techniques. Amsterdam: Elsevier. Hanson, G. H., Mataloni Jr, R. J., & Slaughter, M. J. (2005). Vertical production networks in multinational firms. Review of Economics and statistics, 87(4), 664-678. Harmon, S. A., Sanford, T. H., Xu, S., Turkbey, E. B., Roth, H., Xu, Z., ... & Turkbey, B. (2020). Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets. Nature communications, 11(1), 4080. Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1), 100-108. Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4), 18-28. Hegde, S., Shetty, S., Rai, S., & Dodderi, T. (2019). A survey on machine learning approaches for automatic detection of voice disorders. Journal of Voice, 33(6), 947-e11. Hey, T., & Trefethen, A. (2003). The data deluge: An e-science perspective. Grid computing: Making the global infrastructure a reality, 72, 809-824. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417. Idrees, S. M., Alam, M. A., & Agarwal, P. (2019). A study of big data and its challenges. International Journal of Information Technology, 11, 841-846. International Trade Centre (undated), "Trade Map", available at: http://www trademap org/Index aspx (accessed February 20, 2023). Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4(1), 237-285. Kano, L., Tsang, E. W., & Yeung, H. W. C. (2020). Global value chains: A review of the multi-disciplinary literature. Journal of international business studies, 51, 577-622. Keshavarzi Arshadi, A., Webb, J., Salem, M., Cruz, E., Calad-Thomson, S., Ghadirian, N., & Yuan, J. S. (2020). Artificial intelligence for COVID-19 drug discovery and vaccine development. Frontiers in Artificial Intelligence, 65. Khadse, V., Mahalle, P. N., & Biraris, S. V. (2018, August). An empirical comparison of supervised machine learning algorithms for internet of things data. In 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) (pp. 1-6). IEEE. Khalid, S., Khalil, T., & Nasreen, S. (2014). A survey of feature selection and feature extraction techniques in machine learning. In 2014 science and information conference (pp. 372-378). IEEE. Kleinbaum, D. G., Dietz, K., Gail, M., Klein, M., & Klein, M. (2002). Logistic regression (p. 536). New York: Springer-Verlag. Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of Cluster in K-Means Clustering. International Journal, 1(6), 90-95. Koteluk, O., Wartecki, A., Mazurek, S., Kołodziejczak, I., & Mackiewicz, A. (2021). How do machines learn? artificial intelligence as a new era in medicine. Journal of Personalized Medicine, 11(1), 32. Kotsiantis, S. B. (2013). Decision trees: a recent overview. Artificial Intelligence Review, 39, 261-283. Kowalski, P., Gonzalez, J. L., Ragoussis, A., & Ugarte, C. (2015). Participation of Developing Countries in Global Value Chains: Implications for Trade and Trade-Related Policies. OECD Trade Policy Papers, No. 179, OECD Publishing, Paris, https://doi.org/10.1787/5js33lfw0xxn-en. Lade, P., Ghosh, R., & Srinivasan, S. (2017). Manufacturing analytics and industrial internet of things. IEEE Intelligent Systems, 32(3), 74-79. Lalmuanawma, S., Hussain, J., & Chhakchhuak, L. (2020). Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review. Chaos, Solitons & Fractals, 139, 110059. Lanz, R., & Miroudot, S. (2011). Intra-firm trade: Patterns, determinants, and policy implications (OECD Trade Policy Papers No. 114). Organization for Economic Cooperation and Development. Le Glaz, A., Haralambous, Y., Kim-Dufor, D. H., Lenca, P., Billot, R., Ryan, T. C., & Lemey, C. (2021). Machine learning and natural language processing in mental health: systematic review. Journal of Medical Internet Research, 23(5), e15708. Lee, C., & Landgrebe, D. A. (1993). Feature extraction based on decision boundaries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4), 388-400. Levine, M. D. (1969). Feature extraction: A survey. Proceedings of the IEEE, 57(8), 1391-1407. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM computing surveys (CSUR), 50(6), 1-45. Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep learning for generic object detection: A survey. International journal of computer vision, 128, 261-318. Likas, A., Vlassis, N., & Verbeek, J. J. (2003). The global k-means clustering algorithm. Pattern recognition, 36(2), 451-461. Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research, 9, 381-386. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Hung Byers, A. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute. Mirsadeghi, L., Haji Hosseini, R., Banaei-Moghaddam, A. M., & Kavousi, K. (2021). EARN: an ensemble machine learning algorithm to predict driver genes in metastatic breast cancer. BMC Medical Genomics, 14(1), 122. Mohamadou, Y., Halidou, A., & Kapen, P. T. (2020). A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of COVID-19. Applied Intelligence, 50(11), 3913-3925. Mohammed, M., Khan, M. B., & Bashier, E. B. M. (2016). Machine learning: algorithms and applications. CRC Press. Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to linear regression analysis. John Wiley & Sons. Moustafa, N., & Slay, J. (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In 2015 military communications and information systems conference (MilCIS) (pp. 1-6). IEEE. Mudambi, R., & Venzin, M. (2010). The strategic nexus of offshoring and outsourcing decisions. Journal of Management Studies, 47(8), 1510–1533. Murphy, K. P. (2006). Naive bayes classifiers. University of British Columbia, 18(60), 1-8. Nag, B., Chakraborty, D., & Aggarwal, S. (2021). India's Act East Policy: RCEP Negotiations and Beyond (No. 2101). Indian Institute of Foreign Trade, Delhi. Nick, T. G., & Campbell, K. M. (2007). Logistic regression. Topics in biostatistics, 273-301. Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science, 2(11), 559-572. Pisner, D. A., & Schnyer, D. M. (2020). Support vector machine. In Machine learning (pp. 101-121). Academic Press. Pugliese, R., Regondi, S., & Marini, R. (2021). Machine learning-based approach: Global trends, research directions, and regulatory standpoints. Data Science and Management, 4, 19-29. Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1, 81-106. Ramachandran, M. (2004). Do broad money, output, and prices stand for a stable relationship in India? Journal of Policy Modeling, 26(8-9), 983-1001. Ray, S. (2019). A quick review of machine learning algorithms. In 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon) (pp. 35-39). IEEE. Sarker, I. H., Kayes, A. S. M., Badsha, S., Alqahtani, H., Watters, P., & Ng, A. (2020). Cybersecurity data science: an overview from machine learning perspective. Journal of Big data, 7(1), 1-29. Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN computer science, 2(3), 160. Sikora, R. (2015). A modified stacking ensemble machine learning algorithm using genetic algorithms. In Handbook of research on organizational transformations through big data analytics (pp. 43-53). IGi Global. Singh, A., Thakur, N., & Sharma, A. (2016). A review of supervised machine learning algorithms. In 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 1310-1315). IEEE. Su, X., Yan, X., & Tsai, C. L. (2012). Linear regression. Wiley Interdisciplinary Reviews: Computational Statistics, 4(3), 275-294. Tan, A. C., & Gilbert, D. (2003). Ensemble machine learning on gene expression data for cancer classification. Appl. Bioinf. 2 (3 Suppl. 1), S75-S83. Tsai, C. F., Hsu, Y. F., Lin, C. Y., & Lin, W. Y. (2009). Intrusion detection by machine learning: A review. expert systems with applications, 36(10), 11994-12000. United Nations Conference on Trade and Development (2019). World Investment Report, Geneva: WTO. United Nations Conference on Trade and Development (2022). Trade and Development Report, Geneva: WTO. Van Engelen, J. E., & Hoos, H. H. (2020). A survey on semi-supervised learning. Machine learning, 109(2), 373-440. Vidal, R., Ma, Y., Sastry, S. S., Vidal, R., Ma, Y., & Sastry, S. S. (2016). Principal component analysis (pp. 25-62). Springer New York. Wang, Y., Tetko, I. V., Hall, M. A., Frank, E., Facius, A., Mayer, K. F., & Mewes, H. W. (2005). Gene selection from microarray data for cancer classification—a machine learning approach. Computational biology and chemistry, 29(1), 37-46. Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2005). Practical machine learning tools and techniques. In Data Mining (Vol. 2, No. 4). World Trade Organisation (2011). Trade patterns and global value chains in East Asia: From trade in goods to trade in tasks. In collaboration with institute of developing economies (IDE) and Japan external trade organization (JETRO). WTO. World Trade Organisation (2015). International Trade Statistics 2015, available at: www.wto.org/statistics (Accessed on September 5, 2022). Yi, K. M. (2003). Can vertical specialization explain the growth of world trade? Journal of political Economy, 111(1), 52-102. Zoabi, Y., Deri-Rozov, S., & Shomron, N. (2021). Machine learning-based prediction of COVID-19 diagnosis based on symptoms. npj digital medicine, 4(1), 3. Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of translational medicine, 4(11). Zhou, Z. H., & Zhou, Z. H. (2021). Semi-supervised learning. Machine Learning, 315-341. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/116579 |