Jafari Kang, Masood and Zohoori, Sepideh and Abbasi, Elahe and Li, Yueqing and Hamidi, Maryam (2019): Predicting the price of second-hand vehicles using data mining techniques. Published in: IISE Annual Conference Proceedings (2 November 2020)
Preview |
PDF
MPRA_paper_103933.pdf Download (373kB) | Preview |
Abstract
The electronic commerce, known as “E-commerce”, has been boosted rapidly in recent years, and makes it possible to record all information such as price, location, customer’s review, search history, discount options, competitor’s price, and so on. Accessing to such rich source of data, companies can analyze their users’ behavior to improve the customer satisfaction as well as the revenue. This study aims to estimate the price of used light vehicles in a commercial website, Divar, which is a popular website in Iran for trading second-handed goods. At first, highlighted features were extracted from the description column using the three methods of Bag of Words (BOW), Latent Dirichlet Allocation (LDA), and Hierarchical Dirichlet Process (HDP). Second, a multiple linear regression model was fit to predict the product price based on its attributes and the highlighted features. The accuracy index of Actuals-Predictions Correlation, the min-max index, and MAPE methods were used to validate the proposed methods. Results showed that the BOW model is the best model with an Adjusted R-square of 0.7841.
Item Type: | MPRA Paper |
---|---|
Original Title: | Predicting the price of second-hand vehicles using data mining techniques |
English Title: | Predicting the price of second-hand vehicles using data mining techniques |
Language: | English |
Keywords: | Text mining, Topic modeling, BOW, LDA, HDP, Linear regression |
Subjects: | C - Mathematical and Quantitative Methods > C5 - Econometric Modeling C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs Y - Miscellaneous Categories > Y1 - Data: Tables and Charts > Y10 - Data: Tables and Charts |
Item ID: | 103933 |
Depositing User: | Masood Jafari Kang |
Date Deposited: | 10 Nov 2020 17:21 |
Last Modified: | 10 Nov 2020 17:21 |
References: | [1] J. Guo, Z. Gao, N. Liu and Y. Wu, " Recommend products with consideration of multi-category inter-purchase time and price," Future Generation Computer Systems, vol. 78, pp. 451-461, 2018. [2] M. E. M. M. Ilbeigi, "Statistical Forecasting of Bridge Deterioration Condition," Journal of Performance of Constructed Facilitie, vol. 34, p. 04019104, 2020. [3] D. Van Heijst, R. Potharst and M. Van Wezel, "A support system for predicting eBay end prices," Decision Support Systems, vol. 44, no. 4, p. 970–982, 2008. [4] A. Greenstein-Messica and L. Rokach, "Personal price aware multi-seller recommender system: Evidence from eBay.," Knowledge-Based Systems, vol. 150, pp. 14-26, 2018. [5] M. Gorgoglione, U. Panniello and A. Tuzhilin, "Recommendation strategies in personalization applications," Information & Management, 2019. [6] H. Hwangbo, Y. S. Kim and K. J. Cha, "Recommendation system development for fashion retail e-commerce," Electronic Commerce Research and Applications, vol. 28, pp. 94-101, 2018. [7] "Divar," [Online]. Available: https://divar.ir/. [8] H. Liu, T. Hao, X. Wei, G. ZiYi, T. Lu and G. Yuan, "Sequential Bag-of-Words model for human action classification," CAAI Transactions on Intelligence Technology, vol. 1, no. 2, pp. 125-136, 2016. [9] S. Kim and J. Kang, "Analyzing the discriminative attributes of products using text mining focused on cosmetic reviews.," Information Processing & Management, vol. 54, no. 6, pp. 938-957, 2018. [10] L.-T. Zhao, S.-Q. Guo and W. Yi, "Oil market risk factor identification based on text mining technology," Energy Procedia, vol. 158, pp. 3589-3595, 2019. [11] S. Park, D. Lee, J. Choi, S. Ryu, Y. Kim, S. Kown, B. Kim and G. G. Lee, "Hierarchical Dirichlet Process Topic Modeling for Large Number of Answer Types Classification in Open domain Question Answering," Information Retrieval Technology, pp. 418-428, 2014. [12] E. Zavitsanos, G. Paliouras and G. A. Vouros, "Non-Parametric Estimation of Topic Hierarchies from Texts with Hierarchical Dirichlet Processes," Journal of Machine Learning Research, vol. 12, pp. 2749-2775, 2011. [13] L. Lin, L. T. Wen Dong, S. Yao and Z. Wei, "An overview of topic modeling and its current applications in bioinformatics," SpringerPlus, vol. 5, no. 1: 1608, 2016. [14] D. M. Blei, A. Y. Ng and M. I. Jorda, "Latent dirichlet allocation," Journal of machine Learning research, vol. 3, no. Jan:, pp. 993-1022, 2003. [15] M. Ponweiser, Latent Dirichlet Allocation in R. Theses / Institute for Statistics and Mathematics, Vienna.: WU Vienna University of Economics and Business, 2012. [16] P. Orbanz and Y. W. Teh, "Bayesian nonparametric models," in Encyclopedia of Machine Learning, Springer Link, 2010, pp. 81-89. [17] C. Wang, J. Paisley and D. Blei, "Online variational inference for the hierarchical Dirichlet process," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011. [18] Y. W. Teh, M. I. Jordan, M. J. Beal and D. Blei, Hierarchical Dirichlet Processes., 2005. [19] "Cafebazaar," [Online]. Available: https://research.cafebazaar.ir/visage/divar_datasets/. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/103933 |