Pyzhov, Vladislav and Pyzhov, Stanislav (2017): Comparison of methods of data mining techniques for the predictive accuracy.
Preview |
PDF
MPRA_paper_79326.pdf Download (779kB) | Preview |
Abstract
This paper is based on the work of Yeh, Lien (2009). In the paper, authors used the payment data set from the important bank in Taiwan. To build a model, the whole sample was divided in two subsets - training and testing sets - so each model could be trained on the first one and then be evaluated on the second. Our motivation was to see whether the same result could be obtained if we repeatedly apply the models to the different data sets. To do so, Monte Carlo simulation was implemented to generate these sets.
Item Type: | MPRA Paper |
---|---|
Original Title: | Comparison of methods of data mining techniques for the predictive accuracy. |
Language: | English |
Keywords: | Monte-Carlo, Data Mining, Neural Networks, k-nearest neighbors, Logistic regression, Random Forest. |
Subjects: | C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C53 - Forecasting and Prediction Methods ; Simulation Methods C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C81 - Methodology for Collecting, Estimating, and Organizing Microeconomic Data ; Data Access C - Mathematical and Quantitative Methods > C8 - Data Collection and Data Estimation Methodology ; Computer Programs > C87 - Econometric Software |
Item ID: | 79326 |
Depositing User: | Mr. Vladislav Pyzhov |
Date Deposited: | 27 May 2017 04:43 |
Last Modified: | 26 Sep 2019 18:02 |
References: | Ji-Hyun Kim. "Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap". Computational Statistics & Data Analysis, 53(11):3735–3745, 2009. Ron Kohavi et al. "A study of cross-validation and bootstrap for accuracy estimation and model selection." InIjcai, volume 14, pages 1137–1145. Stanford, CA, 1995. Max Kuhn and Kjell Johnson. "Applied predictive modeling", volume 26. Springer, 2013. Gordon S Linoff and Michael JA Berry. "Data mining techniques: for marketing, sales, and customer relationship management." John Wiley & Sons, 2011. Daniel Mennitt, Kirk Sherrill, and Kurt Fristrup. "A geospatial model of ambient soundpressure levels in the contiguous united states." The Journal of the Acoustical Society of America, 135(5):2746–2764, 2014. I-Cheng Yeh and Che-hui Lien. "The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients." Expert Systems with Applications, 36(2):2473–2480, 2009. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/79326 |