Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model

Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the cr...

全面介绍

Saved in:
书目详细资料
主要作者: Bingamawa, Muhammad Tosan
格式: Thesis
语言:English
English
出版: 2017
主题:
在线阅读:http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf
标签: 添加标签
没有标签, 成为第一个标记此记录!
实物特征
总结:Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification.