Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model

Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the cr...

全面介绍

Saved in:
书目详细资料
主要作者: Bingamawa, Muhammad Tosan
格式: Thesis
语言:English
English
出版: 2017
主题:
在线阅读:http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf
标签: 添加标签
没有标签, 成为第一个标记此记录!
id my-utem-ep.20759
record_format uketd_dc
spelling my-utem-ep.207592022-02-17T11:04:08Z Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model 2017 Bingamawa, Muhammad Tosan Q Science (General) QA76 Computer software Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification. 2017 Thesis http://eprints.utem.edu.my/id/eprint/20759/ http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf text en public http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf text en validuser http://libraryopac.utem.edu.my/webopac20/Record/0000106535 mphil masters Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
topic Q Science (General)
QA76 Computer software
spellingShingle Q Science (General)
QA76 Computer software
Bingamawa, Muhammad Tosan
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
description Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification.
format Thesis
qualification_name Master of Philosophy (M.Phil.)
qualification_level Master's degree
author Bingamawa, Muhammad Tosan
author_facet Bingamawa, Muhammad Tosan
author_sort Bingamawa, Muhammad Tosan
title Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_short Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_full Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_fullStr Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_full_unstemmed Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_sort enhancing accuracy of credit scoring classification with imbalance data using synthetic minority oversampling technique-support vector machine (smote-svm) model
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty of Information and Communication Technology
publishDate 2017
url http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf
_version_ 1747834000527851520