Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the cr...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2017
|
Subjects: | |
Online Access: | http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utem-ep.20759 |
---|---|
record_format |
uketd_dc |
spelling |
my-utem-ep.207592022-02-17T11:04:08Z Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model 2017 Bingamawa, Muhammad Tosan Q Science (General) QA76 Computer software Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification. 2017 Thesis http://eprints.utem.edu.my/id/eprint/20759/ http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf text en public http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf text en validuser http://libraryopac.utem.edu.my/webopac20/Record/0000106535 mphil masters Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology |
institution |
Universiti Teknikal Malaysia Melaka |
collection |
UTeM Repository |
language |
English English |
topic |
Q Science (General) QA76 Computer software |
spellingShingle |
Q Science (General) QA76 Computer software Bingamawa, Muhammad Tosan Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
description |
Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are
identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection,
classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification. |
format |
Thesis |
qualification_name |
Master of Philosophy (M.Phil.) |
qualification_level |
Master's degree |
author |
Bingamawa, Muhammad Tosan |
author_facet |
Bingamawa, Muhammad Tosan |
author_sort |
Bingamawa, Muhammad Tosan |
title |
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
title_short |
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
title_full |
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
title_fullStr |
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
title_full_unstemmed |
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model |
title_sort |
enhancing accuracy of credit scoring classification with imbalance data using synthetic minority oversampling technique-support vector machine (smote-svm) model |
granting_institution |
Universiti Teknikal Malaysia Melaka |
granting_department |
Faculty of Information and Communication Technology |
publishDate |
2017 |
url |
http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf |
_version_ |
1747834000527851520 |