Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model

Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the cr...

Full description

Saved in:
Bibliographic Details
Main Author: Bingamawa, Muhammad Tosan
Format: Thesis
Language:English
English
Published: 2017
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.20759
record_format uketd_dc
spelling my-utem-ep.207592022-02-17T11:04:08Z Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model 2017 Bingamawa, Muhammad Tosan Q Science (General) QA76 Computer software Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification. 2017 Thesis http://eprints.utem.edu.my/id/eprint/20759/ http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf text en public http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf text en validuser http://libraryopac.utem.edu.my/webopac20/Record/0000106535 mphil masters Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
topic Q Science (General)
QA76 Computer software
spellingShingle Q Science (General)
QA76 Computer software
Bingamawa, Muhammad Tosan
Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
description Credit is one of the business models that provide a significant growth. With the growth of new credit applicants and financial markets, the possibility of credit problem occurrence also become higher. Thus, it becomes important for a financial institution to conduct a preliminary selection to the credit applicants. In order to do that, credit scoring becomes one of the models used by a financial institution to perform a preliminary selection of potential customer. One of the most common techniques used to develop a credit scoring model is data mining classification task. However, this technique provides difficulties in classifying imbalanced data distribution. It is because imbalanced data problem may lead the classifier to perform misclassification by classified all of the data into majority class and perform poorly on minority class. In the case of credit scoring, credit data also have imbalanced data distribution. Therefore, classifying a credit data with imbalanced data distribution using unappropriated technique may lead the classification provides a wrong decision result for a financial institution. In this study, several methods for handle imbalanced data problem are identified. Moreover, an improvement of credit scoring model with imbalanced data problem in a financial institution using SMOTE-SVM model is also proposed in this study. This study is conducted in five phases which are data collection, data pre-processing, feature selection, classification, validation, and evaluation. For the experiments using SMOTE-SVM model, the experiments are conducted by taking a consideration in different data ratio and nearest neighbours used in SMOTE. The result of experiments provides that the accuracy and performance result are improved along with the balanced data using SMOTE-SVM model. The performance measurement using 10-fold cross validation and confusion matrix shows that SMOTE-SVM model can correctly classify most of the data in each class with the good result of accuracy, class precision, and class recall. Based on this result, an SMOTE-SVM model is believed to be effective in handling imbalanced data for credit scoring classification.
format Thesis
qualification_name Master of Philosophy (M.Phil.)
qualification_level Master's degree
author Bingamawa, Muhammad Tosan
author_facet Bingamawa, Muhammad Tosan
author_sort Bingamawa, Muhammad Tosan
title Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_short Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_full Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_fullStr Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_full_unstemmed Enhancing Accuracy Of Credit Scoring Classification With Imbalance Data Using Synthetic Minority Oversampling Technique-Support Vector Machine (SMOTE-SVM) Model
title_sort enhancing accuracy of credit scoring classification with imbalance data using synthetic minority oversampling technique-support vector machine (smote-svm) model
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty of Information and Communication Technology
publishDate 2017
url http://eprints.utem.edu.my/id/eprint/20759/1/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/20759/2/Enhancing%20Accuracy%20Of%20Credit%20Scoring%20Classification%20With%20Imbalance%20Data%20Using%20Synthetic%20Minority%20Oversampling%20Technique-Support%20Vector%20Machine%20%28SMOTE-SVM%29%20Model%20-%20Muhammad%20Tosan%20Bingamawa.pdf
_version_ 1747834000527851520