Credit Card Fraud Detection Using New Preprocessing And Hybrid Machine Learning Techniques

One of the significant problems in the credit card fraud domain is the increasing number of imbalanced data. The higher ratio of majority to minority classes can lead to misleading results, as conventional machine learning algorithms assume equal class distribution. The first contribution of this re...

Full description

Saved in:
Bibliographic Details
Main Author: Gasim, Esraa Faisal Malik
Format: Thesis
Language:English
Published: 2023
Subjects:
Online Access:http://eprints.usm.my/60174/1/ESRAA%20FAISAL%20MALIK%20GASIM%20-%20TESIS%20cut.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:One of the significant problems in the credit card fraud domain is the increasing number of imbalanced data. The higher ratio of majority to minority classes can lead to misleading results, as conventional machine learning algorithms assume equal class distribution. The first contribution of this research is to develop a new preprocessing technique that utilizes cost-sensitive learning and resampling techniques at the data-level to improve the performance of highly imbalanced datasets. The developed preprocessing technique consists of three phases. In the first phase, several resampling techniques at the data-level, such as SMOTE-ENN, SMOTE-TOMEK, SMOTE-OSS, SMOTE-RUS, and ROS-RUS with their default parameters, are compared to find the optimum technique with the highest performance. The second phase involves using cost-sensitive learning with different ratios to determine the best range of ratios to be used in phase three. Subsequently, in the third phase, the percentage of resampling techniques at the data-level is fine-tuned to avoid losing crucial information or producing repetitive synthetic data that could cause overfitting. Additionally, the cost-sensitive learning ratio is fine-tuned to determine the misclassification costs in the minority class. The developed new preprocessing technique was found to have a positive impact in terms of F1-measure and misclassification rate in contrast to the conventional resampling techniques. Furthermore, the negative effect of financial crimes on financial institutions has grown dramatically over the years. The second contribution to this research is to develop multiple hybrid machine learning models in order to enhance the detection of fraudulent activities in the credit card fraud detection domain.