Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim

Credit risk assessment is the procedure by the investors or lenders to predict the chances of loan default to measuring the risk. A wrong decision places the institution at risk. Corporation Credit Risk Assessment (CCRA) depends on the financial indicators representing the companies' status at...

Full description

Saved in:
Bibliographic Details
Main Author: Halim, Zulkifli
Format: Thesis
Language:English
Published: 2023
Online Access:https://ir.uitm.edu.my/id/eprint/88795/1/88795.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.88795
record_format uketd_dc
spelling my-uitm-ir.887952024-01-17T02:42:30Z Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim 2023 Halim, Zulkifli Credit risk assessment is the procedure by the investors or lenders to predict the chances of loan default to measuring the risk. A wrong decision places the institution at risk. Corporation Credit Risk Assessment (CCRA) depends on the financial indicators representing the companies' status at a given time. Nowadays, machine learning is a significant field used in various applications, including the financial domain. The global trend in the CCRA study shows that implementing machine learning and deep learning techniques is expanding rapidly. These techniques have demonstrated their superiority over traditional approaches in many CCRA studies. Machine learning model selection is an iterative process of exploring, evaluating, and improving algorithms. Selecting an optimal model for a particular domain is rigid, challenging, and complicated. No free lunch theorem implies that no particular algorithm or combination of features will always produce considerably superior outcomes to others. Hence, the question arises about selecting the optimal model: the characteristic data for CCRA and the best practice machine learning pipelines. The characteristic data for CCRA includes the features used and data dimension. This study used thirteen features, including ten financial ratios, two macroeconomic variables, and the company's age. The features are selected based on the extensive literature on CCRA studies worldwide. This study also investigates the significance of data dimension in CCRA: single or multi-dimensional, and the correlation of the features. For the best practice machine learning pipelines, various machine learning models are used to discover the best model for CCRA study. This study has proposed an automated model selection based on the exhaustive search algorithm—that caused the timeout and memory leak issues. The proposed automated model selection has solved the timeout and memory leak issue by automatically writing all results in CSV files to reduce memory consumption. The samples of the study are the PN17 status companies. Through the automated model selection, 176 models are created across the experiment settings. The models are based on the four machine learning algorithms: logistic regression, support vector machine, decision tree, and neural network; two ensemble techniques: adaptive boost and bootstrap aggregation; three deep learning algorithms: recurrent neural network, long short-term memory(LSTM), and gated recurrent unit (GRU). Besides that, this study proposed two hybrid LSTM-GRU based models. The hybrid models were LSTM-GRU Double Stack(LGDS) and LSTM-GRU Alternate Double Stack (LGADS). As a result, the proposed automated model selection has found that the LGADS model on multi-dimensional data of FR-only features and without a features correlation setup has outperformed the other models with the highest accuracy and Fl score. The LGADS model achieved 84.2% for both measurements. This study contributed to the body of knowledge by proposing an automated machine learning model selection for the CCRA study. This study might be expanded with extensive scope. The scope can be extended by adding more financial ratios since the FR features are significant for the CCRA study and adding more samples to produce better results 2023 Thesis https://ir.uitm.edu.my/id/eprint/88795/ https://ir.uitm.edu.my/id/eprint/88795/1/88795.pdf text en public phd doctoral Universiti Teknologi MARA (UiTM) Faculty of Computer and Mathematical Sciences Mohamed Shuhidan,, Shuhaida (Dr.)
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
advisor Mohamed Shuhidan,, Shuhaida (Dr.)
description Credit risk assessment is the procedure by the investors or lenders to predict the chances of loan default to measuring the risk. A wrong decision places the institution at risk. Corporation Credit Risk Assessment (CCRA) depends on the financial indicators representing the companies' status at a given time. Nowadays, machine learning is a significant field used in various applications, including the financial domain. The global trend in the CCRA study shows that implementing machine learning and deep learning techniques is expanding rapidly. These techniques have demonstrated their superiority over traditional approaches in many CCRA studies. Machine learning model selection is an iterative process of exploring, evaluating, and improving algorithms. Selecting an optimal model for a particular domain is rigid, challenging, and complicated. No free lunch theorem implies that no particular algorithm or combination of features will always produce considerably superior outcomes to others. Hence, the question arises about selecting the optimal model: the characteristic data for CCRA and the best practice machine learning pipelines. The characteristic data for CCRA includes the features used and data dimension. This study used thirteen features, including ten financial ratios, two macroeconomic variables, and the company's age. The features are selected based on the extensive literature on CCRA studies worldwide. This study also investigates the significance of data dimension in CCRA: single or multi-dimensional, and the correlation of the features. For the best practice machine learning pipelines, various machine learning models are used to discover the best model for CCRA study. This study has proposed an automated model selection based on the exhaustive search algorithm—that caused the timeout and memory leak issues. The proposed automated model selection has solved the timeout and memory leak issue by automatically writing all results in CSV files to reduce memory consumption. The samples of the study are the PN17 status companies. Through the automated model selection, 176 models are created across the experiment settings. The models are based on the four machine learning algorithms: logistic regression, support vector machine, decision tree, and neural network; two ensemble techniques: adaptive boost and bootstrap aggregation; three deep learning algorithms: recurrent neural network, long short-term memory(LSTM), and gated recurrent unit (GRU). Besides that, this study proposed two hybrid LSTM-GRU based models. The hybrid models were LSTM-GRU Double Stack(LGDS) and LSTM-GRU Alternate Double Stack (LGADS). As a result, the proposed automated model selection has found that the LGADS model on multi-dimensional data of FR-only features and without a features correlation setup has outperformed the other models with the highest accuracy and Fl score. The LGADS model achieved 84.2% for both measurements. This study contributed to the body of knowledge by proposing an automated machine learning model selection for the CCRA study. This study might be expanded with extensive scope. The scope can be extended by adding more financial ratios since the FR features are significant for the CCRA study and adding more samples to produce better results
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Halim, Zulkifli
spellingShingle Halim, Zulkifli
Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
author_facet Halim, Zulkifli
author_sort Halim, Zulkifli
title Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
title_short Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
title_full Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
title_fullStr Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
title_full_unstemmed Automated model selection for corporation credit risk assessment using machine learning / Zulkifli Halim
title_sort automated model selection for corporation credit risk assessment using machine learning / zulkifli halim
granting_institution Universiti Teknologi MARA (UiTM)
granting_department Faculty of Computer and Mathematical Sciences
publishDate 2023
url https://ir.uitm.edu.my/id/eprint/88795/1/88795.pdf
_version_ 1794192163548954624