Prediction of breast cancer diagnosis using machine learning in Malaysian women

Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk est...

Full description

Saved in:
Bibliographic Details
Main Author: Mokhtar, Tengku Muhammad Hanis Tengku
Format: Thesis
Language:English
Published: 2024
Subjects:
Online Access:http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-usm-ep.60999
record_format uketd_dc
spelling my-usm-ep.609992024-08-21T03:20:41Z Prediction of breast cancer diagnosis using machine learning in Malaysian women 2024-03 Mokhtar, Tengku Muhammad Hanis Tengku R Medicine RA440-440.87 Study and teaching. Research RC254-282 Neoplasms. Tumors. Oncology (including Cancer) Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk estimation and prediction. This thesis included six interrelated projects starting from Chapter 2 to Chapter 7. Chapter 2 presents an overview of breast cancer research in Malaysia. A bibliometric analysis was used to describe the research activities of breast cancer research in Malaysia. This project revealed there was no dominant research area in breast cancer research in Malaysia. Additionally, the study found that two growing research themes related to breast cancer in Malaysia were precision medicine and deep learning. Chapter 3 explored the most cited global research related to breast cancer and ML. This project also utilised bibliometric analysis applied to the most cited papers related to breast cancer and ML. This project found that there was a strong interest in the application of ML to breast cancer in the last three decades. The three frequently used ML algorithms were deep learning, support vector machine (SVM), and cluster analysis. In Chapter 4, factors influencing mammographic density among Asian women including Malaysia women were investigated. The study utilised a multiple imputation approach to overcome a missing data issue and a logistic regression to analyse the data. Five factors affecting mammographic density were age, number of children, body mass index, menopause status, and breast imaging-reporting and data system (BI-RADS) classification. The study in Chapter 5 explored the use of patient registration records and ML for breast cancer risk estimation. The ML model developed in this chapter could be used as an over-the-counter screening (OTC) model for women attending breast clinics. Eight ML algorithms were explored in this project. k-nearest neighbour (kNN) models had a significantly better performance compared to the other seven models. Additionally, Chapter 6 presents a meta-analysis of ML models on breast cancer classification. This project seeks to establish the diagnostic accuracy of ML used on mammographic data. This project found that neural network, deep learning, tree-based models, and SVM performed well on mammographic data for breast cancer detection. The study established the good diagnostic accuracy of ML in this area of research, thus, further supporting the use of ML in this area, especially for screening and supplementary diagnostic tools. Lastly, the study in Chapter 7 explored the use of an ensemble of pre-trained networks for breast abnormality classification using digital mammograms. This project explored thirteen pre-trained networks as candidates for the ensemble model. Each network was further fine-tuned, and the top networks were used to develop the ensemble model. The ensemble pre-trained network displayed a good performance in classifying the normal and suspicious mammograms. In conclusion, this thesis highlights the potential of ML in breast cancer risk estimation and prediction. The findings of this thesis contribute to the growing body of literature on ML in breast cancer research and provide valuable insights for future research in this area. 2024-03 Thesis http://eprints.usm.my/60999/ http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf application/pdf en public phd doctoral Universiti Sains Malaysia Pusat Pengajian Sains Kesihatan
institution Universiti Sains Malaysia
collection USM Institutional Repository
language English
topic R Medicine
R Medicine
R Medicine
spellingShingle R Medicine
R Medicine
R Medicine
Mokhtar, Tengku Muhammad Hanis Tengku
Prediction of breast cancer diagnosis using machine learning in Malaysian women
description Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk estimation and prediction. This thesis included six interrelated projects starting from Chapter 2 to Chapter 7. Chapter 2 presents an overview of breast cancer research in Malaysia. A bibliometric analysis was used to describe the research activities of breast cancer research in Malaysia. This project revealed there was no dominant research area in breast cancer research in Malaysia. Additionally, the study found that two growing research themes related to breast cancer in Malaysia were precision medicine and deep learning. Chapter 3 explored the most cited global research related to breast cancer and ML. This project also utilised bibliometric analysis applied to the most cited papers related to breast cancer and ML. This project found that there was a strong interest in the application of ML to breast cancer in the last three decades. The three frequently used ML algorithms were deep learning, support vector machine (SVM), and cluster analysis. In Chapter 4, factors influencing mammographic density among Asian women including Malaysia women were investigated. The study utilised a multiple imputation approach to overcome a missing data issue and a logistic regression to analyse the data. Five factors affecting mammographic density were age, number of children, body mass index, menopause status, and breast imaging-reporting and data system (BI-RADS) classification. The study in Chapter 5 explored the use of patient registration records and ML for breast cancer risk estimation. The ML model developed in this chapter could be used as an over-the-counter screening (OTC) model for women attending breast clinics. Eight ML algorithms were explored in this project. k-nearest neighbour (kNN) models had a significantly better performance compared to the other seven models. Additionally, Chapter 6 presents a meta-analysis of ML models on breast cancer classification. This project seeks to establish the diagnostic accuracy of ML used on mammographic data. This project found that neural network, deep learning, tree-based models, and SVM performed well on mammographic data for breast cancer detection. The study established the good diagnostic accuracy of ML in this area of research, thus, further supporting the use of ML in this area, especially for screening and supplementary diagnostic tools. Lastly, the study in Chapter 7 explored the use of an ensemble of pre-trained networks for breast abnormality classification using digital mammograms. This project explored thirteen pre-trained networks as candidates for the ensemble model. Each network was further fine-tuned, and the top networks were used to develop the ensemble model. The ensemble pre-trained network displayed a good performance in classifying the normal and suspicious mammograms. In conclusion, this thesis highlights the potential of ML in breast cancer risk estimation and prediction. The findings of this thesis contribute to the growing body of literature on ML in breast cancer research and provide valuable insights for future research in this area.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Mokhtar, Tengku Muhammad Hanis Tengku
author_facet Mokhtar, Tengku Muhammad Hanis Tengku
author_sort Mokhtar, Tengku Muhammad Hanis Tengku
title Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_short Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_full Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_fullStr Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_full_unstemmed Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_sort prediction of breast cancer diagnosis using machine learning in malaysian women
granting_institution Universiti Sains Malaysia
granting_department Pusat Pengajian Sains Kesihatan
publishDate 2024
url http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf
_version_ 1811772870686146560