Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data

Diagnosing data or object classification for magnetic resonance images is important in image segmentation especially data which is less effective to be identified namely low-grade tumors or cerebrospinal fluid (CSF).The aim of this thesis is to address the aforementioned problems associated with mis...

Full description

Saved in:
Bibliographic Details
Main Author: Saeed, Soobia
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/102976/1/SoobiaSaeedPSC2022.pdf.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.102976
record_format uketd_dc
spelling my-utm-ep.1029762023-10-12T08:33:13Z Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data 2022 Saeed, Soobia QA75 Electronic computers. Computer science Diagnosing data or object classification for magnetic resonance images is important in image segmentation especially data which is less effective to be identified namely low-grade tumors or cerebrospinal fluid (CSF).The aim of this thesis is to address the aforementioned problems associated with missing data in MRI images and noisy of MRI images that required more processing times. This thesis focus on segmentation of brain tumor and CSF classification of fourdimensional MRI images. Three datasets called Light Field Database (LFD) with improved accuracy of images and increased resolution have been created. A hybrid k-nearest neighbours (k-NN) framework with time complexity that consists of three techniques namely GrabCut support vector machine (GCSVM) and scale invariant feature transform (SIFT), hidden Markov model of k-mean clustering (HMkC) and k-NN, and correlation matrices of discrete Fourier transform (CM-DFT) have been proposed. Firstly, GCSVM and SIFT technique is a combination of three methods namely the GrabCut, Support Vector Machine and Scale Invariant Feature Transform. This result of the technique is 99.9% for SVM accuracy, 4606 for GrabCut segmentation of Maximum Flow, 50625 and 50168 for Nodes of Image Pixel and edges respectively, and 2.29 seconds for computational time. For SIFT by using LFD dataset, the performance of distance value in the segmentation is 1.464, 1.215 and 1.23 for dataset-I, dataset-II, dataset-III respectively. Meanwhile, computational time for dataset-I, dataset-II and dataset-III is 1.47 seconds, 1.88 seconds, and 1.35 seconds respectively. Secondly, HMkC and k-NN resolves the classification problem using the Iterated Condition Mode (ICMM) with k-mean clustering algorithm and k-NN algorithm. The classification result of the technique for the accuracy, sensitivity, specificity and computational time is 99.83%, 99.99%, 99.8%, and 14.9 seconds respectively. Thirdly, CM-DFT technique resolves the missing data imputation problem by using cross correlation of lagged hybrid k-NN with DFT (Hk-NN-DFT) to enhance the MRI images. The technique generates the not a non-missing values in terms of multiplication of 1100-3000 and 99.84% for the accuracy of missing data in the image. The missing ratio result of imputed missing data in the images after retrieving the missing ratio of dataset-I, II, and III is 0.9815 with the 1.533 second of computational time. These three techniques are useful to improve the proposed hybrid k-NN framework to ensure that the classification of brain tumor (low grade tumors) and CSF in images is conducted easily. 2022 Thesis http://eprints.utm.my/102976/ http://eprints.utm.my/102976/1/SoobiaSaeedPSC2022.pdf.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150554 phd doctoral Universiti Teknologi Malaysia Faculty of Engineering - School of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Saeed, Soobia
Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
description Diagnosing data or object classification for magnetic resonance images is important in image segmentation especially data which is less effective to be identified namely low-grade tumors or cerebrospinal fluid (CSF).The aim of this thesis is to address the aforementioned problems associated with missing data in MRI images and noisy of MRI images that required more processing times. This thesis focus on segmentation of brain tumor and CSF classification of fourdimensional MRI images. Three datasets called Light Field Database (LFD) with improved accuracy of images and increased resolution have been created. A hybrid k-nearest neighbours (k-NN) framework with time complexity that consists of three techniques namely GrabCut support vector machine (GCSVM) and scale invariant feature transform (SIFT), hidden Markov model of k-mean clustering (HMkC) and k-NN, and correlation matrices of discrete Fourier transform (CM-DFT) have been proposed. Firstly, GCSVM and SIFT technique is a combination of three methods namely the GrabCut, Support Vector Machine and Scale Invariant Feature Transform. This result of the technique is 99.9% for SVM accuracy, 4606 for GrabCut segmentation of Maximum Flow, 50625 and 50168 for Nodes of Image Pixel and edges respectively, and 2.29 seconds for computational time. For SIFT by using LFD dataset, the performance of distance value in the segmentation is 1.464, 1.215 and 1.23 for dataset-I, dataset-II, dataset-III respectively. Meanwhile, computational time for dataset-I, dataset-II and dataset-III is 1.47 seconds, 1.88 seconds, and 1.35 seconds respectively. Secondly, HMkC and k-NN resolves the classification problem using the Iterated Condition Mode (ICMM) with k-mean clustering algorithm and k-NN algorithm. The classification result of the technique for the accuracy, sensitivity, specificity and computational time is 99.83%, 99.99%, 99.8%, and 14.9 seconds respectively. Thirdly, CM-DFT technique resolves the missing data imputation problem by using cross correlation of lagged hybrid k-NN with DFT (Hk-NN-DFT) to enhance the MRI images. The technique generates the not a non-missing values in terms of multiplication of 1100-3000 and 99.84% for the accuracy of missing data in the image. The missing ratio result of imputed missing data in the images after retrieving the missing ratio of dataset-I, II, and III is 0.9815 with the 1.533 second of computational time. These three techniques are useful to improve the proposed hybrid k-NN framework to ensure that the classification of brain tumor (low grade tumors) and CSF in images is conducted easily.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Saeed, Soobia
author_facet Saeed, Soobia
author_sort Saeed, Soobia
title Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
title_short Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
title_full Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
title_fullStr Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
title_full_unstemmed Enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
title_sort enhanced k-nearest neighbours classification performance based on segmentation and imputation of missing data
granting_institution Universiti Teknologi Malaysia
granting_department Faculty of Engineering - School of Computing
publishDate 2022
url http://eprints.utm.my/102976/1/SoobiaSaeedPSC2022.pdf.pdf
_version_ 1783729231024357376