Enhanced data clustering and classification using auto-associative neural networks and self organizing maps

This thesis presents a number of investigations leading to introduction of novel applications of intelligent algorithms in the fields of informatics and analytics. This research aims to develop novel methodologies to reduce dimensions and clustering of highly non-linear multidimensional data. Improv...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd. Zin, Zalhan
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://eprints.utm.my/id/eprint/78096/1/ZalhanMohdZinPMJIT20161.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.78096
record_format uketd_dc
spelling my-utm-ep.780962018-07-23T05:34:40Z Enhanced data clustering and classification using auto-associative neural networks and self organizing maps 2016-03 Mohd. Zin, Zalhan T Technology (General) This thesis presents a number of investigations leading to introduction of novel applications of intelligent algorithms in the fields of informatics and analytics. This research aims to develop novel methodologies to reduce dimensions and clustering of highly non-linear multidimensional data. Improving the performance of existing methodologies has been based on two fundamental approaches. The first is to look into making novel structural re-arrangements by hybridisation of conventional intelligent algorithms which are Auto-Associative Neural Networks (AANN) and Self Organizing Maps (SOM) for data clustering improvement. The second is to enhance data clustering and classification performance by introducing novel fundamental algorithmic changes known as M3-SOM in the data processing and training procedure of conventional SOM. Both approaches are tested, benchmarked and analysed using three datasets which are Iris Flowers, Italian Olive Oils and Wine through case studies for dimension reduction, clustering and classification of complex and non-linear data. The study on AANN alone shows that this non-linear algorithm is able to efficiently reduce dimensions of the three datasets. This paves the way towards structurally hybridising AANN as dimension reduction method with SOM as clustering method (AANNSOM) for data clustering enhancement. This hybrid AANNSOM is then introduced and applied to cluster Iris Flowers, Italian Olive Oils and Wine datasets. The hybrid methodology proves to be able to improve data clustering accuracy, reduce quantisation errors and decrease computational time when compared to SOM in all case studies. However, the topographic errors showed inconsistency throughout the studies and it is still difficult for both AANNSOM and SOM to provide additional inherent information of the datasets such as the exact position of a data in a cluster. Therefore, M3-SOM, a novel methodology based on SOM training algorithm is proposed, developed and studied on the same datasets. M3-SOM was able to improve data clustering and classification accuracy for all three case studies when compared to conventional SOM. It is also able to obtain inherent information about the position of one data or "sub-cluster" towards other data or sub-cluster within the same class in Iris Flowers and Wine datasets. Nevertheless, it faces difficulties in achieving the same level of performance when clustering Italian Olive Oils data due to high number of data classes. However, it can be concluded that both methodologies have been able to improve data clustering and classification performance as well as to discover inherent information inside multidimensional data. 2016-03 Thesis http://eprints.utm.my/id/eprint/78096/ http://eprints.utm.my/id/eprint/78096/1/ZalhanMohdZinPMJIT20161.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:92255 phd doctoral Universiti Teknologi Malaysia, Malaysia-Japan International Institute of Technology Malaysia-Japan International Institute of Technology
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic T Technology (General)
spellingShingle T Technology (General)
Mohd. Zin, Zalhan
Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
description This thesis presents a number of investigations leading to introduction of novel applications of intelligent algorithms in the fields of informatics and analytics. This research aims to develop novel methodologies to reduce dimensions and clustering of highly non-linear multidimensional data. Improving the performance of existing methodologies has been based on two fundamental approaches. The first is to look into making novel structural re-arrangements by hybridisation of conventional intelligent algorithms which are Auto-Associative Neural Networks (AANN) and Self Organizing Maps (SOM) for data clustering improvement. The second is to enhance data clustering and classification performance by introducing novel fundamental algorithmic changes known as M3-SOM in the data processing and training procedure of conventional SOM. Both approaches are tested, benchmarked and analysed using three datasets which are Iris Flowers, Italian Olive Oils and Wine through case studies for dimension reduction, clustering and classification of complex and non-linear data. The study on AANN alone shows that this non-linear algorithm is able to efficiently reduce dimensions of the three datasets. This paves the way towards structurally hybridising AANN as dimension reduction method with SOM as clustering method (AANNSOM) for data clustering enhancement. This hybrid AANNSOM is then introduced and applied to cluster Iris Flowers, Italian Olive Oils and Wine datasets. The hybrid methodology proves to be able to improve data clustering accuracy, reduce quantisation errors and decrease computational time when compared to SOM in all case studies. However, the topographic errors showed inconsistency throughout the studies and it is still difficult for both AANNSOM and SOM to provide additional inherent information of the datasets such as the exact position of a data in a cluster. Therefore, M3-SOM, a novel methodology based on SOM training algorithm is proposed, developed and studied on the same datasets. M3-SOM was able to improve data clustering and classification accuracy for all three case studies when compared to conventional SOM. It is also able to obtain inherent information about the position of one data or "sub-cluster" towards other data or sub-cluster within the same class in Iris Flowers and Wine datasets. Nevertheless, it faces difficulties in achieving the same level of performance when clustering Italian Olive Oils data due to high number of data classes. However, it can be concluded that both methodologies have been able to improve data clustering and classification performance as well as to discover inherent information inside multidimensional data.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Mohd. Zin, Zalhan
author_facet Mohd. Zin, Zalhan
author_sort Mohd. Zin, Zalhan
title Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
title_short Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
title_full Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
title_fullStr Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
title_full_unstemmed Enhanced data clustering and classification using auto-associative neural networks and self organizing maps
title_sort enhanced data clustering and classification using auto-associative neural networks and self organizing maps
granting_institution Universiti Teknologi Malaysia, Malaysia-Japan International Institute of Technology
granting_department Malaysia-Japan International Institute of Technology
publishDate 2016
url http://eprints.utm.my/id/eprint/78096/1/ZalhanMohdZinPMJIT20161.pdf
_version_ 1747817906014519296