Improved clustering using robust and classical principal component
k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-upm-ir.70922 |
---|---|
record_format |
uketd_dc |
spelling |
my-upm-ir.709222022-07-07T03:07:15Z Improved clustering using robust and classical principal component 2017-06 Hassn, Ahmed Kadom k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA. Algorithms 2017-06 Thesis http://psasir.upm.edu.my/id/eprint/70922/ http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf text en public masters Universiti Putra Malaysia Algorithms Fitrianto, Anwar |
institution |
Universiti Putra Malaysia |
collection |
PSAS Institutional Repository |
language |
English |
advisor |
Fitrianto, Anwar |
topic |
Algorithms |
spellingShingle |
Algorithms Hassn, Ahmed Kadom Improved clustering using robust and classical principal component |
description |
k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA. |
format |
Thesis |
qualification_level |
Master's degree |
author |
Hassn, Ahmed Kadom |
author_facet |
Hassn, Ahmed Kadom |
author_sort |
Hassn, Ahmed Kadom |
title |
Improved clustering using robust and classical principal component |
title_short |
Improved clustering using robust and classical principal component |
title_full |
Improved clustering using robust and classical principal component |
title_fullStr |
Improved clustering using robust and classical principal component |
title_full_unstemmed |
Improved clustering using robust and classical principal component |
title_sort |
improved clustering using robust and classical principal component |
granting_institution |
Universiti Putra Malaysia |
publishDate |
2017 |
url |
http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf |
_version_ |
1747812937331900416 |