Comparative analysis of K-Means and K-Medoids for clustering exam questions / Nurul Zafirah Mokhtar

Clustering has become more needed as a technique to cluster with intent to provide better grouping due to several problems. Clustering dynamic data is a challenge in identifying and forming groups. This unsupervised learning usually leads to undirected knowledge discovery. The cluster detection algo...

Full description

Saved in:
Bibliographic Details
Main Author: Mokhtar, Nurul Zafirah
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/55833/1/55833.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Clustering has become more needed as a technique to cluster with intent to provide better grouping due to several problems. Clustering dynamic data is a challenge in identifying and forming groups. This unsupervised learning usually leads to undirected knowledge discovery. The cluster detection algorithm searches for clusters of data which are similar to one another by using similarity measures. Determining the suitable algorithm which can bring the optimized group clusters could be an issue. K-Means and k-Medoids are popular technique used in the world of clustering. Grouping an exam questions is a confusing tasks as it have to deal with the attributes and parameters of the data. Both techniques also may resulted in different outcomes. Depending on the parameters and attributes of the data, the results obtained from using both k-Means and k-Medoids could be varied. Each and every attribute and parameters selected undergo several process of data mining starting from pre-processing until the analysis of the data. The attributes and parameters that takes part in grouping the questions are marks, cognitive level and also the topics of the questions. Then the results is compared to determine which technique will produce higher accuracy results. This paper presents a comparative analysis of both algorithm in different data clusters to lay out strength and weaknesses of both. The grouping of an exam questions encompass low, medium and high level questions. Throughout the studies that conducted in determining the cluster, ITS570 course was used as a data and a set of cluster rules that hold the centroid and medoids value for both algorithm were produced at the end of this project for both techniques. The studies had found that k-Medoids produced higher accuracy result with 0.11% higher than k-Means. As a conclusion, with this type of data, k-Medoids algorithm had shown higher accuracy result rather than k-Means.