Algorithm Development of Bidirectional Agglomerative Hierarchical Clustering Using AVL Tree with Visualization
In recent years, the dramatic rise in the use of the internet and the improvement in technology In general have transformed societies into one that strongly depends on information and knowledge. The growth of information resources along with the accelerating rate of technological change has produ...
Saved in:
Summary: | In recent years, the dramatic rise in the use of the internet and the improvement in
technology In general have transformed societies into one that strongly depends on
information and knowledge. The growth of information resources along with the
accelerating rate of technological change has produced massive amount of data and
information that often exceed the ability to handle and manage it. Therefore, the
demand now is creating a faster approach to handle voluminous data. This will also
improve the complexity time of the traditional hierarchical methods to face huge
collections of data and growing information flooding. In addition, user involvement in
the data mining is needed as whereby the user interact with the process through
exploitation of the power of human explanation sight and brain for analyzing and
exploring data. Clustering is an analysis technique for discovering interesting
distributions and patterns in the data set. The objects within a cluster are more similar
to each other than the objects in different clusters. This research proposed a
bidirectional agglomerative hierarchical clustering algorithm. The proposed algorithm
is fundamentally similar to conventional agglomerative hierarchical clustering
algorithms designed to partition a collection of objects into subsets sharing similar
attributes. It is obvious that analyzing large data sets via traditional methods has
moved from being tedious to being high computational cost. The traditional methods
usually not scalable to very large datasets, with an O(ri2) computational cost. However,
the proposed algorithm adapted AVL tree approach cluster the objects to left and of
right the median/root. The computational cost significantly reduced into O(Iog n). This
is efficient for huge amount of data. Thus clustering using bidirectional hierarchical
will facilitate efficient computational cost. This research demonstrated the
agglomerative algorithm performance based on complexity parameters such as
execution time and the number of cluster needed to merge all data point/objects into
one cluster. As part of the experimental validation, real data set were used to measure
the effectiveness and the efficiency of the proposed algorithm!. The study shows a
73.4% improvement from the traditional approach. The demand for visual and
interactive analysis tools is particularly pressing in this information age, where the user
needs to analyze and observe large amount of data to grasp valuable knowledge. This
research also proposed a visual cluster approach to visualize the knowledge extracted
by the data mining algorithm using AVL tree approach. The visualization prototype is
evaluated by postgraduate students who were interviewed and using Technology
Acceptance Model, as the instrument. The result revealed that visualization is useful,
easy to use and give user satisfaction. |
---|