Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream
As the number of connected devices rises, real-time processing of data streams has garnered significant attention and interest within the scientific community. Clustering, known for its versatility in real-time data stream processing and independence from labeled instances, is a suitable method for...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | eng eng eng |
Published: |
2023
|
Subjects: | |
Online Access: | https://etd.uum.edu.my/11343/1/permission%20to%20deposit-not%20allow-s902134_0001.pdf https://etd.uum.edu.my/11343/2/s902134_01.pdf https://etd.uum.edu.my/11343/3/s902134_02.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-uum-etd.11343 |
---|---|
record_format |
uketd_dc |
spelling |
my-uum-etd.113432024-10-06T15:51:05Z Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream 2023 Abdulateef, Alaa Fareed Yusof, Yuhanis Yasin, Azman Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Art & Sciences QA Mathematics As the number of connected devices rises, real-time processing of data streams has garnered significant attention and interest within the scientific community. Clustering, known for its versatility in real-time data stream processing and independence from labeled instances, is a suitable method for analyzing evolving data streams. Existing clustering algorithms for outlier detection encounter significant challenges due to insufficient data pre-processing methods and the absence of a suitable data summarization framework for effective data stream clustering. This research introduces Adaptive Grid-Meshed-Buffer Stream Clustering Algorithm (AGMB), that addresses these weaknesses and improves outlier detection. The AGMB algorithm is built upon three algorithms: 1) Grid-Multi-Buffer Stream Clustering (GMBSC), 2) Cautious Grid-Multi-Buffer Stream Clustering (C-GMBSC) and 3) Adaptive-Grid-Multi-Buffer Stream Clustering (A-GMBSC). The GMBSC addresses inadequate data pre-processing, grid projection, and buffering issues while the C-GBMSC includes an outlier elimination strategy to cautiously eliminate detected outliers’ data points before it turns into normal data. The A-GMBSC is designed by adding an adaptive density threshold which maintains a cluster model of normal data. The effectiveness of the final outcome, which is the AGMB, is validated through an experimental evaluation conducted on eight datasets that are synthetic and real-life datasets. Evaluation was made based on various evaluation metrics related to outlier detection and clustering quality. The results indicate that the AGMB algorithm outperformed existing benchmark algorithms in terms of predefined evaluation criteria with an overall 72% accuracy compared to benchmark algorithms which is 11 % only. Hence, the empirical evidence highlights the superiority and practical relevance of the proposed algorithm in tackling outlier detection in evolving data streams. This may be useful for real-world applications such as surveillance systems based on IoT and customer behaviour analytics systems. 2023 Thesis https://etd.uum.edu.my/11343/ https://etd.uum.edu.my/11343/1/permission%20to%20deposit-not%20allow-s902134_0001.pdf text eng staffonly https://etd.uum.edu.my/11343/2/s902134_01.pdf text eng staffonly https://etd.uum.edu.my/11343/3/s902134_02.pdf text eng staffonly other doctoral Universiti Utara Malaysia |
institution |
Universiti Utara Malaysia |
collection |
UUM ETD |
language |
eng eng eng |
advisor |
Yusof, Yuhanis Yasin, Azman |
topic |
QA Mathematics |
spellingShingle |
QA Mathematics Abdulateef, Alaa Fareed Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
description |
As the number of connected devices rises, real-time processing of data streams has garnered significant attention and interest within the scientific community. Clustering, known for its versatility in real-time data stream processing and independence from labeled instances, is a suitable method for analyzing evolving data streams. Existing clustering algorithms for outlier detection encounter significant challenges due to insufficient data pre-processing methods and the absence of a suitable data summarization framework for effective data stream clustering. This research introduces Adaptive Grid-Meshed-Buffer Stream Clustering Algorithm (AGMB), that addresses these weaknesses and improves outlier detection. The AGMB algorithm is built upon three algorithms: 1) Grid-Multi-Buffer Stream Clustering (GMBSC), 2) Cautious Grid-Multi-Buffer Stream Clustering (C-GMBSC) and 3) Adaptive-Grid-Multi-Buffer Stream Clustering (A-GMBSC). The GMBSC addresses inadequate data pre-processing, grid projection, and buffering issues while the C-GBMSC includes an outlier elimination strategy to cautiously eliminate detected outliers’ data points before it turns into normal data. The A-GMBSC is designed by adding an adaptive density threshold which maintains a cluster model of normal data. The effectiveness of the final outcome, which is the AGMB, is validated through an experimental evaluation conducted on eight datasets that are synthetic and real-life datasets. Evaluation was made based on various evaluation metrics related to outlier detection and clustering quality. The results indicate that the AGMB algorithm outperformed existing benchmark algorithms in terms of predefined evaluation criteria with an overall 72% accuracy compared to benchmark algorithms which is 11 % only. Hence, the empirical evidence highlights the superiority and practical relevance of the proposed algorithm in tackling outlier detection in evolving data streams. This may be useful for real-world applications such as surveillance systems based on IoT and customer behaviour analytics systems. |
format |
Thesis |
qualification_name |
other |
qualification_level |
Doctorate |
author |
Abdulateef, Alaa Fareed |
author_facet |
Abdulateef, Alaa Fareed |
author_sort |
Abdulateef, Alaa Fareed |
title |
Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
title_short |
Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
title_full |
Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
title_fullStr |
Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
title_full_unstemmed |
Adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
title_sort |
adaptive grid-meshed-buffer clustering algorithm for outlier detection in evolving data stream |
granting_institution |
Universiti Utara Malaysia |
granting_department |
Awang Had Salleh Graduate School of Arts & Sciences |
publishDate |
2023 |
url |
https://etd.uum.edu.my/11343/1/permission%20to%20deposit-not%20allow-s902134_0001.pdf https://etd.uum.edu.my/11343/2/s902134_01.pdf https://etd.uum.edu.my/11343/3/s902134_02.pdf |
_version_ |
1813495770945421312 |