An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents

Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the la...

Full description

Saved in:
Bibliographic Details
Main Author: Al-Dyani, Wafa Zubair Abdullah
Format: Thesis
Language:eng
eng
Published: 2022
Subjects:
Online Access:https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/2/s901775_02.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.10228
record_format uketd_dc
spelling my-uum-etd.102282023-01-16T03:51:34Z An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents 2022 Al-Dyani, Wafa Zubair Abdullah Kabir Ahmad, Farzana Kamaruddin, Siti Sakira Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Arts & Sciences QA Mathematics Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making. 2022 Thesis https://etd.uum.edu.my/10228/ https://etd.uum.edu.my/10228/1/s901775_01.pdf text eng 2025-04-25 staffonly https://etd.uum.edu.my/10228/2/s901775_02.pdf text eng public other doctoral Universiti Utara Malaysia
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
advisor Kabir Ahmad, Farzana
Kamaruddin, Siti Sakira
topic QA Mathematics
spellingShingle QA Mathematics
Al-Dyani, Wafa Zubair Abdullah
An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
description Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making.
format Thesis
qualification_name other
qualification_level Doctorate
author Al-Dyani, Wafa Zubair Abdullah
author_facet Al-Dyani, Wafa Zubair Abdullah
author_sort Al-Dyani, Wafa Zubair Abdullah
title An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_short An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_fullStr An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full_unstemmed An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_sort enhanced binary bat and markov clustering algorithms to improve event detection for heterogeneous news text documents
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2022
url https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/2/s901775_02.pdf
_version_ 1776103770007011328