An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development

Social media is crucial in facilitating the Disaster Management (DM) communication process. However, the knowledge representation of DM Social Media (DMSM) is inadequate especially in ontology representation. Given to huge volume of DMSM unstructured text, information extraction for ontology develop...

Full description

Saved in:
Bibliographic Details
Main Author: Muhammad, Mahmud
Format: Thesis
Language:eng
eng
Published: 2023
Subjects:
Online Access:https://etd.uum.edu.my/10738/1/kebenaran%20mendeposit-membenarkan-s95772.pdf
https://etd.uum.edu.my/10738/2/s95772_01.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.10738
record_format uketd_dc
spelling my-uum-etd.107382023-11-26T02:12:25Z An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development 2023 Muhammad, Mahmud Yasin, Azman Omar, Mazni Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Art & Sciences B Philosophy (General) Social media is crucial in facilitating the Disaster Management (DM) communication process. However, the knowledge representation of DM Social Media (DMSM) is inadequate especially in ontology representation. Given to huge volume of DMSM unstructured text, information extraction for ontology development is achieved through text mining. However, existing works on text mining-based ontology development utilizes a well-known unsupervised scheme, TF-IDF that ignore document distribution and leads to high dimensionality of features. The main objectives of the study are to improve ontology development by enhancing supervised term weighting scheme (TWS) and developing DMSM ontology. The enhancement is achieved by identifying the existing supervised TWS and giving higher weightage to the positive category instead of the negative one, which results in the removal of irrelevant terms. The study is conducted by gathering DMSM scientific publications, performing pre-processing, and calculating the eight selected supervised TWS. All the schemes obtained high weightage on the negative category, instead of the positive category. An enhancement is performed by introducing a positive term frequency ratio and positive category ratio, whereby the enhanced schemes extract relevant terms to the positive category. The DMSM ontology is generated and evaluated using a gold-standard-based evaluation method for syntactic comparison, designing the ontology, and evaluating the learned ontology. From the results, it is found that good score is achieved for TF. IDFEC-based. Enhanced and TF. RF. Enhanced with 93.33% and 91.03% for precision, 80.8% and 78.02% for recall, and 0.87 and 0.84 for F-measure, respectively. Theoretically, this study contributes an enhanced supervised TWS by emphasizing the classification information of a corpus, hence features dimensionality can be reduced and boosts the importance of words that are distributed between the positive and the negative class. Practically the enhanced scheme provides an improved technique for ontology developers to extract relevant terms from unstructured scientific publication text especially for DMSM domain. 2023 Thesis https://etd.uum.edu.my/10738/ https://etd.uum.edu.my/10738/1/kebenaran%20mendeposit-membenarkan-s95772.pdf text eng staffonly https://etd.uum.edu.my/10738/2/s95772_01.pdf text eng public other doctoral Universiti Utara Malaysia
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
advisor Yasin, Azman
Omar, Mazni
topic B Philosophy (General)
spellingShingle B Philosophy (General)
Muhammad, Mahmud
An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
description Social media is crucial in facilitating the Disaster Management (DM) communication process. However, the knowledge representation of DM Social Media (DMSM) is inadequate especially in ontology representation. Given to huge volume of DMSM unstructured text, information extraction for ontology development is achieved through text mining. However, existing works on text mining-based ontology development utilizes a well-known unsupervised scheme, TF-IDF that ignore document distribution and leads to high dimensionality of features. The main objectives of the study are to improve ontology development by enhancing supervised term weighting scheme (TWS) and developing DMSM ontology. The enhancement is achieved by identifying the existing supervised TWS and giving higher weightage to the positive category instead of the negative one, which results in the removal of irrelevant terms. The study is conducted by gathering DMSM scientific publications, performing pre-processing, and calculating the eight selected supervised TWS. All the schemes obtained high weightage on the negative category, instead of the positive category. An enhancement is performed by introducing a positive term frequency ratio and positive category ratio, whereby the enhanced schemes extract relevant terms to the positive category. The DMSM ontology is generated and evaluated using a gold-standard-based evaluation method for syntactic comparison, designing the ontology, and evaluating the learned ontology. From the results, it is found that good score is achieved for TF. IDFEC-based. Enhanced and TF. RF. Enhanced with 93.33% and 91.03% for precision, 80.8% and 78.02% for recall, and 0.87 and 0.84 for F-measure, respectively. Theoretically, this study contributes an enhanced supervised TWS by emphasizing the classification information of a corpus, hence features dimensionality can be reduced and boosts the importance of words that are distributed between the positive and the negative class. Practically the enhanced scheme provides an improved technique for ontology developers to extract relevant terms from unstructured scientific publication text especially for DMSM domain.
format Thesis
qualification_name other
qualification_level Doctorate
author Muhammad, Mahmud
author_facet Muhammad, Mahmud
author_sort Muhammad, Mahmud
title An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
title_short An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
title_full An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
title_fullStr An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
title_full_unstemmed An enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
title_sort enhanced term weighting scheme method of identifying and extracting terms for ontology learning and development
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2023
url https://etd.uum.edu.my/10738/1/kebenaran%20mendeposit-membenarkan-s95772.pdf
https://etd.uum.edu.my/10738/2/s95772_01.pdf
_version_ 1783729420777816064