Concept And Relation Extraction Framework For Ontology Learning

Extracting valuable knowledge and representing it in a machine-understandable form is considered one of the main challenges of semantic web and knowledge engineering fields. The explosive growth of textual data is coupled with the increasing demand for ontologies. Ontology Learning (OL) from text is...

Full description

Saved in:
Bibliographic Details
Main Author: Al-Aswadi, Fatima Nadeem Salem
Format: Thesis
Language:English
Published: 2023
Subjects:
Online Access:http://eprints.usm.my/60149/1/FATIMA%20NADEEM%20SALEM%20AL-ASWADI%20-%20TESIS%20cut.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Extracting valuable knowledge and representing it in a machine-understandable form is considered one of the main challenges of semantic web and knowledge engineering fields. The explosive growth of textual data is coupled with the increasing demand for ontologies. Ontology Learning (OL) from text is a process that aims to automatically or semi-automatically extract and represent the knowledge from text into the machine-understandable form. Ontology is a core scheme representing knowledge as a set of concepts and their relationships within a domain. Extracting the concepts and their relations is the backbone for an OL system. The existing OL systems have many limitations and drawbacks, such as not efficient for extracting relevant concepts especially, for large-length dataset; depending on a large amount of predefined patterns to extract relations, and extracting very few types of relations. In this thesis, a framework called Concept and Relation Extraction Framework (CREF) is proposed. It consists of four main stages: enhancing pre-processing method, developing methodology for the concept extraction task, proposing a new text representation approach for relations, and improving relation extraction method. The first stage involves proposing a new Concept Extraction stopwords (CE-stopwords) for scientific publications while the second stage involves introducing a new Domain Time Relevance (DTR) metric and proposing a Developed Concept Extraction Method based on DTR called (DTR-DCEM).