Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar

The research investigated the proposition that a simple terminological ontology supported by general purpose lexical resources and aided by information retrieval and natural language processing techniques can effectively annotate and retrieve documents in a specialised knowledge domain. This is addr...

Full description

Saved in:
Bibliographic Details
Main Author: Moosavifar, Seyed Abolfazle
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/25805/2/25805.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.25805
record_format uketd_dc
spelling my-uitm-ir.258052024-05-30T07:26:14Z Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar 2014 Moosavifar, Seyed Abolfazle Database management The research investigated the proposition that a simple terminological ontology supported by general purpose lexical resources and aided by information retrieval and natural language processing techniques can effectively annotate and retrieve documents in a specialised knowledge domain. This is addressing the evidence from a recent survey, which reported that low satisfaction in the retrieval of documents in a personal collection. A common, but robust approach in this area is keyword-based retrieval. The weakness of keyword-based retrieval is its inability to ‘understand’ the meaning of the keywords (semantic). Ontology approach is introduced as a way to support semantic retrieval. However, there is a problem with the construction of the ontology by laymen, especially ontologies for specialised domain areas. Therefore, the use of simple terminological ontology (constructed based on intuitive understanding of the domain) is proposed in this research. The research objectives are structured to introduce new algorithms for ontology-based automatic annotation, retrieval and ranking of documents and to check on the reliability of WordNet to provide lexical support for the (simple terminological) ontology-based document retrieval. To achieve these objectives, the Boolean IR model was extended by incorporating four coefficients to adjust the term weights, namely to deal with the word significance and word coherence in multi-word terms, to consider the matching type (exact or synonym) and to factor the category weight when calculating the term weights. To find the retrieval effectiveness, the results of ontology-based retrieval was evaluated against the conventional retrieval, and validated against expert retrieval. The results of the ontology-based automatic annotation were evaluated against expert annotation. In addition, the reliability of using WordNet to provide lexical support was tested during the process of the annotation and retrieval. The research found synonyms from WordNet selected with the correct senses can help to improve the (simple terminological) ontology-based annotation and retrieval of documents in a specialised domain area. The research also found that (simple terminological) ontology-based retrieval that is support by selected synonyms from WordNet can recall all documents that are retrieved using keyword-based retrieval with reasonable precision. The evaluations of the retrieval by get help from expert domain also emphasized this result. The research result also indicated there are few common tags between the automatic and expert annotation. There were issues with the expert annotations; nonetheless, if we regard the expert annotation is paramount, then we suggest semiautomatic annotation of the documents in order to improve the result of ontologybased retrieval. Future researchers can use our research ideas (e.g. annotation and retrieval algorithms; assignment of weights to ontology terms) to make further progress in the field of semantic information retrieval. System designers can base our research findings (e.g. type of lexical support) to decide on methods for improving the retrieval in personal collection. 2014 Thesis https://ir.uitm.edu.my/id/eprint/25805/ https://ir.uitm.edu.my/id/eprint/25805/2/25805.pdf text en public masters Universiti Teknologi MARA Faculty of Computer and Mathematical Sciences
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
topic Database management
spellingShingle Database management
Moosavifar, Seyed Abolfazle
Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
description The research investigated the proposition that a simple terminological ontology supported by general purpose lexical resources and aided by information retrieval and natural language processing techniques can effectively annotate and retrieve documents in a specialised knowledge domain. This is addressing the evidence from a recent survey, which reported that low satisfaction in the retrieval of documents in a personal collection. A common, but robust approach in this area is keyword-based retrieval. The weakness of keyword-based retrieval is its inability to ‘understand’ the meaning of the keywords (semantic). Ontology approach is introduced as a way to support semantic retrieval. However, there is a problem with the construction of the ontology by laymen, especially ontologies for specialised domain areas. Therefore, the use of simple terminological ontology (constructed based on intuitive understanding of the domain) is proposed in this research. The research objectives are structured to introduce new algorithms for ontology-based automatic annotation, retrieval and ranking of documents and to check on the reliability of WordNet to provide lexical support for the (simple terminological) ontology-based document retrieval. To achieve these objectives, the Boolean IR model was extended by incorporating four coefficients to adjust the term weights, namely to deal with the word significance and word coherence in multi-word terms, to consider the matching type (exact or synonym) and to factor the category weight when calculating the term weights. To find the retrieval effectiveness, the results of ontology-based retrieval was evaluated against the conventional retrieval, and validated against expert retrieval. The results of the ontology-based automatic annotation were evaluated against expert annotation. In addition, the reliability of using WordNet to provide lexical support was tested during the process of the annotation and retrieval. The research found synonyms from WordNet selected with the correct senses can help to improve the (simple terminological) ontology-based annotation and retrieval of documents in a specialised domain area. The research also found that (simple terminological) ontology-based retrieval that is support by selected synonyms from WordNet can recall all documents that are retrieved using keyword-based retrieval with reasonable precision. The evaluations of the retrieval by get help from expert domain also emphasized this result. The research result also indicated there are few common tags between the automatic and expert annotation. There were issues with the expert annotations; nonetheless, if we regard the expert annotation is paramount, then we suggest semiautomatic annotation of the documents in order to improve the result of ontologybased retrieval. Future researchers can use our research ideas (e.g. annotation and retrieval algorithms; assignment of weights to ontology terms) to make further progress in the field of semantic information retrieval. System designers can base our research findings (e.g. type of lexical support) to decide on methods for improving the retrieval in personal collection.
format Thesis
qualification_level Master's degree
author Moosavifar, Seyed Abolfazle
author_facet Moosavifar, Seyed Abolfazle
author_sort Moosavifar, Seyed Abolfazle
title Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
title_short Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
title_full Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
title_fullStr Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
title_full_unstemmed Effectiveness of simple terminological ontology to support document retrieval in a specialized domain / Seyed Abolfazle Moosavifar
title_sort effectiveness of simple terminological ontology to support document retrieval in a specialized domain / seyed abolfazle moosavifar
granting_institution Universiti Teknologi MARA
granting_department Faculty of Computer and Mathematical Sciences
publishDate 2014
url https://ir.uitm.edu.my/id/eprint/25805/2/25805.pdf
_version_ 1804889593716146176