Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar

This project incorporates the ROA stemming algorithm with thesaurus approach by Rapizal. It is an opportunity to find out whether combining stemming with thesaurus will improve retrieval effectiveness and efficiency. Advance in information technology has made it possible for a wide range of text-bas...

Full description

Saved in:
Bibliographic Details
Main Author: Mokhtar, Mohd Rosmadi
Format: Thesis
Language:English
Published: 2001
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98015/1/98015.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.98015
record_format uketd_dc
spelling my-uitm-ir.980152024-08-21T23:20:36Z Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar 2001 Mokhtar, Mohd Rosmadi Analysis This project incorporates the ROA stemming algorithm with thesaurus approach by Rapizal. It is an opportunity to find out whether combining stemming with thesaurus will improve retrieval effectiveness and efficiency. Advance in information technology has made it possible for a wide range of text-based information to be search and retrieved online, locally or from remote hosts. A wide range of text-based information therefore can be searched and retrieved from online connection anywhere in the world. This type of popularity is due to advancement in technology that is rapidly growing from day to day. There are many Malay word variants that have the same meaning available from Malay words itself. In order to overcome these words variants problems, the development of computational technique that could transform both user's search and database words into a single canonical form is introduces. It is known as conflation methods. One of well-known conflation methods is stemming algorithms, where it is used to identify morphological variants. Stemming algorithms are language dependent. They have proven to be successful to reduce words with the same stem to a common form and are evidenced by the work many researchers. Unfortunately, conflation method is unable to conflate different words that possess the same meaning. These words can only be conflated by a thesaurus that can handle hierarchic, synonymic, and also morphological relationship. To create a thesaurus for a given subject an extensive manual and highly skilled, therefore to solve this problem, another language dependent conflation method, thesaurus is used. Its can build all types of relationship that exist between words. The information retrieval thesaurus typically contains a list of terms, where a term is either a single word or phrase. The relationships between them are also included to assist in coordinating indexing and retrieval. So from this project study it is found that the incorporations of stemming algorithm and thesaurus successfully increase the retrieved and relevant documents using Malay query words but on the other hand reduces its efficiency. 2001 Thesis https://ir.uitm.edu.my/id/eprint/98015/ https://ir.uitm.edu.my/id/eprint/98015/1/98015.pdf text en public degree Universiti Teknologi MARA (UiTM) Faculty of Information Technology and Quantitative Sciences Abu Bakar, Zainab
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
advisor Abu Bakar, Zainab
topic Analysis
spellingShingle Analysis
Mokhtar, Mohd Rosmadi
Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
description This project incorporates the ROA stemming algorithm with thesaurus approach by Rapizal. It is an opportunity to find out whether combining stemming with thesaurus will improve retrieval effectiveness and efficiency. Advance in information technology has made it possible for a wide range of text-based information to be search and retrieved online, locally or from remote hosts. A wide range of text-based information therefore can be searched and retrieved from online connection anywhere in the world. This type of popularity is due to advancement in technology that is rapidly growing from day to day. There are many Malay word variants that have the same meaning available from Malay words itself. In order to overcome these words variants problems, the development of computational technique that could transform both user's search and database words into a single canonical form is introduces. It is known as conflation methods. One of well-known conflation methods is stemming algorithms, where it is used to identify morphological variants. Stemming algorithms are language dependent. They have proven to be successful to reduce words with the same stem to a common form and are evidenced by the work many researchers. Unfortunately, conflation method is unable to conflate different words that possess the same meaning. These words can only be conflated by a thesaurus that can handle hierarchic, synonymic, and also morphological relationship. To create a thesaurus for a given subject an extensive manual and highly skilled, therefore to solve this problem, another language dependent conflation method, thesaurus is used. Its can build all types of relationship that exist between words. The information retrieval thesaurus typically contains a list of terms, where a term is either a single word or phrase. The relationships between them are also included to assist in coordinating indexing and retrieval. So from this project study it is found that the incorporations of stemming algorithm and thesaurus successfully increase the retrieved and relevant documents using Malay query words but on the other hand reduces its efficiency.
format Thesis
qualification_level Bachelor degree
author Mokhtar, Mohd Rosmadi
author_facet Mokhtar, Mohd Rosmadi
author_sort Mokhtar, Mohd Rosmadi
title Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
title_short Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
title_full Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
title_fullStr Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
title_full_unstemmed Incorporating stemming algorithm in the Malay information retrieval that employs Thesaurus aproach / Mohd Rosmadi Mokhtar
title_sort incorporating stemming algorithm in the malay information retrieval that employs thesaurus aproach / mohd rosmadi mokhtar
granting_institution Universiti Teknologi MARA (UiTM)
granting_department Faculty of Information Technology and Quantitative Sciences
publishDate 2001
url https://ir.uitm.edu.my/id/eprint/98015/1/98015.pdf
_version_ 1811768882566791168