To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi

This thesis concerns a Malay language documents retrieval system. Stemming algorithm, database Quran translated documents and electronic root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words that beginning with 'd'...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd Supandi, Mohd Nazril Hafez
Format: Thesis
Language:English
Published: 2000
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98014/1/98014.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.98014
record_format uketd_dc
spelling my-uitm-ir.980142024-08-21T23:20:33Z To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi 2000 Mohd Supandi, Mohd Nazril Hafez QA Mathematics This thesis concerns a Malay language documents retrieval system. Stemming algorithm, database Quran translated documents and electronic root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words that beginning with 'd', using 4 experiments. First, use the original set of data collections. Second, adding a new words in the dictionary. Other than that we modify the total value for 'a', 'k' and 'm'dictionary in header file "dcvarnew.h". Third, the modification into the program is adding the affixes rule format in "rule.txt" Forth, add a new code to differentiate the use of affix rule of "di+an" and "di+kan". The main objective is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'd' word stemmed. It is achieved the objective when the best order of rule to used to stem the words that beginning with 'd' is met. In which it involves the use of two combinations simultaneously such as the pair combination of 1234 as primary combination and 2341 as the secondary. First, all the words will used the 1234 combination, and if the program encountered that the words can not be solved correctly, the combination will be shifted to the secondary combination that is 2341. These experiments can serves as a benchmark for future research in Malay language. Furthermore, it can help those who are interested to know about certain subject matters from the Al-Quran where the document retrieval system will automatically retrieve all relevant documents in response to the users' queries. 2000 Thesis https://ir.uitm.edu.my/id/eprint/98014/ https://ir.uitm.edu.my/id/eprint/98014/1/98014.pdf text en public degree Universiti Teknologi MARA (UiTM) Faculty of Information Technology and Quantitative Sciences Abu Bakar, Zainab
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
advisor Abu Bakar, Zainab
topic QA Mathematics
spellingShingle QA Mathematics
Mohd Supandi, Mohd Nazril Hafez
To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
description This thesis concerns a Malay language documents retrieval system. Stemming algorithm, database Quran translated documents and electronic root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words that beginning with 'd', using 4 experiments. First, use the original set of data collections. Second, adding a new words in the dictionary. Other than that we modify the total value for 'a', 'k' and 'm'dictionary in header file "dcvarnew.h". Third, the modification into the program is adding the affixes rule format in "rule.txt" Forth, add a new code to differentiate the use of affix rule of "di+an" and "di+kan". The main objective is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'd' word stemmed. It is achieved the objective when the best order of rule to used to stem the words that beginning with 'd' is met. In which it involves the use of two combinations simultaneously such as the pair combination of 1234 as primary combination and 2341 as the secondary. First, all the words will used the 1234 combination, and if the program encountered that the words can not be solved correctly, the combination will be shifted to the secondary combination that is 2341. These experiments can serves as a benchmark for future research in Malay language. Furthermore, it can help those who are interested to know about certain subject matters from the Al-Quran where the document retrieval system will automatically retrieve all relevant documents in response to the users' queries.
format Thesis
qualification_level Bachelor degree
author Mohd Supandi, Mohd Nazril Hafez
author_facet Mohd Supandi, Mohd Nazril Hafez
author_sort Mohd Supandi, Mohd Nazril Hafez
title To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
title_short To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
title_full To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
title_fullStr To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
title_full_unstemmed To enhance existing Malay stemming algorithm starting with the letter 'D' / Mohd Nazril Hafez Mohd Supandi
title_sort to enhance existing malay stemming algorithm starting with the letter 'd' / mohd nazril hafez mohd supandi
granting_institution Universiti Teknologi MARA (UiTM)
granting_department Faculty of Information Technology and Quantitative Sciences
publishDate 2000
url https://ir.uitm.edu.my/id/eprint/98014/1/98014.pdf
_version_ 1811768882315132928