The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat

This thesis concerns a Malay language documents retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words beginning with letter 'p', 'q&#...

Full description

Saved in:
Bibliographic Details
Main Author: Mat, Suriani
Format: Thesis
Language:English
Published: 2001
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98195/1/98195.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.98195
record_format uketd_dc
spelling my-uitm-ir.981952024-08-21T23:30:01Z The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat 2001 Mat, Suriani Analysis This thesis concerns a Malay language documents retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words beginning with letter 'p', 'q', 'y' and 'z', using 5 experiments. First experiment uses the original set of data collections. In second experiment, new words are added in the dictionary and the total value for i ' , 'm', 'p', 'q', 'y' and 'z' are modified in the header file "dcvarnew.h". Other than that, affixes rule format in file "rule.txt" are added and misspell words are corrected. Third, the locations of rules in file "rule.txt" are changed. For fourth experiment, words that have more than one root, old spelling words and spoken word are deleted from the dictionary. After the modification, the total value for 'k', 'm\ 'n' and 'p' in header file "dcvarnew.h" are corrected again. Otherwise, new code is added into module 'ubahejaan'. In fifth experiment, the spoken word is deleted from the dictionary and the total value for 'p' in file "dcvarnew.h" is corrected. Then alternative rule to solve the words pengawal, pengawalan and perangan is carried out. The objective of this project is achieved when the best order of the rules to use to stem the words that beginning with p', 'q', 'y' and 'z' is met. This involves the use of two combinations simultaneously such as the pair combination of 1234 as primary combinations and 3124 as the secondary. First, all the words used the 1234 combination, and if the program encountered that the words cannot be solved correctly, combination will be shifted to the secondary combination that is 3124 combination. These experiments can serves as a benchmark for future research in Malay language. 2001 Thesis https://ir.uitm.edu.my/id/eprint/98195/ https://ir.uitm.edu.my/id/eprint/98195/1/98195.pdf text en public degree Universiti Teknologi MARA (UiTM) Faculty of Computer and Mathematical Sciences Abu Bakar, Zainab
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
advisor Abu Bakar, Zainab
topic Analysis
spellingShingle Analysis
Mat, Suriani
The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
description This thesis concerns a Malay language documents retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance of a Malay stemming algorithm is tested based on words beginning with letter 'p', 'q', 'y' and 'z', using 5 experiments. First experiment uses the original set of data collections. In second experiment, new words are added in the dictionary and the total value for i ' , 'm', 'p', 'q', 'y' and 'z' are modified in the header file "dcvarnew.h". Other than that, affixes rule format in file "rule.txt" are added and misspell words are corrected. Third, the locations of rules in file "rule.txt" are changed. For fourth experiment, words that have more than one root, old spelling words and spoken word are deleted from the dictionary. After the modification, the total value for 'k', 'm\ 'n' and 'p' in header file "dcvarnew.h" are corrected again. Otherwise, new code is added into module 'ubahejaan'. In fifth experiment, the spoken word is deleted from the dictionary and the total value for 'p' in file "dcvarnew.h" is corrected. Then alternative rule to solve the words pengawal, pengawalan and perangan is carried out. The objective of this project is achieved when the best order of the rules to use to stem the words that beginning with p', 'q', 'y' and 'z' is met. This involves the use of two combinations simultaneously such as the pair combination of 1234 as primary combinations and 3124 as the secondary. First, all the words used the 1234 combination, and if the program encountered that the words cannot be solved correctly, combination will be shifted to the secondary combination that is 3124 combination. These experiments can serves as a benchmark for future research in Malay language.
format Thesis
qualification_level Bachelor degree
author Mat, Suriani
author_facet Mat, Suriani
author_sort Mat, Suriani
title The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
title_short The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
title_full The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
title_fullStr The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
title_full_unstemmed The study of stemming algorithm on Malay words that begin with alphabets P, Q, Y, and Z from the translated Al-Quran / Suriani Mat
title_sort study of stemming algorithm on malay words that begin with alphabets p, q, y, and z from the translated al-quran / suriani mat
granting_institution Universiti Teknologi MARA (UiTM)
granting_department Faculty of Computer and Mathematical Sciences
publishDate 2001
url https://ir.uitm.edu.my/id/eprint/98195/1/98195.pdf
_version_ 1811768894462885888