To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan

This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the exp...

Full description

Saved in:
Bibliographic Details
Main Author: Jantan, Rohana
Format: Thesis
Language:English
Published: 2000
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.98222
record_format uketd_dc
spelling my-uitm-ir.982222024-08-05T03:41:48Z To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan 2000 Jantan, Rohana Malaysia This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the experiment. In the experiments dictionaries of Malay root words and combination of morphological rules also used. The performance of the Malay stemming algorithm is evaluated by applying to the "S" word by removing different combination of prefixes. The "S" words or the resulted stemmed words are checked for their existences in the dictionaries. If these words do exist, the following stemming processes stop. These words are then analyzed. In the analysis, the percentage of each combination is compared to find the best prefixes combination. The result shows that there is still problem of overstemming, understemming and unstemming of word. For a total of unique 411 "S" words there are 0.73% overstemming, 0.73% understemming and 2.68% unstemmed words. Therefore, the algorithm must be modified in order to increase the performance of the stemming algorithm for Malay words. 2000 Thesis https://ir.uitm.edu.my/id/eprint/98222/ https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF text en public degree Universiti Teknologi MARA (UiTM) Faculty of Information Technology And Quantitative Sciences
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
topic Malaysia
spellingShingle Malaysia
Jantan, Rohana
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
description This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the experiment. In the experiments dictionaries of Malay root words and combination of morphological rules also used. The performance of the Malay stemming algorithm is evaluated by applying to the "S" word by removing different combination of prefixes. The "S" words or the resulted stemmed words are checked for their existences in the dictionaries. If these words do exist, the following stemming processes stop. These words are then analyzed. In the analysis, the percentage of each combination is compared to find the best prefixes combination. The result shows that there is still problem of overstemming, understemming and unstemming of word. For a total of unique 411 "S" words there are 0.73% overstemming, 0.73% understemming and 2.68% unstemmed words. Therefore, the algorithm must be modified in order to increase the performance of the stemming algorithm for Malay words.
format Thesis
qualification_level Bachelor degree
author Jantan, Rohana
author_facet Jantan, Rohana
author_sort Jantan, Rohana
title To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
title_short To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
title_full To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
title_fullStr To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
title_full_unstemmed To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
title_sort to study the performance of stemming algorithm on malay words beginning with the letter "s" / rohana jantan
granting_institution Universiti Teknologi MARA (UiTM)
granting_department Faculty of Information Technology And Quantitative Sciences
publishDate 2000
url https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF
_version_ 1811768897922138112