Stemming Algorithm in Searching Malay Text
Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query w...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | eng eng |
Published: |
2004
|
Subjects: | |
Online Access: | https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-uum-etd.1409 |
---|---|
record_format |
uketd_dc |
institution |
Universiti Utara Malaysia |
collection |
UUM ETD |
language |
eng eng |
topic |
QA76 Computer software |
spellingShingle |
QA76 Computer software Rizauddin, Saian Stemming Algorithm in Searching Malay Text |
description |
Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project
evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query words. The retrieved Web pages are weighted and ranked using Inverse Document Frequency
function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm. |
format |
Thesis |
qualification_name |
masters |
qualification_level |
Master's degree |
author |
Rizauddin, Saian |
author_facet |
Rizauddin, Saian |
author_sort |
Rizauddin, Saian |
title |
Stemming Algorithm in Searching Malay Text
|
title_short |
Stemming Algorithm in Searching Malay Text
|
title_full |
Stemming Algorithm in Searching Malay Text
|
title_fullStr |
Stemming Algorithm in Searching Malay Text
|
title_full_unstemmed |
Stemming Algorithm in Searching Malay Text
|
title_sort |
stemming algorithm in searching malay text |
granting_institution |
Universiti Utara Malaysia |
granting_department |
Faculty of Information Technology |
publishDate |
2004 |
url |
https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf |
_version_ |
1747827140351492096 |
spelling |
my-uum-etd.14092013-07-24T12:11:49Z Stemming Algorithm in Searching Malay Text 2004 Rizauddin, Saian Faculty of Information Technology Faculty of Information Technology QA76 Computer software Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query words. The retrieved Web pages are weighted and ranked using Inverse Document Frequency function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm. 2004 Thesis https://etd.uum.edu.my/1409/ https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf application/pdf eng validuser https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf application/pdf eng public masters masters Universiti Utara Malaysia Ahmad, F.. Yusoff, M. & Sembok, T. M. T. (1996). Experiments with a Stemming Algorithm for Malay Words. Journal of the American Society for Information Science, 47(12), 909-918. Cescone, N. (1978). Morphological Analysis and Lexicon Design for Natural Language Processing. Computers and Humanities, 11, 199-209. Ekmekcioglu,F. Cuna, Lynch, Michael F. & Willett, Peter (1996). Stemming and N-gram matching for term conflation in Turkish texts. Information Research, 1(1). Available at: http://informationr.net/ir/2-2/paper13.html. Frakes, W. B. (1992). Stemming Algorithms. In W. B. Frakes and R. Baeza (Ed.),Information Retrieval, Data Structures and Algorithms. (pp. 131-160). Prentice Hall. Frakes, W.B. (1984). Term Conflation for Information Retrieval. In van Rijsbergen, C.J.(Ed.), Research and Development in Information Retrieval (pp. 383-390). CUP: Cambridge. Freud, G.E. & Willett, P. (1982). Online Identification of Word Variants and Arbitrary Truncation Searching Using a String Similarity Measure. Information Technology Research and Development, 1, 177-187. Hafer, M.A. & Weiss, S.F. (1974). Word Segmentation by Letter Successor Varieties.Information Storage and Retrieval, 10,371-385. Harman, D. (1991). How Effective is Suffixing? .Journal of the American Society for Information Science, 42(1), 7-15. Idris, N. & Syed Mustapha, S. M. F. D. (2001, April 23). Stemming for Term Conflation in Malay Texts. International Conference ofArtrficia1 Intelligence, Las Vegas. p.1512-1517. Kantrowitz, M., Mohit, B., & Mittal, V. (2000). Stemming and Its Effects on TFIDF Ranking. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 357-359. Kobayashi, M. & Takeda, K. (2000, June). Information Retrieval on the Web. ACM Computing Surveys, 32(2),144-173. Lawrence, S. & Giles, C. L. (1999). Accessibility of information on the Web. Nature. Lennon, M., Peirce, D. S., Tarry, B. D. & Willet, P.(1981). An evaluation for some conflation algorithms for information retrieval. Journal of Information Science,3,177- 183. Lovins, J.B. (1968). Development of a Stemming Algorithm. Mechanical Translation and Computational Linguistics, 11,22-31. Niedermair, G.T., Thurmair, G. & Buttel, I. (1985). MARS A Retrieval Tool on the Basis of Morphological Analysis. In van Rjsbergen, C. J. (Ed.), Research and Development in Information Retrieval (pp. 369-380). CUP: Cambridge. Paice, C. D. (1990). Another Stemmer. ACM SIGIR Forum, 24(3), 56-61. Pirkola, A. (2001, May). Morphological Typology of Languages for IR. Journal of Documentation, 57,330-348. Popovic, M. & Willet, P. (1992, June). The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data. Journal of the American Society for Information Science, J3(5), 384-390. Porter, M. F. (1980, July). An Algorithm for Sufix Stripping. Program, 14(3), 130-137. Raben, J. & Lieberman, D.V. (1976). Text comparison: principles and a program. In Jones, A & Churchouse, R. F. (Eds.), The computer in literacy and linguistic studies.(pp.297-308). Cardiff University of Wales Press. Savoy, J. (1993, January). Stemming of French Words Based on Grammatical Categories. .Journal of the American Society for Information Science, 44,1-9. Stephen, G.A. (1994). String Searching Algorithm. In Lecturer Notes Series on Computing.Singapore: World Scientific Publishing Co. Pte. Ltd. UlmschneiderJ, .E. & Doszkocs, T. (1983). A Practical Stemming Algorithm for Online Search Assistance. Online Review, 7, 301-318. Van Rijsbergen, C. J. (1979). Information Retrieval (Second Edition). London: Butterworths. Walker, S. & Jones, R.M. (1987). Improving Subject Retrieval in Online Catalogues. Stemming, Automatic Spelling Correction and Cross-Reference Tables, British Library Research Paper, London. Wen Ji-Rong., Nie Jian-Yun. & Zhang Hong-Jiang.(2001,May I). Clustering User Queries of a Search Engines. ACM, pp. 162-168. Yoshiaki, M. & Keishi, T. (1999, February). Finding Context Paths for Web Pages.Proceedings of the tenth ACM Conference on Hypertext and Hypermedia: returning to our diverse roots. Zainab Abu Bakar & Nurazzah Abd. Rahman (2004). Evaluating the Effectiveness of Conflation Methods in Retrieving Malay Translated Al-Quran Texts and Images. Conference on Scientific and Social Research, UiTM. Zeti Zuryani Mohd Zakuan (2004). Penipuan Kad Kredit di Malaysia. LLM Thesis,Universiti Kebangsaan Malaysia. |