Stemming Algorithm in Searching Malay Text

Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query w...

Full description

Saved in:

Bibliographic Details
Main Author:	Rizauddin, Saian
Format:	Thesis
Language:	eng eng
Published:	2004
Subjects:	QA76 Computer software
Online Access:	https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-uum-etd.1409
record_format	uketd_dc
institution	Universiti Utara Malaysia
collection	UUM ETD
language	eng eng
topic	QA76 Computer software
spellingShingle	QA76 Computer software Rizauddin, Saian Stemming Algorithm in Searching Malay Text
description	Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query words. The retrieved Web pages are weighted and ranked using Inverse Document Frequency function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm.
format	Thesis
qualification_name	masters
qualification_level	Master's degree
author	Rizauddin, Saian
author_facet	Rizauddin, Saian
author_sort	Rizauddin, Saian
title	Stemming Algorithm in Searching Malay Text
title_short	Stemming Algorithm in Searching Malay Text
title_full	Stemming Algorithm in Searching Malay Text
title_fullStr	Stemming Algorithm in Searching Malay Text
title_full_unstemmed	Stemming Algorithm in Searching Malay Text
title_sort	stemming algorithm in searching malay text
granting_institution	Universiti Utara Malaysia
granting_department	Faculty of Information Technology
publishDate	2004
url	https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf
_version_	1747827140351492096
spelling	my-uum-etd.14092013-07-24T12:11:49Z Stemming Algorithm in Searching Malay Text 2004 Rizauddin, Saian Faculty of Information Technology Faculty of Information Technology QA76 Computer software Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query words. The retrieved Web pages are weighted and ranked using Inverse Document Frequency function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm. 2004 Thesis https://etd.uum.edu.my/1409/ https://etd.uum.edu.my/1409/1/RIZAUDDIN_B._SAIAN.pdf application/pdf eng validuser https://etd.uum.edu.my/1409/2/1.RIZAUDDIN_B._SAIAN.pdf application/pdf eng public masters masters Universiti Utara Malaysia Ahmad, F.. Yusoff, M. & Sembok, T. M. T. (1996). Experiments with a Stemming Algorithm for Malay Words. Journal of the American Society for Information Science, 47(12), 909-918. Cescone, N. (1978). Morphological Analysis and Lexicon Design for Natural Language Processing. Computers and Humanities, 11, 199-209. Ekmekcioglu,F. Cuna, Lynch, Michael F. & Willett, Peter (1996). Stemming and N-gram matching for term conflation in Turkish texts. Information Research, 1(1). Available at: http://informationr.net/ir/2-2/paper13.html. Frakes, W. B. (1992). Stemming Algorithms. In W. B. Frakes and R. Baeza (Ed.),Information Retrieval, Data Structures and Algorithms. (pp. 131-160). Prentice Hall. Frakes, W.B. (1984). Term Conflation for Information Retrieval. In van Rijsbergen, C.J.(Ed.), Research and Development in Information Retrieval (pp. 383-390). CUP: Cambridge. Freud, G.E. & Willett, P. (1982). Online Identification of Word Variants and Arbitrary Truncation Searching Using a String Similarity Measure. Information Technology Research and Development, 1, 177-187. Hafer, M.A. & Weiss, S.F. (1974). Word Segmentation by Letter Successor Varieties.Information Storage and Retrieval, 10,371-385. Harman, D. (1991). How Effective is Suffixing? .Journal of the American Society for Information Science, 42(1), 7-15. Idris, N. & Syed Mustapha, S. M. F. D. (2001, April 23). Stemming for Term Conflation in Malay Texts. International Conference ofArtrficia1 Intelligence, Las Vegas. p.1512-1517. Kantrowitz, M., Mohit, B., & Mittal, V. (2000). Stemming and Its Effects on TFIDF Ranking. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 357-359. Kobayashi, M. & Takeda, K. (2000, June). Information Retrieval on the Web. ACM Computing Surveys, 32(2),144-173. Lawrence, S. & Giles, C. L. (1999). Accessibility of information on the Web. Nature. Lennon, M., Peirce, D. S., Tarry, B. D. & Willet, P.(1981). An evaluation for some conflation algorithms for information retrieval. Journal of Information Science,3,177- 183. Lovins, J.B. (1968). Development of a Stemming Algorithm. Mechanical Translation and Computational Linguistics, 11,22-31. Niedermair, G.T., Thurmair, G. & Buttel, I. (1985). MARS A Retrieval Tool on the Basis of Morphological Analysis. In van Rjsbergen, C. J. (Ed.), Research and Development in Information Retrieval (pp. 369-380). CUP: Cambridge. Paice, C. D. (1990). Another Stemmer. ACM SIGIR Forum, 24(3), 56-61. Pirkola, A. (2001, May). Morphological Typology of Languages for IR. Journal of Documentation, 57,330-348. Popovic, M. & Willet, P. (1992, June). The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data. Journal of the American Society for Information Science, J3(5), 384-390. Porter, M. F. (1980, July). An Algorithm for Sufix Stripping. Program, 14(3), 130-137. Raben, J. & Lieberman, D.V. (1976). Text comparison: principles and a program. In Jones, A & Churchouse, R. F. (Eds.), The computer in literacy and linguistic studies.(pp.297-308). Cardiff University of Wales Press. Savoy, J. (1993, January). Stemming of French Words Based on Grammatical Categories. .Journal of the American Society for Information Science, 44,1-9. Stephen, G.A. (1994). String Searching Algorithm. In Lecturer Notes Series on Computing.Singapore: World Scientific Publishing Co. Pte. Ltd. UlmschneiderJ, .E. & Doszkocs, T. (1983). A Practical Stemming Algorithm for Online Search Assistance. Online Review, 7, 301-318. Van Rijsbergen, C. J. (1979). Information Retrieval (Second Edition). London: Butterworths. Walker, S. & Jones, R.M. (1987). Improving Subject Retrieval in Online Catalogues. Stemming, Automatic Spelling Correction and Cross-Reference Tables, British Library Research Paper, London. Wen Ji-Rong., Nie Jian-Yun. & Zhang Hong-Jiang.(2001,May I). Clustering User Queries of a Search Engines. ACM, pp. 162-168. Yoshiaki, M. & Keishi, T. (1999, February). Finding Context Paths for Web Pages.Proceedings of the tenth ACM Conference on Hypertext and Hypermedia: returning to our diverse roots. Zainab Abu Bakar & Nurazzah Abd. Rahman (2004). Evaluating the Effectiveness of Conflation Methods in Retrieving Malay Translated Al-Quran Texts and Images. Conference on Scientific and Social Research, UiTM. Zeti Zuryani Mohd Zakuan (2004). Penipuan Kad Kredit di Malaysia. LLM Thesis,Universiti Kebangsaan Malaysia.

Stemming Algorithm in Searching Malay Text

Similar Items