Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik

Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, do...

Full description

Saved in:

Bibliographic Details
Main Author:	Noorhuzaimi@Karimah, Mohd Noor
Format:	Thesis
Language:	English
Published:	2016
Subjects:	PL Languages and literatures of Eastern Asia Africa Oceania
Online Access:	http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-ump-ir.25341
record_format	uketd_dc
spelling	my-ump-ir.253412021-07-28T03:18:07Z Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik 2016 Noorhuzaimi@Karimah, Mohd Noor PL Languages and literatures of Eastern Asia, Africa, Oceania Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, document summarizations, and information extraction. There has been various research carried out on AR, but the majority of them were meant for languages such as English, Japanese and Norwegian. Very few and almost no research effort have been focussed on AR for Malay language. Therefore, the aim of this research is to resolve the phenomena of AR for Malay text by using knowledge poor approach and semantic class labelling model. In order to achieve the aim, a framework of the Malay AR has been developed as a guide to solve this phenomenon in Malay language. Meanwhile, the process to determine the type of usage for pronoun nya has been solved by using a set of rules, a set of similar words, and word filtering that has been generate from semantic class labelling model. This process is important because the use of pronoun nya in Malay text is the highest, amounting to 68% as compared to other pronouns that mostly depend on the sociological status of referring entity or antecedent. The antecedent candidate determination is an important process that should be considered. The antecedent candidates can be in the form of proper noun or nouns. In order to determine proper nouns as suitable candidates, two main processes need to be done: (1) the entity recognition for proper noun that has the word 'dan' and comma symbol (,); and (2) the process to determine the semantic label for each retrieved candidate in order to determine their sociological status. The research used part of the name gazetteers for people, organization, location and position. Testing has been conducted on 60 Malay articles with different classes of proper nouns. The results were compared with the benchmark data tagged by a Malay linguist. The result shows an average precision and recall values of 85% and 90% respectively. The proposed framework of AR by using knowledge poor approach for Malay text shows increased success rate by 18.79% as compared to the generic approach proposed by Mitkov and Lappin. 2016 Thesis http://umpir.ump.edu.my/id/eprint/25341/ http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf pdf en public phd doctoral Universiti Kebangsaan Malaysia Fakulti Teknologi dan Sains Maklumat
institution	Universiti Malaysia Pahang Al-Sultan Abdullah
collection	UMPSA Institutional Repository
language	English
topic	PL Languages and literatures of Eastern Asia Africa Oceania
spellingShingle	PL Languages and literatures of Eastern Asia Africa Oceania Noorhuzaimi@Karimah, Mohd Noor Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
description	Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, document summarizations, and information extraction. There has been various research carried out on AR, but the majority of them were meant for languages such as English, Japanese and Norwegian. Very few and almost no research effort have been focussed on AR for Malay language. Therefore, the aim of this research is to resolve the phenomena of AR for Malay text by using knowledge poor approach and semantic class labelling model. In order to achieve the aim, a framework of the Malay AR has been developed as a guide to solve this phenomenon in Malay language. Meanwhile, the process to determine the type of usage for pronoun nya has been solved by using a set of rules, a set of similar words, and word filtering that has been generate from semantic class labelling model. This process is important because the use of pronoun nya in Malay text is the highest, amounting to 68% as compared to other pronouns that mostly depend on the sociological status of referring entity or antecedent. The antecedent candidate determination is an important process that should be considered. The antecedent candidates can be in the form of proper noun or nouns. In order to determine proper nouns as suitable candidates, two main processes need to be done: (1) the entity recognition for proper noun that has the word 'dan' and comma symbol (,); and (2) the process to determine the semantic label for each retrieved candidate in order to determine their sociological status. The research used part of the name gazetteers for people, organization, location and position. Testing has been conducted on 60 Malay articles with different classes of proper nouns. The results were compared with the benchmark data tagged by a Malay linguist. The result shows an average precision and recall values of 85% and 90% respectively. The proposed framework of AR by using knowledge poor approach for Malay text shows increased success rate by 18.79% as compared to the generic approach proposed by Mitkov and Lappin.
format	Thesis
qualification_name	Doctor of Philosophy (PhD.)
qualification_level	Doctorate
author	Noorhuzaimi@Karimah, Mohd Noor
author_facet	Noorhuzaimi@Karimah, Mohd Noor
author_sort	Noorhuzaimi@Karimah, Mohd Noor
title	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_short	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_full	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_fullStr	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_full_unstemmed	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_sort	resolusi anafora artikel bahasa melayu berasaskan pengetahuan terhad dan kelas semantik
granting_institution	Universiti Kebangsaan Malaysia
granting_department	Fakulti Teknologi dan Sains Maklumat
publishDate	2016
url	http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf
_version_	1783732093163929600

Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik

Similar Items