Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.

Some of the vehicle advertisements are represented in textual documents, and it requires a reader to read the entire document and understands its content before the important information can be extracted. This process consumes more time instead of having a system that can extract the important infor...

Full description

Saved in:
Bibliographic Details
Main Author: Norfadila, Mahrom
Format: Thesis
Language:eng
eng
Published: 2007
Subjects:
Online Access:https://etd.uum.edu.my/111/1/norfadila.pdf
https://etd.uum.edu.my/111/2/norfadila.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.111
record_format uketd_dc
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
topic QA76 Computer software
spellingShingle QA76 Computer software
Norfadila, Mahrom
Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
description Some of the vehicle advertisements are represented in textual documents, and it requires a reader to read the entire document and understands its content before the important information can be extracted. This process consumes more time instead of having a system that can extract the important information from the document automatically without the reader needs to read the whole document. In this study, a prototype system was developed to assist a reader to extract important information from the Malay vehicle advertisement by applying natural language processing (NLP) techniques. The NLP techniques that have been used are focused on the syntactic processing.
format Thesis
qualification_name masters
qualification_level Master's degree
author Norfadila, Mahrom
author_facet Norfadila, Mahrom
author_sort Norfadila, Mahrom
title Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
title_short Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
title_full Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
title_fullStr Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
title_full_unstemmed Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques.
title_sort information extraction from malay vehicle advertisement text using natural language processing techniques.
granting_institution Universiti Utara Malaysia
granting_department College of Arts and Sciences (CAS)
publishDate 2007
url https://etd.uum.edu.my/111/1/norfadila.pdf
https://etd.uum.edu.my/111/2/norfadila.pdf
_version_ 1747826822099238912
spelling my-uum-etd.1112013-07-24T12:05:38Z Information Extraction From Malay Vehicle Advertisement Text Using Natural Language Processing Techniques. 2007-12-10 Norfadila, Mahrom College of Arts and Sciences (CAS) Faculty of Information Technology QA76 Computer software Some of the vehicle advertisements are represented in textual documents, and it requires a reader to read the entire document and understands its content before the important information can be extracted. This process consumes more time instead of having a system that can extract the important information from the document automatically without the reader needs to read the whole document. In this study, a prototype system was developed to assist a reader to extract important information from the Malay vehicle advertisement by applying natural language processing (NLP) techniques. The NLP techniques that have been used are focused on the syntactic processing. 2007-12 Thesis https://etd.uum.edu.my/111/ https://etd.uum.edu.my/111/1/norfadila.pdf application/pdf eng validuser https://etd.uum.edu.my/111/2/norfadila.pdf application/pdf eng public masters masters Universiti Utara Malaysia Abd Rahman, S. (2006). Automated Document Preprocessing for Text Categorization System (TCS). Thesis for the degree Master of Science (Information Technology), Universiti Utara Malaysia, Kedah. Ahmad, F., Yusoff, M., & Sembok, T. M. T. (1996). Experiments with a Stemming Algorithmm for Malay Words. Journal of the American Society for Information Science, 47(12), pp. 909-918. Alonso, M. A., Cabrero, D., & Vilares, M. (1999). Generalized LR Parsing for Extensions of Context-Free Grammars, Universidade da Coruna. Arimura, H., Abe, J., Fujino, R., Sakamoto, H., Shomozono, S., & Arikawa, S. (2001). Text Data Mining: Discovery of Important Keywords in the Cyberspace, pp.220-226. Arnold, D. (2000). Chart Parsing. Retrieved October 10, 2007 from http://www.cs.ualberta.ca/~lindek/65O/papers/chartParsing. pdf Beckstein, C., & Kim, M. (1989). A Mixed Top-Down and Bottom-Up Deduction Method and its Correctness. IBM Research Division, T.J. Watson Research Center, Yorktown Heights, New York. Brill, E. (1992). A Simple Rule-Based Part-of-Speech Tagger. In Proceeding of the 3rd Conf on Applied Natural Language Processing. Cutting, D., et al. (1992). A Practical Part-of-Speech Tagger. In Proceeding of the 3rd Conf on Applied Natural Language Processing. Daille, B. (1994). Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. In Proceeding of the 32nd Annual Meeting of the Association for Computational Linguistics. Erbach, G. (1990). Syntactic Processing of Unknown Words. In Ed. P. Jorrand and V. Sgurev (Eds), Artificial Intelligence IV - Methodology, Systems, Applications, North-Holland, Amsterdam, 1990. Feldman, R., & Hirsh, H. (1997). Finding Associations in Collections of Text. In Michalski R.S., Bratko I. and Kubat M.(edts); Machine Learning, data Mining and Knowleke Discovery: Methods and Application(John Wiley and sons Ltd) Gao, X., Murugesan, S., & Lo, B. (2005). Extraction of Keyterms by Simple Text Mining for Business Information Retrieval. In Proceeding of the 2005 IEEE International Conference on e-Business Engineering (ICEBE '05). Glasgow, B., Mandell, A., Binney, D., Ghemri, L. & Fisher, D.(1997). MITA: An Information Extraction Approach to Analysis of Free-form Text in Life Insurance Applications. American Association for Artificial Intelligence (www.aaai.org). Goodman, J. T. (1998). Parsing Inside-Out., Harvard University, Cambridge, Massachusetts. Idris, N., & Syed Mustapha, S. M. F. D. (2001). Stemming for Term Conflation in Malay Texts. Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur. Ishikawa, H., Kubota, K., Noguchi, Y., Kato, K., Ono, M., Yoshizawa, N., & Kanaya, A.(1998). A document warehouse: a multimedia database approach. Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing. Prentice Hall, United States of America. Jusoh, S., Wang, F., & Yang, S. X. (2004). Integrating Fuzzy Approach and History Knowledge to Create an Intelligent Processor for A Human-Robot Interface. The International Journal of Artificial Intelligence and Machine Learning, ICGST, No.12, pp. 7-14, 2004. Karim, N. S., Onn, F. M., Musa, H., & Mahmod, A. H.(2004). Tatabahasa Dewan Edisi Baharu.Dewan Bahasa dan Pustaka, Kuala Lumpur. Kushmerick, N., Johnston, E., & McGuinness, S. (2000). Information Extraction by Text Classification. Smart Media Institute, Computer Science Departmennt, University College Dublin. Lecture for Natural Language Processing. Retrieved September 3, 2007 from www.cs.jhu.edu/~jason/papers/eisner.earley-anim.ppt Lehnert, W., Cardie, C., Fisher, D., McCarthy, J., Riloff, E., & Soderland, S. (1992). Evaluating an Information Extraction System. Computer Science Department, LGRC, University of Massachusetts and Department of Computer Science, Berkeley, CA. McCallum, A. (2005). Information Extraction : Distilling Structured Data from Unstructured Text. ACM QUEUE November 2005. Mooney, R. J., & Nahm, U. Y. (2003). Text Mining with Information Extraction. Multilingualism and Electronic Language Management: Proceedings of the 4th International MIDP Colloquium, September 2003, Bloemfontein, South Africa. Pp. 141-160. Othman, A. (1993). Pengakar perkataan Melayu untuk system capaian dokumen. Thesis for the degree Master of Science, Universiti Kebangsaan Malaysia, Bangi. Parsing More Efficiently and Accurately. Retrieved September 3, 2007 from www1.cs.columbia.edu/-julia/cs4705/earley.ppt Parsing: Chart Parsing, Earley-Algorithm, top down/ bottom up left-right. Retrieved September 3, 2007 from http://www.coli.uni-saarland.de/-hansu/EarleyAlgorithm.pdf Rajman, M., & Besancon, R. (1997). Text Mining: Natural Language techniques and Text Mining applications. Computer Science Department, Swiss Federal Institute of Technology. Rao, R. (2003). From unstructured data to actionable intelligence. IEEE Computer Society. Saian, R. (2004). Stemming Algorithm in Searching Malay Text. Thesis for the degree Master of Science (Information Technology), University Utara Malaysia, Kedah. Schabes, Y., & Joshi, A. K. (1988). An Earley-Type Parsing Algorithm for Tree Adjoining Grammars, Department of Computer and Information Science, University of Pennsylvania, Philadelphia. Pp.256-269. Sehgal, A. K. (2000). Text Mining: The Search For Novelty In Text. Thesis for the degree of Doctor of Philosophy. Soubeh, M., & Al-Laban, M. J. (1996). Context-Free Relations and Their Characteristics. Applied Mathematics And Computation, Elsevier Science Inc., North-Holland, pp. 163-172, 1996. WIKIPEDIA The Free Encyclopedia. Earley Parser. Retrieved September 3, 2007 from http://en.wikipedia.org/wiki/Earley_parser Witten, I. H., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco, CA. Witten, I. H. (2004). Text mining. University of Waikato, Hamilton, New Zealand.