Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction

Ontology learning refers to an automatic extraction of ontology to produce the ontology learning layer cake which consists of five kinds of output: terms, concepts, taxonomy relations, non-taxonomy relations and axioms. Term extraction is a prerequisite for all aspects of ontology learning. It is th...

Full description

Saved in:
Bibliographic Details
Main Author: Syafrullah, Mohammad
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/78211/1/MohammadSyafrullahPFC2015.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.78211
record_format uketd_dc
spelling my-utm-ep.782112018-07-30T08:39:38Z Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction 2015-09 Syafrullah, Mohammad QA75 Electronic computers. Computer science Ontology learning refers to an automatic extraction of ontology to produce the ontology learning layer cake which consists of five kinds of output: terms, concepts, taxonomy relations, non-taxonomy relations and axioms. Term extraction is a prerequisite for all aspects of ontology learning. It is the automatic mining of complete terms from the input document. Another important part of ontology is taxonomy, or the hierarchy of concepts. It presents a tree view of the ontology and shows the inheritance between subconcepts and superconcepts. In this research, two methods were proposed for improving the performance of the extraction result. The first method uses particle swarm optimization in order to optimize the weights of features. The advantage of particle swarm optimization is that it can calculate and adjust the weight of each feature according to the appropriate value, and here it is used to improve the performance of term and taxonomy extraction. The second method uses a hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems that ensures that the membership functions and fuzzy system rule sets are optimized. The advantage of using a fuzzy system is that the imprecise and uncertain values of feature weights can be tolerated during the extraction process. This method is used to improve the performance of taxonomy extraction. In the term extraction experiment, five extracted features were used for each term from the document. These features were represented by feature vectors consisting of domain relevance, domain consensus, term cohesion, first occurrence and length of noun phrase. For taxonomy extraction, matching Hearst lexico-syntactic patterns in documents and the web, and hypernym information form WordNet were used as the features that represent each pair of terms from the texts. These two proposed methods are evaluated using a dataset that contains documents about tourism. For term extraction, the proposed method is compared with benchmark algorithms such as Term Frequency Inverse Document Frequency, Weirdness, Glossary Extraction and Term Extractor, using the precision performance evaluation measurement. For taxonomy extraction, the proposed methods are compared with benchmark methods of Feature-based and weighting by Support Vector Machine using the f-measure, precision and recall performance evaluation measurements. For the first method, the experiment results concluded that implementing particle swarm optimization in order to optimize the feature weights in terms and taxonomy extraction leads to improved accuracy of extraction result compared to the benchmark algorithms. For the second method, the results concluded that the hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems leads to improved performance of taxonomy extraction results when compared to the benchmark methods, while adjusting the fuzzy membership function and keeping the number of fuzzy rules to a minimum number with a high degree of accuracy. 2015-09 Thesis http://eprints.utm.my/id/eprint/78211/ http://eprints.utm.my/id/eprint/78211/1/MohammadSyafrullahPFC2015.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:94655 phd doctoral Universiti Teknologi Malaysia, Faculty of Computing Faculty of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Syafrullah, Mohammad
Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
description Ontology learning refers to an automatic extraction of ontology to produce the ontology learning layer cake which consists of five kinds of output: terms, concepts, taxonomy relations, non-taxonomy relations and axioms. Term extraction is a prerequisite for all aspects of ontology learning. It is the automatic mining of complete terms from the input document. Another important part of ontology is taxonomy, or the hierarchy of concepts. It presents a tree view of the ontology and shows the inheritance between subconcepts and superconcepts. In this research, two methods were proposed for improving the performance of the extraction result. The first method uses particle swarm optimization in order to optimize the weights of features. The advantage of particle swarm optimization is that it can calculate and adjust the weight of each feature according to the appropriate value, and here it is used to improve the performance of term and taxonomy extraction. The second method uses a hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems that ensures that the membership functions and fuzzy system rule sets are optimized. The advantage of using a fuzzy system is that the imprecise and uncertain values of feature weights can be tolerated during the extraction process. This method is used to improve the performance of taxonomy extraction. In the term extraction experiment, five extracted features were used for each term from the document. These features were represented by feature vectors consisting of domain relevance, domain consensus, term cohesion, first occurrence and length of noun phrase. For taxonomy extraction, matching Hearst lexico-syntactic patterns in documents and the web, and hypernym information form WordNet were used as the features that represent each pair of terms from the texts. These two proposed methods are evaluated using a dataset that contains documents about tourism. For term extraction, the proposed method is compared with benchmark algorithms such as Term Frequency Inverse Document Frequency, Weirdness, Glossary Extraction and Term Extractor, using the precision performance evaluation measurement. For taxonomy extraction, the proposed methods are compared with benchmark methods of Feature-based and weighting by Support Vector Machine using the f-measure, precision and recall performance evaluation measurements. For the first method, the experiment results concluded that implementing particle swarm optimization in order to optimize the feature weights in terms and taxonomy extraction leads to improved accuracy of extraction result compared to the benchmark algorithms. For the second method, the results concluded that the hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems leads to improved performance of taxonomy extraction results when compared to the benchmark methods, while adjusting the fuzzy membership function and keeping the number of fuzzy rules to a minimum number with a high degree of accuracy.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Syafrullah, Mohammad
author_facet Syafrullah, Mohammad
author_sort Syafrullah, Mohammad
title Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
title_short Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
title_full Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
title_fullStr Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
title_full_unstemmed Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
title_sort hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
granting_institution Universiti Teknologi Malaysia, Faculty of Computing
granting_department Faculty of Computing
publishDate 2015
url http://eprints.utm.my/id/eprint/78211/1/MohammadSyafrullahPFC2015.pdf
_version_ 1747817934239039488