Biological data integration and web services composition using semantic web and artificial intelligence planning

Systems biology research relies on data integration and retrieval which are performed during the early stages of biological research to enable biologists to understand a wide range of biological data including gene and pathway. In data retrieval, web services are utilized to retrieve integrated data...

Full description

Saved in:
Bibliographic Details
Main Author: Remli, Muhammad Akmal
Format: Thesis
Published: 2014
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Systems biology research relies on data integration and retrieval which are performed during the early stages of biological research to enable biologists to understand a wide range of biological data including gene and pathway. In data retrieval, web services are utilized to retrieve integrated data from a repository using web service composition. This service composition would combine multiple services whenever some requirements cannot be fulfilled by a single service. In web service composition, many of the planning algorithms to detect and generate composition plan would focus only on sequence composition thus, neglecting concurrent composition. Besides that, data integration methods cater to biological data which are heterogeneous and these methods focus on general data without taking into consideration specific bacterium data. The research is aimed at developing data integration and retrieval approach using semantic web and web service composition. In this approach, a semantic web and transformation method was applied to integrate protein, gene and pathway data of specific bacterium (Lactococcus Lactis) from several resources. The integrated data are expressed in Resource Description Framework (RDF) format and published in RDF repository. In the web service composition, Artificial Intelligence (AI) planning algorithm was developed to automate the composition by depicting the problem as an AI planning problem. The proposed planning algorithms extended the Hierarchical Task Network (HTN) in the context of concurrent service composition. The approach produced integrated results which were evaluated in terms of connected instances of biological data by performing query over RDF repository. Experimental analysis of service composition showed that the proposed algorithms are capable of detecting and generating concurrent plan when compared with existing algorithms. The research has extended data integration and retrieval approaches by combining them using a semantic web presented in RDF format as well as illustrated that the proposed algorithms can be applied effectively for data retrieval in the context of concurrent web service composition