Enzyme sub-functional class prediction using multibiological knowledge feature respresentation and twin support vector machine

The field of computational structural biology these days has become advanced especially in the continued development of new high-throughput methods for predicting enzyme sub-functional classes. Prior knowledge of enzyme subfunctional classes has been applied in numerous important predictive tasks th...

Full description

Saved in:
Bibliographic Details
Main Author: Guramad Singh, Sharon Kaur
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/40668/5/SharonKaurGuramadSinghMFSKSM201.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The field of computational structural biology these days has become advanced especially in the continued development of new high-throughput methods for predicting enzyme sub-functional classes. Prior knowledge of enzyme subfunctional classes has been applied in numerous important predictive tasks that address structural and functional features of enzymes. However, issues on insufficient sequence-structure knowledge, lack of known enzyme sub-functional class, low-identity sequences have caused inaccurate feature representation and imbalance distribution of enzyme sub-functional class which has contributed to low prediction results. Thus, the research proposed a derivative features vector through the consolidation of amino acid composition; dipeptide composition; hydrophobicity and hydrophilicity known as APH which is based on multi-biological knowledge. The Support Vector Machine assigns and classifies every protein sequence into its respective vector. This process would enhance the sequence-structure knowledge and overcome inaccurate feature representation. Besides that, the Twin Support Vector Machine classifies the enzyme sub-functional class and solves the imbalance distribution of enzyme sub-functional class. In this study, bio-inspired kernel function was introduced to improve the overall enzyme sub-functional class prediction. The overall results were evaluated based on accuracy, sensitivity, specificity and Matthew’s Correlation Coefficient value. Statistical and biological validation using t-test and Gene Ontology showed that the experimental results achieved an accuracy of more than 98%. Findings from the research have shown that the proposed method could assist in the prediction of the enzyme biological function, protein structure and function, protein structural class and hence provide guidance in the designing of novel drugs to cure disease