Identification of informative subpathways and genes using improved differential expression analysis for pathways method

Pathway-based analysis is introduced to define useful biological knowledge by considering the whole pathway features. However, most of these analyses have several shortcomings, such as less sensitivity towards data that could lead to some important information being missed. Because of the deficiency...

Full description

Saved in:
Bibliographic Details
Main Author: Nasarudin, Nurul Athirah
Format: Thesis
Language:English
Published: 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/96325/1/NurulAthirahMSC2021.pdf.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.96325
record_format uketd_dc
spelling my-utm-ep.963252022-07-17T07:18:31Z Identification of informative subpathways and genes using improved differential expression analysis for pathways method 2020 Nasarudin, Nurul Athirah QA75 Electronic computers. Computer science Pathway-based analysis is introduced to define useful biological knowledge by considering the whole pathway features. However, most of these analyses have several shortcomings, such as less sensitivity towards data that could lead to some important information being missed. Because of the deficiency, pathway-based analysis has been shifted to subpathway-based analysis, which is seen to be more relevant in understanding the biological reactions. This is strengthened by the fact that several studies have found abnormalities in pathways caused by certain regions that respond in the etiology of diseases. In addition, subpathway-based analysis has been found to be more effective and sensitive than the whole pathway. Due to this orientation, many tools have been developed to accomplish the inadequate interpretation in biology system. The Differential Expression Analysis for Pathway (DEAP) is one of the methods in subpathway-based analysis which identifies a local region perturbed by complex diseases in large pathway data. However, the method has shown low performance in identifying informative pathway and subpathway. Hence, this research proposes a modified DEAP method (termed iDEAP) for enhancing the identification of perturbed subpathways in pathway activities and aimed at achieving higher performance in the detection of differential expressed pathways. To this end, firstly, asearch algorithm adapted from DMSP algorithm was implemented to DEAP in search for informative subpathways. Secondly, the relation among subpathways was taken into account by averaging the maximum absolute value (termed DEAP score) to emphasize the reaction among subpathways so that efficient identification of informative pathways can be achieved. Three gene expression data sets were applied in this study (head and neck tumour, colorectal cancer and breast cancer). The results were obtained in terms of the number of differential expressed pathways (head and neck tumor-81 pathways, colorectal cancer-78 pathways, breast cancer-95 pathways) and they suggest that the proposed method yielded better performance as compared to previous work. In fact, when the selected genes from the results were evaluated using 10-fold CV in terms of accuracy, the proposed method showed higher accuracy for Colorectal (90%) and Breast cancer (94%). Finally, a biological validation was conducted on the top five (5) significant pathways and selected genes based on biological literatures. 2020 Thesis http://eprints.utm.my/id/eprint/96325/ http://eprints.utm.my/id/eprint/96325/1/NurulAthirahMSC2021.pdf.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:143455 masters Universiti Teknologi Malaysia Faculty of Engineering - School of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Nasarudin, Nurul Athirah
Identification of informative subpathways and genes using improved differential expression analysis for pathways method
description Pathway-based analysis is introduced to define useful biological knowledge by considering the whole pathway features. However, most of these analyses have several shortcomings, such as less sensitivity towards data that could lead to some important information being missed. Because of the deficiency, pathway-based analysis has been shifted to subpathway-based analysis, which is seen to be more relevant in understanding the biological reactions. This is strengthened by the fact that several studies have found abnormalities in pathways caused by certain regions that respond in the etiology of diseases. In addition, subpathway-based analysis has been found to be more effective and sensitive than the whole pathway. Due to this orientation, many tools have been developed to accomplish the inadequate interpretation in biology system. The Differential Expression Analysis for Pathway (DEAP) is one of the methods in subpathway-based analysis which identifies a local region perturbed by complex diseases in large pathway data. However, the method has shown low performance in identifying informative pathway and subpathway. Hence, this research proposes a modified DEAP method (termed iDEAP) for enhancing the identification of perturbed subpathways in pathway activities and aimed at achieving higher performance in the detection of differential expressed pathways. To this end, firstly, asearch algorithm adapted from DMSP algorithm was implemented to DEAP in search for informative subpathways. Secondly, the relation among subpathways was taken into account by averaging the maximum absolute value (termed DEAP score) to emphasize the reaction among subpathways so that efficient identification of informative pathways can be achieved. Three gene expression data sets were applied in this study (head and neck tumour, colorectal cancer and breast cancer). The results were obtained in terms of the number of differential expressed pathways (head and neck tumor-81 pathways, colorectal cancer-78 pathways, breast cancer-95 pathways) and they suggest that the proposed method yielded better performance as compared to previous work. In fact, when the selected genes from the results were evaluated using 10-fold CV in terms of accuracy, the proposed method showed higher accuracy for Colorectal (90%) and Breast cancer (94%). Finally, a biological validation was conducted on the top five (5) significant pathways and selected genes based on biological literatures.
format Thesis
qualification_level Master's degree
author Nasarudin, Nurul Athirah
author_facet Nasarudin, Nurul Athirah
author_sort Nasarudin, Nurul Athirah
title Identification of informative subpathways and genes using improved differential expression analysis for pathways method
title_short Identification of informative subpathways and genes using improved differential expression analysis for pathways method
title_full Identification of informative subpathways and genes using improved differential expression analysis for pathways method
title_fullStr Identification of informative subpathways and genes using improved differential expression analysis for pathways method
title_full_unstemmed Identification of informative subpathways and genes using improved differential expression analysis for pathways method
title_sort identification of informative subpathways and genes using improved differential expression analysis for pathways method
granting_institution Universiti Teknologi Malaysia
granting_department Faculty of Engineering - School of Computing
publishDate 2020
url http://eprints.utm.my/id/eprint/96325/1/NurulAthirahMSC2021.pdf.pdf
_version_ 1747818659021062144