Comprehensive assessment of DNA content feature using machine learning approach

Transcription Factor Proteins-DNA interactions play the key role in gene regulation. Identification of the regulatory elements or motifs bound by transcription factor proteins is critical to understand the gene regulatory network, diseases, and for medical benefit. Computational motif analysis, spec...

Full description

Saved in:
Bibliographic Details
Main Author: Sina, Nazeri
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://ir.unimas.my/id/eprint/20987/1/Sina%20Nazeri.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Transcription Factor Proteins-DNA interactions play the key role in gene regulation. Identification of the regulatory elements or motifs bound by transcription factor proteins is critical to understand the gene regulatory network, diseases, and for medical benefit. Computational motif analysis, specifically the distal regulatory elements –enhancers– is notoriously difficult. Firstly, there are limited choices of features associated with it for machine learning task. Secondly, the discriminative feature that describes enhancer regions are ill-understood and no prior knowledge can be used in the design of recognition system. Lastly, different development stages and different cell lines activate different subset of enhancers which complicate computational methods of making conclusive results on the discriminative feature set that is used to model the active enhancers. Epigenetic and chromatin landmarks have been employed with great success to infer locations of enhancer regions as their locations have high correlation with enhancer regions. K-mer feature representation is one prominent approach for DNA content representation.