Quantum-inspired differential evolution for gene feature selection from single-cell expression data

Single-cell RNA sequencing (scRNA-seq) has emerged as a state-of-the-art technology for gene expression studies, offering the ability to discover cellular heterogeneity, i.e., variations in cell-to-cell expression. In contrast to earlier technologies such as microarray and bulk RNA sequencing, which...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Grace Yee Lin
Format: Thesis
Published: 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Single-cell RNA sequencing (scRNA-seq) has emerged as a state-of-the-art technology for gene expression studies, offering the ability to discover cellular heterogeneity, i.e., variations in cell-to-cell expression. In contrast to earlier technologies such as microarray and bulk RNA sequencing, which provide the average expression of pooled cells, scRNA-seq provides expression profiles for individual cells. Thus, it is important to identify the type of each cell to make sense of the scRNAseq gene expression data. Cell type identification is also a key step to reduce data dimension and facilitate downstream analyses such as differential expression studies, genetic markers selection, and spatial transcriptomics studies. However, scRNA-seq data contains technical variations that can affect downstream interpretations. Therefore, gene selection, often referred to as feature selection in data science, plays an important role in selecting informative genes from scRNA-seq data. Feature selection methods are categorised into filter-, wrapper-, and embedded-based approaches. From existing literature, filter- and embedded-based approaches are widely applied in scRNA-seq gene selection tasks. Wrapper-based approaches that give promising results in other domains remain underexplored for selecting gene features from scRNA-seq data. This study contributes to a new wrapper-based gene selection method for selecting optimal subset of genes while reducing the number of genes in scRNA-seq gene expression datasets. Quantum-inspired Differential Evolution (QDE) wrapped with a classification method was introduced and tested with twelve well-known scRNA-seq datasets, to identify cell type. All experiments in this research study were conducted using a 5-fold cross-validations strategy. The QDE was combined with different machine-learning (ML) classifiers namely logistic regression, decision tree, support vector machine (SVM) with linear and radial basis function kernels, as well as extreme learning machine.