Speech-based depression recognition for Bahasa Malaysia speakers using deep learning models /

Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, family members, society, and the economy. Traditional clinical diagnosis methods are subjective, complicated, and require extensive participation of experts. Furthermore, the severe shortage in psychiatr...

Full description

Saved in:
Bibliographic Details
Main Author: Ezzi, Mugahed Al Ezzi Ahmed (Author)
Format: Thesis
Language:English
Published: Kuala Lumpur : Kulliyyah of Engineering,International Islamic University Malaysia, 2021
Subjects:
Online Access:http://studentrepo.iium.edu.my/handle/123456789/11007
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, family members, society, and the economy. Traditional clinical diagnosis methods are subjective, complicated, and require extensive participation of experts. Furthermore, the severe shortage in psychiatrists’ ratio per population in Malaysia imposes patients’ delay in seeking treatment and poor compliance to follow-up. On the other side, the social stigma of visiting psychiatric clinics also prevents patients from seeking early treatment. Automatic depression detection using speech signals is a promising depression biometric because it is fast, convenient, and non-invasive. However, current machine learning algorithms could not achieve high accuracy and robust results yet. Moreover, the existing researches and approaches have minimal support to Bahasa Malaysia. This research attempts to develop an end-to-end deep learning model to classify depression from Bahasa Malaysia speech using our dataset collected from clinically depressed and healthy Bahasa Malaysia speakers. The dataset was collected via an online platform using participants’ mobile phones to record their read and spontaneous speech and depression status. Depression status is identified by the Patient Health Questionnaire (PHQ-9), the Malay Beck Depression Inventory-II (Malay BDI-II), and subjects’ declaration of Major Depressive Disorder diagnosis by a trained clinician. The dataset consists of 42 and 11 depressed female and male participants, respectively, and 68 and 9 healthy female and male participants. However, this research study focuses on female data only due to data insufficient. We provided a detailed implementation of the deep learning model using two approaches: raw audio input and acoustic features input. Multiple combinations of speech types were analyzed using various deep neural network models. Additionally, an analysis of robust feature selection was carried out on the acoustic features input before proceeding to the deep learning models. After performing hyperparameters tuning, raw audio input from female read and female spontaneous speech combination using AttCRNN model achieved an accuracy of 91%. In comparison, robust acoustic features input from female spontaneous speech using RNN model achieved an accuracy of 85%. These results could be improved by providing a larger dataset. Besides, male and gender-independent models could be further studied.
Item Description:Abstracts in English and Arabic.
"A thesis submitted in fulfilment of the requirement for the degree of Master of Science in Engineering." --On title page.
Physical Description:xv, 92 leaves : colour illustrations ; c30cm.
Bibliography:Includes bibliographical references (leaves 64-69).