Speech-based depression recognition for Bahasa Malaysia speakers using deep learning models /

Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, family members, society, and the economy. Traditional clinical diagnosis methods are subjective, complicated, and require extensive participation of experts. Furthermore, the severe shortage in psychiatr...

Full description

Saved in:
Bibliographic Details
Main Author: Ezzi, Mugahed Al Ezzi Ahmed (Author)
Format: Thesis
Language:English
Published: Kuala Lumpur : Kulliyyah of Engineering,International Islamic University Malaysia, 2021
Subjects:
Online Access:http://studentrepo.iium.edu.my/handle/123456789/11007
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 044110000a22004090004500
008 220428s2021 my a f m 000 0 eng d
040 |a UIAM  |b eng  |e rda 
041 |a eng 
043 |a a-my--- 
050 0 0 |a TK7895.S65 
100 1 |a Ezzi, Mugahed Al Ezzi Ahmed   |9 7719  |e author 
245 1 0 |a Speech-based depression recognition for Bahasa Malaysia speakers using deep learning models /  |c by Mugahed Al Ezzi Ahmed Ezzi 
264 1 |a Kuala Lumpur :  |b Kulliyyah of Engineering,International Islamic University Malaysia,  |c 2021 
300 |a xv, 92 leaves :  |b colour illustrations ;  |c c30cm. 
336 |2 rdacontent  |a text 
337 |2 rdamedia  |a unmediated 
337 |2 rdamedia  |a computer 
338 |2 rdacarrier  |a volume 
338 |2 rdacarrier  |a online resource 
347 |2 rdaft  |a text file  |b PDF 
500 |a Abstracts in English and Arabic. 
500 |a "A thesis submitted in fulfilment of the requirement for the degree of Master of Science in Engineering." --On title page.  
502 |a Thesis (MSENG)--International Islamic University Malaysia, 2021. 
504 |a Includes bibliographical references (leaves 64-69). 
520 |a Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, family members, society, and the economy. Traditional clinical diagnosis methods are subjective, complicated, and require extensive participation of experts. Furthermore, the severe shortage in psychiatrists’ ratio per population in Malaysia imposes patients’ delay in seeking treatment and poor compliance to follow-up. On the other side, the social stigma of visiting psychiatric clinics also prevents patients from seeking early treatment. Automatic depression detection using speech signals is a promising depression biometric because it is fast, convenient, and non-invasive. However, current machine learning algorithms could not achieve high accuracy and robust results yet. Moreover, the existing researches and approaches have minimal support to Bahasa Malaysia. This research attempts to develop an end-to-end deep learning model to classify depression from Bahasa Malaysia speech using our dataset collected from clinically depressed and healthy Bahasa Malaysia speakers. The dataset was collected via an online platform using participants’ mobile phones to record their read and spontaneous speech and depression status. Depression status is identified by the Patient Health Questionnaire (PHQ-9), the Malay Beck Depression Inventory-II (Malay BDI-II), and subjects’ declaration of Major Depressive Disorder diagnosis by a trained clinician. The dataset consists of 42 and 11 depressed female and male participants, respectively, and 68 and 9 healthy female and male participants. However, this research study focuses on female data only due to data insufficient. We provided a detailed implementation of the deep learning model using two approaches: raw audio input and acoustic features input. Multiple combinations of speech types were analyzed using various deep neural network models. Additionally, an analysis of robust feature selection was carried out on the acoustic features input before proceeding to the deep learning models. After performing hyperparameters tuning, raw audio input from female read and female spontaneous speech combination using AttCRNN model achieved an accuracy of 91%. In comparison, robust acoustic features input from female spontaneous speech using RNN model achieved an accuracy of 85%. These results could be improved by providing a larger dataset. Besides, male and gender-independent models could be further studied. 
650 0 |a Automatic speech recognition  |9 4162 
650 0 |a Deep learning (Machine learning) 
655 |a Theses, IIUM local 
690 |a Dissertations, Academic  |x Department of Mechatronics Engineering  |z IIUM  |9 1666 
700 0 |a Nik Nur Wahidah Nik Hashim  |e degree supervisor  |9 7720 
700 0 |a Hasan Firdaus Mohd Zaki  |e degree supervisor  |9 4188 
710 2 |a  International Islamic University Malaysia.  |b Department of Mechatronics Engineering  |9 7721 
856 4 |u http://studentrepo.iium.edu.my/handle/123456789/11007 
900 |a sz-asbh 
942 |2 lcc  |n 0  |c THESIS 
999 |c 502872  |d 534289 
952 |0 0  |1 0  |2 lcc  |4 0  |6 T T K7895 S65 E00099S 02021  |7 3  |8 IIUMTHESIS  |9 982088  |a IIUM  |b IIUM  |c THESIS  |d 2022-07-15  |g 0.00  |o t TK 7895 S65 E99S 2021  |p 11100437199  |r 1900-01-02  |t 1  |v 0.00  |y THESIS