Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition

Vowel Recognition is a part of automatic speech recognition (ASR) systems that classifies speech signals into groups of vowels. The performance of Malay vowel recognition (MVR) like any multiclass classification problem depends largely on Feature Vectors (FVs). FVs such as Mel-frequency Cepstral Coe...

Full description

Saved in:
Bibliographic Details
Main Author: Atanda, Abdulwahab Funsho
Format: Thesis
Language:eng
eng
eng
eng
Published: 2021
Subjects:
Online Access:https://etd.uum.edu.my/9212/1/s95101_01.pdf
https://etd.uum.edu.my/9212/2/s95101_02.pdf
https://etd.uum.edu.my/9212/3/s95101_references.docx
https://etd.uum.edu.my/9212/5/depositpermission-allow-not%20allow_s95101.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.9212
record_format uketd_dc
spelling my-uum-etd.92122022-05-09T08:23:16Z Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition 2021 Atanda, Abdulwahab Funsho Mohd Yusof, Shahrul Azmi Husni, Husniza Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Arts & Sciences QA273-280 Probabilities. Mathematical statistics QA299.6-433 Analysis Vowel Recognition is a part of automatic speech recognition (ASR) systems that classifies speech signals into groups of vowels. The performance of Malay vowel recognition (MVR) like any multiclass classification problem depends largely on Feature Vectors (FVs). FVs such as Mel-frequency Cepstral Coefficients (MFCC) have produced high error rates due to poor phoneme information. Classifier transformed probabilistic features have proved a better alternative in conveying phoneme information. However, the high dimensionality of the probabilistic features introduces additional complexity that deteriorates ASR performance. This study aims to improve MVR performance by proposing an algorithm that transforms MFCC FVs into a new set of features using Multinomial Logistic Regression (MLR) to reduce the dimensionality of the probabilistic features. This study was carried out in four phases which are pre-processing and feature extraction, best regression coefficients generation, feature transformation, and performance evaluation. The speech corpus consists of 1953 samples of five Malay vowels of /a/, /e/, /i/, /o/ and /u/ recorded from students of two public universities in Malaysia. Two sets of algorithms were developed which are DBRCs and FELT. DBRCs algorithm determines the best regression coefficients (DBRCs) to obtain the best set of regression coefficients (RCs) from the extracted 39-MFCC FVs through resampling and data swapping approach. FELT algorithm transforms 39-MFCC FVs using logistic transformation method into FELT FVs. Vowel recognition rates of FELT and 39-MFCC FVs were compared using four different classification techniques of Artificial Neural Network, MLR, Linear Discriminant Analysis, and k-Nearest Neighbour. Classification results showed that FELT FVs surpass the performance of 39-MFCC FVs in MVR. Depending on the classifiers used, the improved performance of 1.48% - 11.70% was attained by FELT over MFCC. Furthermore, FELT significantly improved the recognition accuracy of vowels /o/ and /u/ by 5.13% and 8.04% respectively. This study contributes two algorithms for determining the best set of RCs and generating FELT FVs from MFCC. The FELT FVs eliminate the need for dimensionality reduction with comparable performances. Furthermore, FELT FVs improved MVR for all the five vowels especially /o/ and /u/. The improved MVR performance will spur the development of Malay speech-based systems, especially for the Malaysian community. 2021 Thesis https://etd.uum.edu.my/9212/ https://etd.uum.edu.my/9212/1/s95101_01.pdf text eng public https://etd.uum.edu.my/9212/2/s95101_02.pdf text eng public https://etd.uum.edu.my/9212/3/s95101_references.docx text eng public https://etd.uum.edu.my/9212/5/depositpermission-allow-not%20allow_s95101.pdf text eng staffonly other doctoral Universiti Utara Malaysia
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
eng
eng
advisor Mohd Yusof, Shahrul Azmi
Husni, Husniza
topic QA273-280 Probabilities
Mathematical statistics
QA299.6-433 Analysis
spellingShingle QA273-280 Probabilities
Mathematical statistics
QA299.6-433 Analysis
Atanda, Abdulwahab Funsho
Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
description Vowel Recognition is a part of automatic speech recognition (ASR) systems that classifies speech signals into groups of vowels. The performance of Malay vowel recognition (MVR) like any multiclass classification problem depends largely on Feature Vectors (FVs). FVs such as Mel-frequency Cepstral Coefficients (MFCC) have produced high error rates due to poor phoneme information. Classifier transformed probabilistic features have proved a better alternative in conveying phoneme information. However, the high dimensionality of the probabilistic features introduces additional complexity that deteriorates ASR performance. This study aims to improve MVR performance by proposing an algorithm that transforms MFCC FVs into a new set of features using Multinomial Logistic Regression (MLR) to reduce the dimensionality of the probabilistic features. This study was carried out in four phases which are pre-processing and feature extraction, best regression coefficients generation, feature transformation, and performance evaluation. The speech corpus consists of 1953 samples of five Malay vowels of /a/, /e/, /i/, /o/ and /u/ recorded from students of two public universities in Malaysia. Two sets of algorithms were developed which are DBRCs and FELT. DBRCs algorithm determines the best regression coefficients (DBRCs) to obtain the best set of regression coefficients (RCs) from the extracted 39-MFCC FVs through resampling and data swapping approach. FELT algorithm transforms 39-MFCC FVs using logistic transformation method into FELT FVs. Vowel recognition rates of FELT and 39-MFCC FVs were compared using four different classification techniques of Artificial Neural Network, MLR, Linear Discriminant Analysis, and k-Nearest Neighbour. Classification results showed that FELT FVs surpass the performance of 39-MFCC FVs in MVR. Depending on the classifiers used, the improved performance of 1.48% - 11.70% was attained by FELT over MFCC. Furthermore, FELT significantly improved the recognition accuracy of vowels /o/ and /u/ by 5.13% and 8.04% respectively. This study contributes two algorithms for determining the best set of RCs and generating FELT FVs from MFCC. The FELT FVs eliminate the need for dimensionality reduction with comparable performances. Furthermore, FELT FVs improved MVR for all the five vowels especially /o/ and /u/. The improved MVR performance will spur the development of Malay speech-based systems, especially for the Malaysian community.
format Thesis
qualification_name other
qualification_level Doctorate
author Atanda, Abdulwahab Funsho
author_facet Atanda, Abdulwahab Funsho
author_sort Atanda, Abdulwahab Funsho
title Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
title_short Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
title_full Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
title_fullStr Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
title_full_unstemmed Multinomial logistic regression probability ratio-based feature vectors for Malay vowel recognition
title_sort multinomial logistic regression probability ratio-based feature vectors for malay vowel recognition
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2021
url https://etd.uum.edu.my/9212/1/s95101_01.pdf
https://etd.uum.edu.my/9212/2/s95101_02.pdf
https://etd.uum.edu.my/9212/3/s95101_references.docx
https://etd.uum.edu.my/9212/5/depositpermission-allow-not%20allow_s95101.pdf
_version_ 1747828547883368448