Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network

Speech recognition has many applications in various fields. One of the most important phase in speech recognition is feature extraction. In feature extraction relevant important information from the speech signal are extracted. However, two important issues that affect feature extraction are noise r...

Full description

Saved in:
Bibliographic Details
Main Author: Adam, Tarmizi
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:http://eprints.utm.my/id/eprint/78047/1/TarmiziAdamMFC20141.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.78047
record_format uketd_dc
spelling my-utm-ep.780472018-07-23T05:33:14Z Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network 2014-03 Adam, Tarmizi QA75 Electronic computers. Computer science Speech recognition has many applications in various fields. One of the most important phase in speech recognition is feature extraction. In feature extraction relevant important information from the speech signal are extracted. However, two important issues that affect feature extraction are noise robustness and high feature dimension. Existing feature extraction which uses fixed windows processing and spectral analysis methods like Mel-Frequency Cepstral Coefficient (MFCC) could not cater robustness and high feature dimension problems. This research proposes the usage of Discrete Wavelet Transform (DWT) to replace Discrete Fourier Transform (DFT) for calculating the cepstrum coefficients to produce a newly proposed Wavelet Cepstral Coefficient Wavelet Cepstral Coefficient (WCC). The DWT is used in order to gain the advantages of the wavelet in analyzing non stationary signals. The WCC is computed in a frame by frame manner. Each speech frame is decomposed using the DWT and the log energy of its coefficients is taken. The final stage of the WCC computation is done by taking the Discrete Cosine Transform (DCT) of these log energies to form the WCC. The WCC are then fed into a Neural Network (NN) for classification. In order to test the proposed WCC a series of experiments were conducted on TI-ALPHA dataset to compare its performance with the MFCC. The experiments were conducted under several noise levels using Additive White Gaussian Noise (AWGN) and number of coefficients for speaker dependent and independent tasks. From the results, it is shown that the WCC has the advantage of withstanding noisy conditions better than MFCC especially under small number of features for both speaker dependent and independent tasks. The best result tested under noisy condition of 25 dB shows that 30 WCC coefficients using Daubechies 12 achieved 71.79% recognition rate in comparison to only 37.62% using MFCC under the same constraint. The main contribution of this research is the development of the WCC features which performs better than the MFCC under noisy signals and reduced number of feature coefficients. 2014-03 Thesis http://eprints.utm.my/id/eprint/78047/ http://eprints.utm.my/id/eprint/78047/1/TarmiziAdamMFC20141.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:83709 masters Universiti Teknologi Malaysia, Faculty of Computing Faculty of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Adam, Tarmizi
Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
description Speech recognition has many applications in various fields. One of the most important phase in speech recognition is feature extraction. In feature extraction relevant important information from the speech signal are extracted. However, two important issues that affect feature extraction are noise robustness and high feature dimension. Existing feature extraction which uses fixed windows processing and spectral analysis methods like Mel-Frequency Cepstral Coefficient (MFCC) could not cater robustness and high feature dimension problems. This research proposes the usage of Discrete Wavelet Transform (DWT) to replace Discrete Fourier Transform (DFT) for calculating the cepstrum coefficients to produce a newly proposed Wavelet Cepstral Coefficient Wavelet Cepstral Coefficient (WCC). The DWT is used in order to gain the advantages of the wavelet in analyzing non stationary signals. The WCC is computed in a frame by frame manner. Each speech frame is decomposed using the DWT and the log energy of its coefficients is taken. The final stage of the WCC computation is done by taking the Discrete Cosine Transform (DCT) of these log energies to form the WCC. The WCC are then fed into a Neural Network (NN) for classification. In order to test the proposed WCC a series of experiments were conducted on TI-ALPHA dataset to compare its performance with the MFCC. The experiments were conducted under several noise levels using Additive White Gaussian Noise (AWGN) and number of coefficients for speaker dependent and independent tasks. From the results, it is shown that the WCC has the advantage of withstanding noisy conditions better than MFCC especially under small number of features for both speaker dependent and independent tasks. The best result tested under noisy condition of 25 dB shows that 30 WCC coefficients using Daubechies 12 achieved 71.79% recognition rate in comparison to only 37.62% using MFCC under the same constraint. The main contribution of this research is the development of the WCC features which performs better than the MFCC under noisy signals and reduced number of feature coefficients.
format Thesis
qualification_level Master's degree
author Adam, Tarmizi
author_facet Adam, Tarmizi
author_sort Adam, Tarmizi
title Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
title_short Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
title_full Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
title_fullStr Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
title_full_unstemmed Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network
title_sort isolated english alphabet speech recognition using wavelet cepstral coefficients and neural network
granting_institution Universiti Teknologi Malaysia, Faculty of Computing
granting_department Faculty of Computing
publishDate 2014
url http://eprints.utm.my/id/eprint/78047/1/TarmiziAdamMFC20141.pdf
_version_ 1747817894023004160