Implementation of feature extraction and classification for speech dysfluencies

Speech is prone to disruption of involuntary dysfluent events especially repetitions and prolongations of sounds, syllables and words which lead to dysfluency in communication. Traditionally, speech language pathologists count and classify occurrence of dysfluencies in flow of speech manually. Ho...

全面介紹

Saved in:

書目詳細資料
主要作者:	Lim, Sin Chee
格式:	Thesis
語言:	English
主題:	Speech dysfluencies Speech Communication
在線閱讀:	http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/1/Full%20text.pdf http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/2/p.%201-24.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	my-unimap-21614
record_format	uketd_dc
spelling	my-unimap-216142012-11-05T06:46:49Z Implementation of feature extraction and classification for speech dysfluencies Lim, Sin Chee Speech is prone to disruption of involuntary dysfluent events especially repetitions and prolongations of sounds, syllables and words which lead to dysfluency in communication. Traditionally, speech language pathologists count and classify occurrence of dysfluencies in flow of speech manually. However, these types of assessment are subjective, inconsistent, time-consuming and prone to error. In the last three decades, many research works have been developed to automate the conventional assessments with various approaches such as speech signal analysis, personal variables, acoustic analysis of speech signal and artificial intelligence techniques. From the previous works, it can be concluded that feature extraction methods and classification techniques play important roles in this research field. Therefore, in this work, there are few feature extraction methods, namely, Short Time Fourier Transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and Linear Predictive Coding (LPC) based parameterization were proposed to extract the salient feature of the two types of dysfluencies. By applying the feature extraction methods on each signal, there are total of seven acoustical features extracted namely STFT, MFCC and five acoustical features from Linear Predictive Coding based parameterization, that is, Linear Predictive Coefficient (LPC), Linear Predictive Cepstral Coefficient (LPCC), Weighted Linear Predictive Cepstral Coefficient(WLPCC), First Order Temporal Derivatives (FOTD) and Second Order Temporal Derivatives (SOTD). Acoustical features are extracted from the signal are use as input parameters for classifiers. Both linear and nonlinear classifiers namely Linear Discriminant Analysis (LDA), k-Nearest Neighbor (kNN) and Least-Squares Support Vector Machines (LSSVM) with linear kernel (SLIN) and Radial Basis Function kernel (SRBF) were suggested to classify the two types of dysfluencies. In order to evaluate the effectiveness of the different feature extraction methods and classification techniques, a standard database named as University College London’s Archive of Stuttered Speech (UCLASS) is used. The reliability of the classification accuracy is achieved by adopting the two validation schemas, namely, conventional validation and ten-fold cross-validation. For further analysis, parameters selections of the respective classifiers and parameter variation namely order of Linear Predictive Coding based parameterization, parameter used to control the degree of preemphasis filtering, frame length and overlap percentages on the signal pre-processing techniques are investigated. Analysis results reported that the highest classification accuracy is achieved by STFT features and SLIN classifier. By observing the classification accuracy obtained from different acoustical features and classifiers, it can be concluded that it is necessary to evaluate correlation between acoustical features and different classifiers in order to achieve the best classification accuracy. As a conclusion, the proposed feature extraction methods and classifiers can be used in speech dysfluencies classification. Finally, a Graphical User Interface of this work is developed by using MATLAB® based on the results achieved in the experiments. Universiti Malaysia Perlis (UniMAP) 2011 Thesis en http://dspace.unimap.edu.my/123456789/21614 http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/1/Full%20text.pdf ffa170e8bb9305596442ea34464ac3ec http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/2/p.%201-24.pdf 5a544f10267a2115f05aa1beddd2ab66 http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/3/license.txt 8b9bcae6cbdaa805119b774a20ae34bb Speech dysfluencies Speech Communication School of Mechatronic Engineering
institution	Universiti Malaysia Perlis
collection	UniMAP Institutional Repository
language	English
topic	Speech dysfluencies Speech Communication
spellingShingle	Speech dysfluencies Speech Communication Lim, Sin Chee Implementation of feature extraction and classification for speech dysfluencies
description	Speech is prone to disruption of involuntary dysfluent events especially repetitions and prolongations of sounds, syllables and words which lead to dysfluency in communication. Traditionally, speech language pathologists count and classify occurrence of dysfluencies in flow of speech manually. However, these types of assessment are subjective, inconsistent, time-consuming and prone to error. In the last three decades, many research works have been developed to automate the conventional assessments with various approaches such as speech signal analysis, personal variables, acoustic analysis of speech signal and artificial intelligence techniques. From the previous works, it can be concluded that feature extraction methods and classification techniques play important roles in this research field. Therefore, in this work, there are few feature extraction methods, namely, Short Time Fourier Transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and Linear Predictive Coding (LPC) based parameterization were proposed to extract the salient feature of the two types of dysfluencies. By applying the feature extraction methods on each signal, there are total of seven acoustical features extracted namely STFT, MFCC and five acoustical features from Linear Predictive Coding based parameterization, that is, Linear Predictive Coefficient (LPC), Linear Predictive Cepstral Coefficient (LPCC), Weighted Linear Predictive Cepstral Coefficient(WLPCC), First Order Temporal Derivatives (FOTD) and Second Order Temporal Derivatives (SOTD). Acoustical features are extracted from the signal are use as input parameters for classifiers. Both linear and nonlinear classifiers namely Linear Discriminant Analysis (LDA), k-Nearest Neighbor (kNN) and Least-Squares Support Vector Machines (LSSVM) with linear kernel (SLIN) and Radial Basis Function kernel (SRBF) were suggested to classify the two types of dysfluencies. In order to evaluate the effectiveness of the different feature extraction methods and classification techniques, a standard database named as University College London’s Archive of Stuttered Speech (UCLASS) is used. The reliability of the classification accuracy is achieved by adopting the two validation schemas, namely, conventional validation and ten-fold cross-validation. For further analysis, parameters selections of the respective classifiers and parameter variation namely order of Linear Predictive Coding based parameterization, parameter used to control the degree of preemphasis filtering, frame length and overlap percentages on the signal pre-processing techniques are investigated. Analysis results reported that the highest classification accuracy is achieved by STFT features and SLIN classifier. By observing the classification accuracy obtained from different acoustical features and classifiers, it can be concluded that it is necessary to evaluate correlation between acoustical features and different classifiers in order to achieve the best classification accuracy. As a conclusion, the proposed feature extraction methods and classifiers can be used in speech dysfluencies classification. Finally, a Graphical User Interface of this work is developed by using MATLAB® based on the results achieved in the experiments.
format	Thesis
author	Lim, Sin Chee
author_facet	Lim, Sin Chee
author_sort	Lim, Sin Chee
title	Implementation of feature extraction and classification for speech dysfluencies
title_short	Implementation of feature extraction and classification for speech dysfluencies
title_full	Implementation of feature extraction and classification for speech dysfluencies
title_fullStr	Implementation of feature extraction and classification for speech dysfluencies
title_full_unstemmed	Implementation of feature extraction and classification for speech dysfluencies
title_sort	implementation of feature extraction and classification for speech dysfluencies
granting_institution	Universiti Malaysia Perlis (UniMAP)
granting_department	School of Mechatronic Engineering
url	http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/1/Full%20text.pdf http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/21614/2/p.%201-24.pdf
_version_	1747836773598232576

Implementation of feature extraction and classification for speech dysfluencies

相似書籍