Feature extraction design for embedded neural network urban sound classifier

Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassifi...

全面介紹

Saved in:

書目詳細資料
主要作者:	Lim, Chin Shen
格式:	Thesis
語言:	English
出版:	2021
主題:	TK Electrical engineering Electronics Nuclear engineering
在線閱讀:	http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	my-utm-ep.99482
record_format	uketd_dc
spelling	my-utm-ep.994822023-02-27T07:36:20Z Feature extraction design for embedded neural network urban sound classifier 2021 Lim, Chin Shen TK Electrical engineering. Electronics Nuclear engineering Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassified by recent studies as the region of the city becomes more developed. In recent work of audio classification, the features of sound are extracted by its image which is obtained from the pattern of time-frequency representation or otherwise known as spectrogram. This project aims to design a noise robust, neural network urban sounds classifier that is implemented on an embedded system. Two feature extractors that converts audio to image will be explored and compared to produce better features for urban sound. Mel Frequency Cepstral Coefficient (MFCC) is commonly used throughout all sound classifiers with good results while Gammatone Frequency Cepstral Coefficient (GFCC) is an emerging feature extractor said to be better at extracting noisy data. Urbansound8k, which contains 8732 labelled sound classified into eight classes, is used as the dataset. Different decibels of noise were added to the dataset to simulate the actual urban sound scenario and to explore the noise robustness of the two feature extractors. To classify urban sound, the audio is converted into an image. Therefore, Convolutional Neural Network (CNN) model is employed because it is one of the best machine learning models for image. Since the design are focusing on embedded system application, lightweight CNN model MobileNetV2 will be used in this project. The feature extractor and the neural network model will be developed using a python language and TensorFlow library. The experimental result shows that MFCC outperforms GFCC in terms of classification accuracy by an average of 14.34% across all SNR levels. MFCC is also more robust to noise in dataset, with 2.75% and 2.87% drop in accuracy at 30dB and 10dB noise signal respectively compared to baseline of noiseless signal, whereas GFCC has a drop of 6.18% and 3.87% at 30dB and 10dB noise signal respectively. 2021 Thesis http://eprints.utm.my/id/eprint/99482/ http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:149770 masters Universiti Teknologi Malaysia Faculty of Engineering - School of Electrical Engineering
institution	Universiti Teknologi Malaysia
collection	UTM Institutional Repository
language	English
topic	TK Electrical engineering Electronics Nuclear engineering
spellingShingle	TK Electrical engineering Electronics Nuclear engineering Lim, Chin Shen Feature extraction design for embedded neural network urban sound classifier
description	Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassified by recent studies as the region of the city becomes more developed. In recent work of audio classification, the features of sound are extracted by its image which is obtained from the pattern of time-frequency representation or otherwise known as spectrogram. This project aims to design a noise robust, neural network urban sounds classifier that is implemented on an embedded system. Two feature extractors that converts audio to image will be explored and compared to produce better features for urban sound. Mel Frequency Cepstral Coefficient (MFCC) is commonly used throughout all sound classifiers with good results while Gammatone Frequency Cepstral Coefficient (GFCC) is an emerging feature extractor said to be better at extracting noisy data. Urbansound8k, which contains 8732 labelled sound classified into eight classes, is used as the dataset. Different decibels of noise were added to the dataset to simulate the actual urban sound scenario and to explore the noise robustness of the two feature extractors. To classify urban sound, the audio is converted into an image. Therefore, Convolutional Neural Network (CNN) model is employed because it is one of the best machine learning models for image. Since the design are focusing on embedded system application, lightweight CNN model MobileNetV2 will be used in this project. The feature extractor and the neural network model will be developed using a python language and TensorFlow library. The experimental result shows that MFCC outperforms GFCC in terms of classification accuracy by an average of 14.34% across all SNR levels. MFCC is also more robust to noise in dataset, with 2.75% and 2.87% drop in accuracy at 30dB and 10dB noise signal respectively compared to baseline of noiseless signal, whereas GFCC has a drop of 6.18% and 3.87% at 30dB and 10dB noise signal respectively.
format	Thesis
qualification_level	Master's degree
author	Lim, Chin Shen
author_facet	Lim, Chin Shen
author_sort	Lim, Chin Shen
title	Feature extraction design for embedded neural network urban sound classifier
title_short	Feature extraction design for embedded neural network urban sound classifier
title_full	Feature extraction design for embedded neural network urban sound classifier
title_fullStr	Feature extraction design for embedded neural network urban sound classifier
title_full_unstemmed	Feature extraction design for embedded neural network urban sound classifier
title_sort	feature extraction design for embedded neural network urban sound classifier
granting_institution	Universiti Teknologi Malaysia
granting_department	Faculty of Engineering - School of Electrical Engineering
publishDate	2021
url	http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf
_version_	1776100601972654080

Feature extraction design for embedded neural network urban sound classifier

相似書籍