Feature extraction design for embedded neural network urban sound classifier

Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassifi...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Chin Shen
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.99482
record_format uketd_dc
spelling my-utm-ep.994822023-02-27T07:36:20Z Feature extraction design for embedded neural network urban sound classifier 2021 Lim, Chin Shen TK Electrical engineering. Electronics Nuclear engineering Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassified by recent studies as the region of the city becomes more developed. In recent work of audio classification, the features of sound are extracted by its image which is obtained from the pattern of time-frequency representation or otherwise known as spectrogram. This project aims to design a noise robust, neural network urban sounds classifier that is implemented on an embedded system. Two feature extractors that converts audio to image will be explored and compared to produce better features for urban sound. Mel Frequency Cepstral Coefficient (MFCC) is commonly used throughout all sound classifiers with good results while Gammatone Frequency Cepstral Coefficient (GFCC) is an emerging feature extractor said to be better at extracting noisy data. Urbansound8k, which contains 8732 labelled sound classified into eight classes, is used as the dataset. Different decibels of noise were added to the dataset to simulate the actual urban sound scenario and to explore the noise robustness of the two feature extractors. To classify urban sound, the audio is converted into an image. Therefore, Convolutional Neural Network (CNN) model is employed because it is one of the best machine learning models for image. Since the design are focusing on embedded system application, lightweight CNN model MobileNetV2 will be used in this project. The feature extractor and the neural network model will be developed using a python language and TensorFlow library. The experimental result shows that MFCC outperforms GFCC in terms of classification accuracy by an average of 14.34% across all SNR levels. MFCC is also more robust to noise in dataset, with 2.75% and 2.87% drop in accuracy at 30dB and 10dB noise signal respectively compared to baseline of noiseless signal, whereas GFCC has a drop of 6.18% and 3.87% at 30dB and 10dB noise signal respectively. 2021 Thesis http://eprints.utm.my/id/eprint/99482/ http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:149770 masters Universiti Teknologi Malaysia Faculty of Engineering - School of Electrical Engineering
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic TK Electrical engineering
Electronics Nuclear engineering
spellingShingle TK Electrical engineering
Electronics Nuclear engineering
Lim, Chin Shen
Feature extraction design for embedded neural network urban sound classifier
description Urban sound research has become a hot topic in recent years for city growth observation and surveillance application through noise source identification. However, the sound identification is challenging due to the multiple sound sources that are blended. There are also new sounds that are unclassified by recent studies as the region of the city becomes more developed. In recent work of audio classification, the features of sound are extracted by its image which is obtained from the pattern of time-frequency representation or otherwise known as spectrogram. This project aims to design a noise robust, neural network urban sounds classifier that is implemented on an embedded system. Two feature extractors that converts audio to image will be explored and compared to produce better features for urban sound. Mel Frequency Cepstral Coefficient (MFCC) is commonly used throughout all sound classifiers with good results while Gammatone Frequency Cepstral Coefficient (GFCC) is an emerging feature extractor said to be better at extracting noisy data. Urbansound8k, which contains 8732 labelled sound classified into eight classes, is used as the dataset. Different decibels of noise were added to the dataset to simulate the actual urban sound scenario and to explore the noise robustness of the two feature extractors. To classify urban sound, the audio is converted into an image. Therefore, Convolutional Neural Network (CNN) model is employed because it is one of the best machine learning models for image. Since the design are focusing on embedded system application, lightweight CNN model MobileNetV2 will be used in this project. The feature extractor and the neural network model will be developed using a python language and TensorFlow library. The experimental result shows that MFCC outperforms GFCC in terms of classification accuracy by an average of 14.34% across all SNR levels. MFCC is also more robust to noise in dataset, with 2.75% and 2.87% drop in accuracy at 30dB and 10dB noise signal respectively compared to baseline of noiseless signal, whereas GFCC has a drop of 6.18% and 3.87% at 30dB and 10dB noise signal respectively.
format Thesis
qualification_level Master's degree
author Lim, Chin Shen
author_facet Lim, Chin Shen
author_sort Lim, Chin Shen
title Feature extraction design for embedded neural network urban sound classifier
title_short Feature extraction design for embedded neural network urban sound classifier
title_full Feature extraction design for embedded neural network urban sound classifier
title_fullStr Feature extraction design for embedded neural network urban sound classifier
title_full_unstemmed Feature extraction design for embedded neural network urban sound classifier
title_sort feature extraction design for embedded neural network urban sound classifier
granting_institution Universiti Teknologi Malaysia
granting_department Faculty of Engineering - School of Electrical Engineering
publishDate 2021
url http://eprints.utm.my/id/eprint/99482/1/LimChinShenMKE2021.pdf
_version_ 1776100601972654080