Digital speech watermarking for online speaker recognition systems

Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermark...

Full description

Saved in:

Bibliographic Details
Main Author:	Nematollahi, Mohammad Ali
Format:	Thesis
Language:	English
Published:	2015
Subjects:
Online Access:	http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158IR.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-upm-ir.65614
record_format	uketd_dc
spelling	my-upm-ir.656142018-10-03T01:40:23Z Digital speech watermarking for online speaker recognition systems 2015-06 Nematollahi, Mohammad Ali Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations. 2015-06 Thesis http://psasir.upm.edu.my/id/eprint/65614/ http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158IR.pdf text en public doctoral Universiti Putra Malaysia
institution	Universiti Putra Malaysia
collection	PSAS Institutional Repository
language	English
topic
spellingShingle	Nematollahi, Mohammad Ali Digital speech watermarking for online speaker recognition systems
description	Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations.
format	Thesis
qualification_level	Doctorate
author	Nematollahi, Mohammad Ali
author_facet	Nematollahi, Mohammad Ali
author_sort	Nematollahi, Mohammad Ali
title	Digital speech watermarking for online speaker recognition systems
title_short	Digital speech watermarking for online speaker recognition systems
title_full	Digital speech watermarking for online speaker recognition systems
title_fullStr	Digital speech watermarking for online speaker recognition systems
title_full_unstemmed	Digital speech watermarking for online speaker recognition systems
title_sort	digital speech watermarking for online speaker recognition systems
granting_institution	Universiti Putra Malaysia
publishDate	2015
url	http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158IR.pdf
_version_	1747812350132486144

Digital speech watermarking for online speaker recognition systems

Similar Items