DNN-Based Score Calibration with Multitask Learning for Noise Robust Speaker Verification

Zhili Tan, Man Wai Mak, Brian Kan Wing Mak

Research output: Journal article publicationJournal articleAcademic researchpeer-review

5 Citations (Scopus)

Abstract

This paper proposes and investigates several deep neural network (DNN) based score compensation, transformation, and calibration algorithms for enhancing the noise robustness of i-vector speaker verification systems. Unlike conventional calibration methods where the required score shift is a linear function of SNR or log-duration, the DNN approach learns the complex relationship between the score shifts and the combination of i-vector pairs and uncalibrated scores. Furthermore, with the flexibility of DNNs, it is possible to explicitly train a DNN to recover the clean scores without having to estimate the score shifts. To alleviate the overfitting problem, multitask learning is applied to incorporate auxiliary information such as SNRs and speaker ID of training utterances into the DNN. Experiments on NIST 2012 SRE show that score calibration derived from multitask DNNs can improve the performance of the conventional score-shift approch significantly, especially under noisy conditions.
Original languageEnglish
Article number8249870
Pages (from-to)700-712
Number of pages13
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume26
Issue number4
DOIs
Publication statusPublished - 1 Apr 2018

Keywords

  • Deep learning
  • multi-task learning
  • noise robustness
  • score calibration
  • speaker verification

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology
  • Instrumentation
  • Acoustics and Ultrasonics
  • Linguistics and Language
  • Electrical and Electronic Engineering
  • Speech and Hearing

Cite this