A comparison of various adaptation methods for speaker verification with limited enrollment data

Man Wai Mak, Roger Hsiao, Brian Mak

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

21 Citations (Scopus)

Abstract

One key factor that hinders the widespread deployment of speaker verification technologies is the requirement of long enrollment utterances to guarantee low error rate during verification. To gain user acceptance of speaker verification technologies, adaptation algorithms that can enroll speakers with short utterances are highly essential. To this end, this paper applies kernel eigenspace-based MLLR (KEMLLR) for speaker enrollment and compares its performance against three state-of-the-art model adaptation techniques: maximum a posteriori (MAP), maximum-likelihood linear regression (MLLR), and reference speaker weighting (RSW). The techniques were compared under the NIST2001 SRE framework, with enrollment data vary from 2 to 32 seconds. Experimental results show that KEMLLR is most effective for short enrollment utterances (between 2 to 4 seconds) and that MAP performs better when long utterances (32 seconds) are available.
Original languageEnglish
Title of host publication2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
Volume1
Publication statusPublished - 1 Dec 2006
Event2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France
Duration: 14 May 200619 May 2006

Conference

Conference2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
CountryFrance
CityToulouse
Period14/05/0619/05/06

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this