Incorporating MAP estimation and covariance transform for SVM based speaker recognition

Cheung Chi Leung, Donglai Zhu, Kong Aik Lee, Bin Ma, Haizhou Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

In this paper, we apply Constrained Maximum a Posteriori Linear Regression (CMAPLR) transformation on Universal Background Model (UBM) when characterizing each speaker with a supervector. We incorporate the covariance transformation parameters into the supervector in addition to the mean transformation parameters. Maximum Likelihood Linear Regression (MLLR) covariance transformation is adopted. The auxiliary function maximization involved in Maximum Likelihood (ML) and Maximum a Posteriori (MAP) estimation is also presented. Our experiment on the 2006 NIST Speaker Recognition Evaluation (SRE) corpus shows that the two proposed techniques provide substantial performance improvement.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PublisherInternational Speech Communication Association
Pages2318-2321
Number of pages4
Publication statusPublished - Sept 2010
Externally publishedYes

Publication series

NameProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

  • MAPLR
  • MLLR
  • Speaker adaptation
  • Speaker recognition

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Incorporating MAP estimation and covariance transform for SVM based speaker recognition'. Together they form a unique fingerprint.

Cite this