Environment adaptation for robust speaker verification

Kwok Kwong Yiu, Man Wai Mak, Sun Yuan Kung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

18 Citations (Scopus)


In speaker verification over public telephone networks, utterances can be obtained from different types of handsets. Different handsets may introduce different degrees of distortion to the speech signals. This paper attempts to combine a handset selector with (1) handset-specific transformations and (2) handset-dependent speaker models to reduce the effect caused by the acoustic distortion. Specifically, a number of Gaussian mixture models are independently trained to identify the most likely handset given a test utterance; then during recognition, the speaker model and background model are either transformed by MLLR-based handset-specific transformation or respectively replaced by a handset-dependent speaker model and a handset-dependent background model whose parameters were adapted by reinforced learning to fit the new environment. Experimental results based on 150 speakers of the HTIMIT corpus show that environment adaptation based on both MLLR and reinforced learning outperforms the classical CMS, Hnorm and Tnorm approaches, with MLLR adaptation achieves the best performance.
Original languageEnglish
Title of host publicationEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
PublisherInternational Speech Communication Association
Number of pages4
Publication statusPublished - 1 Jan 2003
Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
Duration: 1 Sep 20034 Sep 2003


Conference8th European Conference on Speech Communication and Technology, EUROSPEECH 2003

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication


Dive into the research topics of 'Environment adaptation for robust speaker verification'. Together they form a unique fingerprint.

Cite this