Speaker verification from coded telephone speech using stochastic feature transformation and handset identification

Eric W.M. Yu, Man Wai Mak, Sun Yuan Kung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

8 Citations (Scopus)

Abstract

A handset compensation technique for speaker verification from coded telephone speech is proposed. The proposed technique combines handset selectors with stochastic feature transformation to reduce the acoustic mismatch between different handsets and different speech coders. Coder-dependent GMM-based handset selectors are trained to identify the most likely handset used by the claimants. Stochastic feature transformations are then applied to remove the acoustic distortion introduced by the coder and the handset. Experimental results show that the proposed technique outperforms the CMS approach and significantly reduces the error rates under six different coders with bit rates ranging from 2.4 kb/s to 64 kb/s. Strong correlation between speech quality and verification performance is also observed.
Original languageEnglish
Title of host publicationAdvances in Multimedia Information Processing - PCM 2002 - 3rd IEEE Pacific Rim Conference on Multimedia, Proceedings
PublisherSpringer Verlag
Pages598-606
Number of pages9
ISBN (Print)3540002626, 9783540002628
Publication statusPublished - 1 Jan 2002
Event3rd IEEE Pacific Rim Conference on Multimedia, PCM 2002 - Hsinchu, Taiwan
Duration: 16 Dec 200218 Dec 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2532
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd IEEE Pacific Rim Conference on Multimedia, PCM 2002
Country/TerritoryTaiwan
CityHsinchu
Period16/12/0218/12/02

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Speaker verification from coded telephone speech using stochastic feature transformation and handset identification'. Together they form a unique fingerprint.

Cite this