A new approach to channel robust speaker verification via constrained stochastic feature transformation

Kwok Kwong Yiu, Man Wai Mak, Ming Cheung Cheung, Sun Yuan Kung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

This paper proposes a constrained stochastic feature transformation algorithm for robust speaker verification. The algorithm computes the feature transformation parameters based on the statistical difference between a test utterance and a composite GMM formed by combining the speaker and background models. The transformation is then used to transform the test utterance to fit the clean speaker model and background model before verification. By implicitly constraining the transformation, the transformed features can fit both models simultaneously. Experimental results based on the 2001 NIST evaluation set show that the proposed algorithms achieves significant improvement in both equal error rate and minimum detection cost when compared to cepstral mean subtraction and Z-norm. The performance of the proposed transformation approach is also slightly better than the short-time Gaussianization method proposed in [1].
Original languageEnglish
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages1753-1756
Number of pages4
Publication statusPublished - 1 Jan 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - International Convention Center, Jeju, Jeju Island, Korea, Republic of
Duration: 4 Oct 20048 Oct 2004

Conference

Conference8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island
Period4/10/048/10/04

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'A new approach to channel robust speaker verification via constrained stochastic feature transformation'. Together they form a unique fingerprint.

Cite this