Adaptive conditional pronunciation modeling using articulatory features for speaker verification

Ka Yee Leung, Man Wai Mak, Manhung Siu, Sun Yuan Kung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

This paper proposes an articulatory feature-based conditional pronunciation modeling (AFCPM) technique for speaker verification. The technique models the pronunciation behaviors of speakers by creating a link between the actual phones produced by the speakers and the state of articulations during speech production. Speaker models consisting of conditional probabilities of two articulatory classes are adapted from a set of universal background models (UBMs) using MAP adaptation technique. This adaptation approach aims to prevent over-fitting the speaker models when the amount of speaker data is insufficient for a direct estimation. Experimental results show that the adaptation technique can enhance the discriminating power of speaker models by establishing a tighter coupling between speaker models and the UBMs. Results also show that fusing the scores derived from an AFCPM-based system and a conventional spectral-based system achieves a significantly lower error rate than that of the individual systems. This suggests that AFCPM and spectral features are complementary to each other.
Original languageEnglish
Title of host publication2004 International Symposium on Chinese Spoken Language Processing - Proceedings
Pages61-64
Number of pages4
Publication statusPublished - 1 Dec 2004
Event2004 International Symposium on Chinese Spoken Language Processing - Hong Kong, China, Hong Kong
Duration: 15 Dec 200418 Dec 2004

Conference

Conference2004 International Symposium on Chinese Spoken Language Processing
Country/TerritoryHong Kong
CityHong Kong, China
Period15/12/0418/12/04

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Adaptive conditional pronunciation modeling using articulatory features for speaker verification'. Together they form a unique fingerprint.

Cite this