Abstract
Although articulatory feature-based conditional pronunciation models (AFCPMs) can capture the pronunciation characteristics of speakers, they requires one discrete density function for each phoneme, which may lead to inaccurate models when the amount of training data is limited. This paper proposes a phonetic-class based AFCPM in which the density functions in speaker models are conditioned on phonetic classes instead of phonemes. Phonemes are mapped to phonetic classes by ; (1) vector quantizing the phoneme-dependent universal background models, (2) grouping phonemes according to the classical phoneme tree, and (3) combination of (1) and (2). A new scoring method that uses an SVM to combine the scores of phonetic-class models is also proposed. Evaluations based on 2000 NIST SRE show that the proposed approach can effectively solve the data sparseness problem encountered in conventional AFCPM.
Original language | English |
---|---|
Title of host publication | International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 |
Pages | 621-624 |
Number of pages | 4 |
Volume | 1 |
Publication status | Published - 1 Dec 2007 |
Event | 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 - Antwerp, Belgium Duration: 27 Aug 2007 → 31 Aug 2007 |
Conference
Conference | 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 |
---|---|
Country/Territory | Belgium |
City | Antwerp |
Period | 27/08/07 → 31/08/07 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Modelling and Simulation
- Linguistics and Language
- Communication