Abstract
State-of-the-art speaker verification systems are based on the total variability model to compactly represent the acoustic space. However, short duration utterances only contain limited phonetic content, potentially resulting in an incomplete representation being captured by the total variability model thus leading to poor speaker verification performance. In this paper, a technique to incorporate component-wise local acoustic variability information into the speaker verification framework is proposed. Specifically, Gaussian Probabilistic Linear Discriminant Analysis (G-PLDA) of the supervector space, with a block diagonal covariance assumption, is used in conjunction with the traditional total variability model. Experimental results obtained using the NIST SRE 2010 dataset show that the incorporation of the proposed method leads to relative improvements of 20.48% and 18.99% in the 3 second condition for male and female speech respectively.
Original language | English |
---|---|
Pages (from-to) | 1502-1506 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Volume | 2017-August |
DOIs | |
Publication status | Published - Aug 2017 |
Externally published | Yes |
Event | 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden Duration: 20 Aug 2017 → 24 Aug 2017 |
Keywords
- I-vector
- Probabilistic LDA
- Short duration
- Speaker verification
- Supervector
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation