Abstract
Using GMM-supervectors as the input to SVM classifiers (namely, GMM-SVM) is one of the promising approaches to text-independent speaker verification. However, one unad-dressed issue of this approach is the severe imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique - namely utterance partitioning with acoustic vector resampling (UP-AVR) - to mitigate the data imbalance problem. Specifically, the sequence order of acoustic vectors in an enrollment utterance is first randomized; then the randomized sequence is partitioned into a number of segments. Each of these segments is then used to produce a GMM-supervector via MAP adaptation and mean vector concatenation. A desirable number of speaker-class su-pervectors can be produced by repeating this randomization and partitioning process a number of times. Experimental evaluations suggest that UP-AVR can reduce the EER of GMM-SVM systems by about 10%.
Original language | English |
---|---|
Title of host publication | Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 |
Pages | 1449-1452 |
Number of pages | 4 |
Publication status | Published - 1 Dec 2010 |
Event | 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan Duration: 26 Sept 2010 → 30 Sept 2010 |
Conference
Conference | 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 |
---|---|
Country/Territory | Japan |
City | Makuhari, Chiba |
Period | 26/09/10 → 30/09/10 |
ASJC Scopus subject areas
- Language and Linguistics
- Speech and Hearing