Abstract
The i-vector/PLDA framework has gained huge popularity in text-independent speaker verification. This approach, however, lacks the ability to represent the reliability of i-vectors. As a result, the framework performs poorly when presented with utterances of arbitrary duration. To address this problem, a method called uncertainty propagation (UP) was proposed to explicitly model the reliability of an i-vector by an utterance-dependent loading matrix. However, the utterance-dependent matrix greatly complicates the evaluation of likelihood scores. As a result, PLDA with UP, or PLDA-UP in short, is far more computational intensive than the conventional PLDA. In this paper, we propose to group i-vectors with similar reliability, and for each group the utterance-dependent loading matrices are replaced by a representative one. This arrangement allows us to pre-compute a set of representative matrices that cover all possible i-vectors, thereby greatly reducing the computational cost of PLDA-UP while preserving its ability in discriminating the reliability of i-vectors. Experiments on NIST 2012 SRE show that the proposed method can perform as good as the PLDA with UP while the scoring time is only 3.18% of it.
Original language | English |
---|---|
Pages (from-to) | 503-515 |
Number of pages | 13 |
Journal | Computer Speech and Language |
Volume | 45 |
DOIs | |
Publication status | Published - Sept 2017 |
Keywords
- Duration mismatch
- i-Vector/PLDA
- Speaker verification
- Uncertainty Propagation
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Human-Computer Interaction