Abstract
We have previously developed a Fishervoice framework that maps the JFA-mean supervectors into a compressed discriminant subspace using nonparametric Fishers discriminant analysis. It was shown that performing cosine distance scoring (CDS) on these Fishervoice projected vectors (denoted as f-vectors) can outperform the classical joint factor analysis. Unlike the ivector approach in which the channel variability is suppressed in the classification stage, in the Fishervoice framework, channel variability is suppressed when the f-vectors are constructed. In this paper, we investigate whether channel variability can be further suppressed by performing Gaussian probabilistic discriminant analysis (PLDA) in the classification stage. We also use random subspace sampling to enrich the speaker discriminative information in the f-vectors. Experiments on NIST SRE10 show that PLDA can boost the performance of Fishervoice in speaker verification significantly by a relative decrease of 14.4% in minDCF (from 0.526 to 0.450).
Original language | English |
---|---|
Pages (from-to) | 1130-1134 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 1 Jan 2014 |
Event | 15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Max Atria at Singapore Expo, Singapore, Singapore Duration: 14 Sept 2014 → 18 Sept 2014 |
Keywords
- Fishervoice
- Joint factor analysis
- Probabilistic linear discriminant analysis
- Random sampling
- Supervector
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation