PLDA in the i-supervector space for text-independent speaker verification

Ye Jiang, Kong Aik Lee, Longbiao Wang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

14 Citations (Scopus)

Abstract

In this paper, we advocate the use of the uncompressed form of i-vector and depend on subspace modeling using probabilistic linear discriminant analysis (PLDA) in handling the speaker and session (or channel) variability. An i-vector is a low-dimensional vector containing both speaker and channel information acquired from a speech segment. When PLDA is used on an i-vector, dimension reduction is performed twice: first in the i-vector extraction process and second in the PLDA model. Keeping the full dimensionality of the i-vector in the i-supervector space for PLDA modeling and scoring would avoid unnecessary loss of information. We refer to the uncompressed i-vector as the i-supervector. The drawback in using the i-supervector with PLDA is the inversion of large matrices in the estimation of the full posterior distribution, which we show can be solved rather efficiently by portioning large matrices into smaller blocks. We also introduce the Gaussianized rank-norm, as an alternative to whitening, for feature normalization prior to PLDA modeling. We found that the i-supervector performs better during normalization. A better performance is obtained by combining the i-supervector and i-vector at the score level. Furthermore, we also analyze the computational complexity of the i-supervector system, compared with that of the i-vector, at four different stages of loading matrix estimation, posterior extraction, PLDA modeling, and PLDA scoring.

Original languageEnglish
Article number29
Pages (from-to)1-13
Number of pages13
JournalEurasip Journal on Audio, Speech, and Music Processing
Volume2014
Issue number1
DOIs
Publication statusPublished - 1 Dec 2014
Externally publishedYes

Keywords

  • I-supervector
  • I-vector
  • Probabilistic linear discriminant analysis
  • Speaker verification

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'PLDA in the i-supervector space for text-independent speaker verification'. Together they form a unique fingerprint.

Cite this