Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification

Youzhi Tu, Man Wai Mak, Jen-Tzung Chien

Research output: Journal article publicationJournal articleAcademic researchpeer-review

34 Citations (Scopus)

Abstract

Domain mismatch is a common problem in speaker verification (SV) and often causes performance degradation. For the system relying on the Gaussian PLDA backend to suppress the channel variability, the performance would be further limited if there is no Gaussianity constraint on the learned embeddings. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) that incorporates an InfoVAE into domain adversarial training (DAT) to reduce domain mismatch and simultaneously meet the Gaussianity requirement of the PLDA backend. Specifically, DAT is applied to produce speaker discriminative and domain-invariant features, while the InfoVAE performs variational regularization on the embedded features so that they follow a Gaussian distribution. Another benefit of the InfoVAE is that it avoids posterior collapse in VAEs by preserving the mutual information between the embedded features and the training set so that extra speaker information can be retained in the features. Experiments on both SRE16 and SRE18-CMN2 show that the InfoVDANN outperforms the recent VDANN, which suggests that increasing the mutual information between the embedded features and input features enables the InfoVDANN to extract extra speaker information that is otherwise not possible.
Original languageEnglish
Article number9124672
Pages (from-to)2013-2024
Number of pages12
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume28
DOIs
Publication statusPublished - Jun 2020

Keywords

  • Speaker verification (SV)
  • domain adaptation
  • domain adversarial training
  • mutual information
  • variational autoencoder

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification'. Together they form a unique fingerprint.

Cite this