Multisource I-Vectors Domain Adaptation Using Maximum Mean Discrepancy Based Autoencoders

Wei Wei Lin, Man Wai Mak, Jen Tzung Chien

Research output: Journal article publicationJournal articleAcademic researchpeer-review

40 Citations (Scopus)

Abstract

Like many machine learning tasks, the performance of speaker verification (SV) systems degrades when training and test data come from very different distributions. What's more, both training and test data themselves could be composed of heterogeneous subsets. These multisource mismatches are detrimental to SV performance. This paper proposes incorporating maximum mean discrepancy (MMD) into the loss function of autoencoders to reduce these mismatches. MMD is a nonparametric method for measuring the distance between two probability distributions. With a properly chosen kernel, MMD can match up to infinite moments of data distributions. We generalize MMD to measure the discrepancies of multiple distributions.We call the generalized MMDdomainwiseMMD. Using domainwiseMMDas an objective function, we propose two autoencoders, namely nuisance-attribute autoencoder (NAE) and domain-invariant autoencoder (DAE), for multisource i-vector adaptation. NAE encodes the features that cause most of the multisource mismatch measured by domainwise MMD. DAE directly encodes the features that minimize the multisource mismatch. Using these MMD-based autoencoders as a preprocessing step for PLDA training, we achieve a relative improvement of 19.2% EER on the NIST 2016 SRE compared to PLDA without adaptation. We also found that MMD-based autoencoders are more robust to unseen domains. In the domain robustness experiments, MMD-based autoencoders show 6.8% and 5.2% improvements over IDVC on female and male Cantonese speakers, respectively.

Original languageEnglish
Article number8445620
Pages (from-to)2412-2422
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume26
Issue number12
DOIs
Publication statusPublished - Dec 2018

Keywords

  • domain adaptation
  • i-vectors
  • maximum mean discrepancy
  • Speaker verification

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Multisource I-Vectors Domain Adaptation Using Maximum Mean Discrepancy Based Autoencoders'. Together they form a unique fingerprint.

Cite this