Reducing Domain Mismatch by Maximum Mean Discrepancy Based Autoencoders

Weiwei Lin, Man Wai Mak, Longxin Li, Jen Tzung Chien

Research output: Unpublished conference presentation (presented paper, abstract, poster)Conference presentation (not published in journal/proceeding/book)Academic researchpeer-review

16 Citations (Scopus)

Abstract

Domain mismatch, caused by the discrepancy between training and test data, can severely degrade the performance of speaker verification (SV) systems. What’s more, both training and test data themselves could be composed of heterogeneous subsets, with each subset corresponding to one sub-domain. These multi-source mismatches can further degrade SV performance. This paper proposes incorporating maximum mean discrepancy (MMD) into the loss function of autoencoders to reduce theses mismatches. Specifically, we generalize MMD to measure the discrepancies among multiple distributions. We call this generalized MMD domain-wise MMD. Using domain-wise MMD as an objective function, we derive a domain-invariant autoencoder (DAE) for multi-source i-vector adaptation. The DAE directly encodes the features that minimize the multi-source mismatch. By replacing the original i-vectors with these domain-invariant feature vectors for PLDA training, we reduce the EER by 11.8% in NIST 2016 SRE when compared to PLDA without adaptation.

Original languageEnglish
Pages162-167
Number of pages6
DOIs
Publication statusPublished - Jun 2018
Event2018 Speaker and Language Recognition Workshop, ODYSSEY 2018 - Les Sables d'Olonne, France
Duration: 26 Jun 201829 Jun 2018

Conference

Conference2018 Speaker and Language Recognition Workshop, ODYSSEY 2018
Country/TerritoryFrance
CityLes Sables d'Olonne
Period26/06/1829/06/18

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Reducing Domain Mismatch by Maximum Mean Discrepancy Based Autoencoders'. Together they form a unique fingerprint.

Cite this