Robust Speaker Verification Using Deep Weight Space Ensemble

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Domain shift is one of the most challenging problems in speaker verification. Although numerous methods have been proposed to address domain shift, most approaches optimize the performance of one domain at the sacrifice of the other. As a result, to obtain the best performance, each domain requires a dedicated model. However, deploying multiple models is resource-demanding and impractical, particularly when the deployment domains are not known in advance. Recent studies in deep neural networks (DNNs) suggest that near the low error surface of the DNN's weight space, there exists a linear path connecting a base model and a fine-tuned model. This finding inspires us to combine the strength of the fine-tuned models and the base models to solve challenging SV problems. Specifically, we aim to develop models that can handle 1) mixed text-dependent (TD) and text-independent (TI) speaker verification where the speech content can be either unconstrained or constrained, 2) cross-channel speaker verification where the recording can be 16 kHz high-fidelity microphone speech or 8 kHz telephone speech, and 3) bi-lingual speaker verification where the enrollment and test speech can be one of the two languages. With weight space ensemble, we show that we can substantially improve the tasks mentioned above, with a 39.6% improvement in mixing TD and TI SV, a 17.4% improvement in bi-lingual SV, and an 18.4% improvement in cross-channel SV. Moreover, we show that the weight space ensemble can also enhance the performance in the target domain, thanks to the regularization effect of the interpolation.

Original languageEnglish
Pages (from-to)802-812
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume31
DOIs
Publication statusPublished - Jan 2023

Keywords

  • domain adaptation
  • domain shift
  • Robust speaker recognition
  • weight space ensemble

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Robust Speaker Verification Using Deep Weight Space Ensemble'. Together they form a unique fingerprint.

Cite this