Learning Domain-Invariant Transformation for Speaker Verification

Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang, Hui Chen

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

5 Citations (Scopus)

Abstract

Automatic speaker verification (ASV) faces domain shift caused by the mismatch of intrinsic and extrinsic factors such as recording device and speaking style in real-world applications, which leads to unsatisfactory performance. To this end, we propose the meta generalized transformation via meta-learning to build a domain-invariant embedding space. Specifically, the transformation module is motivated to learn the domain generalization knowledge by executing meta-optimization on the meta-train and meta-test sets which are designed to simulate domain shift. Furthermore, distribution optimization is incorporated to supervise the metric structure of embeddings. In terms of the transformation module, we investigate various instantiations and observe the multilayer perceptron with gating (gMLP) is the most effective given its extrapolation capability. The experimental results on cross-genre and cross-dataset settings demonstrate that the meta generalized transformation dramatically improves the robustness of ASV systems to domain shift, while outperforms the state-of-the-art methods.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7177-7181
Number of pages5
ISBN (Electronic)9781665405409
DOIs
Publication statusPublished - May 2022
Externally publishedYes
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 23 May 202227 May 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period23/05/2227/05/22

Keywords

  • domain-invariant
  • meta generalized transformation
  • meta-learning
  • speaker verification

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Learning Domain-Invariant Transformation for Speaker Verification'. Together they form a unique fingerprint.

Cite this