Noise-Disentanglement Metric Learning for Robust Speaker Verification

Yao Sun, Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang

Research output: Journal article publicationConference articleAcademic researchpeer-review


Automatic speaker verification (ASV) suffers from performance degradation in noisy environments. To solve this problem, we propose the noise-disentanglement metric learning to reduce the speaker-irrelevant noisy components and build a noise-invariant embedding space. Specifically, the disentanglement module, including the speaker encoder and re-construction module, is dedicated to decoupling speech signals. The speaker encoder is used to disentangle speaker-related components, and the reconstruction module increases the model's ability to constrain the noise information by re-constructing the signal. In addition, distribution optimization is introduced to supervise the spatial structure of speaker embeddings under noisy environments. Experiments on Vox-Celeb1 indicate that the proposed method improves the performance of the speaker verification system in both clean and noisy conditions.

Original languageEnglish
Article number10096848
Pages (from-to)1
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publication statusPublished - 5 May 2023
Externally publishedYes
Event48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023


  • disentangled representation learning
  • metric learning
  • noise robustness
  • speaker verification

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'Noise-Disentanglement Metric Learning for Robust Speaker Verification'. Together they form a unique fingerprint.

Cite this