Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding

Rui Wang, Liping Chen, Kong Aik Lee, Zhen Hua Ling

Research output: Journal article publicationConference articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Voice anonymization has been developed as a technique for preserving privacy by replacing the speaker's voice in a speech signal with that of a pseudo-speaker, thereby obscuring the original voice attributes from machine recognition and human perception. In this paper, we focus on altering the voice attributes against machine recognition while retaining human perception. We referred to this as the asynchronous voice anonymization. To this end, a speech generation framework incorporating a speaker disentanglement mechanism is employed to generate the anonymized speech. The speaker attributes are altered through adversarial perturbation applied on the speaker embedding, while human perception is preserved by controlling the intensity of perturbation. Experiments conducted on the LibriSpeech dataset showed that the speaker attributes were obscured with their human perception preserved for 60.71% of the processed utterances. Audio samples can be found in.

Original languageEnglish
Pages (from-to)4443-4447
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
Publication statusPublished - Sept 2024
Event25th Interspeech Conferece 2024 - Kos Island, Greece
Duration: 1 Sept 20245 Sept 2024

Keywords

  • adversarial perturbation on speaker embedding
  • asynchronous anonymization
  • human perception preservation
  • voice privacy

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding'. Together they form a unique fingerprint.

Cite this