Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation

Shihao Chen, Liping Chen, Jie Zhang, Kong Aik Lee, Zhenhua Ling, Lirong Dai

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

4 Citations (Scopus)

Abstract

The rapid progress in personalized speech generation technology, including personalized text-to-speech (TTS) and voice conversion (VC), poses a challenge in distinguishing between generated and real speech for human listeners, resulting in an urgent demand in protecting speakers' voices from malicious misuse. In this regard, we propose a speaker protection method based on adversarial attacks. The proposed method perturbs speech signals by minimally altering the original speech while rendering downstream speech generation models unable to accurately generate the voice of the target speaker. For validation, we employ the open-source pre-trained YourTTS model for speech generation and protect the target speaker's speech in the white-box scenario. Automatic speaker verification (ASV) evaluations were carried out on the generated speech as the assessment of the voice protection capability. Our experimental results show that we successfully perturbed the speaker encoder of the YourTTS model using the gradient-based I-FGSM adversarial perturbation method. Furthermore, the adversarial perturbation is effective in preventing the YourTTS model from generating the speech of the target speaker. Audio samples can be found in https://voiceprivacy.github.io/Adeversarial-Speech-with-YourTTS.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages11411-11415
Number of pages5
ISBN (Electronic)9798350344851
DOIs
Publication statusPublished - 14 Apr 2024
Event49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Keywords

  • adversary attack
  • Personalized speech generation
  • text-to-speech
  • voice conversion
  • voice privacy

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation'. Together they form a unique fingerprint.

Cite this