Skip to main navigation Skip to search Skip to main content

Investigation of Perception Inconsistency in Speaker Embedding for Asynchronous Voice Anonymization

  • Rui Wang
  • , Liping Chen
  • , Kong Aik Lee
  • , Zhengpeng Zha
  • , Zhenhua Ling

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Given the speech generation framework that represents the speaker attribute with an embedding vector, asynchronous voice anonymization can be achieved by modifying the speaker embedding derived from the original speech. However, the inconsistency between machine and human perceptions of the speaker attribute within the speaker embedding remains unexplored, limiting its performance in asynchronous voice anonymization. To this end, this study investigates this inconsistency via modifications to speaker embedding in the speech generation process. Experiments conducted on the FACodec and Diff-HierVC speech generation models discover a subspace whose removal alters machine perception while preserving its human perception of the speaker attribute in the generated speech. With these findings, an asynchronous voice anonymization is developed, achieving 100 % human perception preservation rate while obscuring the machine perception. Audio samples can be found in https://voiceprivacy.github.io/speaker-embedding-eigen-decomposition/.

Original languageEnglish
Title of host publication2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2074-2079
Number of pages6
ISBN (Electronic)9798331572068
DOIs
Publication statusPublished - 2025
Event17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2025 - Singapore, Singapore
Duration: 22 Oct 202524 Oct 2025

Publication series

Name2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2025

Conference

Conference17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2025
Country/TerritorySingapore
CitySingapore
Period22/10/2524/10/25

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Hardware and Architecture
  • Signal Processing

Fingerprint

Dive into the research topics of 'Investigation of Perception Inconsistency in Speaker Embedding for Asynchronous Voice Anonymization'. Together they form a unique fingerprint.

Cite this