TY - GEN
T1 - LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
AU - Luong, Hieu Thi
AU - Li, Haoyang
AU - Zhang, Lin
AU - Lee, Kong Aik
AU - Chng, Eng Siong
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025/4
Y1 - 2025/4
N2 - Previous fake speech datasets were constructed from a defender's perspective to develop countermeasure (CM) systems without considering diverse motivations of attackers. To better align with real-life scenarios, we created LlamaPartialSpoof, a 130-hour dataset that contains both fully and partially fake speech, using a large language model (LLM) and voice cloning technologies to evaluate the robustness of CMs. By examining valuable information for both attackers and defenders, we identify several key vulnerabilities in current CM systems, which can be exploited to enhance attack success rates, including biases toward certain text-to-speech models or concatenation methods. Our experimental results indicate that the current fake speech detection system struggle to generalize to unseen scenarios, achieving a best performance of 24.49% equal error rate.
AB - Previous fake speech datasets were constructed from a defender's perspective to develop countermeasure (CM) systems without considering diverse motivations of attackers. To better align with real-life scenarios, we created LlamaPartialSpoof, a 130-hour dataset that contains both fully and partially fake speech, using a large language model (LLM) and voice cloning technologies to evaluate the robustness of CMs. By examining valuable information for both attackers and defenders, we identify several key vulnerabilities in current CM systems, which can be exploited to enhance attack success rates, including biases toward certain text-to-speech models or concatenation methods. Our experimental results indicate that the current fake speech detection system struggle to generalize to unseen scenarios, achieving a best performance of 24.49% equal error rate.
KW - dataset
KW - deepfake
KW - fake speech detection
KW - large language model
KW - voice cloning
UR - https://www.scopus.com/pages/publications/105009694746
U2 - 10.1109/ICASSP49660.2025.10888070
DO - 10.1109/ICASSP49660.2025.10888070
M3 - Conference article published in proceeding or book
AN - SCOPUS:105009694746
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1
EP - 5
BT - English
T2 - 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Y2 - 6 April 2025 through 11 April 2025
ER -