TY - GEN
T1 - Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification
AU - Li, Zhe
AU - Mak, Man Wai
AU - Lee, Hung Yi
AU - Meng, Helen
N1 - Publisher Copyright:
© 2024 International Speech Communication Association. All rights reserved.
PY - 2024/9
Y1 - 2024/9
N2 - Prompt tuning can effectively reduce tunable parameters in pretrained Transformers. However, it is weak at capturing speaker traits because the prompts can easily overfit the adaptation utterances, resulting in poor generalization to unseen speakers. This paper introduces a prompt pool comprising learnable prompts to tackle this issue. Unlike the traditional method that learns a fixed set of prompts for each training utterance, our method uses a dynamic selection strategy to select the best matching prompts in a pool for tuning, resulting in each prompt being tuned by its closely matched speaker. The objective is to make the prompts in the pool form speaker clusters, enhancing speaker prediction in the downstream classifier while maintaining the plasticity of the pre-trained Transformers. Our experiments on language mismatch in speaker verification demonstrate that the dynamic prompt pool provides a memory- and computation-efficient solution to fine-tune pre-trained Transformers.
AB - Prompt tuning can effectively reduce tunable parameters in pretrained Transformers. However, it is weak at capturing speaker traits because the prompts can easily overfit the adaptation utterances, resulting in poor generalization to unseen speakers. This paper introduces a prompt pool comprising learnable prompts to tackle this issue. Unlike the traditional method that learns a fixed set of prompts for each training utterance, our method uses a dynamic selection strategy to select the best matching prompts in a pool for tuning, resulting in each prompt being tuned by its closely matched speaker. The objective is to make the prompts in the pool form speaker clusters, enhancing speaker prediction in the downstream classifier while maintaining the plasticity of the pre-trained Transformers. Our experiments on language mismatch in speaker verification demonstrate that the dynamic prompt pool provides a memory- and computation-efficient solution to fine-tune pre-trained Transformers.
KW - parameter-efficient tuning
KW - pre-trained Transformer
KW - prompt pool
KW - prompt tuning
KW - Speaker verification
UR - https://www.scopus.com/pages/publications/85214797325
U2 - 10.21437/Interspeech.2024-295
DO - 10.21437/Interspeech.2024-295
M3 - Conference article published in proceeding or book
AN - SCOPUS:85214797325
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 2675
EP - 2679
BT - English
T2 - 25th Interspeech Conferece 2024
Y2 - 1 September 2024 through 5 September 2024
ER -