TY - GEN
T1 - Cross-Domain adaptation in Distance Space for Speaker Verification
AU - Yi, Lu
AU - Mak, Man Wai
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023/11
Y1 - 2023/11
N2 - Significant performance degradation often occurs when a well-trained speaker verification system is applied to an unseen domain. Data augmentation and domain adaptation are two common approaches to tackling this problem. However, data augmentation would not be helpful when language mismatch rather than environment noise causes the domain shift. Domain adaptation also suffers from a label mismatch problem, making feature distribution alignment unreliable. We propose incorporating a distance metric space into model adaptation to address these issues. The idea is to align not only the embeddings across domains but also the distributions of their pairwise distances, resulting in embeddings tolerant to domain shift. To validate the idea, we used the non-Chinese utterances in VoxCeleb2 and the Chinese utterances in CN-Celeb2 as the source and target domain training data, respectively. Results show that the alignments reduce the EER on the CN-Celeb1 test set by 15.2%.
AB - Significant performance degradation often occurs when a well-trained speaker verification system is applied to an unseen domain. Data augmentation and domain adaptation are two common approaches to tackling this problem. However, data augmentation would not be helpful when language mismatch rather than environment noise causes the domain shift. Domain adaptation also suffers from a label mismatch problem, making feature distribution alignment unreliable. We propose incorporating a distance metric space into model adaptation to address these issues. The idea is to align not only the embeddings across domains but also the distributions of their pairwise distances, resulting in embeddings tolerant to domain shift. To validate the idea, we used the non-Chinese utterances in VoxCeleb2 and the Chinese utterances in CN-Celeb2 as the source and target domain training data, respectively. Results show that the alignments reduce the EER on the CN-Celeb1 test set by 15.2%.
KW - distance alignment
KW - distance metric space
KW - Domain shift
KW - speaker embedding
UR - http://www.scopus.com/inward/record.url?scp=85180004960&partnerID=8YFLogxK
U2 - 10.1109/APSIPAASC58517.2023.10317589
DO - 10.1109/APSIPAASC58517.2023.10317589
M3 - Conference article published in proceeding or book
AN - SCOPUS:85180004960
T3 - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
SP - 2238
EP - 2243
BT - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
Y2 - 31 October 2023 through 3 November 2023
ER -