TY - GEN
T1 - Deep Discriminative Embedding with Ranked Weight for Speaker Verification
AU - Zhou, Dao
AU - Wang, Longbiao
AU - Lee, Kong Aik
AU - Liu, Meng
AU - Dang, Jianwu
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020/11
Y1 - 2020/11
N2 - Deep speaker-embedding neural network trained with a discriminative loss function is widely known to be effective for speaker verification task. Notably, angular margin softmax loss, and its variants, were proposed to promote intra-class compactness. However, it is worth noticing that these methods are not effective enough in enhancing inter-class separability. In this paper, we present a ranked weight loss which explicitly encourages intra-class compactness and enhances inter-class separability simultaneously. During the neural network training process, the most attention is given to the target speaker in order to encourage intra-class compactness. Next, its nearest neighbor who has the greatest impact on the correct classification gets the second most attention while the least attention is paid to its farthest neighbor. Experimental results on VoxCeleb1, CN-Celeb and the Speakers in the Wild (SITW) core-core condition show that the proposed ranked weight loss achieves state-of-the-art performance.
AB - Deep speaker-embedding neural network trained with a discriminative loss function is widely known to be effective for speaker verification task. Notably, angular margin softmax loss, and its variants, were proposed to promote intra-class compactness. However, it is worth noticing that these methods are not effective enough in enhancing inter-class separability. In this paper, we present a ranked weight loss which explicitly encourages intra-class compactness and enhances inter-class separability simultaneously. During the neural network training process, the most attention is given to the target speaker in order to encourage intra-class compactness. Next, its nearest neighbor who has the greatest impact on the correct classification gets the second most attention while the least attention is paid to its farthest neighbor. Experimental results on VoxCeleb1, CN-Celeb and the Speakers in the Wild (SITW) core-core condition show that the proposed ranked weight loss achieves state-of-the-art performance.
KW - Inter-class separability
KW - Intra-class compactness
KW - Speaker embedding
KW - Speaker verification
UR - http://www.scopus.com/inward/record.url?scp=85097052504&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-63823-8_10
DO - 10.1007/978-3-030-63823-8_10
M3 - Conference article published in proceeding or book
AN - SCOPUS:85097052504
SN - 9783030638221
T3 - Communications in Computer and Information Science
SP - 79
EP - 86
BT - Neural Information Processing - 27th International Conference, ICONIP 2020, Proceedings
A2 - Yang, Haiqin
A2 - Pasupa, Kitsuchart
A2 - Leung, Andrew Chi-Sing
A2 - Kwok, James T.
A2 - Chan, Jonathan H.
A2 - King, Irwin
PB - Springer Science and Business Media Deutschland GmbH
T2 - 27th International Conference on Neural Information Processing, ICONIP 2020
Y2 - 18 November 2020 through 22 November 2020
ER -