Probabilistic Back-ends for Online Speaker Recognition and Clustering

Alexey Sholokhov, Nikita Kuzmin, Kong Aik Lee, Eng Siong Chng

Research output: Journal article publicationConference articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

This paper focuses on multi-enrollment speaker recognition which naturally occurs in the task of online speaker clustering, and studies the properties of different scoring back-ends in this scenario. First, we show that popular cosine scoring suffers from poor score calibration with a varying number of enrollment utterances. Second, we propose a simple replacement for cosine scoring based on an extremely constrained version of probabilistic linear discriminant analysis (PLDA). The proposed model improves over the cosine scoring for multi-enrollment recognition while keeping the same performance in the case of one-to-one comparisons. Finally, we consider an online speaker clustering task where each step naturally involves multi-enrollment recognition. We propose an online clustering algorithm allowing us to take benefits from the PLDA model such as the ability to handle uncertainty and better score calibration. Our experiments demonstrate the effectiveness of the proposed algorithm.

Original languageEnglish
Article number10097032
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
DOIs
Publication statusPublished - 5 May 2023
Externally publishedYes
Event48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Keywords

  • online speaker clustering
  • speaker verification

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Probabilistic Back-ends for Online Speaker Recognition and Clustering'. Together they form a unique fingerprint.

Cite this