Abstract
Gaussian mixture model (GMM) and support vector machine (SVM) have become popular classifiers in text-independent speaker recognition. A GMM-supervector characterizes a speaker's voice with the parameters of GMM, which include mean vectors, covariance matrices, and mixture weights. GMM-supervector SVM benefits from both GMM and SVM frameworks to achieve the state-of-the-art performance. Conventional Kull-back-Leibler (KL) kernel in GMM-supervector SVM classifier limits the adaptation of GMM to mean value and leaves covariance unchanged. In this letter, we introduce the GMM-UBM mean interval (GUMI) concept based on the Bhattacharyya distance. This leads to a new kernel for SVM classifier. Comparing with the KL kernel, the new kernel allows us to exploit the information not only from the mean but also from the covariance. We demonstrate the effectiveness of the new kernel on the 2006 National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) dataset.
Original language | English |
---|---|
Pages (from-to) | 49-52 |
Number of pages | 4 |
Journal | IEEE Signal Processing Letters |
Volume | 16 |
Issue number | 1 |
DOIs | |
Publication status | Published - Dec 2008 |
Keywords
- Gaussian mixture model
- National Institute of Standards and Technology (NIST) evaluation
- speaker recognition
- supervector
- support vector machine
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering
- Applied Mathematics