TY - GEN
T1 - The estimation and kernel metric of spectral correlation for text-independent speaker verification
AU - Wang, Eryu
AU - Lee, Kong Aik
AU - Ma, Bin
AU - Li, Haizhou
AU - Guo, Wu
AU - Dai, Lirong
PY - 2010/9
Y1 - 2010/9
N2 - Gaussian mixture models (GMMs) are commonly used in text-independent speaker verification for modeling the spectral distribution of speech. Recent studies have shown the effectiveness of characterizing speaker information using just the mean vectors of the GMM in conjunction with support vector machine (SVM). This paper advocates the use of spectral correlation captured by covariance matrices, and investigates its effectiveness compared to and in complement with the mean vectors. We examine two approaches, namely, homoscedastic and heteroscedastic modeling, in estimating the spectral correlation. We introduce two kernel metrics, namely, Frobenius angle and log-Euclidean inner product, for measuring the similarity between speech utterances in terms of spectral correlation. Experiment conducted on the NIST 2006 speaker verification task shows that approximately 10% of relative improvement is achieved by using the spectral correlation in conjunction with the mean vectors.
AB - Gaussian mixture models (GMMs) are commonly used in text-independent speaker verification for modeling the spectral distribution of speech. Recent studies have shown the effectiveness of characterizing speaker information using just the mean vectors of the GMM in conjunction with support vector machine (SVM). This paper advocates the use of spectral correlation captured by covariance matrices, and investigates its effectiveness compared to and in complement with the mean vectors. We examine two approaches, namely, homoscedastic and heteroscedastic modeling, in estimating the spectral correlation. We introduce two kernel metrics, namely, Frobenius angle and log-Euclidean inner product, for measuring the similarity between speech utterances in terms of spectral correlation. Experiment conducted on the NIST 2006 speaker verification task shows that approximately 10% of relative improvement is achieved by using the spectral correlation in conjunction with the mean vectors.
KW - Frobenius angle
KW - Gaussian mixture model
KW - Log-Euclidean distance
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=79959858464&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:79959858464
T3 - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
SP - 1065
EP - 1068
BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PB - International Speech Communication Association
ER -