TY - GEN
T1 - Factored covariance modeling for text-independent speaker verification
AU - Wang, Eryu
AU - Lee, Kong Aik
AU - Ma, Bin
AU - Li, Haizhou
AU - Guo, Wu
AU - Dai, Lirong
PY - 2011/7
Y1 - 2011/7
N2 - Gaussian mixture models (GMMs) are commonly used to model the spectral distribution of speech signals for text-independent speaker verification. Mean vectors of the GMM, used in conjunction with support vector machine (SVM), have shown to be effective in characterizing speaker information. In addition to the mean vectors, covariance matrices capture the correlation between spectral features, which also represent some salient information about speaker identity. This paper investigates the use of local correlation between different dimensions of acoustic vector by using factor analysis and linear Gaussian model. Log-Euclidean inner product kernel is used to measure the similarity between two speech utterances in the form of covariance matrices. Experiments carried on NIST 2006 speaker verification tasks shows promising results.
AB - Gaussian mixture models (GMMs) are commonly used to model the spectral distribution of speech signals for text-independent speaker verification. Mean vectors of the GMM, used in conjunction with support vector machine (SVM), have shown to be effective in characterizing speaker information. In addition to the mean vectors, covariance matrices capture the correlation between spectral features, which also represent some salient information about speaker identity. This paper investigates the use of local correlation between different dimensions of acoustic vector by using factor analysis and linear Gaussian model. Log-Euclidean inner product kernel is used to measure the similarity between two speech utterances in the form of covariance matrices. Experiments carried on NIST 2006 speaker verification tasks shows promising results.
KW - covariance modeling
KW - factor analysis
KW - Gaussian mixture model
KW - log-Euclidean
KW - support vector machine
UR - http://www.scopus.com/inward/record.url?scp=80051656183&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2011.5947443
DO - 10.1109/ICASSP.2011.5947443
M3 - Conference article published in proceeding or book
AN - SCOPUS:80051656183
SN - 9781457705397
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4856
EP - 4859
BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
T2 - 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Y2 - 22 May 2011 through 27 May 2011
ER -