The estimation and kernel metric of spectral correlation for text-independent speaker verification

Eryu Wang, Kong Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Lirong Dai

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Gaussian mixture models (GMMs) are commonly used in text-independent speaker verification for modeling the spectral distribution of speech. Recent studies have shown the effectiveness of characterizing speaker information using just the mean vectors of the GMM in conjunction with support vector machine (SVM). This paper advocates the use of spectral correlation captured by covariance matrices, and investigates its effectiveness compared to and in complement with the mean vectors. We examine two approaches, namely, homoscedastic and heteroscedastic modeling, in estimating the spectral correlation. We introduce two kernel metrics, namely, Frobenius angle and log-Euclidean inner product, for measuring the similarity between speech utterances in terms of spectral correlation. Experiment conducted on the NIST 2006 speaker verification task shows that approximately 10% of relative improvement is achieved by using the spectral correlation in conjunction with the mean vectors.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PublisherInternational Speech Communication Association
Pages1065-1068
Number of pages4
Publication statusPublished - Sept 2010
Externally publishedYes

Publication series

NameProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

  • Frobenius angle
  • Gaussian mixture model
  • Log-Euclidean distance
  • Support vector machine

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'The estimation and kernel metric of spectral correlation for text-independent speaker verification'. Together they form a unique fingerprint.

Cite this