High-level speaker verification via articulatory-feature based sequence kernels and SVM

Shi Xiong Zhang, Man Wai Mak

Research output: Journal article publicationConference articleAcademic researchpeer-review

5 Citations (Scopus)

Abstract

Articulatory-feature based pronunciation models (AFCPMs) are capable of capturing the pronunciation variations among different speakers and are good for high-level speaker recognition. However, the likelihood-ratio scoring method of AFPCMs is based on a decision boundary created by training the target speaker model and universal background model (UBM) separately. Therefore, the method does not fully utilize the discriminative information available in the training data. To fully harness the discriminative information, this paper proposes training a support vector machine (SVM) for computing the verification scores. More precisely, the models of target speakers, individual background speakers, and claimants are converted to AF-supervectors, which form the inputs to an AF-based kernel of the SVM for computing verification scores. Results show that the proposed AF-kernel scoring is complementary to likelihood-ratio scoring, leading to better performance when the two scoring methods are combined. Further performance enhancement was also observed when the AF scores were combined with acoustic scores derived from a GMM-UBM system.
Original languageEnglish
Pages (from-to)1393-1396
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 1 Dec 2008
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: 22 Sep 200826 Sep 2008

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Cite this