Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

Man Wai Mak, Wei Rao

Research output: Journal article publicationJournal articleAcademic researchpeer-review

35 Citations (Scopus)

Abstract

Recent research has demonstrated the merit of combining Gaussian mixture models and support vector machine (SVM) for text-independent speaker verification. However, one unaddressed issue in this GMM-SVM approach is the imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique - namely utterance partitioning with acoustic vector resampling (UP-AVR) - to mitigate the data imbalance problem. Briefly, the sequence order of acoustic vectors in an enrollment utterance is first randomized, which is followed by partitioning the randomized sequence into a number of segments. Each of these segments is then used to produce a GMM supervector via MAP adaptation and mean vector concatenation. The randomization and partitioning processes are repeated several times to produce a sufficient number of speaker-class supervectors for training an SVM. Experimental evaluations based on the NIST 2002 and 2004 SRE suggest that UP-AVR can reduce the error rate of GMM-SVM systems.
Original languageEnglish
Pages (from-to)119-130
Number of pages12
JournalSpeech Communication
Volume53
Issue number1
DOIs
Publication statusPublished - 1 Jan 2011

Keywords

  • Data imbalance
  • GMM-supervectors (GSV)
  • GMM-SVM
  • Random resampling
  • Speaker verification
  • Support vector machine
  • Utterance partitioning

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this