Joint speaker and lexical modeling for short-term characterization of speaker

Guangsen Wang, Kong Aik Lee, Trung Hieu Nguyen, Hanwu Sun, Bin Ma

Research output: Journal article publicationConference articleAcademic researchpeer-review

13 Citations (Scopus)

Abstract

For speech utterances of very short duration, speaker characterization has shown strong dependency on the lexical content. In this context, speaker verification is always performed by analyzing and matching speaker pronunciation of individual words, syllables, or phones. In this paper, we advocate the use of hidden Markov model (HMM) for joint modeling of speaker characteristic and lexical content. We then develop a scoring model that scores only the speaker part rather than the joint speakerlexical component leading to a better speaker verification performance. Experiments were conducted on the text-prompted task of RSR2015 and the RedDots datasets. In the RSR2015, the prompted texts are limited to random sequences of digits. The RedDots dataset dictates an unconstrained scenario where the prompted texts are free-text sentences. Both RSR2015 and RedDots datasets are publicly available.

Original languageEnglish
Pages (from-to)415-419
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume08-12-September-2016
DOIs
Publication statusPublished - Sept 2016
Externally publishedYes
Event17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States
Duration: 8 Sept 201616 Sept 2016

Keywords

  • Hidden Markov model
  • Speaker adaptation
  • Speaker verification
  • Text-dependent
  • Text-prompted

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Joint speaker and lexical modeling for short-term characterization of speaker'. Together they form a unique fingerprint.

Cite this