A Class-dependent Background Model for Speech Signal Feature Extraction

Yuechi Jiang, H. F. Frank Leung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Universal Background Model (UBM) has been successfully applied to many speech signal classification tasks, such as speaker recognition and microphone recognition. UBM is used to form Gaussian Supervector (GSV) or i-vector, which is a good feature vector representing a piece of speech signal. In this paper, we propose another background model called Class-dependent Background Model (CBM), which makes use of class labels. UBM is completely a generative model, while CBM can be both generative and discriminative. Under some conditions, CBM can consume less time to be constructed than UBM. We also compare the performance of UBM and CBM as the background model to form GSV and i-vector for doing speaker recognition, microphone recognition, and telephone session recognition. Experimental results show that CBM performs very well and can be even better than UBM in most cases.

Original languageEnglish
Title of host publication2018 IEEE 23rd International Conference on Digital Signal Processing, DSP 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538668115
DOIs
Publication statusPublished - 19 Nov 2018
Event23rd IEEE International Conference on Digital Signal Processing, DSP 2018 - Shanghai, China
Duration: 19 Nov 201821 Nov 2018

Publication series

NameInternational Conference on Digital Signal Processing, DSP
Volume2018-November

Conference

Conference23rd IEEE International Conference on Digital Signal Processing, DSP 2018
Country/TerritoryChina
CityShanghai
Period19/11/1821/11/18

Keywords

  • class-dependent background model
  • Gaussian supervector
  • i-vector
  • speech signal classification
  • universal background model

ASJC Scopus subject areas

  • Signal Processing

Fingerprint

Dive into the research topics of 'A Class-dependent Background Model for Speech Signal Feature Extraction'. Together they form a unique fingerprint.

Cite this