Robust voice activity detection for interview speech in NIST speaker recognition evaluation

Man Wai Mak, Hon Bill Yu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

16 Citations (Scopus)

Abstract

The introduction of interview speech in recent NIST Speaker Recognition Evaluations (SREs) has necessitated the development of robust voice activity detectors (VADs) that can work under very low signal-to-noise ratio. This paper highlights the characteristics of interview speech files in NIST SREs and discusses the difficulties of detecting speech/non-speech segments in these files. To alleviate these difficulties, this paper proposes a VAD that uses noise reduction as a pre-processing step. A strategy to avoid the undesirable effects of impulsive signals and sinusoidal background-signals on the VAD is also proposed. The proposed VAD is compared with the VAD in the ETSI-AMR speech coder for removing silence regions of interview speech files. The results show that the proposed VAD is more robust in detecting speech segments under very low SNR, leading to a significant performance gain in Common Conditions 1-4 of NIST 2008 SRE.
Original languageEnglish
Title of host publicationAPSIPA ASC 2010 - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
Pages64-71
Number of pages8
Publication statusPublished - 1 Dec 2010
Event2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010 - Biopolis, Singapore
Duration: 14 Dec 201017 Dec 2010

Conference

Conference2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010
Country/TerritorySingapore
CityBiopolis
Period14/12/1017/12/10

Keywords

  • Far-field microphone
  • NIST speaker recognition evaluations
  • Noise reduction
  • Speaker verification
  • Spectral subtraction
  • Voice activity detection

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Robust voice activity detection for interview speech in NIST speaker recognition evaluation'. Together they form a unique fingerprint.

Cite this