Building a highly accurate mandarin speech recognizer

Mei Yuh Hwang, Gang Peng, Wen Wang, Arlo Faria, Aaron Heidel, Mari Ostendorf

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

18 Citations (Scopus)

Abstract

We describe a highly accurate large-vocabulary continuous Mandarin speech recognizer, a collaborative effort among four research organizations. Particularly, we build two acoustic models (AMs) with significant differences but similar accuracy for the purposes of cross adaptation and system combination. This paper elaborates on the main differences between the two systems, where one recognizer incorporates a discriminatively trained feature while the other utilizes a discriminative feature transformation. Additionally we present an improved acoustic segmentation algorithm and topicbased language model (LM) adaptation. Coupled with increased acoustic training data, we reduced the character error rate (CER) of the DARPA GALE 2006 evaluation set to 15.3% from 18.4%.
Original languageEnglish
Title of host publication2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Proceedings
Pages490-495
Number of pages6
Publication statusPublished - 1 Dec 2007
Externally publishedYes
Event2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 - Kyoto, Japan
Duration: 9 Dec 200713 Dec 2007

Conference

Conference2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007
CountryJapan
CityKyoto
Period9/12/0713/12/07

Keywords

  • Acoustic segmentation
  • Character error rates
  • Discriminative features
  • LM adaptation
  • Mandarin
  • Multi-layer perceptrons
  • Out-of-vocabulary

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Software
  • Artificial Intelligence

Cite this