An Innovative Prosody Modeling Method for Chinese Speech Recognition

Research output: Journal article publicationJournal articleAcademic researchpeer-review

6 Citations (Scopus)

Abstract

This paper presents an innovative method for prosody modeling in Chinese speech recognition. Our method first evaluated the reliability of the prosodic information by which the recognition system dynamically tunes the balance between the spectral scores and prosodic scores. The basic idea of this method is to use prosodic knowledge based on its reliability. The higher the reliability, the more the prosodic information contributes to recognition. Thus, this method will not introduce extra errors but will incorporate more knowledge into the recognition system. Experimental results showed that this method reduced the relative word error rate by as much as 52.9% and 46.0% for Mandarin and Cantonese digit string recognition tasks, respectively. When incorporating tone information into Cantonese Large Vocabulary Continuous Speech Recognition (LVCSR) via the proposed method, a 20.16% relative character error rate reduction was obtained.
Original languageEnglish
Pages (from-to)129-140
Number of pages12
JournalInternational Journal of Speech Technology
Volume7
Issue number2-3
Publication statusPublished - 1 Apr 2004
Externally publishedYes

Keywords

  • Chinese dialects
  • Context-dependent
  • Prosody modeling
  • Speech recognition

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Human-Computer Interaction
  • Linguistics and Language
  • Computer Vision and Pattern Recognition

Cite this