The impact of speech recognition errors on the effectiveness of spoken cantonese query retrieval

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

This paper examines the impact of recognition errors on spoken Cantonese query retrieval effectiveness. One of the largest test collection provided by NTCIR for evaluating Chinese information retrieval is used. The retrieval system uses one of the best models (2-Poisson) and the robust bigram indexing strategy. If there are no syllable recognition errors, then the errors in converting spelling (called pinyin) to characters will degrade the performance by 3.9% points which is not statistically significant. Otherwise, the performance dropped by 10.2% points which is statistically significant. We improved our system by merging the /n/ and /I/ phone labels and retrained the syllable-to-text conversion routines. The improved retrieval system dropped only 6.4% points.
Original languageEnglish
Title of host publication2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
Pages210-213
Number of pages4
Publication statusPublished - 1 Dec 2004
Event2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004 - Hong Kong, China, Hong Kong
Duration: 20 Oct 200422 Oct 2004

Conference

Conference2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
Country/TerritoryHong Kong
CityHong Kong, China
Period20/10/0422/10/04

ASJC Scopus subject areas

  • Engineering(all)

Cite this