Abstract
This paper investigates the issue of automatic segmentation of speech recordings for broadcast news (BN) and broadcast conversation (BC) speech recognition. Our previous segmentation algorithm often exhibited high deletion errors, where some speech segments were misclassified as non-speech and thus were never passed on to the recognizer. In contrast with our previous segmentation models, which only differentiated between speech and non-speech segments, phonetic knowledge is applied to represent speech by using multiple models for different types of speech segments. Moreover, the "pronunciation" of the speech segment has been modified to loosen the minimum duration constraint. This method makes use of language specific knowledge, while keeping the number of models low to achieve fast segmentation. Experimental results show that the new segmenter outperforms our previous segmenter significantly, particularly in reducing deletion errors.
Original language | English |
---|---|
Title of host publication | International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 |
Pages | 2580-2583 |
Number of pages | 4 |
Volume | 4 |
Publication status | Published - 1 Dec 2007 |
Externally published | Yes |
Event | 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 - Antwerp, Belgium Duration: 27 Aug 2007 → 31 Aug 2007 |
Conference
Conference | 8th Annual Conference of the International Speech Communication Association, Interspeech 2007 |
---|---|
Country/Territory | Belgium |
City | Antwerp |
Period | 27/08/07 → 31/08/07 |
Keywords
- Automatic acoustic segmentation
- Broadcast conversation
- Broadcast news
- Machine translation
- Speech recognition
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Modelling and Simulation
- Linguistics and Language
- Communication