Abstract
This paper presents a methodology that uses surface electromyogram (SEMG) signals recorded from the cheek and chin to synthesize speech. Simultaneously recorded speech and SEMG signals are blocked into frames and transformed into features. Linear predictive coding (LPC) and short-time Fourier transform coefficients are chosen as speech and SEMG features respectively. A neural network is applied to convert SEMG features into speech features on a frame-by-frame basis. The converted speech features are used to reconstruct the original speech. Feature selection, conversion methodology and experimental results are discussed. The results show that phoneme-based feature extraction and frame-based feature conversion could be applied to SEMG-based continuous speech synthesis.
Original language | English |
---|---|
Title of host publication | Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology |
Pages | 749-754 |
Number of pages | 6 |
DOIs | |
Publication status | Published - 1 Dec 2005 |
Event | 5th IEEE International Symposium on Signal Processing and Information Technology - Athens, Greece Duration: 18 Dec 2005 → 21 Dec 2005 |
Conference
Conference | 5th IEEE International Symposium on Signal Processing and Information Technology |
---|---|
Country/Territory | Greece |
City | Athens |
Period | 18/12/05 → 21/12/05 |
Keywords
- LPC, short-time Fourier transform
- Neural networks
- SEMG
- Speech synthesis
ASJC Scopus subject areas
- General Engineering