On the analysis and evaluation of prosody conversion techniques

Berrak Sisman, Grandee Lee, Haizhou Li, Kay Chen Tan

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

8 Citations (Scopus)

Abstract

Voice conversion is a process of modifying the characteristics of source speaker such as spectrum or/and prosody, to sound as if it was spoken by another speaker. In this paper, we study the evaluation of prosody transformation, in particular, the evaluation of Fundamental Frequency (F0) conversion. F0 is an essential prosody feature that should be taken care of in a compressive voice conversion framework. So far, the evaluation of the converted prosody features is performed mainly by looking at Pearson Correlation Coefficient and Root Mean Square Error (RMSE). Unfortunately, these techniques do not explicitly measure the F0 alignment between the source and target signals. We believe that an evaluation measure that takes into account the time alignment of F0 is needed to provide a new perspective. Therefore, in this paper, we study a new technique to assess the accuracy of prosody transformation. In our experiments with different prosody transformation techniques, we report that the proposed evaluation approach achieves consistent results with the baseline evaluation metrics.

Original languageEnglish
Title of host publicationProceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
EditorsRong Tong, Minghui Dong, Yanfeng Lu, Yue Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages44-47
Number of pages4
ISBN (Electronic)9781538619803
DOIs
Publication statusPublished - 21 Feb 2018
Externally publishedYes
Event21st International Conference on Asian Language Processing, IALP 2017 - Singapore, Singapore
Duration: 5 Dec 20177 Dec 2017

Publication series

NameProceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
Volume2018-January

Conference

Conference21st International Conference on Asian Language Processing, IALP 2017
Country/TerritorySingapore
CitySingapore
Period5/12/177/12/17

Keywords

  • Prosody evaluation
  • prosody transformation
  • voice conversion

ASJC Scopus subject areas

  • Language and Linguistics
  • Artificial Intelligence
  • Computer Networks and Communications
  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'On the analysis and evaluation of prosody conversion techniques'. Together they form a unique fingerprint.

Cite this