Adversarial data augmentation network for speech emotion recognition

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

21 Citations (Scopus)

Abstract

Insufficient data is a common issue in training deep learning models. With the introduction of generative adversarial networks (GANs), data augmentation has become a promising solution to this problem. This paper investigates whether data augmentation can help improve speech emotion recognition. Unlike conventional GANs, we train a GAN with an autoencoder, where the input to the discriminator comes from the bottleneck layer of the autoencoder and the output of the generator. The synthetic samples can be obtained from the decoder, using the output of the generator as the decoder's input. The combined network, namely adversarial data augmentation network (ADAN), can generate samples that share common latent representation with the real data. Evaluations on EmoDB and IEMOCAP show that using OpenSmile features as input, the ADAN can produce augmented data that make an ordinary SVM classifier outperforms an RNN classifier with local attention and make a DNN competitive to some state-of-The art systems.

Original languageEnglish
Title of host publication2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages529-534
Number of pages6
ISBN (Electronic)9781728132488
DOIs
Publication statusPublished - 18 Nov 2019
Event2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 - Lanzhou, China
Duration: 18 Nov 201921 Nov 2019

Publication series

Name2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

Conference

Conference2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
Country/TerritoryChina
CityLanzhou
Period18/11/1921/11/19

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Adversarial data augmentation network for speech emotion recognition'. Together they form a unique fingerprint.

Cite this