Deep language: A comprehensive deep learning approach to end-to-end language recognition

Trung Ngo Trong, Ville Hautamäki, Kong Aik Lee

Research output: Unpublished conference presentation (presented paper, abstract, poster)Conference presentation (not published in journal/proceeding/book)Academic researchpeer-review

21 Citations (Scopus)

Abstract

This work explores the use of various Deep Neural Network (DNN) architectures for an end-to-end language identification (LID) task. The approach has been proven to significantly improve the state-of-art in many domains include speech recognition, computer vision and genomics. As an end-to-end system, deep learning removes the burden of hand crafting the feature extraction is conventional approach in LID. This versatility is achieved by training a very deep network to learn distributed representations of speech features with multiple levels of abstraction. In this paper, we show that an end-to-end deep learning system can be used to recognize language from speech utterances with various lengths. Our results show that a combination of three deep architectures: feed-forward network, convolutional network and recurrent network can achieve the best performance compared to other network designs. Additionally, we compare our network performance to state-of-the-art BNF-based i-vector system on NIST 2015 Language Recognition Evaluation corpus. Key to our approach is that we effectively address computational and regularization issues into the network structure to build deeper architecture compare to any previous DNN approaches to language recognition task.

Original languageEnglish
Pages109-116
Number of pages8
DOIs
Publication statusPublished - Jun 2016
Externally publishedYes
EventSpeaker and Language Recognition Workshop, Odyssey 2016 - Bilbao, Spain
Duration: 21 Jun 201624 Jun 2016

Conference

ConferenceSpeaker and Language Recognition Workshop, Odyssey 2016
Country/TerritorySpain
CityBilbao
Period21/06/1624/06/16

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Deep language: A comprehensive deep learning approach to end-to-end language recognition'. Together they form a unique fingerprint.

Cite this