Sparse coding of total variability matrix

Longting Xu, Kong Aik Lee, Haizhou Li, Zhen Yang

Research output: Journal article publicationConference articleAcademic researchpeer-review

5 Citations (Scopus)

Abstract

In text-independent speaker verification, it has been shown effective to represent the variable-length and information rich speech utterances using fixed-dimensional vectors, for instance, in the form of i-vectors. An i-vector is a low-dimensional vector in the so-called total variability space represented with a thin and tall rectangular matrix. Taking each row of the total variability matrix as a random vector, we look into the redundancy in representing the total variability space. We show that the total variability matrix is compressible and such characteristic could be exploited to reduce the memory and computational requirement in i-vector extraction. We also show that the existing sparse coding and dictionary learning techniques could be easily adapted for this purpose. Experiments on NIST SRE'10 dataset confirm that the total variability matrix could be represented with a smaller matrix without affecting the performance.

Original languageEnglish
Pages (from-to)1022-1026
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2015-January
Publication statusPublished - Sept 2015
Externally publishedYes
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 6 Sept 201510 Sept 2015

Keywords

  • I-vector
  • Sparse coding
  • Speaker verification

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Sparse coding of total variability matrix'. Together they form a unique fingerprint.

Cite this