The NEC-TT 2018 speaker verification system

Kong Aik Lee, Hitoshi Yamamoto, Koji Okabe, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda

Research output: Journal article publicationConference articleAcademic researchpeer-review

13 Citations (Scopus)

Abstract

This paper describes the NEC-TT speaker verification system for the 2018 NIST speaker recognition evaluation (SRE'18). We present the details of data partitioning, x-vector speaker embedding, data augmentation, speaker diarization, and domain adaptation techniques used in NEC-TT SRE'18 speaker verification system. For the speaker embedding front-end, we found that the amount and diversity of training data are essential to improve the robustness of the x-vector extractor. This was achieved with data augmentation and mixed-bandwidth training in our submission. For the multi-speaker test scenario, we show that x-vector based speaker diarization is promising and holds potential for future research. For the scoring back-end, we used two variants of probabilistic linear discriminant analysis (PLDA), namely, the Gaussian PLDA and heavy-tailed PLDA. We show that correlation alignment (CORAL) and CORAL+ unsupervised PLDA adaptation are effective to deal with domain mismatch.

Original languageEnglish
Pages (from-to)4355-4359
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2019-September
DOIs
Publication statusPublished - Sept 2019
Externally publishedYes
Event20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
Duration: 15 Sept 201919 Sept 2019

Keywords

  • Benchmark evaluation
  • Speaker recognition

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'The NEC-TT 2018 speaker verification system'. Together they form a unique fingerprint.

Cite this