Towards PLDA-RBM based speaker recognition in mobile environment: Designing stacked/deep PLDA-RBM systems

Andreas Nautsch, Hong Hao, Themos Stafylakis, Christian Rathgeb, Christoph Busch

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The vast majority of text-independent speaker recognition systems rely on intermediate-sized vectors (i-vectors), which are compared by probabilistic linear discriminant analysis (PLDA). This paper proposes a PLDA-alike approach with restricted Boltzmann machines for i-vector based speaker recognition: two deep architectures are presented and examined, which aim at suppressing channel effects and recovering speaker-discriminative information on back-ends trained on a small dataset. Experiments are carried out on the MOBIO SRE'13 database, which is a challenging and publicly available dataset for mobile speaker recognition with limited amounts of training data. The experiments show that the proposed system outperforms the baseline i-vector/PLDA approach by relative gains of 31% on female and 9% on male speakers in terms of half total error rate.
Original languageEnglish
Title of host publicationProceedings of 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Number of pages5
PublisherIEEE
Publication date2016
DOIs
Publication statusPublished - 2016
Event2016 IEEE International Conference on Acoustics, Speech, and Signal Processing - Shanghai, China
Duration: 20 Mar 201625 Mar 2016
Conference number: 41

Conference

Conference2016 IEEE International Conference on Acoustics, Speech, and Signal Processing
Number41
Country/TerritoryChina
CityShanghai
Period20/03/201625/03/2016
SeriesI E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
ISSN1520-6149

Keywords

  • Deep learning
  • MOBIO
  • PLDA-RBM
  • Speaker recognition

Fingerprint

Dive into the research topics of 'Towards PLDA-RBM based speaker recognition in mobile environment: Designing stacked/deep PLDA-RBM systems'. Together they form a unique fingerprint.

Cite this