awesome-speaker-embedding

A curated list of speaker-embedding speaker-verification, speaker-identification resources.
https://github.com/ranchlai/awesome-speaker-embedding

Last synced: 4 days ago
JSON representation

Challenges
Code/Tools/Frameworks/Libraries
- - VGGVox
  - SincNet
  - 3D CNN
  - GE2E
  - asv-subtools
  - Resemblyzer - level representation of a voice through a deep learning model (referred to as the voice encoder).
  - Res2Net
  - voxceleb_trainer
  - pytorch_xvectors - vectors.
  - DeepSpeaker - to-End Neural Speaker Embedding System.
  - voxceleb - visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube
  - Triplet-loss
  - kaldi
- Wining solutions of Challenges
  - REPORT - asr/kaldi/tree/master/egs/voxceleb) for data-aug
  - REPORT
  - REPORT
  - REPORT
  - REPORT
- More-recent papers
  - Attention Back-end - end, model: TDNN, Resnet, data: cn-celeb
Must-read papers
- 02\
- 03\
- 04\
- 05\
- 06\
- 07\ - based/time-delay/multi-class, softmax + cross-entropy loss
- 08\ - vector</b> paper Johns Hopkins, based on TDNN, improved by adding Noise and reverberation for augmentation
- 09\ - vector</b>' paper from Johns Hopkins
- 10\
- 11\ - vector</b>' paper from Johns Hopkins
- 12\ - norm paper, useful for score normalization
- 01\
- 08\ - vector</b> paper Johns Hopkins, based on TDNN, improved by adding Noise and reverberation for augmentation
Benchmarks (not very accurate)
- Voxceleb1 - E](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/list_test_all2.txt) and [VoxCeleb1-H](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/list_test_hard2.txt).
- report - nano|2020
- report
- report - |2020|
- link - |2021|
- report
- report
- report - |2020|
Must-read technical reports
- VOXSRC 2019 reports
Datasets
- TIMIT - free
- NIST SRE - free
- AIShell-1
- AIShell-2 - free for commercial
- AIShell-3
- AIShell-4
- HI-MIA - field text-dependent speaker verification and keyword spotting
- SITW
- Voxceleb 1&2
- Cn-Celeb 1&2 - genres speaker dataset in the wild, utterances are from chinese celebrities.
Great Talks / Tutorials
- X-vectors: Neural Speech Embeddings for Speaker Recognition - Romero, 2020
- 2020声纹识别研究与应用学术讨论会

Programming Languages

Python 8 MATLAB 1 HTML 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-speaker-embedding

Challenges

Code/Tools/Frameworks/Libraries

Wining solutions of Challenges

More-recent papers

Must-read papers

Benchmarks (not very accurate)

Must-read technical reports

Datasets

Great Talks / Tutorials