Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
https://github.com/wq2012/awesome-diarization
Last synced: 3 days ago
JSON representation
-
Other learning materials
-
Tech blogs
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization with Kaldi
- Who spoke when! How to Build your own Speaker Diarization Module
- Literature Review For Speaker Change Detection
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Halil Erdoğan
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization with Kaldi
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
- Speaker Diarization: Separation of Multiple Speakers in an Audio File
-
Online courses
-
Video tutorials
- pyannote audio: neural building blocks for speaker diarization
- Google's Diarization System: Speaker Diarization with LSTM
- Fully Supervised Speaker Diarization: Say Goodbye to clustering
- Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
- Speaker Diarization: Optimal Clustering and Learning Speaker Embeddings
- Robust Speaker Diarization for Meetings: the ICSI system
- 【机器之心&博文视点】入门声纹技术|第二讲:声纹分割聚类与其他应用
-
-
Products
-
Video tutorials
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Google Cloud Speech-to-Text API
- Watson Speech To Text API
- Speaker Diarization API
- Tingwu (听悟)
- Azure Conversation Transcription API
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
- Recorder app
-
-
Publications
-
Other
- A study of the cosine distance-based mean shift for telephone speech diarization
- Stream-based speaker segmentation using speaker factors and eigenvoices
- Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation
- End-to-end speaker segmentation for overlap-aware resegmentation
- DIVE: End-to-end Speech Diarization via Iterative Speaker Embedding
- DOVER-Lap: A method for combining overlap-aware diarization outputs
- Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks
- An End-to-End Speaker Diarization Service for improving Multimedia Content Access
- Spot the conversation: speaker diarisation in the wild
- Speaker Diarization with Region Proposal Network
- Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
- Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection
- Speaker diarization using latent space clustering in generative adversarial network
- A study of semi-supervised speaker diarization system using gan mixture model
- Learning deep representations by multilayer bootstrap networks for speaker diarization
- Enhancements for Audio-only Diarization Systems
- LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization
- Meeting Transcription Using Virtual Microphone Arrays
- Speaker diarisation using 2D self-attentive combination of embeddings
- Speaker Diarization with Lexical Information
- Neural speech turn segmentation and affinity propagation for speaker diarization
- Multimodal Speaker Segmentation and Diarization using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks
- Joint Speaker Diarization and Recognition Using Convolutional and Recurrent Neural Networks
- Speaker Diarization with LSTM
- Speaker diarization using deep neural network embeddings
- Speaker diarization using convolutional neural network for statistics accumulation refinement
- pyannote. metrics: a toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems
- Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks
- Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
- A Speaker Diarization System for Studying Peer-Led Team Learning Groups
- Diarization resegmentation in the factor analysis subspace
- A study of the cosine distance-based mean shift for telephone speech diarization
- Speaker diarization with PLDA i-vector scoring and unsupervised calibration
- Artificial neural network features for speaker diarization
- Unsupervised methods for speaker diarization: An integrated and iterative approach
- PLDA-based Clustering for Speaker Diarization of Broadcast Streams
- Speaker diarization of meetings based on speaker role n-gram models
- Speaker Diarization for Meeting Room Audio
- Stream-based speaker segmentation using speaker factors and eigenvoices
- An overview of automatic speaker diarization systems
- A spectral clustering approach to speaker diarization
- AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
-
Special topics
- Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
- A Review of Speaker Diarization: Recent Advances with Deep Learning
- A review on speaker diarization systems and approaches
- Speaker diarization: A review of recent research
- DiarizationLM: Speaker Diarization Post-Processing with Large Language Models
- Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
- Lexical speaker error correction: Leveraging language models for speaker diarization error correction
- TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
- Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
- End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings
- Supervised online diarization with sample mean loss for multi-domain data
- Discriminative Neural Clustering for Speaker Diarisation
- End-to-End Neural Speaker Diarization with Permutation-Free Objectives
- End-to-End Neural Speaker Diarization with Self-attention
- Fully Supervised Speaker Diarization
- A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
- Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
- Joint Speech Recognition and Speaker Diarization via Sequence Transduction
- Says who? Deep learning models for joint speech recognition, segmentation and diarization
- Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox
- Online Speaker Diarization with Relation Network
- VoiceID on the Fly: A Speaker Recognition System that Learns from Scratch
- M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
- The Hitachi-JHU DIHARD III system: Competitive end-to-end neural diarization and x-vector clustering systems combined by DOVER-Lap
- ODESSA at Albayzin Speaker Diarization Challenge 2018
- Joint Discriminative Embedding Learning, Speech Activity and Overlap Detection for the DIHARD Challenge
- DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
- End-to-End Audio-Visual Neural Speaker Diarization
- MSDWild: Multi-modal Speaker Diarization Dataset in the Wild
- DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
- A Review of Speaker Diarization: Recent Advances with Deep Learning
- AVA-AVD: Audio-Visual Speaker Diarization in the Wild
-
-
Software
-
Framework
- SIDEKIT for diarization (s4d)
- LIUM SpkDiarization
- kaldi-asr - ci.com/kaldi-asr/kaldi.svg?branch=master)](https://travis-ci.com/kaldi-asr/kaldi) | Bash | Example scripts for speaker diarization on a portion of CALLHOME used in the 2000 NIST speaker recognition evaluation. |
- Alize LIA_SpkSeg
-
Evaluation
- md-eval.pl - eval-v21.pl](https://github.com/jitendrab/btp/blob/master/c_code/single_diag_gaussian_no_viterbi/md-eval-v21.pl) from [jitendra](https://github.com/jitendrab); (3) [md-eval-22.pl](https://github.com/nryant/dscore/blob/master/scorelib/md-eval-22.pl) from [nryant](https://github.com/nryant) |
- Sequence Match Accuracy
- DiarizationLM - id?style=social) [![Build Status](https://github.com/google/speaker-id/actions/workflows/python-app-diarizationlm.yml/badge.svg)](https://github.com/google/speaker-id/actions/workflows/python-app-diarizationlm.yml) | Python | Implements Word Error Rate (WER), Word Diarization Error Rate (WDER), and concatenated minimum-permutation Word Error Rate (cpWER). |
-
Clustering
- sklearn.cluster - ci.org/scikit-learn/scikit-learn.svg?branch=master)](https://travis-ci.org/scikit-learn/scikit-learn) | Python | scikit-learn clustering algorithms. |
-
-
Datasets
-
Diarization datasets
- 2000 NIST Speaker Recognition Evaluation - 6 (Switchboard)](https://github.com/google/speaker-id/tree/master/publications/LstmDiarization/evaluation/NIST_SRE2000/Disk6_ground_truth), [Disk-8 (CALLHOME)](https://github.com/google/speaker-id/tree/master/publications/LstmDiarization/evaluation/NIST_SRE2000/Disk8_ground_truth) | Multiple | $2400.00 | [Evaluation Plan](https://www.nist.gov/sites/default/files/documents/2017/09/26/spk-2000-plan-v1.0.htm_.pdf) |
- 2003 NIST Rich Transcription Evaluation Data
- The ICSI Meeting Corpus
- The AMI Meeting Corpus
- Fisher English Training Speech Part 1 Speech
- Fisher English Training Part 2, Speech
- CALLHOME American English Speech - id/blob/master/publications/LstmDiarization/evaluation/CALLHOME_American_English/ch109_whitelist.txt) |
-
Speaker embedding training sets
- VCTK
- TIMIT
- LibriSpeech - scale (1000 hours) corpus of read English speech. |
- Multilingual LibriSpeech (MLS) - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish. |
- LibriVox
- The Spoken Wikipedia Corpora
- BookTubeSpeech - videos where people share their opinions on books - from YouTube. The dataset can be downloaded using [BookTubeSpeech-download](https://github.com/wq2012/BookTubeSpeech-download). |
- DeepMine
-
Augmentation noise sources
-
-
Star History
-
Video tutorials
- ![Star History Chart - history.com/#wq2012/awesome-diarization&Date)
-
Categories