Projects in Awesome Lists tagged with speaker-diarization

https://github.com/speechbrain/speechbrain

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

Last synced: 16 Dec 2024

https://github.com/espnet/espnet

End-to-End Speech Processing Toolkit

chainer deep-learning end-to-end kaldi machine-translation pytorch singing-voice-synthesis speaker-diarization speech-enhancement speech-recognition speech-separation speech-synthesis speech-translation spoken-language-understanding text-to-speech voice-conversion

Last synced: 16 Dec 2024

https://github.com/modelscope/funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 17 Dec 2024

https://github.com/modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 29 Oct 2024

https://github.com/pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection

Last synced: 17 Dec 2024

https://github.com/mahmoudashraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 17 Dec 2024

https://github.com/MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 31 Oct 2024

https://github.com/linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper

Last synced: 17 Dec 2024

https://github.com/google/uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

clustering machine-learning speaker-diarization speaker-recognition supervised-clustering supervised-learning uis-rnn

Last synced: 18 Dec 2024

https://github.com/purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

asr ctranslate2 diarization faster-whisper openai speaker-diarization speech-recognition speech-to-text subtitles transcriber uvr vocal-extractor whisper whisper-faster whisperx

Last synced: 20 Dec 2024

https://github.com/modelscope/3d-speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

3d-speaker campplus cnceleb eres2net language-identification modelscope rdino speaker-diarization speaker-verification voxceleb

Last synced: 20 Dec 2024

https://github.com/juanmc2005/diart

A python package to build AI-powered real-time audio applications

deep-learning real-time speaker-diarization speaker-embedding streaming-audio transcription voice-activity-detection

Last synced: 20 Dec 2024

https://github.com/transcriptionstream/transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

automation diarization llm mistral-7b ollama speaker-diarization speech-recognition transcription whisper whisperx

Last synced: 20 Dec 2024

https://github.com/yinruiqing/pyannote-whisper

asr chatgpt meeting-summarization pyannote speaker-diarization whisper

Last synced: 20 Dec 2024

https://github.com/wq2012/spectralcluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

auto-tune clustering constrained-clustering machine-learning python speaker-diarization spectral-clustering unsupervised-clustering unsupervised-learning

Last synced: 19 Dec 2024

https://github.com/wenet-e2e/wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

asv campplus cnceleb dino ecapa-tdnn eres2net nist-sre plda production-ready pytorch repvgg resnet self-supervised-learning speaker-diarization speaker-recognition speaker-verification ssl tdnn voxceleb xvector

Last synced: 16 Dec 2024

https://github.com/revdotcom/reverb

Open source inference code for Rev's model

asr asr-model canary deeplearning diarization docker huggingface neural-network open-source opensource pyannote rev revai speaker-diarization speech-recognition speech-to-text speechrecognition wenet whisper

Last synced: 15 Dec 2024

https://github.com/manojpamk/pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

speaker-diarization speaker-embeddings speaker-recognition speaker-verification

Last synced: 11 Nov 2024

https://github.com/IBM-Cloud/chatbot-watson-android

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

android android-studio chatbot conversation conversation-service dialog entity ibm-cloud ibm-cloud-solutions ibm-watson ibm-watson-services intent java speaker-diarization speaker-recognition speech watson watson-services workspace

Last synced: 18 Nov 2024

https://github.com/yufan-aslp/AliMeeting

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

aishell-4 alimeeting asr challenge m2met multi-speaker-asr speaker-diarization

Last synced: 28 Nov 2024

https://github.com/nezhar/speech-condenser

A tool for summarizing dialogues from videos or audio

asr speach-recognition speaker-diarization speaker-identification summarization

Last synced: 01 Nov 2024

https://github.com/vidyasagarmsc/watbot

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

android android-studio assistant chatbot cognitive-services conversation conversation-service dialog entity ibm-cloud intent speaker-diarization speaker-labels speaker-recognition speech speech-to-text text-to-speech watson watson-assistant-service workspace

Last synced: 17 Nov 2024

https://github.com/wq2012/simpleder

A lightweight library to compute Diarization Error Rate (DER).

diarization machine-learning metrics speaker-diarization speech-processing speech-recognition

Last synced: 07 Nov 2024

https://github.com/clement-pages/gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

annotation-processing annotation-tool audio gradio gradio-custom-component interspeech2024 pyannote speaker-diarization speech-processing

Last synced: 20 Dec 2024

https://github.com/picovoice/falcon

On-device speaker diarization powered by deep learning

deep-learning diarization on-device speaker-diarization speaker-recognition

Last synced: 19 Dec 2024

https://github.com/ubclaunchpad/minutes

:telescope: Speaker diarization via transfer learning

library machine-learning python speaker-diarization speech transfer-learning ubc

Last synced: 19 Nov 2024

https://github.com/linto-ai/linto-diarization

Speaker diarization service

asr linto speaker-diarization speaker-identification

Last synced: 07 Nov 2024

https://github.com/juanmc2005/rttm-viewer

Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way

plotly rttm speaker-diarization visualization

Last synced: 13 Oct 2024

https://github.com/wq2012/vb_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

machine-learning speaker-diarization speech-processing speech-recognition

Last synced: 14 Oct 2024

https://github.com/shashikg/x-vector-based-speaker-diarization

Course project for EE698R (2020-21 Sem 2). An X-Vector Based Speaker Diarization System with AutoEncoder based clustering method. Also supports spectral and KMeans clustering method.

deep-clustering deep-learning python pytroch speaker-diarization

Last synced: 27 Oct 2024

https://github.com/ciscodevnet/vo-id

audio-processing pytorch speaker-diarization speaker-identification speaker-recognition speaker-verification

Last synced: 16 Nov 2024

https://github.com/juanmc2005/csda

Companion repository for the paper "Continual Self-supervised Domain Adaptation for End-to-end Speaker Diarization"

continual-learning domain-adaptation end-to-end pytorch pytorch-lightning self-supervised-learning speaker-diarization

Last synced: 07 Nov 2024

https://github.com/gorkemkaramolla/whisper-run

Faster Whisper with Speaker Diarization

distil-whisper faster-whisper openai pyannote speaker-diarization speech-recognition transcription whisper whisper-large

Last synced: 09 Oct 2024

https://github.com/mmxgn/smooth-convex-kl-nmf

Repository holding various implementation of specific NMF methods for speaker diarization

nmf nonnegative-matrix-factorization smoothness sparsity speaker-diarization

Last synced: 05 Nov 2024

https://github.com/maxhollmann/lium-diarization-editor

A very simple viewer/editor for LIUM speaker diarizations.

lium speaker-diarization

Last synced: 02 Dec 2024

https://github.com/scionoftech/speaker_diarization

speaker diarization using spectralcluster and Deeplearning

clustering speaker-diarization speech-recognition

Last synced: 07 Nov 2024

https://github.com/elmiraghorbani/gpt-speaker-diarization

Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.

asr diarization gpt-4 openai speaker-diarization speech-recognition speech-to-text voice-activity-detection whisper youtube-dl

Last synced: 29 Nov 2024

https://github.com/nicknaskida/insanely-fast-whisper

Incredibly fast Whisper-large-v3 with speaker diarization

diarization speaker-diarization transfromers whisper whisper-ai whisper-faster whisper-large

Last synced: 26 Sep 2024

https://github.com/aeronjl/transcribe

Python package for accurate audio transcription with speaker diarisation

audio-transcription gpt speaker-diarization whisper

Last synced: 09 Oct 2024

https://github.com/flo-bit/youtube-speaker-separation

simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate

speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube

Last synced: 19 Dec 2024

https://github.com/katagaki/firesidesubtitles

Video transcription, speaker diarization, and face detection in Python.

audio dnn face-detection openai openai-whisper opencv python speaker-diarization transcription video

Last synced: 19 Nov 2024

https://github.com/mathusanm6/amaze-voice-lab

The goal of this research project is to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.

asr automatic-speech-recognition game java maze research speaker-diarization speaker-recognition voice-recognition