Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with speaker-diarization
A curated list of projects in awesome lists tagged with speaker-diarization .
https://github.com/speechbrain/speechbrain
A PyTorch-based Speech Toolkit
asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition
Last synced: 16 Dec 2024
https://github.com/espnet/espnet
End-to-End Speech Processing Toolkit
chainer deep-learning end-to-end kaldi machine-translation pytorch singing-voice-synthesis speaker-diarization speech-enhancement speech-recognition speech-separation speech-synthesis speech-translation spoken-language-understanding text-to-speech voice-conversion
Last synced: 16 Dec 2024
https://github.com/modelscope/funasr
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper
Last synced: 17 Dec 2024
https://github.com/modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper
Last synced: 29 Oct 2024
https://github.com/pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection
Last synced: 17 Dec 2024
https://github.com/mahmoudashraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 17 Dec 2024
https://github.com/MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 31 Oct 2024
https://github.com/linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper
Last synced: 17 Dec 2024
https://github.com/google/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
clustering machine-learning speaker-diarization speaker-recognition supervised-clustering supervised-learning uis-rnn
Last synced: 18 Dec 2024
https://github.com/purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
asr ctranslate2 diarization faster-whisper openai speaker-diarization speech-recognition speech-to-text subtitles transcriber uvr vocal-extractor whisper whisper-faster whisperx
Last synced: 20 Dec 2024
https://github.com/modelscope/3d-speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
3d-speaker campplus cnceleb eres2net language-identification modelscope rdino speaker-diarization speaker-verification voxceleb
Last synced: 20 Dec 2024
https://github.com/juanmc2005/diart
A python package to build AI-powered real-time audio applications
deep-learning real-time speaker-diarization speaker-embedding streaming-audio transcription voice-activity-detection
Last synced: 20 Dec 2024
https://github.com/transcriptionstream/transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
automation diarization llm mistral-7b ollama speaker-diarization speech-recognition transcription whisper whisperx
Last synced: 20 Dec 2024
https://github.com/yinruiqing/pyannote-whisper
asr chatgpt meeting-summarization pyannote speaker-diarization whisper
Last synced: 20 Dec 2024
https://github.com/wq2012/spectralcluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
auto-tune clustering constrained-clustering machine-learning python speaker-diarization spectral-clustering unsupervised-clustering unsupervised-learning
Last synced: 19 Dec 2024
https://github.com/wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
asv campplus cnceleb dino ecapa-tdnn eres2net nist-sre plda production-ready pytorch repvgg resnet self-supervised-learning speaker-diarization speaker-recognition speaker-verification ssl tdnn voxceleb xvector
Last synced: 16 Dec 2024
https://github.com/revdotcom/reverb
Open source inference code for Rev's model
asr asr-model canary deeplearning diarization docker huggingface neural-network open-source opensource pyannote rev revai speaker-diarization speech-recognition speech-to-text speechrecognition wenet whisper
Last synced: 15 Dec 2024
https://github.com/manojpamk/pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
speaker-diarization speaker-embeddings speaker-recognition speaker-verification
Last synced: 11 Nov 2024
https://github.com/IBM-Cloud/chatbot-watson-android
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
android android-studio chatbot conversation conversation-service dialog entity ibm-cloud ibm-cloud-solutions ibm-watson ibm-watson-services intent java speaker-diarization speaker-recognition speech watson watson-services workspace
Last synced: 18 Nov 2024
https://github.com/yufan-aslp/AliMeeting
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
aishell-4 alimeeting asr challenge m2met multi-speaker-asr speaker-diarization
Last synced: 28 Nov 2024
https://github.com/nezhar/speech-condenser
A tool for summarizing dialogues from videos or audio
asr speach-recognition speaker-diarization speaker-identification summarization
Last synced: 01 Nov 2024
https://github.com/vidyasagarmsc/watbot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
android android-studio assistant chatbot cognitive-services conversation conversation-service dialog entity ibm-cloud intent speaker-diarization speaker-labels speaker-recognition speech speech-to-text text-to-speech watson watson-assistant-service workspace
Last synced: 17 Nov 2024
https://github.com/wq2012/simpleder
A lightweight library to compute Diarization Error Rate (DER).
diarization machine-learning metrics speaker-diarization speech-processing speech-recognition
Last synced: 07 Nov 2024
https://github.com/clement-pages/gryannote
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
annotation-processing annotation-tool audio gradio gradio-custom-component interspeech2024 pyannote speaker-diarization speech-processing
Last synced: 20 Dec 2024
https://github.com/picovoice/falcon
On-device speaker diarization powered by deep learning
deep-learning diarization on-device speaker-diarization speaker-recognition
Last synced: 19 Dec 2024
https://github.com/ubclaunchpad/minutes
:telescope: Speaker diarization via transfer learning
library machine-learning python speaker-diarization speech transfer-learning ubc
Last synced: 19 Nov 2024
https://github.com/linto-ai/linto-diarization
Speaker diarization service
asr linto speaker-diarization speaker-identification
Last synced: 07 Nov 2024
https://github.com/juanmc2005/rttm-viewer
Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way
plotly rttm speaker-diarization visualization
Last synced: 13 Oct 2024
https://github.com/wq2012/vb_diarization
VB Diarization with Eigenvoice and HMM Priors, refactored
machine-learning speaker-diarization speech-processing speech-recognition
Last synced: 14 Oct 2024
https://github.com/shashikg/x-vector-based-speaker-diarization
Course project for EE698R (2020-21 Sem 2). An X-Vector Based Speaker Diarization System with AutoEncoder based clustering method. Also supports spectral and KMeans clustering method.
deep-clustering deep-learning python pytroch speaker-diarization
Last synced: 27 Oct 2024
https://github.com/juanmc2005/csda
Companion repository for the paper "Continual Self-supervised Domain Adaptation for End-to-end Speaker Diarization"
continual-learning domain-adaptation end-to-end pytorch pytorch-lightning self-supervised-learning speaker-diarization
Last synced: 07 Nov 2024
https://github.com/gorkemkaramolla/whisper-run
Faster Whisper with Speaker Diarization
distil-whisper faster-whisper openai pyannote speaker-diarization speech-recognition transcription whisper whisper-large
Last synced: 09 Oct 2024
https://github.com/mmxgn/smooth-convex-kl-nmf
Repository holding various implementation of specific NMF methods for speaker diarization
nmf nonnegative-matrix-factorization smoothness sparsity speaker-diarization
Last synced: 05 Nov 2024
https://github.com/maxhollmann/lium-diarization-editor
A very simple viewer/editor for LIUM speaker diarizations.
Last synced: 02 Dec 2024
https://github.com/scionoftech/speaker_diarization
speaker diarization using spectralcluster and Deeplearning
clustering speaker-diarization speech-recognition
Last synced: 07 Nov 2024
https://github.com/elmiraghorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
asr diarization gpt-4 openai speaker-diarization speech-recognition speech-to-text voice-activity-detection whisper youtube-dl
Last synced: 29 Nov 2024
https://github.com/nicknaskida/insanely-fast-whisper
Incredibly fast Whisper-large-v3 with speaker diarization
diarization speaker-diarization transfromers whisper whisper-ai whisper-faster whisper-large
Last synced: 26 Sep 2024
https://github.com/aeronjl/transcribe
Python package for accurate audio transcription with speaker diarisation
audio-transcription gpt speaker-diarization whisper
Last synced: 09 Oct 2024
https://github.com/flo-bit/youtube-speaker-separation
simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate
speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube
Last synced: 19 Dec 2024
https://github.com/katagaki/firesidesubtitles
Video transcription, speaker diarization, and face detection in Python.
audio dnn face-detection openai openai-whisper opencv python speaker-diarization transcription video
Last synced: 19 Nov 2024
https://github.com/mathusanm6/amaze-voice-lab
The goal of this research project is to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.
asr automatic-speech-recognition game java maze research speaker-diarization speaker-recognition voice-recognition
Last synced: 13 Dec 2024
https://github.com/nicknaskida/cog-whisper-diarization
Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
diarization openai-whisper pyannote replicate speaker-diarization whisper whisper-faster whisperx
Last synced: 27 Sep 2024
https://github.com/mtwn105/audio-intel
AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more
ai assemblyai audio audio-processing diarization lemur sonet speaker-diarization speaker-recognition speech-recognition speech-to-text transcript
Last synced: 17 Dec 2024
https://github.com/aeronjl/transcribe-streamlit
Streamlit user interface for transcribing conversations with speaker diarisation
audio-transcription speaker-diarization streamlit
Last synced: 15 Nov 2024
https://github.com/yinruiqing/annotation_generator
annotation generator for diarization task
e2e-speaker-diarization speaker-diarization target-speaker-vad
Last synced: 19 Dec 2024
https://github.com/mikeesto/gemini-transcribe
Transcribe audio and video files with speaker diarization and logically grouped timestamps
gemini-flash speaker-diarization speech-to-text sveltekit transcription
Last synced: 19 Dec 2024