Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with speaker-diarization

A curated list of projects in awesome lists tagged with speaker-diarization .

https://github.com/modelscope/funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 17 Dec 2024

https://github.com/modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 29 Oct 2024

https://github.com/pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection

Last synced: 17 Dec 2024

https://github.com/mahmoudashraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 17 Dec 2024

https://github.com/MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 31 Oct 2024

https://github.com/google/uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

clustering machine-learning speaker-diarization speaker-recognition supervised-clustering supervised-learning uis-rnn

Last synced: 18 Dec 2024

https://github.com/modelscope/3d-speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

3d-speaker campplus cnceleb eres2net language-identification modelscope rdino speaker-diarization speaker-verification voxceleb

Last synced: 20 Dec 2024

https://github.com/juanmc2005/diart

A python package to build AI-powered real-time audio applications

deep-learning real-time speaker-diarization speaker-embedding streaming-audio transcription voice-activity-detection

Last synced: 20 Dec 2024

https://github.com/transcriptionstream/transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

automation diarization llm mistral-7b ollama speaker-diarization speech-recognition transcription whisper whisperx

Last synced: 20 Dec 2024

https://github.com/wq2012/spectralcluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

auto-tune clustering constrained-clustering machine-learning python speaker-diarization spectral-clustering unsupervised-clustering unsupervised-learning

Last synced: 19 Dec 2024

https://github.com/manojpamk/pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

speaker-diarization speaker-embeddings speaker-recognition speaker-verification

Last synced: 11 Nov 2024

https://github.com/yufan-aslp/AliMeeting

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

aishell-4 alimeeting asr challenge m2met multi-speaker-asr speaker-diarization

Last synced: 28 Nov 2024

https://github.com/nezhar/speech-condenser

A tool for summarizing dialogues from videos or audio

asr speach-recognition speaker-diarization speaker-identification summarization

Last synced: 01 Nov 2024

https://github.com/vidyasagarmsc/watbot

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

android android-studio assistant chatbot cognitive-services conversation conversation-service dialog entity ibm-cloud intent speaker-diarization speaker-labels speaker-recognition speech speech-to-text text-to-speech watson watson-assistant-service workspace

Last synced: 17 Nov 2024

https://github.com/wq2012/simpleder

A lightweight library to compute Diarization Error Rate (DER).

diarization machine-learning metrics speaker-diarization speech-processing speech-recognition

Last synced: 07 Nov 2024

https://github.com/clement-pages/gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

annotation-processing annotation-tool audio gradio gradio-custom-component interspeech2024 pyannote speaker-diarization speech-processing

Last synced: 20 Dec 2024

https://github.com/picovoice/falcon

On-device speaker diarization powered by deep learning

deep-learning diarization on-device speaker-diarization speaker-recognition

Last synced: 19 Dec 2024

https://github.com/ubclaunchpad/minutes

:telescope: Speaker diarization via transfer learning

library machine-learning python speaker-diarization speech transfer-learning ubc

Last synced: 19 Nov 2024

https://github.com/juanmc2005/rttm-viewer

Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way

plotly rttm speaker-diarization visualization

Last synced: 13 Oct 2024

https://github.com/wq2012/vb_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

machine-learning speaker-diarization speech-processing speech-recognition

Last synced: 14 Oct 2024

https://github.com/shashikg/x-vector-based-speaker-diarization

Course project for EE698R (2020-21 Sem 2). An X-Vector Based Speaker Diarization System with AutoEncoder based clustering method. Also supports spectral and KMeans clustering method.

deep-clustering deep-learning python pytroch speaker-diarization

Last synced: 27 Oct 2024

https://github.com/juanmc2005/csda

Companion repository for the paper "Continual Self-supervised Domain Adaptation for End-to-end Speaker Diarization"

continual-learning domain-adaptation end-to-end pytorch pytorch-lightning self-supervised-learning speaker-diarization

Last synced: 07 Nov 2024

https://github.com/mmxgn/smooth-convex-kl-nmf

Repository holding various implementation of specific NMF methods for speaker diarization

nmf nonnegative-matrix-factorization smoothness sparsity speaker-diarization

Last synced: 05 Nov 2024

https://github.com/maxhollmann/lium-diarization-editor

A very simple viewer/editor for LIUM speaker diarizations.

lium speaker-diarization

Last synced: 02 Dec 2024

https://github.com/scionoftech/speaker_diarization

speaker diarization using spectralcluster and Deeplearning

clustering speaker-diarization speech-recognition

Last synced: 07 Nov 2024

https://github.com/elmiraghorbani/gpt-speaker-diarization

Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.

asr diarization gpt-4 openai speaker-diarization speech-recognition speech-to-text voice-activity-detection whisper youtube-dl

Last synced: 29 Nov 2024

https://github.com/aeronjl/transcribe

Python package for accurate audio transcription with speaker diarisation

audio-transcription gpt speaker-diarization whisper

Last synced: 09 Oct 2024

https://github.com/flo-bit/youtube-speaker-separation

simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate

speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube

Last synced: 19 Dec 2024

https://github.com/katagaki/firesidesubtitles

Video transcription, speaker diarization, and face detection in Python.

audio dnn face-detection openai openai-whisper opencv python speaker-diarization transcription video

Last synced: 19 Nov 2024

https://github.com/mathusanm6/amaze-voice-lab

The goal of this research project is to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.

asr automatic-speech-recognition game java maze research speaker-diarization speaker-recognition voice-recognition

Last synced: 13 Dec 2024

https://github.com/nicknaskida/cog-whisper-diarization

Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote

diarization openai-whisper pyannote replicate speaker-diarization whisper whisper-faster whisperx

Last synced: 27 Sep 2024

https://github.com/aeronjl/transcribe-streamlit

Streamlit user interface for transcribing conversations with speaker diarisation

audio-transcription speaker-diarization streamlit

Last synced: 15 Nov 2024

https://github.com/mikeesto/gemini-transcribe

Transcribe audio and video files with speaker diarization and logically grouped timestamps

gemini-flash speaker-diarization speech-to-text sveltekit transcription

Last synced: 19 Dec 2024