Projects in Awesome Lists tagged with speaker-recognition

https://github.com/nvidia/nemo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 12 May 2025

https://github.com/NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 14 Mar 2025

https://github.com/speechbrain/speechbrain

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

Last synced: 13 May 2025

https://github.com/pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection

Last synced: 13 May 2025

https://github.com/google/uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

clustering machine-learning speaker-diarization speaker-recognition supervised-clustering supervised-learning uis-rnn

Last synced: 14 May 2025

https://github.com/mravanelli/sincnet

SincNet is a neural architecture for efficiently processing raw audio samples.

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform

Last synced: 16 May 2025

https://github.com/mravanelli/SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform

Last synced: 26 Apr 2025

https://github.com/clovaai/voxceleb_trainer

In defence of metric learning for speaker recognition

metric-learning speaker-recognition speaker-verification voxceleb

Last synced: 16 May 2025

https://github.com/yeyupiaoling/voiceprintrecognition-pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

arcface ecapa-tdnn pytorch speaker-recognition voice-recognition

Last synced: 15 May 2025

https://github.com/athena-team/athena

an open-source implementation of sequence-to-sequence based speech processing engine

asr ctc deployment sequence-to-sequence speaker-recognition speech-recognition speech-synthesis tensorflow transformer tts unsupervised-learning wfst

Last synced: 28 Nov 2024

https://github.com/wenet-e2e/wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

asv campplus cnceleb dino ecapa-tdnn eres2net nist-sre plda production-ready pytorch redimnet repvgg resnet speaker-diarization speaker-recognition speaker-verification ssl voxceleb wavlm xvector

Last synced: 16 May 2025

https://github.com/astorfi/3d-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

3d convolutional-neural-networks deep-learning speaker-recognition

Last synced: 04 Apr 2025

https://github.com/astorfi/3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

3d convolutional-neural-networks deep-learning speaker-recognition

Last synced: 24 Mar 2025

https://github.com/cvqluu/Angular-Penalty-Softmax-Losses-Pytorch

Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)

am-softmax arcface embedding face-recognition face-verification fashion-mnist fmnist-dataset loss-function loss-functions metric-learning normface pytorch speaker-recognition sphereface

Last synced: 03 Apr 2025

https://github.com/speechbrain/speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

beamforming deep-learning deeplearning librispeech neural-network neural-networks speaker-identification speaker-recognition speaker-verification speech speech-analysis speech-api speech-emotion-recognition speech-processing speech-recognition speech-recognizer speech-separation speech-to-text speechrecognition timit

Last synced: 04 May 2025

https://github.com/yeyupiaoling/voiceprintrecognition-tensorflow

使用Tensorflow实现声纹识别

arcface speaker-recognition tensorflow voice-recognition

Last synced: 06 Apr 2025

https://github.com/manojpamk/pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

speaker-diarization speaker-embeddings speaker-recognition speaker-verification

Last synced: 26 Apr 2025

https://github.com/samirpaulb/real-time-voice-translator

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

deep-translator final-year-project googletranslator gtts gui linguasync machine-learning ml playsound python real-time-transcription speaker-recognition speech-to-speech speech-to-text speechrecognition text-to-speech tkinter translates-audio translation voice-translator

Last synced: 12 May 2025

https://github.com/yeyupiaoling/voiceprintrecognition-paddlepaddle

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型，同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法

arcface ecapa-tdnn paddlepaddle speaker-recognition voice-recognition

Last synced: 12 Apr 2025

https://github.com/vita-group/autospeech

[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang

automl autospeech neural-architecture-search pytorch speaker-recognition

Last synced: 13 Apr 2025

https://github.com/atul-anand-jha/speaker-identification-python

Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library

python-2 speaker-identification speaker-recognition

Last synced: 10 May 2025

https://github.com/IBM-Cloud/chatbot-watson-android

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

android android-studio chatbot conversation conversation-service dialog entity ibm-cloud ibm-cloud-solutions ibm-watson ibm-watson-services intent java speaker-diarization speaker-recognition speech watson watson-services workspace

Last synced: 13 May 2025

https://github.com/oscarknagg/voicemap

Identifying people from small audio fragments

convolutional-neural-networks machine-learning speaker-identification speaker-recognition

Last synced: 02 May 2025

https://github.com/lihanghang/casr-demo

基于Flask Web的中文自动语音识别演示系统,包含语音识别、语音合成、声纹识别之说话人识别。

baidu-aip casr-demo ctc flask-application gmm pyaudio speaker-recognition speech-to-text

Last synced: 19 Dec 2024

https://github.com/yeyupiaoling/voiceprintrecognition-keras

基于Kersa实现的声纹识别模型

deep-learning kersa speaker-recognition tensorflow voice-recognition

Last synced: 22 Mar 2025

https://github.com/jefflai108/pytorch-kaldi-neural-speaker-embeddings

A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.

kaldi learnable-dictionary-encoding pytorch speaker-identification speaker-recognition speaker-verification speech-processing

Last synced: 27 Nov 2024

https://github.com/Anwarvic/Speaker-Recognition

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

gmm gmm-ubm i-vector identity-vector identity-verification sidekit speaker-identification speaker-recognition speaker-verification ubm

Last synced: 27 Nov 2024

https://github.com/georgygospodinov/speech_course

Deep Learning for Speech

asr deep-learning keyword-spotting self-supervised-learning speaker-recognition speech-recognition tts

Last synced: 07 Apr 2025

https://github.com/seongmin-kye/meta-SR

Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)

meta-learning short-utterances speaker-recognition speaker-verification

Last synced: 11 May 2025

https://github.com/grausof/keras-sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering keras machine-learning neural-network speaker-recognition speaker-verification speech-processing speech-recognition tensorflow timit waveform

Last synced: 13 Apr 2025

https://github.com/vidyasagarmsc/watbot

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

android android-studio assistant chatbot cognitive-services conversation conversation-service dialog entity ibm-cloud intent speaker-diarization speaker-labels speaker-recognition speech speech-to-text text-to-speech watson watson-assistant-service workspace

Last synced: 22 Apr 2025

https://github.com/cyrta/voxceleb

mirror of VoxCeleb dataset - a large-scale speaker identification dataset

corpus dataset speaker speaker-identification speaker-recognition speaker-verification speech

Last synced: 11 May 2025

https://github.com/zycv/openspeaker

OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognition including multi-platform deployment and model optimization.

speaker-recognition speaker-verification voiceprint-recognition

Last synced: 08 May 2025

https://github.com/wadaboa/titanet

Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO

d-vectors ml4cv nvidia speaker-embeddings speaker-identification speaker-recognition speaker-verification titanet unibo

Last synced: 21 Mar 2025

https://github.com/andi611/mockingjay-speech-representation

Official Implementation of Mockingjay in Pytorch

apc feature-extraction mockingjay phone-classification phoneme-prediction pytorch pytorch-implementation representation-learning sentiment-classification speaker-classification speaker-recognition speech speech-representation

Last synced: 13 Apr 2025

https://github.com/andi611/Mockingjay-Speech-Representation

Official Implementation of Mockingjay in Pytorch

apc feature-extraction mockingjay phone-classification phoneme-prediction pytorch pytorch-implementation representation-learning sentiment-classification speaker-classification speaker-recognition speech speech-representation

Last synced: 27 Nov 2024

https://github.com/superkogito/voice-based-speaker-identification

:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM

gaussian-mixture-models gmm machine-learning mel-frequencies mel-frequency-cepstral-coefficients mfcc scikit-learn scikit-learn-python signal speaker-identification speaker-recognition speech vocal voice

Last synced: 04 Apr 2025

https://github.com/wq2012/speakerrecognitionfromscratch

Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家

attention-mechanism deep-learning librispeech lstm neural-network pytorch speaker-recognition speaker-recognition-systems transformer transformer-models

Last synced: 12 Apr 2025

https://github.com/maxhollmann/voxceleb-luigi

Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments

luigi speaker-embedding speaker-recognition speaker-verification voxceleb

Last synced: 02 Dec 2024

https://github.com/picovoice/falcon

On-device speaker diarization powered by deep learning

deep-learning diarization on-device speaker-diarization speaker-recognition

Last synced: 31 Mar 2025

https://github.com/zycv/speaker-recognition-based-on-deep-learning-an-overview

This repo is to list the references papers of 《Speaker Recognition Based on Deep Learning: An Overview》

deep-learning speaker-recognition speaker-verification

Last synced: 23 Mar 2025

https://github.com/ttop32/wav2vec2-live-japanese-translator

real time japanese speech recognition translator using wav2vec2

asr audio automatic-speech-recognition fine-tuning huggingface japanese live pyaudio pyqt5 pytorch real-time speaker-recognition speech-to-text spoken-language-understanding stt translation translator voice voice-recognition wav2vec2

Last synced: 29 Apr 2025

https://github.com/picovoice/eagle

On-device speaker recognition engine powered by deep learning

speaker-embedding speaker-identification speaker-recognition

Last synced: 09 Apr 2025

https://github.com/mycrazycracy/tf-kaldi-speaker

Neural speaker recognition/verification system based on Kaldi and Tensorflow

kaldi kaldi-asr machine-learning neural-network speaker-identification speaker-recognition speaker-verification speech-processing tensorflow

Last synced: 03 May 2025

https://github.com/alifarazz/september

:microphone: An offline text-independent speaker recognition system

ansi-c cross-platform gtk3 openal speaker-recognition sptk

Last synced: 06 Apr 2025

https://github.com/luan78zaoha/kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database

chinese i-vector kaldi speaker-recognition speaker-verification sre

Last synced: 11 Mar 2025

https://github.com/zabir-nabil/audioperm

A python library for generating different permutations of audible segments from audio files.

audio-augmentation audio-classification audio-processing augmentation speaker-recognition speech-augmentation

Last synced: 02 Dec 2024

https://github.com/clivern/wit-java

🗿Java Library For Wit.ai

bots chatbot chatbots machine-learning natural-language-processing nlp speaker-recognition wit-ai

Last synced: 19 Apr 2025

https://github.com/ciscodevnet/vo-id

audio-processing pytorch speaker-diarization speaker-identification speaker-recognition speaker-verification

Last synced: 14 Apr 2025

https://github.com/prmelehan/speaker-recognition

Recognizing a speaker using Deep Learning

cnn deep-learning keras python scipy speaker-recognition

Last synced: 30 Apr 2025

https://github.com/fabioravila/recognito-csharp

Text Independent Speaker Recognition in CSharp, based on recognito in Java

csharp offline recognito-csharp speaker-identification speaker-recognition

Last synced: 04 May 2025

https://github.com/musa11971/manhuw

Recognizing and identifying Quran reciters from audio recordings.

librosa machine-learning python quran speaker-identification speaker-recognition

Last synced: 22 Apr 2025

https://github.com/dinhanhx/automatic_speaker_recognition

A repos for USTH Digital Signal Processing 2020 Group 3 project. It's quite obvious in the title.

datasets digital-signal-processing dsp gmm human machine-learning mfcc-features python python-3 python3 signal-processing sklearn speaker-recognition voice voice-recognition wav-files

Last synced: 22 Feb 2025

https://github.com/nicolay-r/book-persona-retriever

Workflow for literature character personality profiling 📚 which is solely relies on book content 📖

books dialogs dialogue-systems human-level-attributes parlai personality-traits pipeline project-gutenberg ranking-system response-generation sampling speaker-recognition

Last synced: 19 Dec 2024

https://github.com/smorodov/kaldi_vosk_win_cmake

cmake based kaldi + vosk + microphone speech recognition example

kaldi speaker-recognition speech-recognition speech-to-text voice-recognition vosk

Last synced: 10 Apr 2025

https://github.com/jackaduma/speaker_recognition_models.pytorch

speaker recognition / speaker verification models in pytorch implementation

deep-learning deep-speaker pytorch recognition speaker-identification speaker-recognition speaker-verification speech-recognition voice voice-assistant voice-recognition voiceprint voiceprint-recognition

Last synced: 26 Feb 2025

https://github.com/zabir-nabil/tf2-speaker-recognition

speaker recognition in tensorflow 2

deep-learning speaker-identification speaker-recognition speaker-verification speech-processing tensorflow tensorflow2

Last synced: 29 Jan 2025

https://github.com/spaceshaman/spaceassistant

Voice assistant built with Vue.js. Easily to hack and extend.

ai ai-assistant artificial-intelligence assistant chatbot chatgpt gpt gpt-3 gpt-4 gpt3 gpt4 openai personal-assistant speaker-recognition speech-recognition virtual-assistant voice-assistant vue

Last synced: 05 Apr 2025

https://github.com/zabir-nabil/speaker-verification-gmm

Speaker verification using Gaussian Mixture Model (GMM)

gaussian-mixture-models speaker-identification speaker-identity speaker-recognition speaker-recognition-gmm speaker-verification speaker-verification-gmm timit-dataset

Last synced: 24 Mar 2025

https://github.com/limdongjin/ignkafasr

Real-Time In-memory Speaker Verification and Speech Recognition Project using apache ignite, apache kafka, speechbrain, whisper, stomp, spring webflux, kubernetes(k8s)

apache-ignite apache-kafka asr audio-recorder google-kubernetes-engine k8s kubernetes speaker-recognition speaker-verification speech-recognition speechbrain springframework stomp stompwebsocket webflux whisper

Last synced: 11 Mar 2025

https://github.com/paulcampbell/zimzimmer

(Who am I?)

machine-learning realtime-predict speaker-recognition tensorflow

Last synced: 13 Mar 2025

https://github.com/jpzinn654/speaker-diarization-portuguese

This project implements speaker diarization for Portuguese audio using WhisperX for transcription and PyAnotAudio's Speaker-Diarization 3.1 for speaker separation. It includes a Flask UI for easy file upload, transcription, and speaker identification.

flask gender-detection portuguese-language speaker-diarization speaker-recognition speech-recognition transcription whisper

Last synced: 22 Mar 2025

https://github.com/antonako1/audiowhisper

Microphone audio playback

batch csharp headphones microphone microphone-audio-capture nsis powershell speaker speaker-recognition windows

Last synced: 06 Apr 2025

https://github.com/manohara-ai/speaker_recognition

Recognizing speakers using fully connected CNN with help of spectrograms

cnn machine-learning machine-learning-algorithms pytorch speaker-recognition spectrogram

Last synced: 31 Mar 2025

https://github.com/micyg/matrixspy

Matrix display which recognise speaker by his voice

led-display led-matrix raspberry-pi respeaker-6mics-array speaker-identification speaker-recognition

Last synced: 18 Feb 2025

https://github.com/anurag1101/robot_speaker

A simple, interactive Python-based text-to-speech application that converts user input into spoken words using pyttsx3. Type any text, hear it spoken aloud, and enjoy a friendly farewell when you’re done!

libraries modules python python3 speaker-recognition speech

Last synced: 08 Apr 2025

https://github.com/slinusc/speaker_recognition_evaluation

Investigating Layer-Specific Performance in Speaker Recognition with XLS-R Architecture

hidden-states speaker-recognition xls-r

Last synced: 24 Feb 2025

https://github.com/mtwn105/audio-intel

AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more

ai assemblyai audio audio-processing diarization lemur sonet speaker-diarization speaker-recognition speech-recognition speech-to-text transcript

Last synced: 04 Apr 2025

https://github.com/mahshid1378/nemo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-langage-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 08 Apr 2025

https://github.com/jakubvojvoda/speaker-recognition

Speaker Recognition based on GMM

audio-processing gaussian-mixture-model gmm matlab signal-processing speaker-recognition

Last synced: 14 May 2025

https://github.com/mathusanm6/amaze-voice-lab

The goal of this research project was to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.

asr automatic-speech-recognition game java maze research speaker-diarization speaker-recognition voice-recognition

Last synced: 31 Mar 2025