Projects in Awesome Lists tagged with voice-recognition
A curated list of projects in awesome lists tagged with voice-recognition .
https://github.com/paddlepaddle/paddlespeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 12 May 2025
https://github.com/PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 24 Mar 2025
https://github.com/speechbrain/speechbrain
A PyTorch-based Speech Toolkit
asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition
Last synced: 13 May 2025
https://github.com/alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
android asr deep-learning deep-neural-networks deepspeech google-speech-to-text ios kaldi offline privacy python raspberry-pi speaker-identification speaker-verification speech-recognition speech-to-text speech-to-text-android stt voice-recognition vosk
Last synced: 12 May 2025
https://github.com/snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition
Last synced: 13 May 2025
https://github.com/theajack/cnchar
🇨🇳 功能全面的汉字工具库 (拼音 笔画 偏旁 成语 语音 可视化等) (Chinese character util)
chinese-characters draw pinyin speak spell-stroke voice-recognition
Last synced: 23 Apr 2025
https://github.com/coqui-ai/stt
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 14 May 2025
https://github.com/coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 15 Mar 2025
https://github.com/collabora/whisperlive
A nearly-live implementation of OpenAI's Whisper.
dictation obs openai tensorrt tensorrt-llm text-to-speech translation voice-recognition whisper whisper-tensorrt
Last synced: 09 Apr 2025
https://github.com/collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
dictation obs openai tensorrt tensorrt-llm text-to-speech translation voice-recognition whisper whisper-tensorrt
Last synced: 07 Apr 2025
https://github.com/react-native-voice/voice
:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
android ios react-native speech-recognition voice-recognition
Last synced: 14 May 2025
https://github.com/react-native-community/voice
:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
android ios react-native speech-recognition voice-recognition
Last synced: 09 Mar 2025
https://github.com/wenkesj/react-native-voice
:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
android ios react-native speech-recognition voice-recognition
Last synced: 21 Feb 2025
https://github.com/jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
audio-dataset audio-datasets data dataset datasets noise voice voice-activity-detection voice-assistant voice-chat voice-commands voice-computing voice-control voice-conversion voice-dataset voice-datasets voice-recognition voice-synthesis
Last synced: 26 Mar 2025
https://github.com/coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition
Last synced: 26 Mar 2025
https://github.com/yeyupiaoling/voiceprintrecognition-pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
arcface ecapa-tdnn pytorch speaker-recognition voice-recognition
Last synced: 15 May 2025
https://github.com/ggeop/python-ai-assistant
Python AI assistant 🧠
ai google-speech-recognition google-speech-to-text linux-assistant mongodb nlp nlp-machine-learning nltk pymongo python python35 sklearn voice-activity-detection voice-assistant voice-chat voice-commands voice-control voice-recognition voice-recognition-experiment wolfram-language
Last synced: 14 May 2025
https://github.com/ggeop/Python-ai-assistant
Python AI assistant 🧠
ai google-speech-recognition google-speech-to-text linux-assistant mongodb nlp nlp-machine-learning nltk pymongo python python35 sklearn voice-activity-detection voice-assistant voice-chat voice-commands voice-control voice-recognition voice-recognition-experiment wolfram-language
Last synced: 07 Apr 2025
https://github.com/mycroftai/mycroft-precise
A lightweight, simple-to-use, RNN wake word listener
embedded-systems hotword-detection keyword-spotting raspberry-pi speech-recognition voice-control voice-recognition wake-word-detection
Last synced: 15 May 2025
https://github.com/MycroftAI/mycroft-precise
A lightweight, simple-to-use, RNN wake word listener
embedded-systems hotword-detection keyword-spotting raspberry-pi speech-recognition voice-control voice-recognition wake-word-detection
Last synced: 02 Apr 2025
https://github.com/wstxda/plugin-voicegpt
Use ChatGPT voice chat as your device digital assistant
ai android chatgpt launcher material-design openai plugin shortcuts tools voice voice-assistant voice-recognition
Last synced: 15 May 2025
https://github.com/alexylem/jarvis
Jarvis.sh is a simple configurable multi-lang assistant.
assistant home-automation jarvis jasper personal-assistant raspberry-pi sarah voice-commands voice-control voice-recognition
Last synced: 24 Nov 2024
https://github.com/picovoice/rhino
On-device Speech-to-Intent engine powered by deep learning
entity-resolution intent-inference natural-language-understanding nlu on-device slot-filling slu speech-recognition spoken-language-understanding voice-assistant voice-command voice-command-control voice-commands voice-control voice-recognition voice-ui voice-user-interface vui
Last synced: 14 May 2025
https://github.com/Picovoice/rhino
On-device Speech-to-Intent engine powered by deep learning
entity-resolution intent-inference natural-language-understanding nlu on-device slot-filling slu speech-recognition spoken-language-understanding voice-assistant voice-command voice-command-control voice-commands voice-control voice-recognition voice-ui voice-user-interface vui
Last synced: 15 Mar 2025
https://github.com/evancohen/sonus
:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
alexa hotword-detection keyword-spotting node speech speech-recognition speech-to-text stt voice-control voice-recognition
Last synced: 16 May 2025
https://github.com/picovoice/picovoice
On-device voice assistant platform powered by deep learning
natural-language-understanding nlu on-device speech-recoginition voice-assistant voice-command voice-commands voice-interface voice-recognition voice-user-interface wake-word-detection
Last synced: 13 Apr 2025
https://github.com/Picovoice/picovoice
On-device voice assistant platform powered by deep learning
natural-language-understanding nlu on-device speech-recoginition voice-assistant voice-command voice-commands voice-interface voice-recognition voice-user-interface wake-word-detection
Last synced: 02 Apr 2025
https://github.com/Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition
Last synced: 04 May 2025
https://github.com/picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition
Last synced: 13 Apr 2025
https://github.com/Picovoice/speech-to-text-benchmark
speech to text benchmark framework
aws-transcribe cheetah deep-learning deep-neural-networks deepspeech edge-ai google-speech-to-text mozilla-deepspeech offline picovoice pocketsphinx privacy speech-recognition speech-to-text voice-recognition
Last synced: 21 Nov 2024
https://github.com/picovoice/speech-to-text-benchmark
speech to text benchmark framework
aws-transcribe cheetah deep-learning deep-neural-networks deepspeech edge-ai google-speech-to-text mozilla-deepspeech offline picovoice pocketsphinx privacy speech-recognition speech-to-text voice-recognition
Last synced: 04 Apr 2025
https://github.com/algolia/voice-overlay-ios
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
chatbots conversation conversational-bots conversational-interface conversational-ui input instant-search instantsearch ios objective-c overlay permissions search speech-recognition speech-to-text swift voice voice-assistant voice-recognition voicetext
Last synced: 09 Dec 2024
https://github.com/adrianhajdin/project_news_alan_ai
In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.
react react-project reactjs voice-assistant voice-recognition
Last synced: 05 Apr 2025
https://github.com/cay-zhang/swiftspeech
A speech recognition framework designed for SwiftUI.
audio combine ios speech-recognition swift swiftui user-voice voice-recognition
Last synced: 05 Apr 2025
https://github.com/Cay-Zhang/SwiftSpeech
A speech recognition framework designed for SwiftUI.
audio combine ios speech-recognition swift swiftui user-voice voice-recognition
Last synced: 24 Mar 2025
https://github.com/hackingbeauty/react-mic
Record audio from a user's microphone and display a cool visualization.
audio-recorder audio-visualizer microphone mp3-audio reactjs record-audio speech-recognition-apps speech-to-text voice voice-activated voice-app voice-applications voice-recognition wav-audio
Last synced: 14 May 2025
https://github.com/reriiasu/speech-to-text
Real-time transcription using faster-whisper
faster-whisper openai speech-recognition speech-to-text voice-recognition whisper
Last synced: 05 Apr 2025
https://github.com/picovoice/leopard
On-device speech-to-text engine powered by deep learning
asr automatic-speech-recognition on-device speech-recognition speech-to-text stt transcription voice-recognition voice-to-text
Last synced: 14 May 2025
https://github.com/hollance/tensorflow-ios-example
Source code for my blog post "Getting started with TensorFlow on iOS"
ios logistic-regression machine-learning metal tensorflow voice-recognition
Last synced: 07 Apr 2025
https://github.com/hollance/TensorFlow-iOS-Example
Source code for my blog post "Getting started with TensorFlow on iOS"
ios logistic-regression machine-learning metal tensorflow voice-recognition
Last synced: 09 May 2025
https://github.com/rcbyron/hey-athena-client
Your personal voice assistant
alexa assistant cortana cross-platform siri voice voice-commands voice-control voice-recognition
Last synced: 07 Apr 2025
https://github.com/shamspias/customizable-gpt-chatbot
A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery for seamless integration and performance.
artificial-intelligence autogpt chatbot conversational-ai data-preprocessing django django-rest-framework gpt-3 gpt-voice langchain langchain-python longchain machine-learning natural-language-processing nlp python voice-chat voice-recognition voice-to-text voice-transcription
Last synced: 05 Apr 2025
https://github.com/jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
data data-cleaning encryption-decryption featurization generation machine-learning python3 security server transcription visualization voice voice-activity-detection voice-assistant voice-computing voice-control voice-recognition voice-recording wake-word-detection
Last synced: 06 Apr 2025
https://github.com/nikorasu/livewhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
ai assistant chatbot dictation numpy openai openai-whisper python sounddevice speech-recognition speech-to-text terminal text-to-speech transcription translation tts voice voice-assistant voice-recognition whisper
Last synced: 06 Apr 2025
https://github.com/Nikorasu/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
ai assistant chatbot dictation numpy openai openai-whisper python sounddevice speech-recognition speech-to-text terminal text-to-speech transcription translation tts voice voice-assistant voice-recognition whisper
Last synced: 29 Apr 2025
https://github.com/dictation-toolbox/Caster
Dragonfly-Based Voice Programming and Accessibility Toolkit
accessibility accessibility-automation dragonfly grammars open-source programming python rsi voice voice-commands voice-control voice-programming voice-recognition
Last synced: 03 Apr 2025
https://github.com/yeyupiaoling/voiceprintrecognition-tensorflow
使用Tensorflow实现声纹识别
arcface speaker-recognition tensorflow voice-recognition
Last synced: 06 Apr 2025
https://github.com/Adri6336/gpt-voice-conversation-chatbot
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition
Last synced: 07 Apr 2025
https://github.com/adri6336/gpt-voice-conversation-chatbot
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition
Last synced: 06 Apr 2025
https://github.com/fulldecent/fdsoundactivatedrecorder
Start recording when the user speaks
audio ios listen sound swift voice-commands voice-control voice-recognition
Last synced: 06 Apr 2025
https://github.com/gtreshchev/RuntimeSpeechRecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp
Last synced: 08 Apr 2025
https://github.com/fulldecent/FDSoundActivatedRecorder
Start recording when the user speaks
audio ios listen sound swift voice-commands voice-control voice-recognition
Last synced: 09 Dec 2024
https://github.com/gtreshchev/runtimespeechrecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp
Last synced: 19 Feb 2025
https://github.com/yeyupiaoling/voiceprintrecognition-paddlepaddle
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
arcface ecapa-tdnn paddlepaddle speaker-recognition voice-recognition
Last synced: 12 Apr 2025
https://github.com/algolia/voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
android chatbots conversation conversational-bots conversational-interface conversational-ui input instant-search instantsearch overlay permission permissions permissions-android search speech-recognition speech-to-text stt voice voice-assistant voice-recognition
Last synced: 14 Dec 2024
https://github.com/themanyone/whisper_dictation
Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.
ai assistant-chat-bots assistive-technology client client-server coding continuous dictation hands-free launcher server speech-recognition stable-diffusion stable-diffusion-webui star-trek voice-assistant voice-control voice-recognition whisper-api whisper-cpp
Last synced: 04 Apr 2025
https://github.com/voqal/voqal
Voice native AI agent for the builders of tomorrow
accessibility assistive-technology coding-assistant natural-language-programming speech-to-code talonvoice voice voice-assistant voice-coding voice-commands voice-programming voice-recognition voice-user-interface
Last synced: 05 Apr 2025
https://github.com/botbahlul/pyautosrt
PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition
Last synced: 06 Apr 2025
https://github.com/botbahlul/PyAutoSRT
PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition
Last synced: 24 Mar 2025
https://github.com/manekinekko/angular-search-experience
Algolia + Angular = 🔥🔥🔥
algolia angular autocomplete chatbot dialogflow firebase infinite-scroll material-design pwa search voice-assistant voice-recognition
Last synced: 03 Jan 2025
https://github.com/alexxit/streamassist
Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant
hacs home-assistant voice-assistant voice-control voice-recognition
Last synced: 23 Jan 2025
https://github.com/AlexxIT/StreamAssist
Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant
hacs home-assistant voice-assistant voice-control voice-recognition
Last synced: 05 Apr 2025
https://github.com/small-cactus/M.I.L.E.S
M.I.L.E.S, a GPT-4-Turbo voice assistant, self-adapts its prompts and AI model, can play any Spotify song, adjusts system and Spotify volume, performs calculations, browses the web and internet, searches global weather, delivers date and time, autonomously chooses and retains long-term memories. Available for macOS and Windows.
ai ai-assistant calculator chatbot customization electron-app gpt-4-turbo home-assistant-integration jarvis-assistant large-language-models llm macos memory openai spotify tts ui voice-assistant voice-recognition windows
Last synced: 06 Mar 2025
https://github.com/yeyupiaoling/voiceprintrecognition-keras
基于Kersa实现的声纹识别模型
deep-learning kersa speaker-recognition tensorflow voice-recognition
Last synced: 22 Mar 2025
https://github.com/lee-b/kobold_assistant
Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper
ai-assistant ai-assistants android desktop koboldai koboldcpp linux llama llm local-ai mobile oobabooga speech-to-text text-generation-webui text-to-speech voice-assistant voice-recognition
Last synced: 17 Feb 2025
https://github.com/jakecyr/chatgpt-voice-assistant
A chatbot that integrates OpenAI Whisper, Chat Completions and Voice Generation. Also provides the option to use free transcription / TTS options.
artificial-intelligence chatgpt machine-learning python voice-assistant voice-recognition
Last synced: 05 Apr 2025
https://github.com/jakkra/smartmirror
My MagicMirror running on a Raspberry Pi
home-automation iot led-strip magic-mirror magicmirror node nodejs philips-hue raspberry-pi react smart-home smart-mirror smartmirror snowboy voice-recognition
Last synced: 16 Apr 2025
https://github.com/at16k/at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
asr asr-model automatic-speech-recognition pretrained-models speech-analysis speech-api speech-recognition speech-recognizer speech-to-text voice-commands voice-recognition
Last synced: 22 Nov 2024
https://github.com/jakkra/SmartMirror
My MagicMirror running on a Raspberry Pi
home-automation iot led-strip magic-mirror magicmirror node nodejs philips-hue raspberry-pi react smart-home smart-mirror smartmirror snowboy voice-recognition
Last synced: 14 Mar 2025
https://github.com/adiksondev/youtranslate
Takes a youtube video, clones the voice and re-creates that video in a different language
ai collaborate elevenlabs-api github localization-tool translation voice-cloning voice-recognition youtube
Last synced: 07 May 2025
https://github.com/antirek/voicer
AGI-server voice recognizer for #Asterisk
agi asr asterisk dialplan google javascript recognition voice voice-assistant voice-commands voice-control voice-recognition yandex
Last synced: 10 Apr 2025
https://github.com/eddyverbruggen/nativescript-speech-recognition
:speech_balloon: Speech to text, using the awesome engines readily available on the device.
nativescript nativescript-plugin siri speech-recognition speech-to-text voice-recognition
Last synced: 08 Feb 2025
https://github.com/botbahlul/crx-live-translate
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!
auto-caption auto-subtitle browser-extension chrome edge google-translate-api javascript speech-recognition speech-to-text voice-recognition webkit-speech-recognition webkitspeechrecognition
Last synced: 29 Jan 2025
https://github.com/umesh-01/python-assistant
Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.
ai-assistants google-recognition nlp openweathermap-api pycharm-ide python python-assistant python-automation python39 pyttsx3 speech-recognition text-to-speech virtual-assistant voice-assistant voice-commands voice-recognition web-scraping wikipedia-search wolfram-alpha
Last synced: 08 Feb 2025
https://github.com/franck-dernoncourt/asr_benchmark
Program to benchmark various speech recognition APIs
asr benchmark speech-recognition voice-recognition
Last synced: 13 Apr 2025
https://github.com/jim-schwoebel/voice_gender_detection
♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).
gender-classification gender-detection machine-learning machine-learning-model machine-learning-modeling machine-learning-practice machine-learning-tutorial neurolex surveylex tutorial voice voice-activity-detection voice-assistant voice-commands voice-computing voice-control voice-recognition workshop-materials
Last synced: 17 Dec 2024
https://github.com/shhossain/banglaspeech2text
BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.
bangla bangla-asr bangla-automatic-speech-recognition bangla-speech-recognition bangla-speech-to-text bangla-voice-recognition deep-learning hacktoberfest machine-learning pytorch speech speech-recognition speech-to-text transformer voice-recognition whisper whisper-model
Last synced: 05 Apr 2025
https://github.com/nexxeln/spotify-voice-control
Voice control for Spotify through the terminal
music python speech-recognition spotify spotify-api voice-commands voice-recognition
Last synced: 30 Apr 2025
https://github.com/fewieden/MMM-voice
Offline Voice Recognition Module for MagicMirror²
magicmirror offline pocketsphinx privacy speech-recognition speech-to-text speech2text voice-recognition
Last synced: 20 Nov 2024
https://github.com/j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
android android-ime automatic-speech-recognition chinese-speech-recognition ime keyboard kotlin openai openai-api speech speech-recognition speech-to-text virtual-keyboard voice voice-recognition whisper
Last synced: 09 Apr 2025
https://github.com/hecomi/node-julius
Node.js module for voice recognition using Julius
julius node-js voice-recognition
Last synced: 07 Apr 2025
https://github.com/matteo-convertino/vosk-build-model
How to create your own model for vosk
deep-learning deep-neural-networks guide kaldi speech-recognition tutorial voice-recognition vosk walkthrough
Last synced: 10 Apr 2025
https://github.com/bunyaminergen/callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition
Last synced: 03 Apr 2025
https://github.com/markparker5/stark
S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit
cross-platform framework natural-language natural-language-processing natural-language-understanding python python3 speech-processing speech-recognition voice voice-assistant voice-commands voice-control voice-interface voice-recognition
Last synced: 28 Apr 2025
https://github.com/shamspias/chatgpt-voice-chatbot-telegram
ChatGPT Voice Chatbot Telegram is a Python and Flask-based GitHub repository that enables users to communicate with an AI chatbot using voice-to-text and text-to-voice technologies powered by OpenAI. The repository provides a flexible and customizable solution for building advanced voice-enabled chatbots using natural language processing.
celery chatbot chatgpt dall-e flask gpt-3 openjourney python telegram-bot telegram-voice-chat text-to-speech text-to-speech-python3 tts voice-chat voice-conversion voice-recognition voice-to-text whisper
Last synced: 04 Dec 2024
https://github.com/botbahlul/autosrt
A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
auto-caption auto-subtitle captions ffmpeg google-translate-api python speech-recognition speechrecognition srt-subtitle subriptext subtitle voice-recognition voicerecognition
Last synced: 11 Apr 2025
https://github.com/felipefacundes/brasiltts
Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado por pessoas com deficiência visual, esse tipo de programa pode ser usado por pessoas com dislexia e outras dificuldades de leitura, pessoas com deficiência severa de fala, bem como por crianças pré-alfabetizadas. Além de ser uma ferramenta de tecnologia assistiva, sintetizadores de voz podem ter ainda aplicações pedagógicas e de entretenimento.
accessibility brasil brasiltts sistema-de-fala text-to-speech tts voice voice-assistant voice-recognition voz
Last synced: 24 Mar 2025
https://github.com/aronweiler/assistant
An intellligent AI assistant that can do anything!
ai database large-language-models llama2 llamacpp llm open-ai open-ai-api openai pgvector polly-voice postgres postgresql python streamlit transcription voice-assistant voice-recognition whisper
Last synced: 24 Jan 2025
https://github.com/vinitshahdeo/online-debate-system
:speaking_head: Using google voice recognition API to predict the "For the motion" and "Against the motion" using sentiment analysis :speech_balloon: :loudspeaker:
debate-system php sentiment-analysis speech-to-text voice-recognition
Last synced: 10 Apr 2025
https://github.com/mobilequickie/amazonspeechtranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
amazon-cognito amazon-polly amazon-translate aws-mobilehub aws-sdk-ios interpreter mobile-development speech-recognition speech-recognizer speech-synthesis speech-to-text swift4 text-to-speech translation translation-api voice-recognition youtube-video
Last synced: 13 Feb 2025
https://github.com/opensrc0/fe-pilot
A React UI library for Advance Web Features
advance-js autofill current-location live-location live-location-tracker mobile-navbar navigator phonebook react reactjs scanner share text-to-speech voice voice-assistant voice-recognition
Last synced: 03 Jan 2025
https://github.com/geoffsmith82/symposium2023
Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot. Source code and Documentation for my 2023 ADUG Symposium Talk.
ai artificial-intelligence claude-3-haiku claude-3-opus claude-3-sonnet computer-vision gpt gpt-4 gpt-4o image-to-text oauth2 palm palm2 speech-to-text text-to-image text-to-speech translation voice-recognition websockets
Last synced: 26 Feb 2025
https://github.com/keyvan-m-sadeghi/assister
Private Open General Assistant Platform
artificial-intelligence assistant assistant-chat-bots chatbot nlp voice voice-recognition
Last synced: 15 Apr 2025
https://github.com/iamaziz/llm-voice-bot
Speak (speech-to-text) to LLMs (Ollama) in any lanaguage - Streamlit app
llms ollama streamlit voice-bot voice-recognition
Last synced: 07 May 2025
https://github.com/spokestack/spokestack-ios
Spokestack: give your iOS app a voice interface!
asr hacktoberfest ios natural-language-understanding speech-api speech-processing speech-recognition speech-synthesis speech-to-text swift tensorflow text-to-speech vad voice-activity-detection voice-assistant voice-recognition voice-synthesis wakeword wakeword-activation
Last synced: 23 Jan 2025
https://github.com/madzadev/voice-cue
📣 Find sentiments, tags, entities, and actions in your voice recordings instantly
audio audio-analysis audio-processing speech speech-recognition speech-to-text transcript voice voice-recognition
Last synced: 22 Nov 2024
https://github.com/fleschutz/talk2windows
Control your Windows desktop by voice commands.
powershell serenade voice voice-assistant voice-commands voice-control voice-recognition
Last synced: 05 May 2025
https://github.com/ttop32/wav2vec2-live-japanese-translator
real time japanese speech recognition translator using wav2vec2
asr audio automatic-speech-recognition fine-tuning huggingface japanese live pyaudio pyqt5 pytorch real-time speaker-recognition speech-to-text spoken-language-understanding stt translation translator voice voice-recognition wav2vec2
Last synced: 29 Apr 2025
https://github.com/picovoice/octopus
On-device Speech-to-Index engine powered by deep learning
audio-search information-retrieval search speech-recognition speech-search speech-to-index speech-to-text voice-recognition voice-search
Last synced: 09 Apr 2025
https://github.com/lipsurf/plugins
Community plugins for LipSurf.
accessibility alexa browser chrome-extension google hacktoberfest shortcuts siri voice voice-as-an-interface voice-commands voice-control voice-recognition
Last synced: 03 May 2025