Projects in Awesome Lists tagged with voice-recognition

https://github.com/paddlepaddle/paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 12 May 2025

https://github.com/PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 24 Mar 2025

https://github.com/speechbrain/speechbrain

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

Last synced: 13 May 2025

https://github.com/alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

android asr deep-learning deep-neural-networks deepspeech google-speech-to-text ios kaldi offline privacy python raspberry-pi speaker-identification speaker-verification speech-recognition speech-to-text speech-to-text-android stt voice-recognition vosk

Last synced: 12 May 2025

https://github.com/snakers4/silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition

Last synced: 13 May 2025

https://github.com/theajack/cnchar

🇨🇳 功能全面的汉字工具库 (拼音笔画偏旁成语语音可视化等) (Chinese character util)

chinese-characters draw pinyin speak spell-stroke voice-recognition

Last synced: 23 Apr 2025

https://github.com/coqui-ai/stt

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 14 May 2025

https://github.com/coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 15 Mar 2025

https://github.com/collabora/whisperlive

A nearly-live implementation of OpenAI's Whisper.

dictation obs openai tensorrt tensorrt-llm text-to-speech translation voice-recognition whisper whisper-tensorrt

Last synced: 09 Apr 2025

https://github.com/collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

dictation obs openai tensorrt tensorrt-llm text-to-speech translation voice-recognition whisper whisper-tensorrt

Last synced: 07 Apr 2025

https://github.com/react-native-voice/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 14 May 2025

https://github.com/react-native-community/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 09 Mar 2025

https://github.com/wenkesj/react-native-voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 21 Feb 2025

https://github.com/jim-schwoebel/voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

audio-dataset audio-datasets data dataset datasets noise voice voice-activity-detection voice-assistant voice-chat voice-commands voice-computing voice-control voice-conversion voice-dataset voice-datasets voice-recognition voice-synthesis

Last synced: 26 Mar 2025

https://github.com/coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition

Last synced: 26 Mar 2025

https://github.com/yeyupiaoling/voiceprintrecognition-pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

arcface ecapa-tdnn pytorch speaker-recognition voice-recognition

Last synced: 15 May 2025

https://github.com/ggeop/python-ai-assistant

Python AI assistant 🧠

ai google-speech-recognition google-speech-to-text linux-assistant mongodb nlp nlp-machine-learning nltk pymongo python python35 sklearn voice-activity-detection voice-assistant voice-chat voice-commands voice-control voice-recognition voice-recognition-experiment wolfram-language

Last synced: 14 May 2025

https://github.com/ggeop/Python-ai-assistant

Python AI assistant 🧠

ai google-speech-recognition google-speech-to-text linux-assistant mongodb nlp nlp-machine-learning nltk pymongo python python35 sklearn voice-activity-detection voice-assistant voice-chat voice-commands voice-control voice-recognition voice-recognition-experiment wolfram-language

Last synced: 07 Apr 2025

https://github.com/mycroftai/mycroft-precise

A lightweight, simple-to-use, RNN wake word listener

embedded-systems hotword-detection keyword-spotting raspberry-pi speech-recognition voice-control voice-recognition wake-word-detection

Last synced: 15 May 2025

https://github.com/MycroftAI/mycroft-precise

A lightweight, simple-to-use, RNN wake word listener

embedded-systems hotword-detection keyword-spotting raspberry-pi speech-recognition voice-control voice-recognition wake-word-detection

Last synced: 02 Apr 2025

https://github.com/wstxda/plugin-voicegpt

Use ChatGPT voice chat as your device digital assistant

ai android chatgpt launcher material-design openai plugin shortcuts tools voice voice-assistant voice-recognition

Last synced: 15 May 2025

https://github.com/alexylem/jarvis

Jarvis.sh is a simple configurable multi-lang assistant.

assistant home-automation jarvis jasper personal-assistant raspberry-pi sarah voice-commands voice-control voice-recognition

Last synced: 24 Nov 2024

https://github.com/picovoice/rhino

On-device Speech-to-Intent engine powered by deep learning

entity-resolution intent-inference natural-language-understanding nlu on-device slot-filling slu speech-recognition spoken-language-understanding voice-assistant voice-command voice-command-control voice-commands voice-control voice-recognition voice-ui voice-user-interface vui

Last synced: 14 May 2025

https://github.com/Picovoice/rhino

On-device Speech-to-Intent engine powered by deep learning

entity-resolution intent-inference natural-language-understanding nlu on-device slot-filling slu speech-recognition spoken-language-understanding voice-assistant voice-command voice-command-control voice-commands voice-control voice-recognition voice-ui voice-user-interface vui

Last synced: 15 Mar 2025

https://github.com/evancohen/sonus

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

alexa hotword-detection keyword-spotting node speech speech-recognition speech-to-text stt voice-control voice-recognition

Last synced: 16 May 2025

https://github.com/picovoice/picovoice

On-device voice assistant platform powered by deep learning

natural-language-understanding nlu on-device speech-recoginition voice-assistant voice-command voice-commands voice-interface voice-recognition voice-user-interface wake-word-detection

Last synced: 13 Apr 2025

https://github.com/Picovoice/picovoice

On-device voice assistant platform powered by deep learning

natural-language-understanding nlu on-device speech-recoginition voice-assistant voice-command voice-commands voice-interface voice-recognition voice-user-interface wake-word-detection

Last synced: 02 Apr 2025

https://github.com/Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition

Last synced: 04 May 2025

https://github.com/picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition

Last synced: 13 Apr 2025

https://github.com/Picovoice/speech-to-text-benchmark

speech to text benchmark framework

aws-transcribe cheetah deep-learning deep-neural-networks deepspeech edge-ai google-speech-to-text mozilla-deepspeech offline picovoice pocketsphinx privacy speech-recognition speech-to-text voice-recognition

Last synced: 21 Nov 2024

https://github.com/picovoice/speech-to-text-benchmark

speech to text benchmark framework

aws-transcribe cheetah deep-learning deep-neural-networks deepspeech edge-ai google-speech-to-text mozilla-deepspeech offline picovoice pocketsphinx privacy speech-recognition speech-to-text voice-recognition

Last synced: 04 Apr 2025

https://github.com/algolia/voice-overlay-ios

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

chatbots conversation conversational-bots conversational-interface conversational-ui input instant-search instantsearch ios objective-c overlay permissions search speech-recognition speech-to-text swift voice voice-assistant voice-recognition voicetext

Last synced: 09 Dec 2024

https://github.com/adrianhajdin/project_news_alan_ai

In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.

react react-project reactjs voice-assistant voice-recognition

Last synced: 05 Apr 2025

https://github.com/cay-zhang/swiftspeech

A speech recognition framework designed for SwiftUI.

audio combine ios speech-recognition swift swiftui user-voice voice-recognition

Last synced: 05 Apr 2025

https://github.com/Cay-Zhang/SwiftSpeech

A speech recognition framework designed for SwiftUI.

audio combine ios speech-recognition swift swiftui user-voice voice-recognition

Last synced: 24 Mar 2025

https://github.com/hackingbeauty/react-mic

Record audio from a user's microphone and display a cool visualization.

audio-recorder audio-visualizer microphone mp3-audio reactjs record-audio speech-recognition-apps speech-to-text voice voice-activated voice-app voice-applications voice-recognition wav-audio

Last synced: 14 May 2025

https://github.com/reriiasu/speech-to-text

Real-time transcription using faster-whisper

faster-whisper openai speech-recognition speech-to-text voice-recognition whisper

Last synced: 05 Apr 2025

https://github.com/picovoice/leopard

On-device speech-to-text engine powered by deep learning

asr automatic-speech-recognition on-device speech-recognition speech-to-text stt transcription voice-recognition voice-to-text

Last synced: 14 May 2025

https://github.com/hollance/tensorflow-ios-example

Source code for my blog post "Getting started with TensorFlow on iOS"

ios logistic-regression machine-learning metal tensorflow voice-recognition

Last synced: 07 Apr 2025

https://github.com/hollance/TensorFlow-iOS-Example

Source code for my blog post "Getting started with TensorFlow on iOS"

ios logistic-regression machine-learning metal tensorflow voice-recognition

Last synced: 09 May 2025

https://github.com/rcbyron/hey-athena-client

Your personal voice assistant

alexa assistant cortana cross-platform siri voice voice-commands voice-control voice-recognition

Last synced: 07 Apr 2025

https://github.com/shamspias/customizable-gpt-chatbot

A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery for seamless integration and performance.

artificial-intelligence autogpt chatbot conversational-ai data-preprocessing django django-rest-framework gpt-3 gpt-voice langchain langchain-python longchain machine-learning natural-language-processing nlp python voice-chat voice-recognition voice-to-text voice-transcription

Last synced: 05 Apr 2025

https://github.com/jim-schwoebel/voicebook

🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).

data data-cleaning encryption-decryption featurization generation machine-learning python3 security server transcription visualization voice voice-activity-detection voice-assistant voice-computing voice-control voice-recognition voice-recording wake-word-detection

Last synced: 06 Apr 2025

https://github.com/nikorasu/livewhisper

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

ai assistant chatbot dictation numpy openai openai-whisper python sounddevice speech-recognition speech-to-text terminal text-to-speech transcription translation tts voice voice-assistant voice-recognition whisper

Last synced: 06 Apr 2025

https://github.com/Nikorasu/LiveWhisper

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

ai assistant chatbot dictation numpy openai openai-whisper python sounddevice speech-recognition speech-to-text terminal text-to-speech transcription translation tts voice voice-assistant voice-recognition whisper

Last synced: 29 Apr 2025

https://github.com/dictation-toolbox/Caster

Dragonfly-Based Voice Programming and Accessibility Toolkit

accessibility accessibility-automation dragonfly grammars open-source programming python rsi voice voice-commands voice-control voice-programming voice-recognition

Last synced: 03 Apr 2025

https://github.com/yeyupiaoling/voiceprintrecognition-tensorflow

使用Tensorflow实现声纹识别

arcface speaker-recognition tensorflow voice-recognition

Last synced: 06 Apr 2025

https://github.com/Adri6336/gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition

Last synced: 07 Apr 2025

https://github.com/adri6336/gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition

Last synced: 06 Apr 2025

https://github.com/fulldecent/fdsoundactivatedrecorder

Start recording when the user speaks

audio ios listen sound swift voice-commands voice-control voice-recognition

Last synced: 06 Apr 2025

https://github.com/gtreshchev/RuntimeSpeechRecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 08 Apr 2025

https://github.com/fulldecent/FDSoundActivatedRecorder

Start recording when the user speaks

audio ios listen sound swift voice-commands voice-control voice-recognition

Last synced: 09 Dec 2024

https://github.com/gtreshchev/runtimespeechrecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 19 Feb 2025

https://github.com/yeyupiaoling/voiceprintrecognition-paddlepaddle

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型，同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法

arcface ecapa-tdnn paddlepaddle speaker-recognition voice-recognition

Last synced: 12 Apr 2025

https://github.com/algolia/voice-overlay-android

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

android chatbots conversation conversational-bots conversational-interface conversational-ui input instant-search instantsearch overlay permission permissions permissions-android search speech-recognition speech-to-text stt voice voice-assistant voice-recognition

Last synced: 14 Dec 2024

https://github.com/themanyone/whisper_dictation

Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.

ai assistant-chat-bots assistive-technology client client-server coding continuous dictation hands-free launcher server speech-recognition stable-diffusion stable-diffusion-webui star-trek voice-assistant voice-control voice-recognition whisper-api whisper-cpp

Last synced: 04 Apr 2025

https://github.com/voqal/voqal

Voice native AI agent for the builders of tomorrow

accessibility assistive-technology coding-assistant natural-language-programming speech-to-code talonvoice voice voice-assistant voice-coding voice-commands voice-programming voice-recognition voice-user-interface

Last synced: 05 Apr 2025

https://github.com/botbahlul/pyautosrt

PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition

Last synced: 06 Apr 2025

https://github.com/botbahlul/PyAutoSRT

PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition

Last synced: 24 Mar 2025

https://github.com/manekinekko/angular-search-experience

Algolia + Angular = 🔥🔥🔥

algolia angular autocomplete chatbot dialogflow firebase infinite-scroll material-design pwa search voice-assistant voice-recognition

Last synced: 03 Jan 2025

https://github.com/alexxit/streamassist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant

hacs home-assistant voice-assistant voice-control voice-recognition

Last synced: 23 Jan 2025

https://github.com/AlexxIT/StreamAssist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant

hacs home-assistant voice-assistant voice-control voice-recognition

Last synced: 05 Apr 2025

https://github.com/small-cactus/M.I.L.E.S

M.I.L.E.S, a GPT-4-Turbo voice assistant, self-adapts its prompts and AI model, can play any Spotify song, adjusts system and Spotify volume, performs calculations, browses the web and internet, searches global weather, delivers date and time, autonomously chooses and retains long-term memories. Available for macOS and Windows.

ai ai-assistant calculator chatbot customization electron-app gpt-4-turbo home-assistant-integration jarvis-assistant large-language-models llm macos memory openai spotify tts ui voice-assistant voice-recognition windows

Last synced: 06 Mar 2025

https://github.com/yeyupiaoling/voiceprintrecognition-keras

基于Kersa实现的声纹识别模型

deep-learning kersa speaker-recognition tensorflow voice-recognition

Last synced: 22 Mar 2025

https://github.com/lee-b/kobold_assistant

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper

ai-assistant ai-assistants android desktop koboldai koboldcpp linux llama llm local-ai mobile oobabooga speech-to-text text-generation-webui text-to-speech voice-assistant voice-recognition

Last synced: 17 Feb 2025

https://github.com/jakecyr/chatgpt-voice-assistant

A chatbot that integrates OpenAI Whisper, Chat Completions and Voice Generation. Also provides the option to use free transcription / TTS options.

artificial-intelligence chatgpt machine-learning python voice-assistant voice-recognition

Last synced: 05 Apr 2025

https://github.com/jakkra/smartmirror

My MagicMirror running on a Raspberry Pi

home-automation iot led-strip magic-mirror magicmirror node nodejs philips-hue raspberry-pi react smart-home smart-mirror smartmirror snowboy voice-recognition

Last synced: 16 Apr 2025

https://github.com/at16k/at16k

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

asr asr-model automatic-speech-recognition pretrained-models speech-analysis speech-api speech-recognition speech-recognizer speech-to-text voice-commands voice-recognition

Last synced: 22 Nov 2024

https://github.com/jakkra/SmartMirror

My MagicMirror running on a Raspberry Pi

home-automation iot led-strip magic-mirror magicmirror node nodejs philips-hue raspberry-pi react smart-home smart-mirror smartmirror snowboy voice-recognition

Last synced: 14 Mar 2025

https://github.com/adiksondev/youtranslate

Takes a youtube video, clones the voice and re-creates that video in a different language

ai collaborate elevenlabs-api github localization-tool translation voice-cloning voice-recognition youtube

Last synced: 07 May 2025

https://github.com/antirek/voicer

AGI-server voice recognizer for #Asterisk

agi asr asterisk dialplan google javascript recognition voice voice-assistant voice-commands voice-control voice-recognition yandex

Last synced: 10 Apr 2025

https://github.com/eddyverbruggen/nativescript-speech-recognition

:speech_balloon: Speech to text, using the awesome engines readily available on the device.

nativescript nativescript-plugin siri speech-recognition speech-to-text voice-recognition

Last synced: 08 Feb 2025

https://github.com/botbahlul/crx-live-translate

Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!

auto-caption auto-subtitle browser-extension chrome edge google-translate-api javascript speech-recognition speech-to-text voice-recognition webkit-speech-recognition webkitspeechrecognition

Last synced: 29 Jan 2025

https://github.com/umesh-01/python-assistant

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

ai-assistants google-recognition nlp openweathermap-api pycharm-ide python python-assistant python-automation python39 pyttsx3 speech-recognition text-to-speech virtual-assistant voice-assistant voice-commands voice-recognition web-scraping wikipedia-search wolfram-alpha

Last synced: 08 Feb 2025

https://github.com/franck-dernoncourt/asr_benchmark

Program to benchmark various speech recognition APIs

asr benchmark speech-recognition voice-recognition

Last synced: 13 Apr 2025

https://github.com/jim-schwoebel/voice_gender_detection

♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).

gender-classification gender-detection machine-learning machine-learning-model machine-learning-modeling machine-learning-practice machine-learning-tutorial neurolex surveylex tutorial voice voice-activity-detection voice-assistant voice-commands voice-computing voice-control voice-recognition workshop-materials

Last synced: 17 Dec 2024

https://github.com/shhossain/banglaspeech2text

BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.

bangla bangla-asr bangla-automatic-speech-recognition bangla-speech-recognition bangla-speech-to-text bangla-voice-recognition deep-learning hacktoberfest machine-learning pytorch speech speech-recognition speech-to-text transformer voice-recognition whisper whisper-model

Last synced: 05 Apr 2025

https://github.com/nexxeln/spotify-voice-control

Voice control for Spotify through the terminal

music python speech-recognition spotify spotify-api voice-commands voice-recognition

Last synced: 30 Apr 2025

https://github.com/fewieden/MMM-voice

Offline Voice Recognition Module for MagicMirror²

magicmirror offline pocketsphinx privacy speech-recognition speech-to-text speech2text voice-recognition

Last synced: 20 Nov 2024

https://github.com/j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

android android-ime automatic-speech-recognition chinese-speech-recognition ime keyboard kotlin openai openai-api speech speech-recognition speech-to-text virtual-keyboard voice voice-recognition whisper

Last synced: 09 Apr 2025

https://github.com/hecomi/node-julius

Node.js module for voice recognition using Julius

julius node-js voice-recognition

Last synced: 07 Apr 2025

https://github.com/matteo-convertino/vosk-build-model

How to create your own model for vosk

deep-learning deep-neural-networks guide kaldi speech-recognition tutorial voice-recognition vosk walkthrough

Last synced: 10 Apr 2025

https://github.com/bunyaminergen/callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition

Last synced: 03 Apr 2025

https://github.com/markparker5/stark

S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit

cross-platform framework natural-language natural-language-processing natural-language-understanding python python3 speech-processing speech-recognition voice voice-assistant voice-commands voice-control voice-interface voice-recognition

Last synced: 28 Apr 2025

https://github.com/shamspias/chatgpt-voice-chatbot-telegram

ChatGPT Voice Chatbot Telegram is a Python and Flask-based GitHub repository that enables users to communicate with an AI chatbot using voice-to-text and text-to-voice technologies powered by OpenAI. The repository provides a flexible and customizable solution for building advanced voice-enabled chatbots using natural language processing.

celery chatbot chatgpt dall-e flask gpt-3 openjourney python telegram-bot telegram-voice-chat text-to-speech text-to-speech-python3 tts voice-chat voice-conversion voice-recognition voice-to-text whisper

Last synced: 04 Dec 2024

https://github.com/botbahlul/autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate-api python speech-recognition speechrecognition srt-subtitle subriptext subtitle voice-recognition voicerecognition

Last synced: 11 Apr 2025

https://github.com/felipefacundes/brasiltts

Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado por pessoas com deficiência visual, esse tipo de programa pode ser usado por pessoas com dislexia e outras dificuldades de leitura, pessoas com deficiência severa de fala, bem como por crianças pré-alfabetizadas. Além de ser uma ferramenta de tecnologia assistiva, sintetizadores de voz podem ter ainda aplicações pedagógicas e de entretenimento.

accessibility brasil brasiltts sistema-de-fala text-to-speech tts voice voice-assistant voice-recognition voz

Last synced: 24 Mar 2025

https://github.com/aronweiler/assistant

An intellligent AI assistant that can do anything!

ai database large-language-models llama2 llamacpp llm open-ai open-ai-api openai pgvector polly-voice postgres postgresql python streamlit transcription voice-assistant voice-recognition whisper

Last synced: 24 Jan 2025

https://github.com/vinitshahdeo/online-debate-system

:speaking_head: Using google voice recognition API to predict the "For the motion" and "Against the motion" using sentiment analysis :speech_balloon: :loudspeaker:

debate-system php sentiment-analysis speech-to-text voice-recognition

Last synced: 10 Apr 2025

https://github.com/mobilequickie/amazonspeechtranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

amazon-cognito amazon-polly amazon-translate aws-mobilehub aws-sdk-ios interpreter mobile-development speech-recognition speech-recognizer speech-synthesis speech-to-text swift4 text-to-speech translation translation-api voice-recognition youtube-video

Last synced: 13 Feb 2025

https://github.com/opensrc0/fe-pilot

A React UI library for Advance Web Features

advance-js autofill current-location live-location live-location-tracker mobile-navbar navigator phonebook react reactjs scanner share text-to-speech voice voice-assistant voice-recognition

Last synced: 03 Jan 2025

https://github.com/geoffsmith82/symposium2023

Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot. Source code and Documentation for my 2023 ADUG Symposium Talk.

ai artificial-intelligence claude-3-haiku claude-3-opus claude-3-sonnet computer-vision gpt gpt-4 gpt-4o image-to-text oauth2 palm palm2 speech-to-text text-to-image text-to-speech translation voice-recognition websockets