An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with voice-recognition

A curated list of projects in awesome lists tagged with voice-recognition .

https://github.com/paddlepaddle/paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 12 May 2025

https://github.com/PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 24 Mar 2025

https://github.com/theajack/cnchar

🇨🇳 功能全面的汉字工具库 (拼音 笔画 偏旁 成语 语音 可视化等) (Chinese character util)

chinese-characters draw pinyin speak spell-stroke voice-recognition

Last synced: 23 Apr 2025

https://github.com/coqui-ai/stt

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 14 May 2025

https://github.com/coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 15 Mar 2025

https://github.com/react-native-voice/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 14 May 2025

https://github.com/react-native-community/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 09 Mar 2025

https://github.com/wenkesj/react-native-voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 21 Feb 2025

https://github.com/yeyupiaoling/voiceprintrecognition-pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

arcface ecapa-tdnn pytorch speaker-recognition voice-recognition

Last synced: 15 May 2025

https://github.com/evancohen/sonus

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

alexa hotword-detection keyword-spotting node speech speech-recognition speech-to-text stt voice-control voice-recognition

Last synced: 16 May 2025

https://github.com/adrianhajdin/project_news_alan_ai

In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.

react react-project reactjs voice-assistant voice-recognition

Last synced: 05 Apr 2025

https://github.com/cay-zhang/swiftspeech

A speech recognition framework designed for SwiftUI.

audio combine ios speech-recognition swift swiftui user-voice voice-recognition

Last synced: 05 Apr 2025

https://github.com/Cay-Zhang/SwiftSpeech

A speech recognition framework designed for SwiftUI.

audio combine ios speech-recognition swift swiftui user-voice voice-recognition

Last synced: 24 Mar 2025

https://github.com/hollance/tensorflow-ios-example

Source code for my blog post "Getting started with TensorFlow on iOS"

ios logistic-regression machine-learning metal tensorflow voice-recognition

Last synced: 07 Apr 2025

https://github.com/hollance/TensorFlow-iOS-Example

Source code for my blog post "Getting started with TensorFlow on iOS"

ios logistic-regression machine-learning metal tensorflow voice-recognition

Last synced: 09 May 2025

https://github.com/shamspias/customizable-gpt-chatbot

A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery for seamless integration and performance.

artificial-intelligence autogpt chatbot conversational-ai data-preprocessing django django-rest-framework gpt-3 gpt-voice langchain langchain-python longchain machine-learning natural-language-processing nlp python voice-chat voice-recognition voice-to-text voice-transcription

Last synced: 05 Apr 2025

https://github.com/Adri6336/gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition

Last synced: 07 Apr 2025

https://github.com/adri6336/gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

ai chatbot chatgpt cli conversational conversational-ai conversational-bots customizable elevenlabs gpt-3 gpt-4 memory openai personalized python speech-recognition speech-to-text tts user-friendly voice-recognition

Last synced: 06 Apr 2025

https://github.com/gtreshchev/RuntimeSpeechRecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 08 Apr 2025

https://github.com/gtreshchev/runtimespeechrecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 19 Feb 2025

https://github.com/yeyupiaoling/voiceprintrecognition-paddlepaddle

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法

arcface ecapa-tdnn paddlepaddle speaker-recognition voice-recognition

Last synced: 12 Apr 2025

https://github.com/botbahlul/pyautosrt

PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition

Last synced: 06 Apr 2025

https://github.com/botbahlul/PyAutoSRT

PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate google-translate-api pysimplegui python speech-recognition srt-subtitle subriptext subtitle voice-recognition

Last synced: 24 Mar 2025

https://github.com/alexxit/streamassist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant

hacs home-assistant voice-assistant voice-control voice-recognition

Last synced: 23 Jan 2025

https://github.com/AlexxIT/StreamAssist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant

hacs home-assistant voice-assistant voice-control voice-recognition

Last synced: 05 Apr 2025

https://github.com/small-cactus/M.I.L.E.S

M.I.L.E.S, a GPT-4-Turbo voice assistant, self-adapts its prompts and AI model, can play any Spotify song, adjusts system and Spotify volume, performs calculations, browses the web and internet, searches global weather, delivers date and time, autonomously chooses and retains long-term memories. Available for macOS and Windows.

ai ai-assistant calculator chatbot customization electron-app gpt-4-turbo home-assistant-integration jarvis-assistant large-language-models llm macos memory openai spotify tts ui voice-assistant voice-recognition windows

Last synced: 06 Mar 2025

https://github.com/lee-b/kobold_assistant

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper

ai-assistant ai-assistants android desktop koboldai koboldcpp linux llama llm local-ai mobile oobabooga speech-to-text text-generation-webui text-to-speech voice-assistant voice-recognition

Last synced: 17 Feb 2025

https://github.com/jakecyr/chatgpt-voice-assistant

A chatbot that integrates OpenAI Whisper, Chat Completions and Voice Generation. Also provides the option to use free transcription / TTS options.

artificial-intelligence chatgpt machine-learning python voice-assistant voice-recognition

Last synced: 05 Apr 2025

https://github.com/at16k/at16k

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

asr asr-model automatic-speech-recognition pretrained-models speech-analysis speech-api speech-recognition speech-recognizer speech-to-text voice-commands voice-recognition

Last synced: 22 Nov 2024

https://github.com/adiksondev/youtranslate

Takes a youtube video, clones the voice and re-creates that video in a different language

ai collaborate elevenlabs-api github localization-tool translation voice-cloning voice-recognition youtube

Last synced: 07 May 2025

https://github.com/eddyverbruggen/nativescript-speech-recognition

:speech_balloon: Speech to text, using the awesome engines readily available on the device.

nativescript nativescript-plugin siri speech-recognition speech-to-text voice-recognition

Last synced: 08 Feb 2025

https://github.com/botbahlul/crx-live-translate

Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!

auto-caption auto-subtitle browser-extension chrome edge google-translate-api javascript speech-recognition speech-to-text voice-recognition webkit-speech-recognition webkitspeechrecognition

Last synced: 29 Jan 2025

https://github.com/umesh-01/python-assistant

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

ai-assistants google-recognition nlp openweathermap-api pycharm-ide python python-assistant python-automation python39 pyttsx3 speech-recognition text-to-speech virtual-assistant voice-assistant voice-commands voice-recognition web-scraping wikipedia-search wolfram-alpha

Last synced: 08 Feb 2025

https://github.com/franck-dernoncourt/asr_benchmark

Program to benchmark various speech recognition APIs

asr benchmark speech-recognition voice-recognition

Last synced: 13 Apr 2025

https://github.com/shhossain/banglaspeech2text

BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.

bangla bangla-asr bangla-automatic-speech-recognition bangla-speech-recognition bangla-speech-to-text bangla-voice-recognition deep-learning hacktoberfest machine-learning pytorch speech speech-recognition speech-to-text transformer voice-recognition whisper whisper-model

Last synced: 05 Apr 2025

https://github.com/j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

android android-ime automatic-speech-recognition chinese-speech-recognition ime keyboard kotlin openai openai-api speech speech-recognition speech-to-text virtual-keyboard voice voice-recognition whisper

Last synced: 09 Apr 2025

https://github.com/hecomi/node-julius

Node.js module for voice recognition using Julius

julius node-js voice-recognition

Last synced: 07 Apr 2025

https://github.com/bunyaminergen/callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition

Last synced: 03 Apr 2025

https://github.com/shamspias/chatgpt-voice-chatbot-telegram

ChatGPT Voice Chatbot Telegram is a Python and Flask-based GitHub repository that enables users to communicate with an AI chatbot using voice-to-text and text-to-voice technologies powered by OpenAI. The repository provides a flexible and customizable solution for building advanced voice-enabled chatbots using natural language processing.

celery chatbot chatgpt dall-e flask gpt-3 openjourney python telegram-bot telegram-voice-chat text-to-speech text-to-speech-python3 tts voice-chat voice-conversion voice-recognition voice-to-text whisper

Last synced: 04 Dec 2024

https://github.com/botbahlul/autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

auto-caption auto-subtitle captions ffmpeg google-translate-api python speech-recognition speechrecognition srt-subtitle subriptext subtitle voice-recognition voicerecognition

Last synced: 11 Apr 2025

https://github.com/felipefacundes/brasiltts

Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado por pessoas com deficiência visual, esse tipo de programa pode ser usado por pessoas com dislexia e outras dificuldades de leitura, pessoas com deficiência severa de fala, bem como por crianças pré-alfabetizadas. Além de ser uma ferramenta de tecnologia assistiva, sintetizadores de voz podem ter ainda aplicações pedagógicas e de entretenimento.

accessibility brasil brasiltts sistema-de-fala text-to-speech tts voice voice-assistant voice-recognition voz

Last synced: 24 Mar 2025

https://github.com/vinitshahdeo/online-debate-system

:speaking_head: Using google voice recognition API to predict the "For the motion" and "Against the motion" using sentiment analysis :speech_balloon: :loudspeaker:

debate-system php sentiment-analysis speech-to-text voice-recognition

Last synced: 10 Apr 2025

https://github.com/mobilequickie/amazonspeechtranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

amazon-cognito amazon-polly amazon-translate aws-mobilehub aws-sdk-ios interpreter mobile-development speech-recognition speech-recognizer speech-synthesis speech-to-text swift4 text-to-speech translation translation-api voice-recognition youtube-video

Last synced: 13 Feb 2025

https://github.com/geoffsmith82/symposium2023

Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot. Source code and Documentation for my 2023 ADUG Symposium Talk.

ai artificial-intelligence claude-3-haiku claude-3-opus claude-3-sonnet computer-vision gpt gpt-4 gpt-4o image-to-text oauth2 palm palm2 speech-to-text text-to-image text-to-speech translation voice-recognition websockets

Last synced: 26 Feb 2025

https://github.com/iamaziz/llm-voice-bot

Speak (speech-to-text) to LLMs (Ollama) in any lanaguage - Streamlit app

llms ollama streamlit voice-bot voice-recognition

Last synced: 07 May 2025

https://github.com/madzadev/voice-cue

📣 Find sentiments, tags, entities, and actions in your voice recordings instantly

audio audio-analysis audio-processing speech speech-recognition speech-to-text transcript voice voice-recognition

Last synced: 22 Nov 2024