Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Whisper
Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
- GitHub: https://github.com/topics/whisper
- Repo: https://github.com/openai/whisper
- Created by: OpenAI
- Released: August 2021
- Related Topics: machine-learning, artificial-intelligence, language-modeling,
- Last updated: 2025-02-11 00:33:31 UTC
- JSON Representation
https://github.com/roman01la/sub-deep
Transcribe and translate audio with AI
deepl transcribe translate whisper
Last synced: 30 Dec 2024
https://github.com/xaionaro-go/speech
A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)
ai converter go golang library module package speech speech-recognition speech-to-text text whisper
Last synced: 13 Jan 2025
https://github.com/chinese-soup/cbot-telegram-whisper
Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.
bot cpu-inference golang openai speech-recognition speech-to-text whisper whisper-cpp whispercpp
Last synced: 17 Jan 2025
https://github.com/egorsmkv/star-adapt-uk
Fork of https://github.com/YUCHEN005/STAR-Adapt with some modifications for Ukrainian.
asr speech-recognition ukrainian whisper
Last synced: 19 Dec 2024
https://github.com/nri12/filter_voice
Dự án lọc và tắt tiếng video những từ khóa mong muốn
Last synced: 19 Dec 2024
https://github.com/oov/aviutl_subtitler
AviUtl+拡張編集の環境で Whisper による文字起こしをするためのプラグイン
Last synced: 19 Dec 2024
https://github.com/canaxs/whisper-core
An application where users can make rumor-based news and earn money in return.
mysql panel spring spring-boot whisper
Last synced: 19 Dec 2024
https://github.com/breadrock1/audio-to-text
There is simple backend project to use whisper-rs.
actix-web audio-to-text rust swagger-ui whisper
Last synced: 10 Jan 2025
https://github.com/doctorpok42/pheere-app
Pheere is a simple virtual assistant
ai chatgpt desktop-app elevenlabs nextjs scss tauri ts virtual-assistant whisper
Last synced: 10 Jan 2025
https://github.com/tranbavinhson/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 26 Dec 2024
https://github.com/ayeshaaaaaaaaa/ai-powered-video-analysis-with-object-detection-and-detailed-scene-narratives
AI-driven video analysis system that extracts and transcribes audio with Whisper, detects objects using YOLO, and generates comprehensive scene descriptions with GPT-2. The project combines transcriptions and object detections to produce detailed, context-aware video narratives.
bart gpt2 video-analysis whisper yolov8
Last synced: 02 Jan 2025
https://github.com/fukuro-kun/wortweber
Wortweber ist ein sich in der Entwicklung befindendes Open-Source-Projekt, das Echtzeit-Sprachtranskription mit KI-Technologie erforscht. Es dient als Lern- und Experimentierplattform für Spracherkennung in Deutsch und Englisch.
Last synced: 17 Jan 2025
https://github.com/becomingbabyman/eunoia-desktop
local desktop transcription and search for apple voice memos and videos
search second-brain transcription videos voice-memos whisper
Last synced: 25 Dec 2024
https://github.com/oussemabenhassena5/notegen-with-llama-and-whisper
AI-powered YouTube video notes generator
ai llama3 python whisper youtube-api
Last synced: 07 Feb 2025
https://github.com/valiantlynx/custom-whisper-api
This project provides a custom API wrapper for the open-source Whisper model using FastAPI. It allows you to integrate Whisper into your applications for automatic speech recognition (ASR) tasks.
ai docker-compose fastapi python whisper
Last synced: 10 Jan 2025
https://github.com/huuquyet/phowhisper-next
Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js
nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper
Last synced: 19 Dec 2024
https://github.com/bbc-esq/whisper-solo-with-gui
OpenAI's Whisper program with a simple lightweight GUI.
pyqt pyqt6 pyqt6-gui transcribe transcribe-audio-files translate whisper
Last synced: 11 Jan 2025
https://github.com/maylad31/colab-codes
some useful colab files
clip colab-notebook speech-recognition whisper zero-shot-classification
Last synced: 11 Jan 2025
https://github.com/malexandersalazar/casey
Casey is a Voice-Activated AI Companion for Mental Wellbeing & Content Creation #BuildWithAI
agentic-ai content-creation groq large-language-models python wellbeing whisper
Last synced: 11 Feb 2025
https://github.com/lazauk/aoai-entraidauth-sdkv1
Authenticating with Entra ID (former Azure AD) to access Azure OpenAI models in Python SDK v1.x
ai authentication azure azure-active-directory dall-e embeddings entra-id gpt openai whisper
Last synced: 12 Jan 2025
https://github.com/nicknaskida/insanely-fast-whisper
Incredibly fast Whisper-large-v3 with speaker diarization
diarization speaker-diarization transfromers whisper whisper-ai whisper-faster whisper-large
Last synced: 19 Jan 2025
https://github.com/tracywong117/ai-learning-material-from-video
Support subtitling, translating, RAG to generate language learning material from video.
ai auto-subtitle gpt-translate groq groq-api rag subtitles-generator translate whisper
Last synced: 19 Jan 2025
https://github.com/thewh1teagle/whisper.zig
Transcribe audio with whisper in zig
Last synced: 24 Jan 2025
https://github.com/brentwong-kiel1997/ai_language_school_based_on_django_and_openai
Django and OpenAI API example use case
django gpt-4 openai openai-api whisper
Last synced: 08 Feb 2025
https://github.com/drankush/voxrad
VOXRAD is a voice transcription application for radiologists leveraging locally deployed ASR and LLM models.
desktop-app ffmpeg gemini gpt llm macos medical-informatics multimodal natural-language-processing nlp openai openai-api productivity python radiology reporting transcription voice-recognition whisper windows
Last synced: 31 Jan 2025
https://github.com/stnderror/robotron
🤖 A personal robot assistant for Telegram
assistant bot dall-e gpt-35-turbo openai telegram-bot whisper
Last synced: 25 Jan 2025
https://github.com/mikeesto/whispercpp-android
An Android app using whisper.cpp to do voice-to-text transcriptions
android kotlin speech-to-text whisper whisper-cpp
Last synced: 09 Feb 2025
https://github.com/aitor-alvarez/large-speech-models
Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper
arabic-speech-recognition asr asr-model finetuning-wav2vec finetuning-whisper large-speech-models speech-recognition-model wav2vec2 whisper
Last synced: 25 Jan 2025
https://github.com/winstxnhdw/capgen
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
asr automatic-speech-recognition caddy ctranslate2 docker fastapi huggingface huggingface-spaces uvicorn-gunicorn whisper
Last synced: 23 Oct 2024
https://github.com/jowadev/interview
Interview is an interactive application crafted to empower both students and professionals in honing their skills for job interviews.
interview-preparation job-interviews nextjs professional students whisper
Last synced: 07 Feb 2025
https://github.com/marquesafonso/multilang-asr-captioner
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.
automatic-speech-recognition captioning-videos faster-whisper whisper
Last synced: 24 Oct 2024
https://github.com/Op27/meeting_minutes_generator
This Python application automates the process of generating meeting minutes from an audio recording. It uses the Whisper library for transcription and the OpenAI GPT models for summarizing content, then outputs the result in a Word document.
ai audio-processing document-automation meeting-minutes openai python speech-recognition text-summarization transcription whisper
Last synced: 24 Oct 2024
https://github.com/TranBaVinhSon/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 24 Oct 2024
https://github.com/nerdimite/meetsy-app
Frontend for the Workshop on Building an End-to-End AI Meeting Assistant
gpt-3 nextjs sentence-transformers tailwindcss whisper
Last synced: 24 Oct 2024
https://github.com/juanestban/whisper-tnode
cli ts typescript whisper whisper-cpp whisper-ia whisper-node whisper-node-ts
Last synced: 21 Dec 2024
https://github.com/sumitesh9/localizedwhisper
An initiative to make OpenAI Whisper more localized by adding support for more languages.
albanian albanian-language huggingface openai speech speech-to-text whisper
Last synced: 02 Jan 2025
https://github.com/gabriellopesdesouza2002/funcspy
Functions to help you develop any program or script you want
automation chatbot dall-e email email-library ocr openai-api openai-chatgpt openai-whisper pdf pdf-tools python regex selenium selenium-webdriver whisper
Last synced: 30 Oct 2024
https://github.com/jgw96/speech-to-text-web-toolkit
Making Speech-To-Text on the web easy, both local and in the cloud
ai lit transformersjs webcomponents whisper
Last synced: 01 Feb 2025
https://github.com/Shtirmann/V2T
Telegram bot which automatically transcribes all voice and video messages to text.
ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper
Last synced: 24 Oct 2024
https://github.com/cris-m/langgraph_examples
duckduckgo kokoro langgraph llama3-2 whisper
Last synced: 18 Jan 2025
https://github.com/jojasadventure/whisper-client
Very simple Python based client for Whisper compatible endpoint
desktop-app dictation faster-whisper macos productivity python speech-to-text stt whisper
Last synced: 08 Feb 2025
https://github.com/chaoticbyte/audio-summarize
An audio summarizer (faster-whisper and BART glued together)
ai ai-summarizer audio bart ctranslate2 faster-whisper nlp speech-to-text summarization whisper
Last synced: 08 Feb 2025
https://github.com/team-mansumugang/mansumugang-backend
만수무강 서비스의 스프링 부트 어플리케이션입니다.
aws github-actions jpa jpa-hibernate spring-boot whisper
Last synced: 08 Feb 2025
https://github.com/ivanrj7j/transcription
This project transcribes audio using whisper and provides an api
ai api flask transcription whisper
Last synced: 08 Feb 2025
https://github.com/aeronjl/transcribe
Python package for accurate audio transcription with speaker diarisation
audio-transcription gpt speaker-diarization whisper
Last synced: 08 Feb 2025
https://github.com/niqifan007/openai-tts-stt-streamlit
A gui interface for tts (text-to-speech) and stt (speech-to-text) interfaces using the openai api developed by Streamlit, with a history function一个使用Streamlit开发的openai的api接口的tts(文字转语音)和stt(语音转文字)接口的gui界面,带有历史记录功能
openai openai-api streamlit stt-gui tts tts-gui whisper whisper-api
Last synced: 08 Feb 2025
https://github.com/shtirmann/v2t
Telegram bot which automatically transcribes all voice and video messages to text.
ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper
Last synced: 10 Feb 2025
https://github.com/mickekring/top-of-mind-clara
Clara är en prototyp som möjliggör att anonymt kunna göra sin röst hörd. Medarbetaren kan prata eller skriva in det du vill säga och AI anonymiserar det. Medarbetaren har dessutom tillgång till en chatbot att rådfråga. Därefter analyseras och sammanställs alla medarbetares tankar i en dashboard.
ai chatbot feedback openai python streamlit transcription whisper
Last synced: 22 Dec 2024
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 08 Feb 2025
https://github.com/natanielf/lecsum
Automatically transcribe and summarize lecture recordings completely on-device using AI
ollama ollama-python whisper whisper-ai
Last synced: 11 Feb 2025
https://github.com/seanvelasco/ai
Cloudflare AI challenge submission: Slater - your virtual foreign language friend
ai artificial-intelligence language-learning llama2 llm m2m100 machine-learning whisper
Last synced: 03 Feb 2025
https://github.com/xawos/owt
🦙🗣️ Ollama and Whisper Telegram bot, with advanced configuration
ai-bots local-ai ollama telegram-aichatbot telegram-bots whisper
Last synced: 28 Jan 2025
https://github.com/jpzinn654/speaker-diarization-portuguese
This project implements speaker diarization for Portuguese audio using WhisperX for transcription and PyAnotAudio's Speaker-Diarization 3.1 for speaker separation. It includes a Flask UI for easy file upload, transcription, and speaker identification.
flask gender-detection portuguese-language speaker-diarization speaker-recognition speech-recognition transcription whisper
Last synced: 28 Jan 2025
https://github.com/carlosulisesochoa/whisper-ai-transcription-audio-to-text-file
A Python tool that uses OpenAI's Whisper model to batch transcribe audio files with GPU acceleration. Features include multi-language support, timestamp-based output, automatic file status checking, and CUDA support for faster processing. Perfect for transcribing lectures, interviews, or any audio content with high accuracy.
ai audio-to-text transcription whisper
Last synced: 28 Jan 2025
https://github.com/LarissaGuder/whisper-datastream
Transcription and NER in streaming environment
bert-ner python spark-streaming whisper
Last synced: 24 Oct 2024
https://github.com/mottla/speech-to-text
Local and fast speech to text (STT) with speaker recognition. Transcibe your meetings confidentially.
huggingface speech-recognition stt teams transcription translation whisper zoom
Last synced: 21 Jan 2025
https://github.com/same-ou/whisper-speech-recognition
This repository contains a deployment of the Whisper speech recognition model using Flask and Python. Whisper is a cutting-edge speech recognition model designed to accurately transcribe speech input into text.
deep-learning flask machine-learning openai python pytorch whisper
Last synced: 01 Jan 2025
https://github.com/baomeomeo/speech
A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)
ai converter go golang library module package speech speech-recognition speech-to-text text whisper
Last synced: 13 Jan 2025
https://github.com/yuxiang32/Audio-Transcription
Audio transcriber using OpenAI Whisper
Last synced: 24 Oct 2024
https://github.com/mdbecker/whisper_cpp_macos_utils
Automated transcription workflow for macOS: Shell scripts to streamline audio recording, conversion, and transcription using whisper.cpp with macOS utilities like QuickTime Player and BlackHole-2ch.
audio-processing openai shell-scripts speech-to-text transcription whisper whisper-cpp
Last synced: 29 Jan 2025
https://github.com/xi-rick/captains-log
Captain's Log is your personal AI-powered voice transcription logbook. This innovative web application allows you to transcribe spoken words into text, organize your thoughts, and manage important notes. Built with cutting-edge technology and creative design, Captain's Log sets sail to revolutionize how you capture and manage ideas.
audio-recorder audio-visualizer javascript mongodb mongodb-atlas nextjs once-ui openai react reactjs shadcn-ui tailwindcss typescript voice whisper
Last synced: 21 Jan 2025
https://github.com/rishabhmathur06/fine-tuning-whisper-small-for-asr-
This repository contains notebook that shows how to fine-tune OpenAI's Whisper model on custom Hindi dataset.
artificial-intelligence asr automatic-speech-recognition fine-tuning openai python whisper whisper-model
Last synced: 19 Dec 2024
https://github.com/akhkim/babel
Real-time Internal Audio Translate and Transcriber that uses Whisper model
ai internal-audio real-time transcription translation whisper
Last synced: 19 Dec 2024
https://github.com/bilalhameed248/whisper-fine-tuning-for-pronunciation-learning
Fine Tuning of Whisper Speech To Text Base Model For Pronunciation Learning
deep-learning deep-neural-networks dnn fine-tuning openai pronunciation python seq2seq speech speech-recognition speech-synthesis speech-to-text whisper whisper-ai
Last synced: 16 Jan 2025
https://github.com/youknow2509/real-time-speech-to-text
Speech To Text in Real-Time
blackhole speech-recognition speech-to-text whisper whisper-api
Last synced: 19 Dec 2024
https://github.com/mariatepei/vt_thesis_mtepei
This repository accompanies my MSc Thesis for the degree Voice Technology, storing all referenced data and other relevant resources.
data-augmentation fastspeech2 speech-recognition whisper
Last synced: 10 Feb 2025
https://github.com/heyfoz/python-openai-whisper
This Python script provides a simple interface to transcribe audio files using the OpenAI API's speech-to-text functionality, powered by the Whisper model. The result is returned to the console as text or VTT (WebVTT) format.
ai api audio-transcription openai python speech-to-text whisper
Last synced: 19 Dec 2024
https://github.com/geo-y20/enhanced-learning-experience
IntelliLearn is a FastAPI-based application designed to process and transcribe audio and video files into text using the Whisper model. The application also supports processing PDF files to extract and summarize their content.
chat-application chatgpt educational-project fastapi groq-api huggingface lama llm pdf-files platform python speech-to-text text-summarization transformer whisper word2vec wordembedding
Last synced: 19 Dec 2024
https://github.com/tristan-mcinnis/simultaneous-interpretation
Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.
agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper
Last synced: 17 Jan 2025
https://github.com/doctorpok42/pheere
Pheere is a simple virtual assistant
ai chatgpt elevenlabs ts virtual-assistant whisper
Last synced: 10 Jan 2025
https://github.com/status-im/infra-role-status-go
Ansible role for status-go
ansible-role infra waku whisper
Last synced: 05 Jan 2025
https://github.com/deepbiolab/customer-complaint-classification
An GenAI-powered pipeline leveraging Whisper, DALL-E, and GPT to transform customer complaints into actionable insights with automated transcription, visualization, and classification.
Last synced: 23 Jan 2025
https://github.com/fatma-moanes/voice-assistant
Voice Assistant for FM-Clinic: A multilingual AI-powered voice assistant for booking doctor appointments, leveraging advanced speech-to-text, text-to-speech, and large language models for seamless, natural user interactions.
ai-assistant arabic arabic-nlp aws-polly chatbot gpt groq langchain langsmith llm mongodb multilingual openai speech-recognition speech-to-text streamlit text-to-speech transcription voice-assistant whisper
Last synced: 26 Dec 2024
https://github.com/levysantiago/upload-ai
Este é um sistema que utiliza Whisper e ChatGPT da OpenAI para gerar títulos e descrições a partir da análise de vídeos submetidos.
ai artificial-intelligence axios chatgpt fastify ffmpeg nlw-13 node openai prisma react rocketseat tailwindcss typescript vite whisper zod
Last synced: 12 Jan 2025
https://github.com/yankeexe/tiktok-summarizer
Ask questions to a Tiktok video
ai function-calling llm llm-tool-call mini-app ollama pytorch seq2seq streamlit tiktok tool-calling transformers whisper
Last synced: 02 Jan 2025
https://github.com/fkiller/whispertranscript
Transcribe voice from mic input using OpenAI Whisper API.
llm openai transcribe transcript transcription webaudio whisper
Last synced: 06 Jan 2025
https://github.com/nazago/meeting-minutes-generator
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
langchain-python llm-inference local-inference meeting-minutes ollama speech-to-text summarization whisper
Last synced: 02 Jan 2025
https://github.com/aitor-alvarez/whisper-lightning-finetuning
Whisper fine-tuning using Lightning
acoustic-features acoustic-model speech-recognition torch-lightning whisper
Last synced: 02 Jan 2025
https://github.com/asai95/speech-recognition-api
Simple but extensible API for Speech Recognition.
Last synced: 02 Jan 2025
https://github.com/luluw8071/whisper-tune
Finetuning Whisper on your own voice
Last synced: 07 Feb 2025
https://github.com/sugarcane-mk/whisper
This repository provides a Python script for extracting speech embeddings using OpenAI's Whisper model. The embeddings are high-dimensional feature vectors that capture the acoustic properties of the input audio. These embeddings can be used for downstream tasks such as speech classification, clustering, and speaker recognition.
asr classification feature-extraction openai speech-processing speech-recognition speech-to-text svm-classifier whisper
Last synced: 02 Jan 2025
https://github.com/yjg30737/pyqt-simple-whisper-gui
Whisper text-to-speech, speech-to-text example in PyQt5 GUI
openai pyqt pyqt-ai pyqt5 pyqt5-desktop-application pyqt5-examples pyqt5-gui whisper
Last synced: 03 Jan 2025
https://github.com/mrbuslov/reminder_4u_bot
AI Telegram Bot Reminder. You send a free-form text OR voice reminder, the AI bot records it and reminds you at the right time!
ai ai-bot aiogram chatgpt django gpt-3 gpt-4 gpt-models python reminder telegram-bot voice-recognition whisper
Last synced: 10 Jan 2025
https://github.com/zdwolfe/transcription-tools
Docker video transcriber, wrapper around OpenAI
openai transcription whisper whisper-ai
Last synced: 02 Jan 2025
https://github.com/studiowebux/tommygotchi
whisper, piper, llama-gpt, python, fun .. so much fun !
llama-gpt piper python3 whisper whisper-ai
Last synced: 05 Jan 2025
https://github.com/andreykolomiets/local_speech_translator
Almost online speech translation on Apple Silicon laptops with CoreML enabled. Doesn't need any APIs, all work is done locally using OpenAI's excellent Whisper model. Also https://github.com/ggerganov/whisper.cpp repo is used to build Whisper with CoreML support, enhancing speed significantly
coreml dash fastapi hebrew hebrew-english whisper whisper-cpp
Last synced: 07 Feb 2025
https://github.com/zakariaf/whisperwave
AI-powered audio transcription app using OpenAI Whisper, Flask, and Vue. Upload .wav files, select a language, and get accurate transcriptions. Fully Dockerized!
audio-processing docker flask python speech-to-text transcription vite vue whisper
Last synced: 07 Feb 2025
https://github.com/theaussiepom/wyoming-openai
OpenAI SST and TTS support for the Wyoming protocol
home-assistant home-assistant-assist openai sst tts whisper wyoming
Last synced: 21 Dec 2024
https://github.com/philogicae/docker-faster-whisper-fr-api
Docker - Faster Whisper FR - RunPod Serverless API
ctranslate2 docker faster-whisper french runpod serverless whisper
Last synced: 08 Jan 2025
https://github.com/deshwalmahesh/whisper-fastapi-realtime
It is Front + Backend app that uses openai/whisper-large-v3-turbo in your consumer grade system to provide real live audio transcription
audio-transcription fastapi huggingface live pyaudio realtime transcription transformers whisper whisper-large
Last synced: 25 Oct 2024
https://github.com/mickekring/top-of-mind-beromfabriken
Att ge beröm till en kollega kan kännas lite pinsamt, men forskning har visat att det kan få oss att må bättre på jobbet och att vi till och med blir mer produktiva. Att få höra att kollegor värdesätter och uppmärksammar en ökar ens välmående helt enkelt.
api gpt openai python transcription whisper
Last synced: 16 Jan 2025
https://github.com/cp3249/athena_project
Athena is an AI assistant project that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversational and interactive experience. It uses LLMs available through Ollama and provides a basic framework for extending functionalities through a modular tool system.
Last synced: 15 Jan 2025