Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Whisper
Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
- GitHub: https://github.com/topics/whisper
- Repo: https://github.com/openai/whisper
- Created by: OpenAI
- Released: August 2021
- Related Topics: machine-learning, artificial-intelligence, language-modeling,
- Last updated: 2025-02-11 00:33:31 UTC
- JSON Representation
https://github.com/heyfoz/python-openai-whisper
This Python script provides a simple interface to transcribe audio files using the OpenAI API's speech-to-text functionality, powered by the Whisper model. The result is returned to the console as text or VTT (WebVTT) format.
ai api audio-transcription openai python speech-to-text whisper
Last synced: 19 Dec 2024
https://github.com/geo-y20/enhanced-learning-experience
IntelliLearn is a FastAPI-based application designed to process and transcribe audio and video files into text using the Whisper model. The application also supports processing PDF files to extract and summarize their content.
chat-application chatgpt educational-project fastapi groq-api huggingface lama llm pdf-files platform python speech-to-text text-summarization transformer whisper word2vec wordembedding
Last synced: 19 Dec 2024
https://github.com/doctorpok42/pheere
Pheere is a simple virtual assistant
ai chatgpt elevenlabs ts virtual-assistant whisper
Last synced: 10 Jan 2025
https://github.com/deepbiolab/customer-complaint-classification
An GenAI-powered pipeline leveraging Whisper, DALL-E, and GPT to transform customer complaints into actionable insights with automated transcription, visualization, and classification.
Last synced: 23 Jan 2025
https://github.com/fatma-moanes/voice-assistant
Voice Assistant for FM-Clinic: A multilingual AI-powered voice assistant for booking doctor appointments, leveraging advanced speech-to-text, text-to-speech, and large language models for seamless, natural user interactions.
ai-assistant arabic arabic-nlp aws-polly chatbot gpt groq langchain langsmith llm mongodb multilingual openai speech-recognition speech-to-text streamlit text-to-speech transcription voice-assistant whisper
Last synced: 26 Dec 2024
https://github.com/yankeexe/tiktok-summarizer
Ask questions to a Tiktok video
ai function-calling llm llm-tool-call mini-app ollama pytorch seq2seq streamlit tiktok tool-calling transformers whisper
Last synced: 02 Jan 2025
https://github.com/nazago/meeting-minutes-generator
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
langchain-python llm-inference local-inference meeting-minutes ollama speech-to-text summarization whisper
Last synced: 02 Jan 2025
https://github.com/aitor-alvarez/whisper-lightning-finetuning
Whisper fine-tuning using Lightning
acoustic-features acoustic-model speech-recognition torch-lightning whisper
Last synced: 02 Jan 2025
https://github.com/asai95/speech-recognition-api
Simple but extensible API for Speech Recognition.
Last synced: 02 Jan 2025
https://github.com/sugarcane-mk/whisper
This repository provides a Python script for extracting speech embeddings using OpenAI's Whisper model. The embeddings are high-dimensional feature vectors that capture the acoustic properties of the input audio. These embeddings can be used for downstream tasks such as speech classification, clustering, and speaker recognition.
asr classification feature-extraction openai speech-processing speech-recognition speech-to-text svm-classifier whisper
Last synced: 02 Jan 2025
https://github.com/yjg30737/pyqt-simple-whisper-gui
Whisper text-to-speech, speech-to-text example in PyQt5 GUI
openai pyqt pyqt-ai pyqt5 pyqt5-desktop-application pyqt5-examples pyqt5-gui whisper
Last synced: 03 Jan 2025
https://github.com/mrbuslov/reminder_4u_bot
AI Telegram Bot Reminder. You send a free-form text OR voice reminder, the AI bot records it and reminds you at the right time!
ai ai-bot aiogram chatgpt django gpt-3 gpt-4 gpt-models python reminder telegram-bot voice-recognition whisper
Last synced: 10 Jan 2025
https://github.com/andreykolomiets/local_speech_translator
Almost online speech translation on Apple Silicon laptops with CoreML enabled. Doesn't need any APIs, all work is done locally using OpenAI's excellent Whisper model. Also https://github.com/ggerganov/whisper.cpp repo is used to build Whisper with CoreML support, enhancing speed significantly
coreml dash fastapi hebrew hebrew-english whisper whisper-cpp
Last synced: 07 Feb 2025
https://github.com/zakariaf/whisperwave
AI-powered audio transcription app using OpenAI Whisper, Flask, and Vue. Upload .wav files, select a language, and get accurate transcriptions. Fully Dockerized!
audio-processing docker flask python speech-to-text transcription vite vue whisper
Last synced: 07 Feb 2025
https://github.com/werserk/techstormhack-1st-place
Решение соревнования ТехШторм от корпорации ТатНефть по анализу активности членов команды на ВКС
pyannote speaker-diarization speech-recognition streamlit whisper
Last synced: 11 Jan 2025
https://github.com/charlot-dedjinou/hackathon-ia-multimodal-multilingue
Lors de ce hackathon, nous avons développé la solution Smart VT, une application web basée sur l'IA conçue pour sous-titrer et doubler n'importe quelle vidéo d'une langue à une autre (selon votre choix). Le projet s'appuie sur un frontend en React, des API Python pour le traitement des vidéos, et Node.js pour la gestion des sous-titres vidéo.
api dubble fastapi ffmpeg googletranslator mongodb moviepy nodejs openia reactjs subtitles whisper
Last synced: 12 Jan 2025
https://github.com/waikato-llm/whisper
Docker images for the whisper audio transcription library and variants.
Last synced: 12 Jan 2025
https://github.com/lohiyah/vidcraft
VidCraft is an AI-driven backend application that generates videos from user-defined topics and backgrounds. It combines text, audio, and visuals using advanced AI services, making video creation accessible and efficient for developers and content creators alike.
ai elevenlabs fastapi ffmpgeg full-stack-web-development gemini-ai huggingface image-generation machine-learning reactjs subtitles text-to-speech typescript video-generation whisper
Last synced: 18 Jan 2025
https://github.com/velocitatem/dontlectureme
A program that pays attention to your lectures for you.
ai lectures university whisper
Last synced: 30 Jan 2025
https://github.com/electroneum/electroneum-web3.js
Electroneum SmartChain JavaScript API
api electroneum ethereum etn-sc javascript swarm typescript whisper
Last synced: 19 Jan 2025
https://github.com/isladot/speech-to-text-whisper
A speech-to-text converter powered by OpenAI's Whisper model. Easy-to-use tool for transcribing audio into text with high accuracy.
ai python s2t speech-to-text whisper
Last synced: 19 Jan 2025
https://github.com/nanext21/vidcraft
VidCraft is an AI-driven backend application that generates videos from user-defined topics and backgrounds. It combines text, audio, and visuals using advanced AI services, making video creation accessible and efficient for developers and content creators alike.
elevenlabs fastapi ffmpgeg full-stack-web-development gemini-ai github-config image-generation machine-learning mern-project subtitles typescript video-generation whisper whisper-ai
Last synced: 19 Jan 2025
https://github.com/jt-427/whisper-ui
A minimalist and elegant UI for OpenAI's Whisper speech-to-text model, built with React + Vite and Flask
flask openai react speech-to-text transcription vite whisper
Last synced: 19 Jan 2025
https://github.com/armaggheddon/whisper2me
whisper2me is a telegram bot written with pyTelegramBotAPI that uses OpenAI's whisper to perform speech2text so you no longer have listen to voice messages 🤫🔇
docker openia pytelegrambotapi python whisper
Last synced: 25 Jan 2025
https://github.com/luizcalaca/transcricao-medica
Full Stack + Whisper Transcription + Node.js REST API + VITE + React.js + Railway deploy
full-stack nodejs openai openai-api railway reactjs sequelize sequelize-orm vite whisper whisper-ai
Last synced: 25 Jan 2025
https://github.com/antosser/whisper-ui-web
Web App for interacting with the OpenAI Whisper API visually, written in Svelte
app english svelte text voice voice-recognition voice-to-text web whisper
Last synced: 07 Feb 2025
https://github.com/umlx5h/llplayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
asr csharp flyleaf language-learning media-player ocr player tesseract video video-player whisper wpf yt-dlp
Last synced: 01 Feb 2025
https://github.com/huuquyet/phowhisper-tiny
Converted clone of PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)
onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper
Last synced: 01 Feb 2025
https://github.com/diegoseg15/ia-tesis-backend
About Proyecto de tesis - Asistente Robot DORIS - Frontend
artificial-intelligence express gpt nodejs openai tts whisper
Last synced: 08 Feb 2025
https://github.com/dheison0/subcreator
A subtitle creator, translator and embeder tool made using AI
ai machine-learning ml python subtitles video-processing whisper
Last synced: 08 Feb 2025
https://github.com/tristan-mcinnis/Simultaneous-Interpretation
Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.
agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper
Last synced: 08 Feb 2025
https://github.com/yousofss/speechtotext
Speech-to-Text using OpenAI's Whisper model
audio-to-text openai openai-whisper speech-to-text transcription whisper whisper-ai
Last synced: 08 Feb 2025
https://github.com/man2dev/whisper-cpp
dev fork of https://src.fedoraproject.org/rpms/whisper-cpp
fedora fedora-repository linux whisper whisper-cpp whispercpp
Last synced: 10 Feb 2025