Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Whisper
Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
- GitHub: https://github.com/topics/whisper
- Repo: https://github.com/openai/whisper
- Created by: OpenAI
- Released: August 2021
- Related Topics: machine-learning, artificial-intelligence, language-modeling,
- Last updated: 2025-02-04 00:30:59 UTC
- JSON Representation
https://github.com/chaoticbyte/audio-summarize
An audio summarizer (faster-whisper and BART glued together)
ai ai-summarizer audio bart ctranslate2 faster-whisper nlp speech-to-text summarization whisper
Last synced: 09 Oct 2024
https://github.com/tranbavinhson/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 26 Dec 2024
https://github.com/topdev0215/AudioMultifunctionChatbot
This app enabling users to either record or upload audio files. Then utilizing OpenAI API (Whisper, GPT4) generates transcriptions, summaries, fact checks, sentiment analysis, and text metrics. Users can also intelligently chat about their transcriptions with a GPT4 chatbot. Data is stored relationally in SQLite and also vectorized in Pinecone.
gpt4 langcha nltk openai python3 sqlite3 streamlit strean whisper
Last synced: 24 Oct 2024
https://github.com/ayeshaaaaaaaaa/ai-powered-video-analysis-with-object-detection-and-detailed-scene-narratives
AI-driven video analysis system that extracts and transcribes audio with Whisper, detects objects using YOLO, and generates comprehensive scene descriptions with GPT-2. The project combines transcriptions and object detections to produce detailed, context-aware video narratives.
bart gpt2 video-analysis whisper yolov8
Last synced: 02 Jan 2025
https://github.com/jesse-c/local-audio-toolkit
Some handy tools to do with audio locally.
large-language-models lm-studio macos side-project whisper
Last synced: 29 Jan 2025
https://github.com/fukuro-kun/wortweber
Wortweber ist ein sich in der Entwicklung befindendes Open-Source-Projekt, das Echtzeit-Sprachtranskription mit KI-Technologie erforscht. Es dient als Lern- und Experimentierplattform für Spracherkennung in Deutsch und Englisch.
Last synced: 17 Jan 2025
https://github.com/aws-samples/amazon-ivs-webgpu-captions-demo
This repository contains an experimental demo application that shows how you can add client-side auto-generated captions to Amazon IVS Real-time and Low-latency streams using transformers.js and WebGPU.
ai amazon-ivs aws captions experimental ivs-lowlatency ivs-realtime lambda lowlatency lvl-300 realtime serverless transformersjs web webgpu webrtc whisper
Last synced: 09 Oct 2024
https://github.com/xaionaro-go/speech
A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)
ai converter go golang library module package speech speech-recognition speech-to-text text whisper
Last synced: 13 Jan 2025
https://github.com/marquesafonso/multilang-asr-captioner
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.
automatic-speech-recognition captioning-videos faster-whisper whisper
Last synced: 24 Oct 2024
https://github.com/valiantlynx/custom-whisper-api
This project provides a custom API wrapper for the open-source Whisper model using FastAPI. It allows you to integrate Whisper into your applications for automatic speech recognition (ASR) tasks.
ai docker-compose fastapi python whisper
Last synced: 10 Jan 2025
https://github.com/antoniosbarotsis/telegram-transcriber
A Telegram bot for transcribing voice messages
telegram transcribe voice whisper
Last synced: 26 Dec 2024
https://github.com/Op27/meeting_minutes_generator
This Python application automates the process of generating meeting minutes from an audio recording. It uses the Whisper library for transcription and the OpenAI GPT models for summarizing content, then outputs the result in a Word document.
ai audio-processing document-automation meeting-minutes openai python speech-recognition text-summarization transcription whisper
Last synced: 24 Oct 2024
https://github.com/bbc-esq/whisper-solo-with-gui
OpenAI's Whisper program with a simple lightweight GUI.
pyqt pyqt6 pyqt6-gui transcribe transcribe-audio-files translate whisper
Last synced: 11 Jan 2025
https://github.com/maylad31/colab-codes
some useful colab files
clip colab-notebook speech-recognition whisper zero-shot-classification
Last synced: 11 Jan 2025
https://github.com/etienneab3d/srt-sync
Synchronize SRT timestamps over an existing accurate transcription
aligner asr nlp subtitles text-to-speech whisper
Last synced: 19 Dec 2024
https://github.com/TranBaVinhSon/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 24 Oct 2024
https://github.com/becomingbabyman/eunoia-desktop
local desktop transcription and search for apple voice memos and videos
search second-brain transcription videos voice-memos whisper
Last synced: 25 Dec 2024
https://github.com/lazauk/aoai-entraidauth-sdkv1
Authenticating with Entra ID (former Azure AD) to access Azure OpenAI models in Python SDK v1.x
ai authentication azure azure-active-directory dall-e embeddings entra-id gpt openai whisper
Last synced: 12 Jan 2025
https://github.com/nerdimite/meetsy-app
Frontend for the Workshop on Building an End-to-End AI Meeting Assistant
gpt-3 nextjs sentence-transformers tailwindcss whisper
Last synced: 24 Oct 2024
https://github.com/jlcarveth/skreech
An HTTP API wrapper around Whisper for transcribing audio files.
speech-recognition speech-to-text whisper whisper-ai
Last synced: 19 Jan 2025
https://github.com/ahmetoner/master-whisper
Master Whisper transcription with CTranslate2
deep-learning inference openai quantization speech-recognition speech-to-text transformer whisper
Last synced: 08 Jan 2025
https://github.com/nicknaskida/insanely-fast-whisper
Incredibly fast Whisper-large-v3 with speaker diarization
diarization speaker-diarization transfromers whisper whisper-ai whisper-faster whisper-large
Last synced: 19 Jan 2025
https://github.com/bhattbhavesh91/openai-whisper-benchmarking
Comparing the performance of OpenAI's Whisper model on a GPU vs OpenAI's API
gpu openai speech-to-text whisper
Last synced: 16 Nov 2024
https://github.com/madh93/whisper
🎙️ My Whisper stuff
docker openai speech-recognition speech-to-text whisper whisper-cpp
Last synced: 29 Jan 2025
https://github.com/mickekring/top-of-mind-clara
Clara är en prototyp som möjliggör att anonymt kunna göra sin röst hörd. Medarbetaren kan prata eller skriva in det du vill säga och AI anonymiserar det. Medarbetaren har dessutom tillgång till en chatbot att rådfråga. Därefter analyseras och sammanställs alla medarbetares tankar i en dashboard.
ai chatbot feedback openai python streamlit transcription whisper
Last synced: 22 Dec 2024
https://github.com/tracywong117/ai-learning-material-from-video
Support subtitling, translating, RAG to generate language learning material from video.
ai auto-subtitle gpt-translate groq groq-api rag subtitles-generator translate whisper
Last synced: 19 Jan 2025
https://github.com/thewh1teagle/whisper.zig
Transcribe audio with whisper in zig
Last synced: 24 Jan 2025
https://github.com/maawad/luna
Personal assistant
bot openai personal-assistant whisper
Last synced: 17 Dec 2024
https://github.com/huuquyet/phowhisper-next
Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js
nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper
Last synced: 19 Dec 2024
https://github.com/h3yn3s/tl-dl
A selfhostable webapp which helps you read those uselessly long (by nature) voice messages with the power of AI.
Last synced: 24 Oct 2024
https://github.com/Shtirmann/V2T
Telegram bot which automatically transcribes all voice and video messages to text.
ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper
Last synced: 24 Oct 2024
https://github.com/sbadulin/obsidian-dictation-plugin
Obsidian dictation plugin
dictation gpt-35-turbo obsidian obsidian-plugin openai speech-to-text whisper
Last synced: 02 Feb 2025
https://github.com/jojasadventure/whisper-client
Very simple Python based client for Whisper compatible endpoint
desktop-app dictation faster-whisper macos productivity python speech-to-text stt whisper
Last synced: 09 Oct 2024
https://github.com/kunesj/holo-subs-search
Tool for searching transcriptions of vtuber videos.
holodex pyannote transcription vtuber whisper youtube
Last synced: 19 Jan 2025
https://github.com/toomore/whisper
🔐📦📜🔑🍞 Write some notes by using the GPG encrypts.
gpg notes pgp quickstart whisper
Last synced: 23 Jan 2025
https://github.com/team-mansumugang/mansumugang-backend
만수무강 서비스의 스프링 부트 어플리케이션입니다.
aws github-actions jpa jpa-hibernate spring-boot whisper
Last synced: 09 Oct 2024
https://github.com/rokbenko/arctic-meet
ArcticMeet is an AI meeting assistant using Streamlit for the GUI and the Snowflake Arctic LLM via the Snowflake Cortex for the AI features
ffmpeg pandas plotly python pytorch snowflake snowflake-arctic snowflake-cortex snowpark streamlit transformers whisper
Last synced: 11 Jan 2025
https://github.com/drankush/voxrad
VOXRAD is a voice transcription application for radiologists leveraging locally deployed ASR and LLM models.
desktop-app ffmpeg gemini gpt llm macos medical-informatics multimodal natural-language-processing nlp openai openai-api productivity python radiology reporting transcription voice-recognition whisper windows
Last synced: 31 Jan 2025
https://github.com/stnderror/robotron
🤖 A personal robot assistant for Telegram
assistant bot dall-e gpt-35-turbo openai telegram-bot whisper
Last synced: 25 Jan 2025
https://github.com/szilvia-csernus/openai-audio-api-calls
Speech-to-text and text-to-speech API call examples, using OpenAI's whisper-1 and tts-1 models.
jupyter-notebook openai openai-api tts-1 whisper
Last synced: 09 Oct 2024
https://github.com/abdnh/anki-asr
Anki add-on for speech recognition
anki anki-addon deepgram speech-recognition whisper
Last synced: 24 Nov 2024
https://github.com/aitor-alvarez/large-speech-models
Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper
arabic-speech-recognition asr asr-model finetuning-wav2vec finetuning-whisper large-speech-models speech-recognition-model wav2vec2 whisper
Last synced: 25 Jan 2025
https://github.com/xawos/owt
🦙🗣️ Ollama and Whisper Telegram bot, with advanced configuration
ai-bots local-ai ollama telegram-aichatbot telegram-bots whisper
Last synced: 28 Jan 2025
https://github.com/xeloxa/wtosrt
Effortlessly convert your whisper timestamped subtitles in an unknown/rarely used format to the more familiar SRT format.
conversion python srt-subtitles subtitle subtitle-edit subtitle-format timestamp timestamp-convert whisper
Last synced: 04 Feb 2025
https://github.com/sskorol/home-assistant-voice
Home Assistant Voice PE Setup Guide
docker home-assistant home-automation piper smart-home speech-recognition speech-synthesis voice-assistant whisper
Last synced: 04 Feb 2025
https://github.com/cp3249/athena_project
Athena is an AI assistant project that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversational and interactive experience. It uses LLMs available through Ollama and provides a basic framework for extending functionalities through a modular tool system.
Last synced: 15 Jan 2025
https://github.com/ubos-tech/node-red-contrib-speech-to-text-ubos
Learn how to turn audio into text.
ai low-code lowcode node-red node-red-contrib node-red-flow openai openai-api openai-whisper speech-to-text whisper whisper-ai whisper-api
Last synced: 20 Jan 2025
https://github.com/khushijtrivedi/speech
The Assistive Speech Technology System is designed to enhance communication by analyzing and processing various speech and audio inputs.
ajax bigru-crf bootstrap flask flask-server html-css-javascript librosa python restapi-framework voice-recognition whisper
Last synced: 09 Oct 2024
https://github.com/mottla/speech-to-text
Local and fast speech to text (STT) with speaker recognition. Transcibe your meetings confidentially.
huggingface speech-recognition stt teams transcription translation whisper zoom
Last synced: 21 Jan 2025
https://github.com/xi-rick/captains-log
Captain's Log is your personal AI-powered voice transcription logbook. This innovative web application allows you to transcribe spoken words into text, organize your thoughts, and manage important notes. Built with cutting-edge technology and creative design, Captain's Log sets sail to revolutionize how you capture and manage ideas.
audio-recorder audio-visualizer javascript mongodb mongodb-atlas nextjs once-ui openai react reactjs shadcn-ui tailwindcss typescript voice whisper
Last synced: 21 Jan 2025
https://github.com/brunogaliati/speech2text-investments
This project automates the download, transcription, and summarization of audio from YouTube videos. Using OpenAI's Whisper model, it converts video content into concise text summaries with an investment analyst's perspective, ideal for professionals needing quick insights.
chatgpt investment openai politics python speech-recognition speech-to-text whisper
Last synced: 19 Dec 2024
https://github.com/danibcorr/university-helper
🧑🎓 University Helper streamlines academic and administrative tasks for students, educators, and researchers. It provides tools for managing document metadata, converting PDFs to Markdown, transcribing audio, analyzing grade statistics, and more.
deep-learning documentation-tool metadata ocr open-source pdf python statistics university whisper
Last synced: 19 Dec 2024
https://github.com/flaviodelgrosso/whisper-transcriber
Use OpenAI's Whisper to transcribe audio files and diariaze speakers of the transcribed text
ai audio-to-text diarization openai torch whisper
Last synced: 19 Dec 2024
https://github.com/egorsmkv/optimized-whisper-intel
Run quantized Whisper models only on CPU with Intel hardware
intel onnx onnxruntime quantized-neural-networks whisper
Last synced: 19 Dec 2024
https://github.com/devgeekm/chat-it-up
Chat It Up! elevates conversations by transforming YouTube URLs, documents, and audio into text, enabling interactive Q&A and summaries. With one click, turn media into time-saving, knowledge-rich dialogues.
ai azure azure-functions azureservices blob-storage fastapi python rag whisper youtube-dl
Last synced: 20 Dec 2024
https://github.com/valkryst/whisper_automations
Various scripts for automating tasks using OpenAI's Whisper.
automation openai subtitle subtitle-generator transcription translation whisper
Last synced: 26 Dec 2024
https://github.com/joaobraganca555/extractionanalysistool
Cloud-based tool for multimedia data extraction and analysis, focusing on influencer content. Utilizes YOLOv8 for object/logo detection, Whisper.AI for speech recognition, and EasyOCR for OCR. Includes sentiment analysis with a scalable microservice architecture for content monitoring.
aws-s3 content-monitor docker easyocr fastapi image-classification logo-detection microservices multimedia-data-analysis object-detection ocr python rabbitmq sentiment-analysis speech-recognition streamlit whisper yolov8
Last synced: 28 Jan 2025
https://github.com/arslanex/whisperdemo
A scalable Python module for robust audio transcription using OpenAI's Whisper model. Supports multiple languages, batch processing, and output formats like JSON and SRT.
audio-processing openai openai-whisper python whisper
Last synced: 23 Nov 2024
https://github.com/flo-bit/youtube-speaker-separation
simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate
speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube
Last synced: 19 Dec 2024
https://github.com/chloelavrat/speech-to-text-app
Speech to text web app based on Streamlit and whisper that extract script for audio or youtube video.
audio-processing machine-learning machinelearning speech-to-text streamlit streamlit-webapp stt whisper whisper-ai
Last synced: 02 Jan 2025
https://github.com/umlx5h/llplayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
asr csharp flyleaf language-learning media-player ocr player tesseract video video-player whisper wpf yt-dlp
Last synced: 01 Feb 2025
https://github.com/philogicae/docker-faster-whisper-fr-api
Docker - Faster Whisper FR - RunPod Serverless API
ctranslate2 docker faster-whisper french runpod serverless whisper
Last synced: 08 Jan 2025
https://github.com/bilalhameed248/whisper-fine-tuning-for-pronunciation-learning
Fine Tuning of Whisper Speech To Text Base Model For Pronunciation Learning
deep-learning deep-neural-networks dnn fine-tuning openai pronunciation python seq2seq speech speech-recognition speech-synthesis speech-to-text whisper whisper-ai
Last synced: 16 Jan 2025
https://github.com/ashot72/answering-questions-about-images
You can upload images, ask questions about images using voice prompts, then listen to the responses in voice
answering-questions blip-2-ai-model gtts large-language-models llm replicate speech-to-text text-to-speech whisper
Last synced: 30 Dec 2024
https://github.com/njorogemaurice/speech-recognition-openai-whisper
This project is a web-based application that utilizes OpenAI's Whisper for speech-to-text conversion. The application allows users to upload audio files or record audio directly from their browser, and then converts the speech in these audio files to text using the Whisper model.
openai speech-recognition speech-to-text whisper
Last synced: 14 Jan 2025
https://github.com/vlazic/json-verbose-to-vtt-converter
Transform `json_verbose` transcriptions from OpenAI, Groq, or command-line tools into VTT files with this Deno converter.
converter groq json json-verbose openai vtt webvtt whisper
Last synced: 26 Jan 2025
https://github.com/a-iceberg/whisper-timestamped
Timestamped ASR microservice
asr audio-to-text automatic-speech-recognition data-analysis data-science deep-learning docker fastapi mlops monitoring mssqlserver openai prompt-engineering python resource-management timestamps uvicorn-gunicorn whisper
Last synced: 18 Jan 2025
https://github.com/neiltron/autocap
ALL CAPS
closedcaptions ml subtitles transcription whisper
Last synced: 19 Dec 2024
https://github.com/zdwolfe/transcription-tools
Docker video transcriber, wrapper around OpenAI
openai transcription whisper whisper-ai
Last synced: 02 Jan 2025
https://github.com/aixerum/faster-whisper
faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. This implementation is up to 4 times faster than openai/whisper for the same accuracy while using less memory. The efficiency can be further improved with 8-bit quantization on both CPU and GPU.
ctranslate2 gpu transcription whisper
Last synced: 07 Jan 2025
https://github.com/orhancavus/transcribe_video
Extract Subtitles from YouTube Videos with OpenAI Whisper and Insanely Fast Whisper
insanely-fast speach-to-text whisper
Last synced: 09 Jan 2025
https://github.com/homelab-00/longformstt
A python script that utilizes faster-whisper and pytorch for long form transcription. Uses silence detection with RMS/peak value. Has global hotkeys for easy use.
faster-whisper python speech-to-text whisper
Last synced: 09 Jan 2025
https://github.com/aidayang/faster-whisper-oneclick
Faster-whisper一键启动整合包带GUI界面
deep-learning faster-whisper inference openai quantization speech-recognition speech-to-text transformer whisper
Last synced: 09 Jan 2025
https://github.com/mickekring/top-of-mind-beromfabriken
Att ge beröm till en kollega kan kännas lite pinsamt, men forskning har visat att det kan få oss att må bättre på jobbet och att vi till och med blir mer produktiva. Att få höra att kollegor värdesätter och uppmärksammar en ökar ens välmående helt enkelt.
api gpt openai python transcription whisper
Last synced: 16 Jan 2025
https://github.com/a-iceberg/whisper_model_evaluator
WER, MER, WIL of Whisper vs Vosk vs Google transcribators comparator
asr audio-to-text automatic-speech-recognition data-analysis evaluation google-speech-recognition python tuning-parameters visualization vosk whisper
Last synced: 24 Oct 2024
https://github.com/felipecastrosales/scripts
List of useful scripts.
audio helper-functions helpers ia pip python python3 script scripts video whisper whisper-ai
Last synced: 22 Dec 2024
https://github.com/josemarcosrf/Lexicap-QA
QA retrieval for Lex Fridman's podcast transcriptions
Last synced: 24 Oct 2024
https://github.com/levysantiago/upload-ai
Este é um sistema que utiliza Whisper e ChatGPT da OpenAI para gerar títulos e descrições a partir da análise de vídeos submetidos.
ai artificial-intelligence axios chatgpt fastify ffmpeg nlw-13 node openai prisma react rocketseat tailwindcss typescript vite whisper zod
Last synced: 12 Jan 2025
https://github.com/tristan-mcinnis/simultaneous-interpretation
Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.
agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper
Last synced: 17 Jan 2025
https://github.com/tylim88/Voicefu-back-end
Translate Speech Into Japanese
chatgpt speech-synthesis voicevox whisper
Last synced: 24 Oct 2024
https://gitlab.com/ifrz/asr-multi-lite
Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition
Last synced: 24 Oct 2024
https://github.com/userpjm/whisper-youtube
Generate a SubRip subtitle file (srt) using Whisper for the audio of a YouTube video.
faster-whisper openai speech-to-text whisper
Last synced: 24 Oct 2024
https://github.com/willdphan/little-jarvis-whisper
Jarvis, a GPT Voice Assistant made with speech recognition, OpenAI's Whisper, and Gradio
gradio openai voice-assistant voice-recognition whisper
Last synced: 24 Oct 2024
https://github.com/bluebirdback/groq-subtitles
Batch video subtitle generation using Groq Whisper API
groq speech-to-text subtitles video whisper
Last synced: 21 Dec 2024
https://github.com/meain/raus
Record audio until silence (RAUS)
audio hammerspoon transcription whisper whisper-cpp
Last synced: 17 Jan 2025
https://github.com/julrog/jokes-on-you
Storyteller
ggj2024 global-game-jam openai unity whisper
Last synced: 17 Dec 2024
https://github.com/escarrie/transcriptaudio
This is a script that can be used to transcript audio file into text file using Whisper AI
Last synced: 17 Jan 2025
https://github.com/MattCode64/Scriba_Front
SCRIBA is a web application that transcribes audio files. It supports .mp3 files and provides the transcription results in a user-friendly interface.
speech-to-text vite vue vuejs whisper
Last synced: 24 Oct 2024
https://github.com/ts-azure-services/batch-transcription-examples
A repo to archive some code related to batch transcription for animation movies.
batch-transcription speech-to-text whisper
Last synced: 28 Jan 2025
https://github.com/EvilFreelancer/whisper-tests
Collection of experiments on OpenAI Whisper models
api-server docker-compose testing transcription whisper
Last synced: 24 Oct 2024
https://github.com/yui-mhcp/speech_to_text
Speech-To-Text (STT) project
audio-transcription deepspeech jasper speech-to-text stt stt-api tensorflow2 video-transcription whisper
Last synced: 24 Oct 2024
https://github.com/televisionninja/chat
Chat with an AI Vtuber
ai chatbot llama llm tts vtube-studio vtuber whisper
Last synced: 20 Nov 2024
https://github.com/sixiaolong1117/whisperpythonscript
一个简单的 Whisper Python 脚本,可以将媒体文件的音频通过 whisper 识别成文字,并通过 pysrt 保存为字幕。
pysrt python python3 whisper whisper-ai
Last synced: 16 Jan 2025