Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Whisper
Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
- GitHub: https://github.com/topics/whisper
- Repo: https://github.com/openai/whisper
- Created by: OpenAI
- Released: August 2021
- Related Topics: machine-learning, artificial-intelligence, language-modeling,
- Last updated: 2025-02-07 00:32:40 UTC
- JSON Representation
https://github.com/vimwei/whispertranscriber
Whisper Transcribe and srt Resegment
speech-to-text subtitle whisper
Last synced: 17 Oct 2024
https://github.com/sonhm3029/realtime-vietnamese-asr-react-native-and-whisper
This project implement end to end realtime vietnamese speech recognition with PhoWhisper in Backend and frontend in React Native
asr phowhiper react-native realtime realtime-speech-recognition speech-recognition speech-to-text vietnamese whisper
Last synced: 16 Nov 2024
https://github.com/firefly55lm/bisbigliatorev2
Automatic audio transcriber notebook based on Whisper
colab-notebook speech-to-text whisper
Last synced: 25 Jan 2025
https://github.com/water25234/ChatREP
Summary on Youtube By ChatGPT & whisper
chatgpt-api openai python python3 video whisper youtube
Last synced: 24 Oct 2024
https://github.com/amir-mohseni/voicebridge
This repository provides a dockerized Speech-to-Speech application that supports text-to-audio conversion, audio-to-text transcription, and interactive voice-based conversations. It is easy to set up and use, offering a versatile platform for speech and text processing.
docker huggingface python transformer tts whisper
Last synced: 17 Jan 2025
https://github.com/imsanjoykb/speech-nlp-bootcamp
Speech NLP Bootcamp
asr audio-analysis audio-applications bangla-nlp huggingface-transformers seq2seq speech speech-recognition tts wav2vec2 whisper
Last synced: 18 Jan 2025
https://github.com/JoSuru/speeka
Speeaka is an open-source project that uses the Whisper model of OpenAI to transcribe audio into text. Its intuitive web interface makes it easy to use. Contributions are welcome.
open-source python python3 speech-to-text streamlit whisper
Last synced: 24 Oct 2024
https://github.com/kazkozdev/video-analyser
⚡ The YouTube Video Analyzer Pro brings AI-powered analysis capabilities to your fingertips, offering deep insights for content creators and marketers.
ai content-analytics fastapi llama3 llm ollama-api python3 video-analysis video-analysis-client whisper youtube youtube-analytics youtube-api youtube-subscribers
Last synced: 13 Jan 2025
https://github.com/i4ds/whisper-prep
Data preparation utility for the finetuning of OpenAI's Whisper model.
fine-tuning nlp speech-to-text whisper
Last synced: 09 Nov 2024
https://github.com/alessioborgi/stylealigned_multireference-multimodal
Novel framework for Zero-Shot Style Alignment in Text-to-Image generation, incorporating Multi-Modal Context-Awareness and Multi-Reference Style Alignment, using minimal attention sharing, ensuring consistent style transfer without fine-tuning.
adain blip clap context-awareness multi-modal multi-style-transfer no-fine-tuning shared-attention-heads style-aligned text-to-image-generation whisper zero-shot-learning
Last synced: 18 Oct 2024
https://github.com/alancunningham/chatgpt-assistant
A ChatGPT assistant with voice activation and image generation, connected to a Raspberry Pi display.
chatgpt chatgpt-api dall-e dall-e-api porcupine python raspberry-pi whisper
Last synced: 06 Jan 2025
https://github.com/flyingfathead/youwhisper-cli
A streamlined CLI tool combining `yt-dlp` and `whisperx` (or `openai-whisper`) for quick and efficient audio transcription from various video platforms.
cli cli-app python transcribe transcriber transcription whisper whisper-ai whisperx youtube-downloader yt-dlp yt-dlp-wrapper
Last synced: 11 Jan 2025
https://github.com/gurpreetkaurjethra/multimodal-ai-app-using-llava-7b
Multimodal AI App using Llava 7B and Gradio
ai generative-ai gradio large-language-models llava llavacpp llm multimodal voice-assistant whisper
Last synced: 22 Nov 2024
https://github.com/limdongjin/ignkafasr
Real-Time In-memory Speaker Verification and Speech Recognition Project using apache ignite, apache kafka, speechbrain, whisper, stomp, spring webflux, kubernetes(k8s)
apache-ignite apache-kafka asr audio-recorder google-kubernetes-engine k8s kubernetes speaker-recognition speaker-verification speech-recognition speechbrain springframework stomp stompwebsocket webflux whisper
Last synced: 24 Oct 2024
https://github.com/awaisoem/interview-lingo
(Aug 2024) AI assistant which help with interviews, hiring, personality development and communication skills
ai ai71 drizzle-orm falcon neondb nextjs postgresql tailwindcss whisper
Last synced: 08 Feb 2025
https://github.com/sanket-poojary-03/fine-tuning-whisper
Fine tuning Whisper-Small LLM for Hinglish Audio dataset
audio-dataset audio-to-text deep-learning fine-tuning huggingface-transformers python speech-recognition speech-to-text whisper whisper-ai
Last synced: 08 Feb 2025
https://github.com/jacoblincool/wft
Run Whisper fine-tuning with ease—it works on MPS, CUDA, and CPU without code changes.
Last synced: 11 Dec 2024
https://github.com/aspadax/subtitlegenerator
Automatically generate a subtitle for your video.
gpt machine-learning openai rust streamlit subtitles-generator whisper
Last synced: 08 Feb 2025
https://github.com/ndjenkins85/afkode
Personal voice command interface for iPhone on pythonista powered by Whisper and ChatGPT.
chatgpt openai python-packaging quick-start whisper
Last synced: 12 Oct 2024
https://github.com/sovit-123/sam_molmo_whisper
An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
molmo segment-anything-model segmentanythingmodel vlm whisper
Last synced: 18 Oct 2024
https://github.com/amgawishx/voiceworker
A Web App UI for OpenAI's Whisper model for audio transcription and translation.
ai audio-processing python streamlit transcription translation webapp whisper
Last synced: 17 Jan 2025
https://github.com/astrologos/py-speakeasy
Speakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.
automatic-speech-recognition chat-gpt coqui-ai coqui-tts elevenlabs-api mimic mycroftai text-to-speech whisper
Last synced: 24 Oct 2024
https://github.com/Lord-Haji/ChatAudio
chatbot gpt-3-5-turbo gpt-4 langchain langchain-python speech-recognition whisper whisper-api
Last synced: 24 Oct 2024
https://github.com/williamwa/mssmith
A Telegram bot that utilizes the ChatGPT API and can communicate through voice.
chatpgt-api telegram-bot tts whisper
Last synced: 31 Dec 2024
https://github.com/szilvia-csernus/openai-audio-api-calls
Speech-to-text and text-to-speech API call examples, using OpenAI's whisper-1 and tts-1 models.
jupyter-notebook openai openai-api tts-1 whisper
Last synced: 08 Feb 2025
https://github.com/fer14/videoseek
Intelligent video search tool powered by AI
bert timestamp video whisper youtube-api
Last synced: 14 Jan 2025
https://github.com/phidlarkson/whisper-stt-api
Easy setup for the whisper speech to text
api flask speech-to-text whisper
Last synced: 01 Jan 2025
https://github.com/otonomee/mic2transcript
CLI tool that continuously transcribes audio from the device's built-in microphone to a text file. Runs in the background, providing an ongoing log of ambient audio as text.
audio cli cli-tool openai speech speech-transcription transcription whisper
Last synced: 08 Feb 2025
https://github.com/upes-open/osoc-24-the-content-forge
The Content Hub Is a online platform which acts as a all in one solution helping content creators develop and generate short form video image content utilising genai models and cloud to maximize their efficiency and benefit from the ever-growing developments in ai models
aws docker fastapi genai microservices nodejs react whisper
Last synced: 08 Feb 2025
https://github.com/saadkh1/docqa-textsummarization-app
A Streamlit app for document question answering and text summarization.
langchain llama-2 llamacpp pytesseract question-answering streamlit summarization whisper
Last synced: 07 Jan 2025
https://github.com/seitzquest/RavenWhisperer
Listens to your voice and queries a language model for answers when a question is detected
Last synced: 22 Nov 2024
https://github.com/ashot72/speech-to-text-to-image
Generating texts from your voice then images form the texts
chatgpt large-language-models llm replicate speech-to-text speechtotext stability-ai text-to-image texttoimage whisper whisper-ai
Last synced: 30 Dec 2024
https://github.com/adamelkholyy/whisper-yt
Toolkit for using Whisper to transcribe YouTube videos. Includes Whisper transcription of YouTube videos, conversion of YouTube video into HuggingFace dataset (using audio and subtitles) and evaluation of Whisper transcription against YouTube subtitles
asr diarization huggingface-datasets pyannote transcription whisper word-error-rate youtube
Last synced: 05 Feb 2025
https://github.com/my-north-ai/semantic_audio_filtering
Synthetic data augmentation technique via LLM for Automatic Speech Recognition fine tuning.
automatic-speech-recognition fine-tuning synthetic-dataset-generation text-to-speech whisper
Last synced: 24 Oct 2024
https://github.com/jemtaly/whispering
A real-time transcription and translation tool implemented in Python based on the fast-whisper library.
live-caption python real-time-transcription real-time-translation tkinter transcription translation whisper
Last synced: 09 Jan 2025
https://github.com/tonywu71/distilling-and-forgetting-in-large-pre-trained-models
Code for my dissertation on "Distilling and Forgetting in Large Pre-Trained Models" for the MPhil in Machine Learning and Machine Intelligence (MLMI) at the University of Cambridge.
continual-learning distillation speech-recognition whisper
Last synced: 04 Dec 2024
https://github.com/datarabbit-ai/transcription_service
System/service with REST API for extracting text transcriptions from movies and audio recordings in most popular video formats.
containers datarabbit rest-api speech-to-text stt transcription transcription-services whisper
Last synced: 08 Feb 2025
https://github.com/andreabak/whispersubs
Generate subtitles for your video or audio files using the power of AI
ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper
Last synced: 16 Nov 2024
https://github.com/t-h-chung/note-taker
Note-taking app for online/local video/audio using Whisper transcription, ChatGPT, and Notion
chatgpt notes notion transcription whisper youtube
Last synced: 08 Feb 2025
https://github.com/marketcalls/openalgo-voice-based-orders
OpenAlgo Voice Based Orders
flask groq openai python speech-to-text whisper
Last synced: 19 Dec 2024
https://github.com/daisyyedda/whisper-large-v2-atcosim_corpus
A fine-tuned Whisper model (whisper-large-v2) for aviation audio transcription. WER < 5%.
asr-model nlp whisper whisper-ai
Last synced: 08 Feb 2025
https://github.com/knot-inc/john
John is a web app that records video, analyzes audio with AI, and identifies the speaker's native language from their English accent, simplifying language assessment.
audio-analysis machine-learning whisper
Last synced: 17 Nov 2024
https://github.com/becomingbabyman/eunoia-desktop
local desktop transcription and search for apple voice memos and videos
search second-brain transcription videos voice-memos whisper
Last synced: 25 Dec 2024
https://github.com/jowadev/interview
Interview is an interactive application crafted to empower both students and professionals in honing their skills for job interviews.
interview-preparation job-interviews nextjs professional students whisper
Last synced: 07 Feb 2025
https://github.com/maawad/luna
Personal assistant
bot openai personal-assistant whisper
Last synced: 17 Dec 2024
https://github.com/huuquyet/phowhisper-next
Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js
nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper
Last synced: 19 Dec 2024
https://github.com/adisol07/sharpspeech
SharpSpeech is free, local and open source way to speech and wake word recognition.
audio speech speech-recognition speech-to-text wake-word-detection wakeword whisper whisper-ai
Last synced: 19 Dec 2024
https://github.com/bigyaa/transcription-system
This versatile tool is designed for anyone in need of a robust solution for transcribing and diarizing large volumes of audio files. Whether you are dealing with terabytes or even larger quantities, our tool ensures efficient and accurate processing. Ideal for researchers, content creators, and businesses.
accessibility diarization speech-to-text storytelling-with-data transcription whisper
Last synced: 19 Dec 2024
https://github.com/brentwong-kiel1997/ai_language_school_based_on_django_and_openai
Django and OpenAI API example use case
django gpt-4 openai openai-api whisper
Last synced: 08 Feb 2025
https://github.com/shani-sinojiya/sandalquest
AI/ML project for recognizing colloquial Kannada speech and building a speech-based Q&A system focused on sandalwood cultivation.
ai audio-processing data-augmentation deep-learning machine-learning mongodb nlp python pytorch question-answering speech-based-question-answering-system speech-recognition whisper
Last synced: 10 Jan 2025
https://github.com/roman01la/sub-deep
Transcribe and translate audio with AI
deepl transcribe translate whisper
Last synced: 30 Dec 2024
https://github.com/mikeesto/whispercpp-android
An Android app using whisper.cpp to do voice-to-text transcriptions
android kotlin speech-to-text whisper whisper-cpp
Last synced: 09 Feb 2025
https://github.com/jgw96/speech-to-text-web-toolkit
Making Speech-To-Text on the web easy, both local and in the cloud
ai lit transformersjs webcomponents whisper
Last synced: 01 Feb 2025
https://github.com/gamut73/quizinator
Generating quizzes, on Android, from YouTube videos.
kotlin-android llm python whisper
Last synced: 19 Dec 2024
https://github.com/voqal/browser
Natural speech browsing for the software developers of tomorrow
cef jcef openai realtime-api voice voice-assistant voice-browser voice-commands voice-control whisper
Last synced: 20 Oct 2024
https://github.com/winstxnhdw/capgen
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
asr automatic-speech-recognition caddy ctranslate2 docker fastapi huggingface huggingface-spaces uvicorn-gunicorn whisper
Last synced: 23 Oct 2024
https://github.com/tylim88/voicefu
Translate Speech Into Japanese
chatgpt speech-synthesis voicevox whisper
Last synced: 18 Dec 2024
https://github.com/baristikir/voice-typing
Simple Desktop Application with Voice Typing features. Runs locally, transcribes locally and works fully offline with support for real-time transcribing. Powered by OpenAI Whisper ASR-models and whisper.cpp inference engine
Last synced: 24 Dec 2024
https://github.com/tranbavinhson/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 26 Dec 2024
https://github.com/Op27/meeting_minutes_generator
This Python application automates the process of generating meeting minutes from an audio recording. It uses the Whisper library for transcription and the OpenAI GPT models for summarizing content, then outputs the result in a Word document.
ai audio-processing document-automation meeting-minutes openai python speech-recognition text-summarization transcription whisper
Last synced: 24 Oct 2024
https://github.com/TranBaVinhSon/eth-decentralized-chat
Decentralized chat app by Ethereum Whisper protocol + Vuejs
ethereum vue vuejs whisper whisper-protocol
Last synced: 24 Oct 2024
https://github.com/jojasadventure/whisper-client
Very simple Python based client for Whisper compatible endpoint
desktop-app dictation faster-whisper macos productivity python speech-to-text stt whisper
Last synced: 08 Feb 2025
https://github.com/chaoticbyte/audio-summarize
An audio summarizer (faster-whisper and BART glued together)
ai ai-summarizer audio bart ctranslate2 faster-whisper nlp speech-to-text summarization whisper
Last synced: 08 Feb 2025
https://github.com/team-mansumugang/mansumugang-backend
만수무강 서비스의 스프링 부트 어플리케이션입니다.
aws github-actions jpa jpa-hibernate spring-boot whisper
Last synced: 08 Feb 2025
https://github.com/seanvelasco/ai
Cloudflare AI challenge submission: Slater - your virtual foreign language friend
ai artificial-intelligence language-learning llama2 llm m2m100 machine-learning whisper
Last synced: 03 Feb 2025
https://github.com/antoniosbarotsis/telegram-transcriber
A Telegram bot for transcribing voice messages
telegram transcribe voice whisper
Last synced: 26 Dec 2024
https://github.com/natanielf/lecsum
Automatically transcribe and summarize lecture recordings completely on-device using AI.
ollama ollama-python whisper whisper-ai
Last synced: 18 Dec 2024
https://github.com/nerdimite/meetsy-app
Frontend for the Workshop on Building an End-to-End AI Meeting Assistant
gpt-3 nextjs sentence-transformers tailwindcss whisper
Last synced: 24 Oct 2024
https://github.com/kunesj/holo-subs-search
Tool for searching transcriptions of vtuber videos.
holodex pyannote transcription vtuber whisper youtube
Last synced: 19 Jan 2025
https://github.com/ivanrj7j/transcription
This project transcribes audio using whisper and provides an api
ai api flask transcription whisper
Last synced: 08 Feb 2025
https://github.com/sugarcane-mk/speaker_classification
This repository provides a Python script for extracting speech embeddings using OpenAI's Whisper model. The embeddings are high-dimensional feature vectors that capture the acoustic properties of the input audio. These embeddings can be used for downstream tasks such as speech classification, clustering, and speaker recognition.
asr classification feature-extraction openai speech-processing speech-recognition speech-to-text svm-classifier whisper
Last synced: 09 Jan 2025
https://github.com/aeronjl/transcribe
Python package for accurate audio transcription with speaker diarisation
audio-transcription gpt speaker-diarization whisper
Last synced: 08 Feb 2025
https://github.com/slinusc/speaker_identification_evaluation
Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks
Last synced: 08 Feb 2025
https://github.com/chinese-soup/cbot-telegram-whisper
Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.
bot cpu-inference golang openai speech-recognition speech-to-text whisper whisper-cpp whispercpp
Last synced: 17 Jan 2025
https://github.com/mikeesto/subber
A small CLI tool for converting video & audio to a text transcription
audio cli ffmpeg golang transcribe video whisper
Last synced: 19 Dec 2024
https://github.com/egorsmkv/star-adapt-uk
Fork of https://github.com/YUCHEN005/STAR-Adapt with some modifications for Ukrainian.
asr speech-recognition ukrainian whisper
Last synced: 19 Dec 2024
https://github.com/Shtirmann/V2T
Telegram bot which automatically transcribes all voice and video messages to text.
ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper
Last synced: 24 Oct 2024
https://github.com/niqifan007/openai-tts-stt-streamlit
A gui interface for tts (text-to-speech) and stt (speech-to-text) interfaces using the openai api developed by Streamlit, with a history function一个使用Streamlit开发的openai的api接口的tts(文字转语音)和stt(语音转文字)接口的gui界面,带有历史记录功能
openai openai-api streamlit stt-gui tts tts-gui whisper whisper-api
Last synced: 08 Feb 2025
https://github.com/shtirmann/v2t
Telegram bot which automatically transcribes all voice and video messages to text.
ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper
Last synced: 08 Feb 2025
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 08 Feb 2025
https://github.com/volkansah/text-to-speech-pygui-for-whisper
This is a simple Python-based GUI application that allows users to generate speech from text using the OpenAI API. The application provides a user-friendly interface for inputting text and selecting from different voices to create personalized audio output.
openai openai-api python-gui-tkinter python3 whisper whisper-ai
Last synced: 27 Jan 2025
https://github.com/nri12/filter_voice
Dự án lọc và tắt tiếng video những từ khóa mong muốn
Last synced: 19 Dec 2024
https://github.com/oov/aviutl_subtitler
AviUtl+拡張編集の環境で Whisper による文字起こしをするためのプラグイン
Last synced: 19 Dec 2024
https://github.com/canaxs/whisper-core
An application where users can make rumor-based news and earn money in return.
mysql panel spring spring-boot whisper
Last synced: 19 Dec 2024
https://github.com/abdnh/anki-asr
Anki add-on for speech recognition
anki anki-addon deepgram speech-recognition whisper
Last synced: 24 Nov 2024
https://github.com/pkarpovich/kira-client
An AI-powered voice automation tool for IoT, integrating voice-triggered commands, OpenAI-driven intent recognition, and HTTP server management for seamless control of smart devices
ai-assistant intent-classification porcupine trigger-word-detection whisper
Last synced: 13 Jan 2025
https://github.com/breadrock1/audio-to-text
There is simple backend project to use whisper-rs.
actix-web audio-to-text rust swagger-ui whisper
Last synced: 10 Jan 2025
https://github.com/bhattbhavesh91/openai-whisper-benchmarking
Comparing the performance of OpenAI's Whisper model on a GPU vs OpenAI's API
gpu openai speech-to-text whisper
Last synced: 16 Nov 2024
https://github.com/doctorpok42/pheere-app
Pheere is a simple virtual assistant
ai chatgpt desktop-app elevenlabs nextjs scss tauri ts virtual-assistant whisper
Last synced: 10 Jan 2025