Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Whisper

Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.

https://github.com/jgw96/speech-to-text-web-toolkit

Making Speech-To-Text on the web easy, both local and in the cloud

ai lit transformersjs webcomponents whisper

Last synced: 01 Feb 2025

https://github.com/schnoddelbotz/whisper-ui

Transcribe audio/video to text, locally on macOS, Linux and Windows. A simple whisper.cpp wrapper/UI built with Go/Fyne.

ffmpeg ffmpeg-wrapper fyne gui local privacy speech-to-text transcription whisper whisper-cpp

Last synced: 27 Jan 2025

https://github.com/antoniosbarotsis/telegram-transcriber

A Telegram bot for transcribing voice messages

telegram transcribe voice whisper

Last synced: 26 Dec 2024

https://github.com/gamut73/quizinator

Generating quizzes, on Android, from YouTube videos.

kotlin-android llm python whisper

Last synced: 19 Dec 2024

https://github.com/toLSC/tolsc-speech-to-text

Speech to text service for toLSC app implemented with OpenAI Whisper model

fastapi python speech-recognition speech-to-text tts whisper

Last synced: 24 Oct 2024

https://github.com/tposcic/audio-to-srt-transcriber

Audio to srt transcriber in Python using whisper for transcription and Tcl/Tk for GUI

audio python3 srt transcription whisper

Last synced: 05 Jan 2025

https://github.com/adamelkholyy/whisper-yt

Toolkit for using Whisper to transcribe YouTube videos. Includes Whisper transcription of YouTube videos, conversion of YouTube video into HuggingFace dataset (using audio and subtitles) and evaluation of Whisper transcription against YouTube subtitles

asr diarization huggingface-datasets pyannote transcription whisper word-error-rate youtube

Last synced: 10 Dec 2024

https://github.com/h3yn3s/tl-dl

A selfhostable webapp which helps you read those uselessly long (by nature) voice messages with the power of AI.

sveltekit tailwind whisper

Last synced: 24 Oct 2024

https://github.com/mickekring/top-of-mind-clara

Clara är en prototyp som möjliggör att anonymt kunna göra sin röst hörd. Medarbetaren kan prata eller skriva in det du vill säga och AI anonymiserar det. Medarbetaren har dessutom tillgång till en chatbot att rådfråga. Därefter analyseras och sammanställs alla medarbetares tankar i en dashboard.

ai chatbot feedback openai python streamlit transcription whisper

Last synced: 22 Dec 2024

https://github.com/sugarcane-mk/speaker_classification

This repository provides a Python script for extracting speech embeddings using OpenAI's Whisper model. The embeddings are high-dimensional feature vectors that capture the acoustic properties of the input audio. These embeddings can be used for downstream tasks such as speech classification, clustering, and speaker recognition.

asr classification feature-extraction openai speech-processing speech-recognition speech-to-text svm-classifier whisper

Last synced: 09 Jan 2025

https://github.com/platput/pysubs

api to get audio transcription for video files from youtube, aws s3 and such. using OpenAI Whisper

openai whisper

Last synced: 24 Oct 2024

https://github.com/szilvia-csernus/openai-audio-api-calls

Speech-to-text and text-to-speech API call examples, using OpenAI's whisper-1 and tts-1 models.

jupyter-notebook openai openai-api tts-1 whisper

Last synced: 09 Oct 2024

https://github.com/crone-ai/force-align-wordstamps

Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.

captions faster-whisper force-alignment stable-ts whisper

Last synced: 17 Jan 2025

https://github.com/stnderror/robotron

🤖 A personal robot assistant for Telegram

assistant bot dall-e gpt-35-turbo openai telegram-bot whisper

Last synced: 25 Jan 2025

https://github.com/drankush/voxrad

VOXRAD is a voice transcription application for radiologists leveraging locally deployed ASR and LLM models.

desktop-app ffmpeg gemini gpt llm macos medical-informatics multimodal natural-language-processing nlp openai openai-api productivity python radiology reporting transcription voice-recognition whisper windows

Last synced: 31 Jan 2025

https://github.com/jlcarveth/skreech

An HTTP API wrapper around Whisper for transcribing audio files.

speech-recognition speech-to-text whisper whisper-ai

Last synced: 19 Jan 2025

https://github.com/saamerm/whisperkit-ios15

iOS 15 - On-device Inference of Whisper Speech Recognition Models for Apple Silicon

ios ios15 swiftui whisper whisper-ai

Last synced: 19 Jan 2025

https://github.com/aaishikdutta/notebook-lm-podcast-audiogram

a simple project to convert notebook-lm (or any audio in that case) into a podcast audiogram with subtitles powered by openai whisper

audiogram openai podcast remotion whisper

Last synced: 08 Dec 2024

https://github.com/bhattbhavesh91/openai-whisper-benchmarking

Comparing the performance of OpenAI's Whisper model on a GPU vs OpenAI's API

gpu openai speech-to-text whisper

Last synced: 16 Nov 2024

https://github.com/thewh1teagle/whisper.zig

Transcribe audio with whisper in zig

asr openai whisper zig

Last synced: 24 Jan 2025

https://github.com/tracywong117/ai-learning-material-from-video

Support subtitling, translating, RAG to generate language learning material from video.

ai auto-subtitle gpt-translate groq groq-api rag subtitles-generator translate whisper

Last synced: 19 Jan 2025

https://github.com/seanvelasco/ai

Cloudflare AI challenge submission: Slater - your virtual foreign language friend

ai artificial-intelligence language-learning llama2 llm m2m100 machine-learning whisper

Last synced: 03 Feb 2025

https://github.com/abdnh/anki-asr

Anki add-on for speech recognition

anki anki-addon deepgram speech-recognition whisper

Last synced: 24 Nov 2024

https://github.com/bigyaa/transcription-system

This versatile tool is designed for anyone in need of a robust solution for transcribing and diarizing large volumes of audio files. Whether you are dealing with terabytes or even larger quantities, our tool ensures efficient and accurate processing. Ideal for researchers, content creators, and businesses.

accessibility diarization speech-to-text storytelling-with-data transcription whisper

Last synced: 19 Dec 2024

https://github.com/xaionaro-go/speech

A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)

ai converter go golang library module package speech speech-recognition speech-to-text text whisper

Last synced: 13 Jan 2025

https://github.com/Shtirmann/V2T

Telegram bot which automatically transcribes all voice and video messages to text.

ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper

Last synced: 24 Oct 2024

https://github.com/niqifan007/openai-tts-stt-streamlit

A gui interface for tts (text-to-speech) and stt (speech-to-text) interfaces using the openai api developed by Streamlit, with a history function一个使用Streamlit开发的openai的api接口的tts(文字转语音)和stt(语音转文字)接口的gui界面,带有历史记录功能

openai openai-api streamlit stt-gui tts tts-gui whisper whisper-api

Last synced: 09 Oct 2024

https://github.com/aeronjl/transcribe

Python package for accurate audio transcription with speaker diarisation

audio-transcription gpt speaker-diarization whisper

Last synced: 09 Oct 2024

https://github.com/lazauk/aoai-entraidauth-sdkv1

Authenticating with Entra ID (former Azure AD) to access Azure OpenAI models in Python SDK v1.x

ai authentication azure azure-active-directory dall-e embeddings entra-id gpt openai whisper

Last synced: 12 Jan 2025

https://github.com/mikeesto/subber

A small CLI tool for converting video & audio to a text transcription

audio cli ffmpeg golang transcribe video whisper

Last synced: 19 Dec 2024

https://github.com/becomingbabyman/eunoia-desktop

local desktop transcription and search for apple voice memos and videos

search second-brain transcription videos voice-memos whisper

Last synced: 25 Dec 2024

https://github.com/topdev0215/AudioMultifunctionChatbot

This app enabling users to either record or upload audio files. Then utilizing OpenAI API (Whisper, GPT4) generates transcriptions, summaries, fact checks, sentiment analysis, and text metrics. Users can also intelligently chat about their transcriptions with a GPT4 chatbot. Data is stored relationally in SQLite and also vectorized in Pinecone.

gpt4 langcha nltk openai python3 sqlite3 streamlit strean whisper

Last synced: 24 Oct 2024

https://github.com/bbc-esq/whisper-solo-with-gui

OpenAI's Whisper program with a simple lightweight GUI.

pyqt pyqt6 pyqt6-gui transcribe transcribe-audio-files translate whisper

Last synced: 11 Jan 2025

https://github.com/maawad/luna

Personal assistant

bot openai personal-assistant whisper

Last synced: 17 Dec 2024

https://github.com/huuquyet/phowhisper-next

Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js

nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper

Last synced: 19 Dec 2024

https://github.com/valiantlynx/custom-whisper-api

This project provides a custom API wrapper for the open-source Whisper model using FastAPI. It allows you to integrate Whisper into your applications for automatic speech recognition (ASR) tasks.

ai docker-compose fastapi python whisper

Last synced: 10 Jan 2025

https://github.com/wtlow003/auto-subtitles

CLI tool to transcribe (+ translate) videos and embed subtitles automatically.

faster-whisper nllb subtitles subtitles-generator translation whisper whisper-cpp

Last synced: 15 Nov 2024

https://github.com/tobybenjaminclark/intermew

👨‍💻 Realistic, generative simulated interviews for Durhack 2024. Built using Webscraping, OpenCV, Deepface, Whisper, OpenAI and Gamemaker.

computer-vision openai-api whisper

Last synced: 25 Jan 2025

https://github.com/nexuslux/simultaneous-interpretation

Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.

agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper

Last synced: 09 Oct 2024

https://github.com/aquibali01/voice-to-text-and-voice-chatbot

Voice-to-Voice Chatbot using Whisper, LLaMA, and Groq API

chatbot gtts llama8b llm opeai python voice whisper

Last synced: 19 Dec 2024

https://github.com/man2dev/whisper-cpp

dev fork of https://src.fedoraproject.org/rpms/whisper-cpp

fedora fedora-repository linux whisper whisper-cpp whispercpp

Last synced: 09 Oct 2024

https://github.com/senkita/gabriel

视频总结工具。

summarizer whisper

Last synced: 09 Oct 2024

https://github.com/sivakumar-mahalingam/subtitle-generator

🎞️ Automatically generating subtitles for video files using Whisper ASR model in Python

ai audio-model audio-processing automatic-speech-recognition openai-whisper python speech-recognition speech-to-text subtitle-generator whisper

Last synced: 09 Oct 2024

https://github.com/kitschpatrol/ambient-novel

An interface for nonlinear interactive exploration of a novel.

ambient book fiction interactive novel svelte whisper

Last synced: 20 Jan 2025

https://github.com/kristofferv98/whisper_turboapi

An optimized FastAPI server for OpenAI's Whisper whisper-large-v3-turbo model using MLX turbo optimization

ai api asynchronous audio audio-processing fastapi huggingface machine-learning macos mlx model-serving nlp openai optimization python speech-to-text synchronous transcription whisper whisper-turbo

Last synced: 14 Dec 2024

https://github.com/luluw8071/whisper-tune

Finetuning Whisper on your own voice

whisper

Last synced: 14 Dec 2024

https://github.com/teemow/mnote

Generates meeting notes and summaries from video recordings

ai chatgpt google-meet kubeai kubernetes meeting-minutes transcription video-transcription whisper

Last synced: 02 Feb 2025

https://github.com/iamarunbrahma/smart-voice-assistant

A simple voice assistant to get your queries in speech format and generate answers using ChatGPT API in both text and audio format.

chatgpt tts whisper

Last synced: 02 Feb 2025

https://github.com/educa-ch/educa24-speech-to-summary

Demonstrator for an open-source speech-to-summary workflow

langchain ollama open-source open-weight speech-to-text summarization whisper

Last synced: 11 Oct 2024

https://github.com/flo-bit/youtube-speaker-separation

simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate

speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube

Last synced: 19 Dec 2024

https://github.com/thealphamerc/audio-to-text

Transcribe multi-lingual audio clips using whisper model

openai whisper

Last synced: 02 Feb 2025

https://github.com/khushijtrivedi/speech

The Assistive Speech Technology System is designed to enhance communication by analyzing and processing various speech and audio inputs.

ajax bigru-crf bootstrap flask flask-server html-css-javascript librosa python restapi-framework voice-recognition whisper

Last synced: 09 Oct 2024

https://github.com/tomdewildt/whisper-experiment

Experiments using the Whisper model from Open AI

colab jupyter python transcribe transformers translate whisper

Last synced: 27 Dec 2024

https://github.com/suchith-2002/whisperwave

Transcribe any Audio to Text.

openai whisper

Last synced: 03 Feb 2025

https://github.com/fkiller/whispertranscript

Transcribe voice from mic input using OpenAI Whisper API.

llm openai transcribe transcript transcription webaudio whisper

Last synced: 06 Jan 2025

https://github.com/barrylee111/voicechat-llm

A chatbot with both prompt and voicechat capabilities leveraging LangChain, Elasticsearch, and FastAPI. When using voicechat, the user can immerse themselves in the experience by selecting a narrator, like a pirate for instance.

elasticsearch fastapi langchain largelanguagemodel python react speech-to-text tailwind text-to-speech typescript websocket whisper

Last synced: 19 Dec 2024

https://github.com/brunogaliati/speech2text-investments

This project automates the download, transcription, and summarization of audio from YouTube videos. Using OpenAI's Whisper model, it converts video content into concise text summaries with an investment analyst's perspective, ideal for professionals needing quick insights.

chatgpt investment openai politics python speech-recognition speech-to-text whisper

Last synced: 19 Dec 2024

https://github.com/pjarbas/azure-ai

Examples using Azure AI services (DALLE3, Text to Speech, Whisper)

azure-openai dalle-3 image-generation-ai speech-synthesis text-to-speech whisper

Last synced: 21 Jan 2025

https://github.com/huuquyet/phowhisper-small

Converted clone of PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)

onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper

Last synced: 01 Feb 2025

https://github.com/danibcorr/university-helper

🧑‍🎓 University Helper streamlines academic and administrative tasks for students, educators, and researchers. It provides tools for managing document metadata, converting PDFs to Markdown, transcribing audio, analyzing grade statistics, and more.

deep-learning documentation-tool metadata ocr open-source pdf python statistics university whisper

Last synced: 19 Dec 2024

https://github.com/flaviodelgrosso/whisper-transcriber

Use OpenAI's Whisper to transcribe audio files and diariaze speakers of the transcribed text

ai audio-to-text diarization openai torch whisper

Last synced: 19 Dec 2024

https://github.com/deshwalmahesh/whisper-fastapi-realtime

It is Front + Backend app that uses openai/whisper-large-v3-turbo in your consumer grade system to provide real live audio transcription

audio-transcription fastapi huggingface live pyaudio realtime transcription transformers whisper whisper-large

Last synced: 25 Oct 2024

https://github.com/evil0ctal/whisper-speech-to-text-api

An open source Speech-to-Text API. The project is based on OpenAI's Whisper model and uses the asynchronous features of FastAPI to efficiently wrap it and support more custom functions.

ai api fastapi openai-whisper speech-to-text speech-to-text-api whisper whisper-ai whisper-api

Last synced: 25 Oct 2024

https://github.com/egorsmkv/optimized-whisper-intel

Run quantized Whisper models only on CPU with Intel hardware

intel onnx onnxruntime quantized-neural-networks whisper

Last synced: 19 Dec 2024

https://github.com/devgeekm/chat-it-up

Chat It Up! elevates conversations by transforming YouTube URLs, documents, and audio into text, enabling interactive Q&A and summaries. With one click, turn media into time-saving, knowledge-rich dialogues.

ai azure azure-functions azureservices blob-storage fastapi python rag whisper youtube-dl

Last synced: 20 Dec 2024

https://github.com/bilelouahmed/vocal-assistant

Python voice assistant (based on SpeechRecognition, Whisper and XTTS models) designed to transcribe speech to text, translate across languages, engage in chat mode, and ultimately respond vocally.

chatbot llm mistral-7b neo4j python rag speech-recognition text-to-speech transcription whisper xtts

Last synced: 21 Dec 2024

https://github.com/soenneker/soenneker.libraries.whisper.ctranslate

Simply adds the Whisper_CTrantlate2 Windows executable, updated daily (if available)

ai csharp ctranslate ctranslate2 dotnet faster libraries library whisper whisperctranslate

Last synced: 29 Dec 2024

https://github.com/vifill/audio-recorder-and-summarizer

This project is a Python script that records system audio on macOS using BlackHole, transcribes the audio using OpenAI's Whisper API, and summarizes the transcription using OpenAI's GPT models

ai audio blackhole gpt openai records summarize system whisper

Last synced: 20 Dec 2024

https://github.com/ekito-station/whisper-api-unity

UnityでOpenAI Whisper APIを使って文字起こしを行ったサンプル

unity whisper

Last synced: 20 Dec 2024

https://github.com/miosipof/asr_train

Fine-tuning OpenAI Whisper for ASR tasks on low-size datasets

asr machine-learning nlp whisper

Last synced: 07 Jan 2025

https://github.com/miosipof/whisper_inference

OpenAI Whisper ASR inference on CPU with OpenVino, PyTorch or Huggingface

asr inference machine-learning openvino pytorch whisper

Last synced: 07 Jan 2025

https://github.com/arkapravo-ghosh/speech-to-text

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI

ai fastapi huggingface machine-learning openai python3 speech-to-text transformers whisper

Last synced: 21 Dec 2024

https://github.com/theaussiepom/wyoming-openai

OpenAI SST and TTS support for the Wyoming protocol

home-assistant home-assistant-assist openai sst tts whisper wyoming

Last synced: 21 Dec 2024

https://github.com/bluebirdback/groq-subtitles

Batch video subtitle generation using Groq Whisper API

groq speech-to-text subtitles video whisper

Last synced: 21 Dec 2024

https://github.com/josemarcosrf/Lexicap-QA

QA retrieval for Lex Fridman's podcast transcriptions

lexicap qa search whisper

Last synced: 24 Oct 2024

https://github.com/tylim88/Voicefu-back-end

Translate Speech Into Japanese

chatgpt speech-synthesis voicevox whisper

Last synced: 24 Oct 2024

https://github.com/aixerum/faster-whisper

faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. This implementation is up to 4 times faster than openai/whisper for the same accuracy while using less memory. The efficiency can be further improved with 8-bit quantization on both CPU and GPU.

ctranslate2 gpu transcription whisper

Last synced: 07 Jan 2025

https://gitlab.com/ifrz/asr-multi-lite

Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition

distilhubert wav2vec2 whisper

Last synced: 24 Oct 2024

https://github.com/zdwolfe/transcription-tools

Docker video transcriber, wrapper around OpenAI

openai transcription whisper whisper-ai

Last synced: 02 Jan 2025

https://github.com/ashot72/answering-questions-about-images

You can upload images, ask questions about images using voice prompts, then listen to the responses in voice

answering-questions blip-2-ai-model gtts large-language-models llm replicate speech-to-text text-to-speech whisper

Last synced: 30 Dec 2024

https://github.com/chloelavrat/speech-to-text-app

Speech to text web app based on Streamlit and whisper that extract script for audio or youtube video.

audio-processing machine-learning machinelearning speech-to-text streamlit streamlit-webapp stt whisper whisper-ai

Last synced: 02 Jan 2025