Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Whisper

Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.

GitHub: https://github.com/topics/whisper
Repo: https://github.com/openai/whisper
Created by: OpenAI
Released: August 2021
Related Topics: machine-learning, artificial-intelligence, language-modeling,
Last updated: 2025-01-22 00:34:31 UTC
JSON Representation

https://github.com/vimwei/whispertranscriber

Whisper Transcribe and srt Resegment

speech-to-text subtitle whisper

Last synced: 17 Oct 2024

https://github.com/sovit-123/sam_molmo_whisper

An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.

molmo segment-anything-model segmentanythingmodel vlm whisper

Last synced: 18 Oct 2024

https://github.com/alessioborgi/stylealigned_multireference-multimodal

Novel framework for Zero-Shot Style Alignment in Text-to-Image generation, incorporating Multi-Modal Context-Awareness and Multi-Reference Style Alignment, using minimal attention sharing, ensuring consistent style transfer without fine-tuning.

adain blip clap context-awareness multi-modal multi-style-transfer no-fine-tuning shared-attention-heads style-aligned text-to-image-generation whisper zero-shot-learning

Last synced: 18 Oct 2024

https://github.com/andreabak/whispersubs

Generate subtitles for your video or audio files using the power of AI

ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper

Last synced: 16 Nov 2024

https://github.com/fer14/videoseek

Intelligent video search tool powered by AI

bert timestamp video whisper youtube-api

Last synced: 14 Jan 2025

https://github.com/ashot72/speech-to-text-to-image

Generating texts from your voice then images form the texts

chatgpt large-language-models llm replicate speech-to-text speechtotext stability-ai text-to-image texttoimage whisper whisper-ai

Last synced: 30 Dec 2024

https://github.com/alancunningham/chatgpt-assistant

A ChatGPT assistant with voice activation and image generation, connected to a Raspberry Pi display.

chatgpt chatgpt-api dall-e dall-e-api porcupine python raspberry-pi whisper

Last synced: 06 Jan 2025

https://github.com/saadkh1/docqa-textsummarization-app

A Streamlit app for document question answering and text summarization.

langchain llama-2 llamacpp pytesseract question-answering streamlit summarization whisper

Last synced: 07 Jan 2025

https://github.com/i4ds/whisper-prep

Data preparation utility for the finetuning of OpenAI's Whisper model.

fine-tuning nlp speech-to-text whisper

Last synced: 09 Nov 2024

https://github.com/stefanasandei/youtube-to-text

Speech to text for any YouTube video.

ai api flask openai python server speech-to-text web-server whisper youtube youtube-dl

Last synced: 04 Jan 2025

https://github.com/sonhm3029/realtime-vietnamese-asr-react-native-and-whisper

This project implement end to end realtime vietnamese speech recognition with PhoWhisper in Backend and frontend in React Native

asr phowhiper react-native realtime realtime-speech-recognition speech-recognition speech-to-text vietnamese whisper

Last synced: 16 Nov 2024

https://github.com/ksylvest/omniai-openai

An implementation of the OmniAI interface for OpenAI.

chatgpt omniai openai ruby whisper

Last synced: 10 Jan 2025

https://github.com/knot-inc/john

John is a web app that records video, analyzes audio with AI, and identifies the speaker's native language from their English accent, simplifying language assessment.

audio-analysis machine-learning whisper

Last synced: 17 Nov 2024

https://github.com/abhishtagatya/polly

☎️ Language Learning Chatbot

chatbot chatgpt python telegram whisper

Last synced: 17 Nov 2024

https://github.com/jemtaly/whispering

A real-time transcription and translation tool implemented in Python based on the fast-whisper library.

live-caption python real-time-transcription real-time-translation tkinter transcription translation whisper

Last synced: 09 Jan 2025

https://github.com/gurpreetkaurjethra/multimodal-ai-app-using-llava-7b

Multimodal AI App using Llava 7B and Gradio

ai generative-ai gradio large-language-models llava llavacpp llm multimodal voice-assistant whisper

Last synced: 22 Nov 2024

https://github.com/williamwa/mssmith

A Telegram bot that utilizes the ChatGPT API and can communicate through voice.

chatpgt-api telegram-bot tts whisper

Last synced: 31 Dec 2024

https://github.com/benitomartin/youtube-llm

LLM Q&A and Summarization App

chromadb langchain python streamlit whisper

Last synced: 31 Dec 2024

https://github.com/tonywu71/distilling-and-forgetting-in-large-pre-trained-models

Code for my dissertation on "Distilling and Forgetting in Large Pre-Trained Models" for the MPhil in Machine Learning and Machine Intelligence (MLMI) at the University of Cambridge.

continual-learning distillation speech-recognition whisper

Last synced: 04 Dec 2024

https://github.com/amir-mohseni/voicebridge

This repository provides a dockerized Speech-to-Speech application that supports text-to-audio conversion, audio-to-text transcription, and interactive voice-based conversations. It is easy to set up and use, offering a versatile platform for speech and text processing.

docker huggingface python transformer tts whisper

Last synced: 17 Jan 2025

https://github.com/amgawishx/voiceworker

A Web App UI for OpenAI's Whisper model for audio transcription and translation.

ai audio-processing python streamlit transcription translation webapp whisper

Last synced: 17 Jan 2025

https://github.com/jacoblincool/wft

Run Whisper fine-tuning with ease—it works on MPS, CUDA, and CPU without code changes.

fine-tuning whisper

Last synced: 11 Dec 2024

https://github.com/bhattbhavesh91/neo4j-palm2-makersuite

Explore how to build a Q&A system on Neo4j using Google's Palm2 model with MakerSuite in this repository.

google google-api google-palm maker-suite neo4j-driver neo4j-python-scripts palm2 python table-qa voice-assistant whisper

Last synced: 17 Jan 2025

https://github.com/phidlarkson/whisper-stt-api

Easy setup for the whisper speech to text

api flask speech-to-text whisper

Last synced: 01 Jan 2025

https://github.com/imsanjoykb/speech-nlp-bootcamp

Speech NLP Bootcamp

asr audio-analysis audio-applications bangla-nlp huggingface-transformers seq2seq speech speech-recognition tts wav2vec2 whisper

Last synced: 18 Jan 2025

https://github.com/gamut73/quizinator

Generating quizzes, on Android, from YouTube videos.

kotlin-android llm python whisper

Last synced: 19 Dec 2024

https://github.com/nerdimite/meetsy-backend

AI Backend for the Workshop on Building an End-to-End AI Meeting Assistant

gpt-3 nextjs sentence-transformers tailwindcss whisper

Last synced: 24 Oct 2024

https://github.com/chinese-soup/cbot-telegram-whisper

Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.

bot cpu-inference golang openai speech-recognition speech-to-text whisper whisper-cpp whispercpp

Last synced: 17 Jan 2025

https://github.com/brentwong-kiel1997/brents_ai_language_school

Use AI such as ChatGPT and Whisper to learn foreign languages from YouTube videos

ai chatgpt foreign-language openai openai-api whisper whisper-ai youtube

Last synced: 31 Dec 2024

https://github.com/egorsmkv/star-adapt-uk

Fork of https://github.com/YUCHEN005/STAR-Adapt with some modifications for Ukrainian.

asr speech-recognition ukrainian whisper

Last synced: 19 Dec 2024

https://github.com/slinusc/speaker_identification_evaluation

Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks

wav2vec2 whisper xls-r

Last synced: 09 Oct 2024

https://github.com/gabriellopesdesouza2002/funcspy

Functions to help you develop any program or script you want

automation chatbot dall-e email email-library ocr openai-api openai-chatgpt openai-whisper pdf pdf-tools python regex selenium selenium-webdriver whisper

Last synced: 30 Oct 2024

https://github.com/nri12/filter_voice

Dự án lọc và tắt tiếng video những từ khóa mong muốn

python tools whisper

Last synced: 19 Dec 2024

https://github.com/oov/aviutl_subtitler

AviUtl+拡張編集の環境で Whisper による文字起こしをするためのプラグイン

aviutl aviutl-plugin whisper

Last synced: 19 Dec 2024

https://github.com/szilvia-csernus/openai-audio-api-calls

Speech-to-text and text-to-speech API call examples, using OpenAI's whisper-1 and tts-1 models.

jupyter-notebook openai openai-api tts-1 whisper

Last synced: 09 Oct 2024

https://github.com/canaxs/whisper-core

An application where users can make rumor-based news and earn money in return.

mysql panel spring spring-boot whisper

Last synced: 19 Dec 2024

https://github.com/team-mansumugang/mansumugang-backend

만수무강 서비스의 스프링 부트 어플리케이션입니다.

aws github-actions jpa jpa-hibernate spring-boot whisper

Last synced: 09 Oct 2024

https://github.com/niqifan007/openai-tts-stt-streamlit

A gui interface for tts (text-to-speech) and stt (speech-to-text) interfaces using the openai api developed by Streamlit, with a history function一个使用Streamlit开发的openai的api接口的tts（文字转语音）和stt（语音转文字）接口的gui界面，带有历史记录功能

openai openai-api streamlit stt-gui tts tts-gui whisper whisper-api

Last synced: 09 Oct 2024

https://github.com/breadrock1/audio-to-text

There is simple backend project to use whisper-rs.

actix-web audio-to-text rust swagger-ui whisper

Last synced: 10 Jan 2025

https://github.com/doctorpok42/pheere-app

Pheere is a simple virtual assistant

ai chatgpt desktop-app elevenlabs nextjs scss tauri ts virtual-assistant whisper

Last synced: 10 Jan 2025

https://github.com/saamerm/whisperkit-ios15

iOS 15 - On-device Inference of Whisper Speech Recognition Models for Apple Silicon

ios ios15 swiftui whisper whisper-ai

Last synced: 19 Jan 2025

https://github.com/aeronjl/transcribe

Python package for accurate audio transcription with speaker diarisation

audio-transcription gpt speaker-diarization whisper

Last synced: 09 Oct 2024

https://github.com/bhattbhavesh91/openai-whisper-benchmarking

Comparing the performance of OpenAI's Whisper model on a GPU vs OpenAI's API

gpu openai speech-to-text whisper

Last synced: 16 Nov 2024

https://github.com/toLSC/tolsc-speech-to-text

Speech to text service for toLSC app implemented with OpenAI Whisper model

fastapi python speech-recognition speech-to-text tts whisper

Last synced: 24 Oct 2024

https://github.com/tposcic/audio-to-srt-transcriber

Audio to srt transcriber in Python using whisper for transcription and Tcl/Tk for GUI

audio python3 srt transcription whisper

Last synced: 05 Jan 2025

https://github.com/platput/pysubs

api to get audio transcription for video files from youtube, aws s3 and such. using OpenAI Whisper

openai whisper

Last synced: 24 Oct 2024

https://github.com/tracywong117/ai-learning-material-from-video

Support subtitling, translating, RAG to generate language learning material from video.

ai auto-subtitle gpt-translate groq groq-api rag subtitles-generator translate whisper

Last synced: 19 Jan 2025

https://github.com/tranbavinhson/eth-decentralized-chat

Decentralized chat app by Ethereum Whisper protocol + Vuejs

ethereum vue vuejs whisper whisper-protocol

Last synced: 26 Dec 2024

https://github.com/aitor-alvarez/large-speech-models

Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper

arabic-speech-recognition asr asr-model finetuning-wav2vec finetuning-whisper large-speech-models speech-recognition-model wav2vec2 whisper

Last synced: 25 Nov 2024

https://github.com/ayeshaaaaaaaaa/ai-powered-video-analysis-with-object-detection-and-detailed-scene-narratives

AI-driven video analysis system that extracts and transcribes audio with Whisper, detects objects using YOLO, and generates comprehensive scene descriptions with GPT-2. The project combines transcriptions and object detections to produce detailed, context-aware video narratives.

bart gpt2 video-analysis whisper yolov8

Last synced: 02 Jan 2025

https://github.com/antoniosbarotsis/telegram-transcriber

A Telegram bot for transcribing voice messages

telegram transcribe voice whisper

Last synced: 26 Dec 2024

https://github.com/crone-ai/force-align-wordstamps

Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.

captions faster-whisper force-alignment stable-ts whisper

Last synced: 17 Jan 2025

https://github.com/lelserslasers/transcriberplus

Transcribe your files with ease!

flask python socket-io svelte trancribe whisper

Last synced: 25 Nov 2024

https://github.com/mooerslab/bash-whisper-transcription

Bash function to ease the transcription of audio files with OpenAI's whisper.

asr audio audio-file-trancription audio-messages automate-the-boring-stuff automatic-speech-recognition automation bash bash-function beginner-friendly speech-to-text stt whisper

Last synced: 14 Dec 2024

https://github.com/mikeesto/subber

A small CLI tool for converting video & audio to a text transcription

audio cli ffmpeg golang transcribe video whisper

Last synced: 19 Dec 2024

https://github.com/xaionaro-go/speech

A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)

ai converter go golang library module package speech speech-recognition speech-to-text text whisper

Last synced: 13 Jan 2025

https://github.com/adamelkholyy/whisper-yt

Toolkit for using Whisper to transcribe YouTube videos. Includes Whisper transcription of YouTube videos, conversion of YouTube video into HuggingFace dataset (using audio and subtitles) and evaluation of Whisper transcription against YouTube subtitles

asr diarization huggingface-datasets pyannote transcription whisper word-error-rate youtube

Last synced: 10 Dec 2024

https://github.com/becomingbabyman/eunoia-desktop

local desktop transcription and search for apple voice memos and videos

search second-brain transcription videos voice-memos whisper

Last synced: 25 Dec 2024

https://github.com/pkarpovich/kira-client

An AI-powered voice automation tool for IoT, integrating voice-triggered commands, OpenAI-driven intent recognition, and HTTP server management for seamless control of smart devices

ai-assistant intent-classification porcupine trigger-word-detection whisper

Last synced: 13 Jan 2025

https://github.com/maawad/luna

Personal assistant

bot openai personal-assistant whisper

Last synced: 17 Dec 2024

https://github.com/aaishikdutta/notebook-lm-podcast-audiogram

a simple project to convert notebook-lm (or any audio in that case) into a podcast audiogram with subtitles powered by openai whisper

audiogram openai podcast remotion whisper

Last synced: 08 Dec 2024

https://github.com/fukuro-kun/wortweber

Wortweber ist ein sich in der Entwicklung befindendes Open-Source-Projekt, das Echtzeit-Sprachtranskription mit KI-Technologie erforscht. Es dient als Lern- und Experimentierplattform für Spracherkennung in Deutsch und Englisch.

speech-to-text whisper

Last synced: 17 Jan 2025

https://github.com/etienneab3d/srt-sync

Synchronize SRT timestamps over an existing accurate transcription

aligner asr nlp subtitles text-to-speech whisper

Last synced: 19 Dec 2024

https://github.com/huuquyet/phowhisper-next

Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js

nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper

Last synced: 19 Dec 2024

https://github.com/juanestban/whisper-tnode

cli ts typescript whisper whisper-cpp whisper-ia whisper-node whisper-node-ts

Last synced: 21 Dec 2024

https://github.com/aspadax/subtitlegenerator

Automatically generate a subtitle for your video.

gpt machine-learning openai rust streamlit subtitles-generator whisper

Last synced: 09 Oct 2024

https://github.com/shtirmann/v2t

Telegram bot which automatically transcribes all voice and video messages to text.

ai aiogram faster-whisper python telegram-bot telegram-bot-python voice-to-text whisper

Last synced: 09 Oct 2024

https://github.com/i4ds/whisper-finetune

This repository contains code for fine-tuning the Whisper speech-to-text model.

fine-tuning nlp speech-to-text whisper

Last synced: 09 Oct 2024

https://github.com/voqal/browser

Natural speech browsing for the software developers of tomorrow

cef jcef openai realtime-api voice voice-assistant voice-browser voice-commands voice-control whisper

Last synced: 20 Oct 2024

https://github.com/elmiraghorbani/gpt-speaker-diarization

Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.

asr diarization gpt-4 openai speaker-diarization speech-recognition speech-to-text voice-activity-detection whisper youtube-dl

Last synced: 29 Nov 2024

https://github.com/roman01la/sub-deep

Transcribe and translate audio with AI

deepl transcribe translate whisper

Last synced: 30 Dec 2024

https://github.com/sumitesh9/localizedwhisper

An initiative to make OpenAI Whisper more localized by adding support for more languages.

albanian albanian-language huggingface openai speech speech-to-text whisper

Last synced: 02 Jan 2025

https://github.com/brentwong-kiel1997/ai_language_school_based_on_django_and_openai

Django and OpenAI API example use case

django gpt-4 openai openai-api whisper

Last synced: 09 Oct 2024

https://github.com/valiantlynx/custom-whisper-api

This project provides a custom API wrapper for the open-source Whisper model using FastAPI. It allows you to integrate Whisper into your applications for automatic speech recognition (ASR) tasks.

ai docker-compose fastapi python whisper

Last synced: 10 Jan 2025

https://github.com/wtlow003/auto-subtitles

CLI tool to transcribe (+ translate) videos and embed subtitles automatically.

faster-whisper nllb subtitles subtitles-generator translation whisper whisper-cpp

Last synced: 15 Nov 2024

https://github.com/jlcarveth/skreech

An HTTP API wrapper around Whisper for transcribing audio files.

speech-recognition speech-to-text whisper whisper-ai

Last synced: 19 Jan 2025

https://github.com/jowadev/interview

Interview is an interactive application crafted to empower both students and professionals in honing their skills for job interviews.

interview-preparation job-interviews nextjs professional students whisper

Last synced: 14 Dec 2024

https://github.com/extrange/transcription-benchmarks

Speech to text model benchmarks

transcription whisper

Last synced: 08 Dec 2024

https://github.com/bbc-esq/whisper-solo-with-gui

OpenAI's Whisper program with a simple lightweight GUI.

pyqt pyqt6 pyqt6-gui transcribe transcribe-audio-files translate whisper

Last synced: 11 Jan 2025

https://github.com/mikeesto/whispercpp-android

An Android app using whisper.cpp to do voice-to-text transcriptions

android kotlin speech-to-text whisper whisper-cpp

Last synced: 17 Dec 2024

https://github.com/abdnh/anki-asr

Anki add-on for speech recognition

anki anki-addon deepgram speech-recognition whisper

Last synced: 24 Nov 2024

https://github.com/maylad31/colab-codes

some useful colab files

clip colab-notebook speech-recognition whisper zero-shot-classification

Last synced: 11 Jan 2025

https://github.com/lidedongsn/cut.ai

cut.ai 是一个AI音视频剪辑工具，语音转写基于whisper

whisper whisper-ui

Last synced: 17 Jan 2025

https://github.com/winstxnhdw/capgen

A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.

asr automatic-speech-recognition caddy ctranslate2 docker fastapi huggingface huggingface-spaces uvicorn-gunicorn whisper

Last synced: 23 Oct 2024

https://github.com/topdev0215/AudioMultifunctionChatbot

This app enabling users to either record or upload audio files. Then utilizing OpenAI API (Whisper, GPT4) generates transcriptions, summaries, fact checks, sentiment analysis, and text metrics. Users can also intelligently chat about their transcriptions with a GPT4 chatbot. Data is stored relationally in SQLite and also vectorized in Pinecone.

gpt4 langcha nltk openai python3 sqlite3 streamlit strean whisper

Last synced: 24 Oct 2024

https://github.com/vi-ssh-al/auto-caption-generator

flask genai whisper

Last synced: 12 Oct 2024

https://github.com/jojasadventure/whisper-client

Very simple Python based client for Whisper compatible endpoint

desktop-app dictation faster-whisper macos productivity python speech-to-text stt whisper

Last synced: 09 Oct 2024

https://github.com/marquesafonso/multilang-asr-captioner

A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.

automatic-speech-recognition captioning-videos faster-whisper whisper

Last synced: 24 Oct 2024

https://github.com/Op27/meeting_minutes_generator

This Python application automates the process of generating meeting minutes from an audio recording. It uses the Whisper library for transcription and the OpenAI GPT models for summarizing content, then outputs the result in a Word document.

ai audio-processing document-automation meeting-minutes openai python speech-recognition text-summarization transcription whisper

Last synced: 24 Oct 2024

https://github.com/utrechtuniversity/transcription-d-lucea

python utrecht-university whisper

Last synced: 22 Nov 2024

https://github.com/TranBaVinhSon/eth-decentralized-chat

Decentralized chat app by Ethereum Whisper protocol + Vuejs

ethereum vue vuejs whisper whisper-protocol

Last synced: 24 Oct 2024

https://github.com/lazauk/aoai-entraidauth-sdkv1

Authenticating with Entra ID (former Azure AD) to access Azure OpenAI models in Python SDK v1.x

ai authentication azure azure-active-directory dall-e embeddings entra-id gpt openai whisper

Last synced: 12 Jan 2025

https://github.com/nerdimite/meetsy-app

Frontend for the Workshop on Building an End-to-End AI Meeting Assistant

gpt-3 nextjs sentence-transformers tailwindcss whisper

Last synced: 24 Oct 2024

https://github.com/shani-sinojiya/sandalquest

AI/ML project for recognizing colloquial Kannada speech and building a speech-based Q&A system focused on sandalwood cultivation.

ai audio-processing data-augmentation deep-learning machine-learning mongodb nlp python pytorch question-answering speech-based-question-answering-system speech-recognition whisper

Last synced: 10 Jan 2025

https://github.com/cris-m/langgraph_examples

duckduckgo kokoro langgraph llama3-2 whisper

Last synced: 18 Jan 2025

https://github.com/h3yn3s/tl-dl

A selfhostable webapp which helps you read those uselessly long (by nature) voice messages with the power of AI.

sveltekit tailwind whisper

Last synced: 24 Oct 2024

https://github.com/aws-samples/amazon-ivs-webgpu-captions-demo

This repository contains an experimental demo application that shows how you can add client-side auto-generated captions to Amazon IVS Real-time and Low-latency streams using transformers.js and WebGPU.

ai amazon-ivs aws captions experimental ivs-lowlatency ivs-realtime lambda lowlatency lvl-300 realtime serverless transformersjs web webgpu webrtc whisper