Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Whisper

Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.

https://github.com/schnoddelbotz/whisper-ui

Transcribe audio/video to text, locally on macOS, Linux and Windows. A simple whisper.cpp wrapper/UI built with Go/Fyne.

ffmpeg ffmpeg-wrapper fyne gui local privacy speech-to-text transcription whisper whisper-cpp

Last synced: 27 Jan 2025

https://github.com/team-mansumugang/mansumugang-backend

만수무강 서비스의 스프링 부트 어플리케이션입니다.

aws github-actions jpa jpa-hibernate spring-boot whisper

Last synced: 09 Oct 2024

https://github.com/markshawn2020/2025-02-03_lex-fridman-deepseek

Transcription and translation scripts for Lex Fridman podcast about DeepSeek, at 2025-02-03

assemblyai deepl deepseek lexfridman whisper xunfei

Last synced: 04 Feb 2025

https://github.com/xeloxa/wtosrt

Effortlessly convert your whisper timestamped subtitles in an unknown/rarely used format to the more familiar SRT format.

conversion python srt-subtitles subtitle subtitle-edit subtitle-format timestamp timestamp-convert whisper

Last synced: 04 Feb 2025

https://github.com/roman01la/sub-deep

Transcribe and translate audio with AI

deepl transcribe translate whisper

Last synced: 30 Dec 2024

https://github.com/sumitesh9/localizedwhisper

An initiative to make OpenAI Whisper more localized by adding support for more languages.

albanian albanian-language huggingface openai speech speech-to-text whisper

Last synced: 02 Jan 2025

https://github.com/mickekring/top-of-mind-clara

Clara är en prototyp som möjliggör att anonymt kunna göra sin röst hörd. Medarbetaren kan prata eller skriva in det du vill säga och AI anonymiserar det. Medarbetaren har dessutom tillgång till en chatbot att rådfråga. Därefter analyseras och sammanställs alla medarbetares tankar i en dashboard.

ai chatbot feedback openai python streamlit transcription whisper

Last synced: 22 Dec 2024

https://github.com/jojasadventure/whisper-client

Very simple Python based client for Whisper compatible endpoint

desktop-app dictation faster-whisper macos productivity python speech-to-text stt whisper

Last synced: 09 Oct 2024

https://github.com/sugarcane-mk/speaker_classification

This repository provides a Python script for extracting speech embeddings using OpenAI's Whisper model. The embeddings are high-dimensional feature vectors that capture the acoustic properties of the input audio. These embeddings can be used for downstream tasks such as speech classification, clustering, and speaker recognition.

asr classification feature-extraction openai speech-processing speech-recognition speech-to-text svm-classifier whisper

Last synced: 09 Jan 2025

https://github.com/huuquyet/phowhisper-next

Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js

nextjs onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper

Last synced: 19 Dec 2024

https://github.com/maawad/luna

Personal assistant

bot openai personal-assistant whisper

Last synced: 17 Dec 2024

https://github.com/aws-samples/amazon-ivs-webgpu-captions-demo

This repository contains an experimental demo application that shows how you can add client-side auto-generated captions to Amazon IVS Real-time and Low-latency streams using transformers.js and WebGPU.

ai amazon-ivs aws captions experimental ivs-lowlatency ivs-realtime lambda lowlatency lvl-300 realtime serverless transformersjs web webgpu webrtc whisper

Last synced: 09 Oct 2024

https://github.com/chaoticbyte/audio-summarize

An audio summarizer (faster-whisper and BART glued together)

ai ai-summarizer audio bart ctranslate2 faster-whisper nlp speech-to-text summarization whisper

Last synced: 09 Oct 2024

https://github.com/adisol07/sharpspeech

SharpSpeech is free, local and open source way to speech and wake word recognition.

audio speech speech-recognition speech-to-text wake-word-detection wakeword whisper whisper-ai

Last synced: 19 Dec 2024

https://github.com/crone-ai/force-align-wordstamps

Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.

captions faster-whisper force-alignment stable-ts whisper

Last synced: 17 Jan 2025

https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 09 Oct 2024

https://github.com/bigyaa/transcription-system

This versatile tool is designed for anyone in need of a robust solution for transcribing and diarizing large volumes of audio files. Whether you are dealing with terabytes or even larger quantities, our tool ensures efficient and accurate processing. Ideal for researchers, content creators, and businesses.

accessibility diarization speech-to-text storytelling-with-data transcription whisper

Last synced: 19 Dec 2024

https://github.com/notyusheng/transcribe-translate

Local web app for transcription and translation services for audio and video using Whisper models

docker full-stack nodejs react reactjs self-hosted speech-to-text transcribe translate whisper

Last synced: 11 Oct 2024

https://github.com/h3yn3s/tl-dl

A selfhostable webapp which helps you read those uselessly long (by nature) voice messages with the power of AI.

sveltekit tailwind whisper

Last synced: 24 Oct 2024

https://github.com/gamut73/quizinator

Generating quizzes, on Android, from YouTube videos.

kotlin-android llm python whisper

Last synced: 19 Dec 2024

https://github.com/bhattbhavesh91/openai-whisper-benchmarking

Comparing the performance of OpenAI's Whisper model on a GPU vs OpenAI's API

gpu openai speech-to-text whisper

Last synced: 16 Nov 2024

https://github.com/becomingbabyman/eunoia-desktop

local desktop transcription and search for apple voice memos and videos

search second-brain transcription videos voice-memos whisper

Last synced: 25 Dec 2024

https://github.com/slinusc/speaker_identification_evaluation

Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks

wav2vec2 whisper xls-r

Last synced: 09 Oct 2024

https://github.com/jlcarveth/skreech

An HTTP API wrapper around Whisper for transcribing audio files.

speech-recognition speech-to-text whisper whisper-ai

Last synced: 19 Jan 2025

https://github.com/nerdimite/meetsy-backend

AI Backend for the Workshop on Building an End-to-End AI Meeting Assistant

gpt-3 nextjs sentence-transformers tailwindcss whisper

Last synced: 24 Oct 2024

https://github.com/abdnh/anki-asr

Anki add-on for speech recognition

anki anki-addon deepgram speech-recognition whisper

Last synced: 24 Nov 2024

https://github.com/xaionaro-go/speech

A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)

ai converter go golang library module package speech speech-recognition speech-to-text text whisper

Last synced: 13 Jan 2025

https://github.com/jgw96/speech-to-text-web-toolkit

Making Speech-To-Text on the web easy, both local and in the cloud

ai lit transformersjs webcomponents whisper

Last synced: 01 Feb 2025

https://github.com/antoniosbarotsis/telegram-transcriber

A Telegram bot for transcribing voice messages

telegram transcribe voice whisper

Last synced: 26 Dec 2024

https://github.com/aaishikdutta/notebook-lm-podcast-audiogram

a simple project to convert notebook-lm (or any audio in that case) into a podcast audiogram with subtitles powered by openai whisper

audiogram openai podcast remotion whisper

Last synced: 08 Dec 2024

https://github.com/voqal/browser

Natural speech browsing for the software developers of tomorrow

cef jcef openai realtime-api voice voice-assistant voice-browser voice-commands voice-control whisper

Last synced: 20 Oct 2024

https://github.com/tposcic/audio-to-srt-transcriber

Audio to srt transcriber in Python using whisper for transcription and Tcl/Tk for GUI

audio python3 srt transcription whisper

Last synced: 05 Jan 2025

https://github.com/stnderror/robotron

🤖 A personal robot assistant for Telegram

assistant bot dall-e gpt-35-turbo openai telegram-bot whisper

Last synced: 25 Jan 2025

https://github.com/drankush/voxrad

VOXRAD is a voice transcription application for radiologists leveraging locally deployed ASR and LLM models.

desktop-app ffmpeg gemini gpt llm macos medical-informatics multimodal natural-language-processing nlp openai openai-api productivity python radiology reporting transcription voice-recognition whisper windows

Last synced: 31 Jan 2025

https://github.com/seanvelasco/ai

Cloudflare AI challenge submission: Slater - your virtual foreign language friend

ai artificial-intelligence language-learning llama2 llm m2m100 machine-learning whisper

Last synced: 03 Feb 2025

https://github.com/thewh1teagle/whisper.zig

Transcribe audio with whisper in zig

asr openai whisper zig

Last synced: 24 Jan 2025

https://github.com/tracywong117/ai-learning-material-from-video

Support subtitling, translating, RAG to generate language learning material from video.

ai auto-subtitle gpt-translate groq groq-api rag subtitles-generator translate whisper

Last synced: 19 Jan 2025

https://github.com/wtlow003/auto-subtitles

CLI tool to transcribe (+ translate) videos and embed subtitles automatically.

faster-whisper nllb subtitles subtitles-generator translation whisper whisper-cpp

Last synced: 15 Nov 2024

https://github.com/pdcalado/waste

Whisper Audio Service for Transcription and Ergonomics

productivity rofi transcription tts whisper

Last synced: 21 Jan 2025

https://gitlab.com/ifrz/asr-multi-lite

Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition

distilhubert wav2vec2 whisper

Last synced: 24 Oct 2024

https://github.com/josemarcosrf/Lexicap-QA

QA retrieval for Lex Fridman's podcast transcriptions

lexicap qa search whisper

Last synced: 24 Oct 2024

https://github.com/userpjm/whisper-youtube

Generate a SubRip subtitle file (srt) using Whisper for the audio of a YouTube video.

faster-whisper openai speech-to-text whisper

Last synced: 24 Oct 2024

https://github.com/willdphan/little-jarvis-whisper

Jarvis, a GPT Voice Assistant made with speech recognition, OpenAI's Whisper, and Gradio

gradio openai voice-assistant voice-recognition whisper

Last synced: 24 Oct 2024

https://github.com/devgeekm/chat-it-up

Chat It Up! elevates conversations by transforming YouTube URLs, documents, and audio into text, enabling interactive Q&A and summaries. With one click, turn media into time-saving, knowledge-rich dialogues.

ai azure azure-functions azureservices blob-storage fastapi python rag whisper youtube-dl

Last synced: 20 Dec 2024

https://github.com/egorsmkv/optimized-whisper-intel

Run quantized Whisper models only on CPU with Intel hardware

intel onnx onnxruntime quantized-neural-networks whisper

Last synced: 19 Dec 2024

https://github.com/flaviodelgrosso/whisper-transcriber

Use OpenAI's Whisper to transcribe audio files and diariaze speakers of the transcribed text

ai audio-to-text diarization openai torch whisper

Last synced: 19 Dec 2024

https://github.com/danibcorr/university-helper

🧑‍🎓 University Helper streamlines academic and administrative tasks for students, educators, and researchers. It provides tools for managing document metadata, converting PDFs to Markdown, transcribing audio, analyzing grade statistics, and more.

deep-learning documentation-tool metadata ocr open-source pdf python statistics university whisper

Last synced: 19 Dec 2024

https://github.com/brunogaliati/speech2text-investments

This project automates the download, transcription, and summarization of audio from YouTube videos. Using OpenAI's Whisper model, it converts video content into concise text summaries with an investment analyst's perspective, ideal for professionals needing quick insights.

chatgpt investment openai politics python speech-recognition speech-to-text whisper

Last synced: 19 Dec 2024

https://github.com/barrylee111/voicechat-llm

A chatbot with both prompt and voicechat capabilities leveraging LangChain, Elasticsearch, and FastAPI. When using voicechat, the user can immerse themselves in the experience by selecting a narrator, like a pirate for instance.

elasticsearch fastapi langchain largelanguagemodel python react speech-to-text tailwind text-to-speech typescript websocket whisper

Last synced: 19 Dec 2024

https://github.com/khushijtrivedi/speech

The Assistive Speech Technology System is designed to enhance communication by analyzing and processing various speech and audio inputs.

ajax bigru-crf bootstrap flask flask-server html-css-javascript librosa python restapi-framework voice-recognition whisper

Last synced: 09 Oct 2024

https://github.com/flo-bit/youtube-speaker-separation

simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate

speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube

Last synced: 19 Dec 2024

https://github.com/nexuslux/simultaneous-interpretation

Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.

agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper

Last synced: 09 Oct 2024

https://github.com/aquibali01/voice-to-text-and-voice-chatbot

Voice-to-Voice Chatbot using Whisper, LLaMA, and Groq API

chatbot gtts llama8b llm opeai python voice whisper

Last synced: 19 Dec 2024

https://github.com/man2dev/whisper-cpp

dev fork of https://src.fedoraproject.org/rpms/whisper-cpp

fedora fedora-repository linux whisper whisper-cpp whispercpp

Last synced: 09 Oct 2024

https://github.com/senkita/gabriel

视频总结工具。

summarizer whisper

Last synced: 09 Oct 2024

https://github.com/sivakumar-mahalingam/subtitle-generator

🎞️ Automatically generating subtitles for video files using Whisper ASR model in Python

ai audio-model audio-processing automatic-speech-recognition openai-whisper python speech-recognition speech-to-text subtitle-generator whisper

Last synced: 09 Oct 2024

https://github.com/kitschpatrol/ambient-novel

An interface for nonlinear interactive exploration of a novel.

ambient book fiction interactive novel svelte whisper

Last synced: 20 Jan 2025

https://github.com/kristofferv98/whisper_turboapi

An optimized FastAPI server for OpenAI's Whisper whisper-large-v3-turbo model using MLX turbo optimization

ai api asynchronous audio audio-processing fastapi huggingface machine-learning macos mlx model-serving nlp openai optimization python speech-to-text synchronous transcription whisper whisper-turbo

Last synced: 14 Dec 2024

https://github.com/luluw8071/whisper-tune

Finetuning Whisper on your own voice

whisper

Last synced: 14 Dec 2024

https://github.com/MattCode64/Scriba_Front

SCRIBA is a web application that transcribes audio files. It supports .mp3 files and provides the transcription results in a user-friendly interface.

speech-to-text vite vue vuejs whisper

Last synced: 24 Oct 2024

https://github.com/teemow/mnote

Generates meeting notes and summaries from video recordings

ai chatgpt google-meet kubeai kubernetes meeting-minutes transcription video-transcription whisper

Last synced: 02 Feb 2025

https://github.com/vinayaktalukder17/Youtube-Transcribe-tool-

YouTube Transcribe tool that uses Whisper tech made by OPENAI

chatgpt chatgpt3 gradio openai python whisper youtube

Last synced: 24 Oct 2024

https://github.com/iamarunbrahma/smart-voice-assistant

A simple voice assistant to get your queries in speech format and generate answers using ChatGPT API in both text and audio format.

chatgpt tts whisper

Last synced: 02 Feb 2025

https://github.com/educa-ch/educa24-speech-to-summary

Demonstrator for an open-source speech-to-summary workflow

langchain ollama open-source open-weight speech-to-text summarization whisper

Last synced: 11 Oct 2024

https://github.com/thealphamerc/audio-to-text

Transcribe multi-lingual audio clips using whisper model

openai whisper

Last synced: 02 Feb 2025

https://github.com/EvilFreelancer/whisper-tests

Collection of experiments on OpenAI Whisper models

api-server docker-compose testing transcription whisper

Last synced: 24 Oct 2024

https://github.com/tomdewildt/whisper-experiment

Experiments using the Whisper model from Open AI

colab jupyter python transcribe transformers translate whisper

Last synced: 27 Dec 2024

https://github.com/suchith-2002/whisperwave

Transcribe any Audio to Text.

openai whisper

Last synced: 03 Feb 2025

https://github.com/fkiller/whispertranscript

Transcribe voice from mic input using OpenAI Whisper API.

llm openai transcribe transcript transcription webaudio whisper

Last synced: 06 Jan 2025

https://github.com/jplhughes/whisper_logit_lens

This Alignment Jam Hackathon project explores whether the concept of "logit lens" applies to the encoder and decoder layers in Whisper, an end-to-end speech recognition model.

alignment-jam asr interpretability interpretability-jam logitlens whisper

Last synced: 24 Oct 2024

https://github.com/pjarbas/azure-ai

Examples using Azure AI services (DALLE3, Text to Speech, Whisper)

azure-openai dalle-3 image-generation-ai speech-synthesis text-to-speech whisper

Last synced: 21 Jan 2025

https://github.com/huuquyet/phowhisper-small

Converted clone of PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)

onnx-models phowhisper speech-recognition transformersjs vietnamese vinai whisper

Last synced: 01 Feb 2025

https://github.com/deshwalmahesh/whisper-fastapi-realtime

It is Front + Backend app that uses openai/whisper-large-v3-turbo in your consumer grade system to provide real live audio transcription

audio-transcription fastapi huggingface live pyaudio realtime transcription transformers whisper whisper-large

Last synced: 25 Oct 2024

https://github.com/evil0ctal/whisper-speech-to-text-api

An open source Speech-to-Text API. The project is based on OpenAI's Whisper model and uses the asynchronous features of FastAPI to efficiently wrap it and support more custom functions.

ai api fastapi openai-whisper speech-to-text speech-to-text-api whisper whisper-ai whisper-api

Last synced: 25 Oct 2024

https://github.com/bilelouahmed/vocal-assistant

Python voice assistant (based on SpeechRecognition, Whisper and XTTS models) designed to transcribe speech to text, translate across languages, engage in chat mode, and ultimately respond vocally.

chatbot llm mistral-7b neo4j python rag speech-recognition text-to-speech transcription whisper xtts

Last synced: 21 Dec 2024

https://github.com/soenneker/soenneker.libraries.whisper.ctranslate

Simply adds the Whisper_CTrantlate2 Windows executable, updated daily (if available)

ai csharp ctranslate ctranslate2 dotnet faster libraries library whisper whisperctranslate

Last synced: 29 Dec 2024

https://github.com/baomeomeo/speech

A Speech-To-Text (with translation) library for Go; currently uses Whisper (runs locally if needed; no need in any API keys)

ai converter go golang library module package speech speech-recognition speech-to-text text whisper

Last synced: 13 Jan 2025

https://github.com/status-im/infra-role-status-go

Ansible role for status-go

ansible-role infra waku whisper

Last synced: 05 Jan 2025

https://github.com/vifill/audio-recorder-and-summarizer

This project is a Python script that records system audio on macOS using BlackHole, transcribes the audio using OpenAI's Whisper API, and summarizes the transcription using OpenAI's GPT models

ai audio blackhole gpt openai records summarize system whisper

Last synced: 20 Dec 2024

https://github.com/ekito-station/whisper-api-unity

UnityでOpenAI Whisper APIを使って文字起こしを行ったサンプル

unity whisper

Last synced: 20 Dec 2024

https://github.com/miosipof/asr_train

Fine-tuning OpenAI Whisper for ASR tasks on low-size datasets

asr machine-learning nlp whisper

Last synced: 07 Jan 2025

https://github.com/miosipof/whisper_inference

OpenAI Whisper ASR inference on CPU with OpenVino, PyTorch or Huggingface

asr inference machine-learning openvino pytorch whisper

Last synced: 07 Jan 2025

https://github.com/arkapravo-ghosh/speech-to-text

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI

ai fastapi huggingface machine-learning openai python3 speech-to-text transformers whisper

Last synced: 21 Dec 2024

https://github.com/theaussiepom/wyoming-openai

OpenAI SST and TTS support for the Wyoming protocol

home-assistant home-assistant-assist openai sst tts whisper wyoming

Last synced: 21 Dec 2024

https://github.com/studiowebux/tommygotchi

whisper, piper, llama-gpt, python, fun .. so much fun !

llama-gpt piper python3 whisper whisper-ai

Last synced: 05 Jan 2025