Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Whisper

Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.

https://github.com/ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

inference openai speech-recognition speech-to-text transformer whisper

Last synced: 23 Dec 2024

https://github.com/m-bain/whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 23 Dec 2024

https://github.com/chidiwilliams/Buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

whisper

Last synced: 02 Nov 2024

https://github.com/chidiwilliams/buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

whisper

Last synced: 23 Dec 2024

https://github.com/paddlepaddle/paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 23 Dec 2024

https://github.com/PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 29 Oct 2024

https://github.com/m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 25 Oct 2024

https://github.com/modelscope/funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 24 Dec 2024

https://github.com/xorbitsai/inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

artificial-intelligence chatglm deployment flan-t5 gemma ggml glm4 inference llama llama3 llamacpp llm machine-learning mistral openai-api pytorch qwen vllm whisper wizardlm

Last synced: 23 Dec 2024

https://github.com/modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 29 Oct 2024

https://github.com/sanchit-gandhi/whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

deep-learning jax speech-recognition speech-to-text whisper

Last synced: 26 Dec 2024

https://github.com/wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

asr automatic-speech-recognition conformer e2e-models production-ready pytorch speech-recognition transformer whisper

Last synced: 23 Dec 2024

https://github.com/leetcode-mafia/cheetah

Mac app for crushing remote tech interviews with AI

ai chatgpt gpt gpt-4 openai swift swiftui whisper whisper-cpp

Last synced: 27 Dec 2024

https://github.com/argmaxinc/whisperkit

On-device Speech Recognition for Apple Silicon

inference ios macos speech-recognition swift transformers visionos watchos whisper

Last synced: 24 Dec 2024

https://github.com/nexaai/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

asr edge-computing language-model llm on-device-ai on-device-ml sdk sdk-python stable-diffusion transformers tts vlm whisper

Last synced: 24 Dec 2024

https://github.com/argmaxinc/WhisperKit

On-device Speech Recognition for Apple Silicon

inference ios macos speech-recognition swift transformers visionos watchos whisper

Last synced: 31 Oct 2024

https://github.com/embarklabs/embark

Framework for serverless Decentralized Applications using Ethereum, IPFS and other platforms

blockchain dapp decentralized ethereum framework ipfs serverless smart-contracts swarm whisper

Last synced: 23 Dec 2024

https://iurimatias.github.io/embark-framework

Framework for serverless Decentralized Applications using Ethereum, IPFS and other platforms

blockchain dapp decentralized ethereum framework ipfs serverless smart-contracts swarm whisper

Last synced: 13 Oct 2024

https://github.com/huggingface/distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

audio speech-recognition whisper

Last synced: 24 Dec 2024

https://github.com/MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 31 Oct 2024

https://github.com/mahmoudashraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 24 Dec 2024

https://github.com/grt1228/chatgpt-java

ChatGPT Java SDK支持流式输出、Gpt插件、联网。支持OpenAI官方所有接口。ChatGPT的Java客户端。OpenAI GPT-3.5-Turb GPT-4 Api Client for Java

chatgpt chatgpt-java gpt-35-turbo gpt-4 gpt-plugins java openai-api openai-chatgpt openai-images openai-whisper tiktoken-java whisper

Last synced: 24 Dec 2024

https://github.com/Grt1228/chatgpt-java

ChatGPT Java SDK支持流式输出、Gpt插件、联网。支持OpenAI官方所有接口。ChatGPT的Java客户端。OpenAI GPT-3.5-Turb GPT-4 Api Client for Java

chatgpt chatgpt-java gpt-35-turbo gpt-4 gpt-plugins java openai-api openai-chatgpt openai-images openai-whisper tiktoken-java whisper

Last synced: 03 Nov 2024

https://github.com/n3d1117/chatgpt-telegram-bot

🤖 A Telegram bot that integrates with OpenAI's official ChatGPT APIs to provide answers, written in Python

chatgpt dall-e openai python telegram-bot whisper

Last synced: 24 Dec 2024

https://github.com/betalgo/openai

OpenAI .NET sdk - Azure OpenAI, ChatGPT, Whisper, and DALL-E

azure-openai chatgpt csharp dall-e dotnet gpt-3 gpt-4 openai openai-api sdk whisper whisper-ai

Last synced: 24 Dec 2024

https://github.com/alexrudall/ruby-openai

OpenAI API + Ruby! 🤖❤️

ai api-client chatgpt dall-e gpt-3 gpt-4 openai rails ruby whisper

Last synced: 23 Dec 2024

https://github.com/samuraigpt/embedai

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 27 Dec 2024

https://github.com/SamurAIGPT/EmbedAI

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 25 Oct 2024

https://github.com/xenova/whisper-web

ML-powered speech recognition directly in your browser

javascript transformers whisper

Last synced: 25 Dec 2024

https://github.com/toverainc/willow

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

alexa deep-learning echo esp-adf esp-idf esp32 google-home home-assistant home-automation privacy speech-recognition speech-to-text whisper

Last synced: 25 Dec 2024

https://github.com/abus-aikorea/voice-pro

Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube downloading, vocal isolation(UVR5), Text-to-Speech (Edge-TTS), and multi-language translation. Perfect for content creators and developers.

faster-whisper gradio podcasts speech-recognition speech-synthesis speech-to-text stt subtitles text-to-speech transcription translation translator tts voice-cloning voice-conversion webui whisper yt-dlp

Last synced: 28 Dec 2024

https://github.com/chenyme/chenyme-aavt

这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。

faster-whisper gpt-4 gpt-4o speech-recognition video-translation whisper

Last synced: 25 Dec 2024

https://github.com/fl33tw00d/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

audio machine-learning rust speech-recognition webgpu whisper windows

Last synced: 27 Dec 2024

https://github.com/FL33TW00D/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

audio machine-learning rust speech-recognition webgpu whisper windows

Last synced: 05 Nov 2024

https://github.com/pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

ai audio-to-text golang speech-recognition speech-to-text stt subtitles sveltekit transcription ui web web-whisper webapp whisper

Last synced: 26 Dec 2024

https://github.com/m1guelpf/auto-subtitle

Automatically generate and overlay subtitles for any video.

ffmpeg openai-whisper subtitle-generator subtitles subtitles-generator whisper

Last synced: 25 Dec 2024

https://github.com/Chenyme/Chenyme-AAVT

这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。

faster-whisper gpt-4 gpt-4o speech-recognition video-translation whisper

Last synced: 07 Nov 2024

https://github.com/jhj0517/whisper-webui

A Web UI for easy subtitle using whisper model.

ai gradio open-source python pytorch web-ui whisper

Last synced: 26 Dec 2024

https://github.com/floneum/floneum

Instant, controllable, local pre-trained AI models in Rust

ai candle constrained-generation dioxus floneum-v3 kalosm llama llamacpp llm mistral rust transcription whisper

Last synced: 24 Dec 2024

https://github.com/aallam/openai-kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.

api chatgpt client coroutines dall-e gpt kotlin llm multiplatform openai whisper

Last synced: 25 Dec 2024

https://github.com/Aallam/openai-kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.

api chatgpt client coroutines dall-e gpt kotlin llm multiplatform openai whisper

Last synced: 10 Nov 2024

https://github.com/m1guelpf/yt-whisper

Using OpenAI's Whisper to automatically generate YouTube subtitles

ffmpeg openai openai-whisper subtitles subtitles-generated transcribe whisper youtube youtube-dl

Last synced: 28 Dec 2024

https://github.com/absadiki/subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

cli subtitles subtitles-generator webui whisper whisper-ai

Last synced: 26 Dec 2024

https://github.com/abdeladim-s/subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

cli subtitles subtitles-generator webui whisper whisper-ai

Last synced: 12 Dec 2024

https://github.com/graphite-project/whisper

Whisper is a file-based time-series database format for Graphite.

graphite graphite-components library metrics python time-series whisper

Last synced: 26 Dec 2024

https://github.com/jhj0517/Whisper-WebUI

A Web UI for easy subtitle using whisper model.

ai gradio open-source python pytorch web-ui whisper

Last synced: 20 Oct 2024

https://github.com/ntegrals/aura-voice

Aura is like Siri, but in your browser. An AI voice assistant optimized for low latency responses.

artificial-intelligence elevenlabs gpt-3 gpt-4 langchain nextjs openai vercel whisper whisper-cpp

Last synced: 28 Dec 2024

https://github.com/harry0703/audionotes

快速提取音视频内容,整理成一份结构化的markdown笔记

ai asr funasr ollama python qwen2 whisper

Last synced: 27 Dec 2024

https://github.com/harry0703/AudioNotes

快速提取音视频内容,整理成一份结构化的markdown笔记

ai asr funasr ollama python qwen2 whisper

Last synced: 29 Oct 2024

https://github.com/robitx/gp.nvim

Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]

claude codeium copilot gemini gpt-4o gpt4o llm lua mistral neovim nvim ollama parrot perplexity sonnet speech-to-text stt vim voice whisper

Last synced: 24 Dec 2024

https://github.com/yeyupiaoling/whisper-finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

android asr chinese ctranslate2 huggingface lora pytorch speech-recognition transformers web whisper

Last synced: 27 Dec 2024

https://github.com/Softcatala/whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

openai- openai-whisper speech-recognition speech-to-text whisper

Last synced: 02 Nov 2024

https://github.com/softcatala/whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

openai- openai-whisper speech-recognition speech-to-text whisper

Last synced: 26 Dec 2024

https://github.com/yaofanguk/video-subtitle-generator

视频音频生成字幕,生成srt文件。无需申请第三方API,本地实现音频转文本。基于Transformer的视频字幕生成框架。A GUI tool for generating subtitle from videos and generating srt files.

audio2text generation srt subtitle transcription whisper

Last synced: 23 Dec 2024

https://github.com/innovatorved/whisper.api

This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

asr hacktoberfest innovatorved transcribe whisper

Last synced: 25 Dec 2024

https://github.com/YaoFANGUK/video-subtitle-generator

视频音频生成字幕,生成srt文件。无需申请第三方API,本地实现音频转文本。基于Transformer的视频字幕生成框架。A GUI tool for generating subtitle from videos and generating srt files.

audio2text generation srt subtitle transcription whisper

Last synced: 20 Nov 2024

https://github.com/twitchlib/twitchlib

C# Twitch Chat, Whisper, API and PubSub Library. Allows for chatting, whispering, stream event subscription and channel/account modification. Supports everything that supports .NETStandard 2.0

api bot chat client csharp events pubsub twitch whisper

Last synced: 22 Dec 2024

https://github.com/TwitchLib/TwitchLib

C# Twitch Chat, Whisper, API and PubSub Library. Allows for chatting, whispering, stream event subscription and channel/account modification. Supports everything that supports .NETStandard 2.0

api bot chat client csharp events pubsub twitch whisper

Last synced: 16 Nov 2024

https://github.com/aschmelyun/subvert

Generate subtitles, summaries, and chapters from videos in seconds

chatgpt openai transcription translation video-editing whisper

Last synced: 25 Dec 2024

https://github.com/go-graphite/go-carbon

Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister

carbon devops graphite hacktoberfest timeseries whisper

Last synced: 24 Dec 2024

https://github.com/saharmor/whisper-playground

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

machine-learning openai speech-recognition speech-to-text whisper

Last synced: 23 Dec 2024

https://github.com/lenml/speech-ai-forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

agent asr chattts chattts-forge chinese colab cosy-voice cosyvoice english firered fireredtts fish-speech gpt llama llm ssml stt text-to-speech tts whisper

Last synced: 27 Dec 2024

https://github.com/mayeaux/generate-subtitles

Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration

expressjs gpu libretranslate machine-learning nodejs transcription translation whisper yt-dlp

Last synced: 25 Dec 2024

https://github.com/transcriptionstream/transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

automation diarization llm mistral-7b ollama speaker-diarization speech-recognition transcription whisper whisperx

Last synced: 27 Dec 2024

https://github.com/srcnalt/openai-unity

An unofficial OpenAI Unity Package that aims to help you use OpenAI API directly in Unity Game engine.

chatgpt dalle openai openai-api unity unity3d whisper

Last synced: 27 Dec 2024

https://github.com/chengsokdara/use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

api hook openai react real-time whisper

Last synced: 28 Dec 2024

https://github.com/mezbaul-h/june

Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit

ai assistant-chat-bots chatbot chatbots cli-app command-line-tool coqui-tts huggingface large-language-models llm python speech-recognition speech-to-text text-to-speech whisper

Last synced: 27 Dec 2024

https://github.com/mallorbc/whisper_mic

Project that allows one to use a microphone with OpenAI whisper.

microphone speech-recognition speech-to-text whisper whisper-ai whisper-api

Last synced: 27 Dec 2024

https://github.com/Saik0s/Whisperboard

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

audio-to-text composable-architecture ios openai speech-recognition speech-to-text swiftui tca transcription tuist whisper whisper-cpp

Last synced: 09 Nov 2024

https://github.com/saik0s/whisperboard

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

audio-to-text composable-architecture ios openai speech-recognition speech-to-text swiftui tca transcription tuist whisper whisper-cpp

Last synced: 28 Dec 2024

https://github.com/lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

agent asr chattts chattts-forge chinese colab cosy-voice cosyvoice english firered fireredtts fish-speech gpt llama llm ssml stt text-to-speech tts whisper

Last synced: 13 Oct 2024

https://github.com/srcnalt/OpenAI-Unity

An unofficial OpenAI Unity Package that aims to help you use OpenAI API directly in Unity Game engine.

chatgpt dalle openai openai-api unity unity3d whisper

Last synced: 24 Oct 2024

https://github.com/shirayu/whispering

Streaming transcriber with whisper

automatic-speech-recognition whisper

Last synced: 26 Sep 2024

https://github.com/playvoice/lora-svc

singing voice change based on whisper, and lora for singing voice clone

lora singing-voice-conversion speech-to-sing uni-svc vits vits-svc voice-change voice-cloning voice-conversion whisper

Last synced: 27 Dec 2024

https://github.com/PlayVoice/lora-svc

singing voice change based on whisper, and lora for singing voice clone

lora singing-voice-conversion speech-to-sing uni-svc vits vits-svc voice-change voice-cloning voice-conversion whisper

Last synced: 07 Nov 2024

https://github.com/exphat/swiftwhisper

🎤 The easiest way to transcribe audio in Swift

ios macos openai speech-recognition speech-to-text swift transcription whisper whisper-cpp

Last synced: 27 Dec 2024

https://github.com/buxuku/VideoSubtitleGenerator

批量为本地视频生成字幕文件,并可将字幕文件翻译成其它语言, 跨平台支持 window, mac 系统

subtitle translate whisper whisper-cpp

Last synced: 23 Dec 2024

https://github.com/buxuku/videosubtitlegenerator

批量为本地视频生成字幕文件,并可将字幕文件翻译成其它语言, 跨平台支持 window, mac 系统

subtitle translate whisper whisper-cpp

Last synced: 28 Dec 2024

https://github.com/jina-ai/agentchain

Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks

artificial-intelligence blip langchain llm machine-learning multimodal nlproc stable-diffusion whisper

Last synced: 27 Dec 2024

https://github.com/exPHAT/SwiftWhisper

🎤 The easiest way to transcribe audio in Swift

ios macos openai speech-recognition speech-to-text swift transcription whisper whisper-cpp

Last synced: 05 Nov 2024

https://github.com/azkadev/whisper

Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models

ai android dart flutter ggml indonesia ios linux macos openai speech speech-recognition speech-synthesis speech-to-text transcribe transformer whisper whisper-dart whisper-flutter windows

Last synced: 28 Dec 2024

https://github.com/showlab/vlog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.

chatgpt langchain large-language-model video-language whisper

Last synced: 28 Dec 2024