Projects in Awesome Lists tagged with automatic-speech-recognition
A curated list of projects in awesome lists tagged with automatic-speech-recognition .
https://github.com/wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
asr automatic-speech-recognition conformer e2e-models production-ready pytorch speech-recognition transformer whisper
Last synced: 13 May 2025
https://github.com/zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
audio automatic-speech-recognition chinese-speech-recognition cnn data-preprocessing deep-learning end-to-end evaluation feature-vector layer-normalization lstm paper phonemes rnn rnn-encoder-decoder speech-recognition tensorflow timit-dataset
Last synced: 02 Apr 2025
https://github.com/zzw922cn/automatic_speech_recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
audio automatic-speech-recognition chinese-speech-recognition cnn data-preprocessing deep-learning end-to-end evaluation feature-vector layer-normalization lstm paper phonemes rnn rnn-encoder-decoder speech-recognition tensorflow timit-dataset
Last synced: 15 May 2025
https://github.com/ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
asr automatic-speech-recognition docker openai-whisper speech speech-recognition speech-to-text
Last synced: 14 May 2025
https://github.com/coqui-ai/stt
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 14 May 2025
https://github.com/coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 15 Mar 2025
https://github.com/kakaobrain/pororo
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
automatic-speech-recognition deep-learning natural-language-processing neural-models speech-synthesis
Last synced: 30 Dec 2025
https://github.com/tensorspeech/tensorflowasr
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
automatic-speech-recognition conformer contextnet ctc deepspeech2 end2end jasper rnn-transducer speech-recognition speech-to-text streaming-transducer subword-speech-recognition tensorflow tensorflow2 tflite tflite-convertion tflite-model
Last synced: 14 May 2025
https://github.com/FireRedTeam/FireRedASR
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.
asr automatic-speech-recognition conformer industrial-grade llm multimodal-llm open-source speech-recognition speechllm transformer
Last synced: 12 Apr 2025
https://github.com/snakers4/open_stt
Open STT
asr automatic-speech-recognition dataset russian speech-to-text stt
Last synced: 19 Jul 2025
https://github.com/shirayu/whispering
Streaming transcriber with whisper
automatic-speech-recognition whisper
Last synced: 29 Sep 2025
https://github.com/jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
automatic-speech-recognition evaluation-metrics python3 speech-to-text wer word-error-rate
Last synced: 16 Apr 2025
https://github.com/Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition
Last synced: 04 May 2025
https://github.com/picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition
Last synced: 13 Apr 2025
https://github.com/hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
asr attention attention-mechanism automatic-speech-recognition ctc language-model language-modeling pytorch rnn-transducer seq2seq sequence-to-sequence speech speech-recognition streaming transformer transformer-xl
Last synced: 02 May 2025
https://github.com/FluidInference/FluidAudio
Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.
ane asr audio automatic-speech-recognition avfoundation coreml ios macos nvidia parakeet real-time speaker-diarization speaker-embedding speaker-identification speaker-recognition speech-to-text swift vad voice-activity-detection
Last synced: 31 Aug 2025
https://github.com/FireRedTeam/FireRedASR2S
A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.
asr asr-pipeline audio-event-classification audio-event-detection automatic-speech-recognition industrial-grade language-identification lid llm multimodal-llm open-source punctuation-prediction punctuation-restoration sota speech-recognition speechllm vad voice-activity-detection
Last synced: 06 May 2026
https://github.com/picovoice/leopard
On-device speech-to-text engine powered by deep learning
asr automatic-speech-recognition on-device speech-recognition speech-to-text stt transcription voice-recognition voice-to-text
Last synced: 14 May 2025
https://github.com/arthurfdlr/whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper
automatic-speech-recognition colab-notebook speech-recognition speech-to-text transformer whisper youtube
Last synced: 05 Apr 2025
https://github.com/double22a/speech_dataset
The dataset of Speech Recognition
asr audio automatic-speech-recognition dataset deep-learning deep-neural-networks speech speech-diarization speech-enhancement speech-recognition speech-segmentation speech-separation speech-synthesis speech-to-text speech-translation text-to-speech tts voice-conversion wav
Last synced: 05 May 2025
https://github.com/ArthurFDLR/whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper
automatic-speech-recognition colab-notebook speech-recognition speech-to-text transformer whisper youtube
Last synced: 01 Apr 2025
https://github.com/hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
asr attention-mechanism automatic-speech-recognition beam-search csj ctc end-to-end end-to-end-learning joint-ctc-attention librispeech speech-recognition speech-to-text tensorflow timit timit-dataset
Last synced: 19 Jul 2025
https://github.com/rolczynski/automatic-speech-recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
automatic-speech-recognition deep-learning deepspeech distill keras language-model machine-learning neural-networks speech-recognition speech-to-text tensorflow tensorflow-models
Last synced: 30 Sep 2025
https://github.com/sovaai/sova-asr
SOVA ASR (Automatic Speech Recognition)
asr asr-model automatic-speech-recognition speech speech-recognition speech-to-text stt wav2letter
Last synced: 19 Jul 2025
https://github.com/vilassn/whisper_android
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
android asr automatic-speech-recognition embedded mobile offline openai speech-recognition tensorflow tensorflowlite text-to-speech texttospeech tflite transcribe transcription tts whisper
Last synced: 22 Oct 2025
https://github.com/noco-ai/spellbook-docker
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
automatic-speech-recognition bark llama2 llm-inference mixtral musicgeneration stable-diffusion text-to-speech whisper xttsv2
Last synced: 13 May 2025
https://github.com/CoEDL/elpis
🙊 software for creating speech recognition models.
automatic-speech-recognition computational-linguistics docker kaldi linguistics python transcription
Last synced: 08 May 2025
https://github.com/ieasybooks/tafrigh
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
asr automatic-speech-recognition ctranslate2 facebook faster-whisper javascript python soundcloud srt stable-whisper subtitles twitter vtt whisper youtube
Last synced: 06 Feb 2026
https://github.com/altunenes/parakeet-rs
very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust
asr automatic-speech-recognition onnx parakeet speaker-diarization speaker-identification speech speech-recognition speech-to-text
Last synced: 06 Feb 2026
https://github.com/tugstugi/mongolian-speech-recognition
Mongolian speech recognition with PyTorch
asr automatic-speech-recognition convolutional-neural-networks deep-learning mongolian python pytorch speech-recognition speech-to-text
Last synced: 14 Apr 2025
https://github.com/at16k/at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
asr asr-model automatic-speech-recognition pretrained-models speech-analysis speech-api speech-recognition speech-recognizer speech-to-text voice-commands voice-recognition
Last synced: 13 Jul 2025
https://github.com/kmario23/kenlm-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
automatic-speech-recognition deep-neural-networks deep-speech kenlm kenlm-toolkit language-model language-modeling natural-language-processing probabilistic-models python speech-recognition
Last synced: 07 Apr 2025
https://github.com/kmario23/KenLM-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
automatic-speech-recognition deep-neural-networks deep-speech kenlm kenlm-toolkit language-model language-modeling natural-language-processing probabilistic-models python speech-recognition
Last synced: 19 Jul 2025
https://github.com/andi611/zerospeech-tts-without-t
A Pytorch implementation for the ZeroSpeech 2019 challenge.
adversarial-learning asr autoencoder automatic-speech-recognition gan text-to-speech tts tts-without-t zerospeech
Last synced: 01 Mar 2026
https://github.com/lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
automatic-speech-recognition speech-synthesis text-to-speech
Last synced: 20 Aug 2025
https://github.com/primaprashant/awesome-voice-typing
Curated list of open-source speech-to-text and voice typing tools for Linux, macOS, Windows, Android, and iOS. Offline, local, and cloud.
ai automatic-speech-recognition awesome-list dictation dictation-tool faster-whisper linux local-transcription macos offline-speech-recognition open-source parakeet privacy-focused push-to-talk speech-to-text transcription voice-typing whisper whisper-cpp wisprflow-alternative
Last synced: 15 Apr 2026
https://github.com/mgonzs13/whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
asr automatic-speech-recognition ggml ros2 speech-recognition speech-to-text vad voice-activity-detection whisper whisper-cpp
Last synced: 30 Aug 2025
https://github.com/j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
android android-ime automatic-speech-recognition chinese-speech-recognition ime keyboard kotlin openai openai-api speech speech-recognition speech-to-text virtual-keyboard voice voice-recognition whisper
Last synced: 09 Apr 2025
https://github.com/undertheseanlp/automatic_speech_recognition
Vietnamese Automatic Speech Recognition
automatic-speech-recognition nlp vietnamese vietnamese-nlp
Last synced: 17 Feb 2026
https://github.com/pythainlp/pythaiasr
Python Thai Automatic Speech Recognition
asr automatic-speech-recognition hacktoberfest hacktoberfest2022 thai-language thai-nlp
Last synced: 13 Apr 2025
https://github.com/googlecreativelab/obvi
A Polymer 3+ webcomponent / button for doing speech recognition
automatic-speech-recognition button polymer polymer2 speech-recognition webcomponent
Last synced: 29 Jul 2025
https://github.com/sungnyun/armhubert
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
automatic-speech-recognition distillation ssl-compression
Last synced: 12 Apr 2025
https://github.com/tsmdt/whisply
💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... fast!
asr automatic-speech-recognition speech-recognition speech-to-text subtitles transcription-tool whisper-ai
Last synced: 09 Apr 2025
https://github.com/ttop32/wav2vec2-live-japanese-translator
real time japanese speech recognition translator using wav2vec2
asr audio automatic-speech-recognition fine-tuning huggingface japanese live pyaudio pyqt5 pytorch real-time speaker-recognition speech-to-text spoken-language-understanding stt translation translator voice voice-recognition wav2vec2
Last synced: 03 Sep 2025
https://github.com/soheil-mp/speech-recognition
End-to-End Speech Recognition using Neural Networks.
asr audio automatic-speech-recognition librispeech
Last synced: 20 Aug 2025
https://github.com/sooftware/jasper
PyTorch implementation of "Jasper: An End-to-End Convolutional Neural Acoustic Model" (INTERSPEECH 2019)
asr automatic-speech-recognition cnn jasper nvidia pytorch speech-recognition
Last synced: 09 Apr 2025
https://github.com/lucasgris/wav2vec4bp
Wav2vec resources and models for Brazilian Portuguese
automatic-speech-recognition brazilian-portuguese dataset portuguese speech-to-text wav2vec wav2vec2
Last synced: 07 May 2025
https://github.com/j3soon/speech-to-windows-input
Perform speech-to-text (STT/ASR) with Azure speech service and simulate keyboard to input the recognized text; Supports English, Chinese, Japanese, and more.
automatic-speech-recognition azure azure-speech-service chinese-speech-recognition simulate-keyboard speech speech-recognition speech-to-text voice voice-recognition
Last synced: 10 Apr 2025
https://github.com/the-data-dilemma/medibeng-whisper-tiny
MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.
audio audio-processing automatic-speech-recognition bengali code-switch english fastapi faster-whisper fine-tuning gradio healthcare openai python speech-recognition speech-to-text synthetic-data transcription transformers translation whisper
Last synced: 17 Mar 2026
https://github.com/kssteven418/q-asr
[ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition
automatic-speech-recognition deep-learning efficient-model efficient-neural-networks jasper model-compression quantization quartznet speech speech-recognition
Last synced: 31 Jul 2025
https://github.com/popcornell/micrank
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.
ad-hoc-microphone-network array-processing asr automatic-speech-recognition channel-selection
Last synced: 12 Apr 2025
https://github.com/saurabhchalke/whisper-meta-quest
Running speech-to-text in a Meta Quest headset using OpenAI's Whisper tiny model
artificial-intelligence automatic-speech-recognition mixed-reality speech-to-text virtual-reality vr whisper
Last synced: 04 Sep 2025
https://github.com/egorsmkv/whisper-ukrainian
Trainer and Evaluation scripts for fine-tuning Whisper models for the Ukrainian language
asr automatic-speech-recognition openai speech-recognition ukrainian whisper
Last synced: 03 Mar 2025
https://github.com/linto-ai/linto-agent
LinTO platform services stack deployment tool for Docker Swarm cluster
asr automatic-speech-processing automatic-speech-recognition cluster docker-compose docker-swarm microservices smart-assistant virtual-agent vocal-assistant
Last synced: 26 Oct 2025
https://github.com/ivankunyankin/quartznet-asr
asr automatic-speech-recognition jasper pytorch quartznet
Last synced: 08 Apr 2026
https://github.com/FernandoLpz/SpeechRecognition
This repository contains the implementation of an Automatic Speech Recognition system in python, using a client-server architecture with Web Sockets.
automatic-speech-recognition python speech-recognition speech-to-text transformers wav2vec2 websockets
Last synced: 03 Apr 2025
https://github.com/openvoiceos/ovos-stt-plugin-vosk
vosk STT plugin for mycroft
asr automatic-speech-recognition hacktoberfest kaldi speech-recognition speech-to-text stt vosk
Last synced: 16 May 2025
https://github.com/OpenVoiceOS/ovos-stt-plugin-vosk
vosk STT plugin for mycroft
asr automatic-speech-recognition hacktoberfest kaldi speech-recognition speech-to-text stt vosk
Last synced: 10 May 2025
https://github.com/megengine/end-to-end-asr-transformer
An end to end ASR Transformer model training repo
asr-model attention-mechanism automatic-speech-recognition megengine transfomer
Last synced: 12 Apr 2025
https://github.com/analyticsinmotion/werpy
🐍📦 Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.
asr asr-evaluation automatic-speech-recognition levenshtein-distance metrics nlp python python-package speech-to-text stt stt-benchmark wer werpy word-error-rate
Last synced: 07 Apr 2025
https://github.com/jmaczan/asr-dysarthria
Research on Automatic Speech Recognition for dysarthric speech
asr automatic-speech-recognition deep-learning dysarthria dysarthric-speech self-supervised-learning wav2vec2
Last synced: 12 Apr 2025
https://github.com/estuary-ai/mangrove
Mangrove is the backend module of Estuary, a framework for building multimodal real-time Socially Intelligent Agents (SIAs).
affective-computing agents artificial-intelligence automatic-speech-recognition digital-assistant framework human-computer-interaction large-language-models socially-aware-agents socially-intelligent-agents speech-recognition speech-synthesis
Last synced: 11 Feb 2026
https://github.com/zevaverbach/tatt
Transcribe All The Things™ is a CLI for creating and managing speech-to-text transcripts.
amazon-transcribe-api asr automatic-speech-recognition cli speech-to-text stt
Last synced: 13 Apr 2025
https://github.com/winstxnhdw/capgen
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
asr automatic-speech-recognition ctranslate2 docker granian huggingface huggingface-spaces litestar whisper
Last synced: 03 Jul 2025
https://github.com/bhattbhavesh91/whisper-youtube
This repository will guide you to create automatically generate YouTube Transcription using Using OpenAI's Whisper
automatic-speech-recognition ffmpeg openai openai-gym python pytube subtitles whisper youtube youtube-dl
Last synced: 17 Apr 2025
https://github.com/scalable-ml-deep-learning/fine_tune_whisper
Fine-Tune Whisper for Italian ASR with transformers
automatic-speech-recognition common-voice-dataset huggingface openai transformers whisper
Last synced: 11 Mar 2025
https://github.com/abus-aikorea/studio-free
youtube download, vocal remover, vocal extraction, karaoke video production, STT, automatic speech recognition, transcription, automatic subtitle, AI, yt-dlp, demucs, whisper, webui, gradio, windows
ai automatic-speech-recognition automatic-subtitle demucs gradio karaoke openai stt transcription video-download vocal-remover webui whisper windows yt-dlp
Last synced: 25 Apr 2025
https://github.com/bagustris/detect-segment-cough
A python model to detect and segment coughs, forked from coughvid's repo
automatic-speech-recognition cough-detection cough-sound covid-19 speech-segmentation
Last synced: 28 Jun 2025
https://github.com/sinaahmadi/CORDI
Language and Speech Technology for Central Kurdish Varieties (LREC-COLING 2024)
automatic-speech-recognition dialect-identification erbil kurdish kurdish-language-processing language-identification machine-translation mahabad sanandaj sorani sulaymaniyah
Last synced: 07 May 2025
https://github.com/jarbasal/pocketsphinx-models-mirror
pocketsphinx models for languages originating from the iberian peninsula
asr automatic-speech-recognition pocketsphinx speech-recognition speech-to-text stt stt-models
Last synced: 12 Feb 2026
https://github.com/the-data-dilemma/parquettohuggingface
ParquetToHuggingFace processes raw audio data, converts it into Parquet files, and uploads them to Hugging Face. The README explains how to set up the environment, configure paths, and run the scripts to generate and upload the data.
audio-dataset audio-processing automatic-speech-recognition data-analysis data-science dataset healthcare-application huggingface huggingface-datasets pandas parquet parquet-generator python3 speech-data speech-recognition speech-to-text speech-translation
Last synced: 21 Aug 2025
https://github.com/thc1006/breeze-asr-taigi
Taiwanese Hokkien (Taigi) speech-to-text transcriber - MediaTek Breeze-ASR-26 with faster-whisper, tuned for RTX 3050 4GB low-VRAM GPUs. Gradio UI, CLI, Docker, SRT/VTT/TXT/JSON.
asr automatic-speech-recognition breeze-asr chinese-speech-recognition ctranslate2 faster-whisper gradio hokkien low-vram mediatek pytorch rtx-3050 speech-recognition speech-to-text subtitle-generator taigi taiwanese whisper
Last synced: 14 May 2026
https://github.com/roboticslab-uc3m/speech
Text To Speech (TTS) and Automatic Speech Recognition (ASR).
automatic-speech-recognition text-to-speech
Last synced: 19 Jan 2026
https://github.com/bhattbhavesh91/table-question-answering-with-automatic-speech-recognition
Question Answering Gradio Interface on Tabular Data with HuggingFace Transformers Pipeline & TAPAS Wav2Vec2 is a pretrained model for Automatic Speech Recognition (ASR)
automatic-speech-recognition google-assistant gradio-interface huggingface huggingface-transformers huggingface-transformers-pipeline question-answering voice-recognition
Last synced: 27 Feb 2026
https://github.com/BatuhanYilmaz26/Youtube-Transcriber
Input a YouTube video link and get a transcription as a .txt, .vtt or .srt file.
automatic-speech-recognition huggingface openai python speech-recognition streamlit whisper
Last synced: 11 Mar 2025
https://github.com/idiap/tidigitsrecipe.jl
A Julia recipe for training an ASR system using the TIDIGITS database
asr automatic-speech-recognition decoding hidden-markov-models wfst
Last synced: 12 Sep 2025
https://github.com/khaykingleb/automatic-speech-recognition
QuartzNet and Deepspeech Implementation for ASR
automatic-speech-recognition deep-learning deepspeech pytorch quartz-net speech-recognition
Last synced: 27 Apr 2026
https://github.com/my-north-ai/semantic_audio_filtering
Synthetic data augmentation technique via LLM for Automatic Speech Recognition fine tuning.
automatic-speech-recognition fine-tuning synthetic-dataset-generation text-to-speech whisper
Last synced: 11 Mar 2025
https://github.com/pleasurecruise/3d-ai-agent
This project aims to create an AI agent capable of expressing a range of emotions through facial expressions and tone of voice, using Large Language Models (LLMs) and Large Vision Models (LVMs).
asr automatic-speech-recognition large-language-models llm ocr optical-character-recognition python text-to-speech tts
Last synced: 15 Apr 2025
https://github.com/astrologos/py-speakeasy
Speakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.
automatic-speech-recognition chat-gpt coqui-ai coqui-tts elevenlabs-api mimic mycroftai text-to-speech whisper
Last synced: 11 Mar 2025
https://github.com/openvoiceos/ovos-stt-plugin-chromium
A stt plugin for mycroft using the google chrome browser api
asr automatic-speech-recognition speech-recognition speech-to-text stt
Last synced: 16 May 2025
https://github.com/egorsmkv/cv10-uk-testset-clean
The cleaned Common Voice 10 (test set) that has been checked by a human for Ukrainian 🇺🇦
asr automatic-speech-recognition speech speech-recognition speech-to-text ukrainian
Last synced: 19 Mar 2026
https://github.com/OpenVoiceOS/ovos-stt-plugin-chromium
A stt plugin for mycroft using the google chrome browser api
asr automatic-speech-recognition speech-recognition speech-to-text stt
Last synced: 10 May 2025
https://github.com/rafat-decodis/robust-asr-for-low-resource-languages
Exploring Benchmark Gaps and Real-World Speech Generalization for Language in Low Resource
artificial-intelligence automatic-speech-recognition data-analysis dataprocessing whisper
Last synced: 23 Jun 2025
https://github.com/pprattis/automatic-speech-recognision-system-asr
A python script that implements an automatic speech recognision system.
asr automatic-speech-recognition computer-science dtw dynamic-time-warping fir-filter librosa mel-frequency-cepstral-coefficients mfcc nyquist program python short-time-fourier-transform short-time-signal-analysis signal signal-processing student
Last synced: 07 Sep 2025
https://github.com/pprattis/automatic-speech-recognision-system-ASR
A python script that implements an automatic speech recognision system.
asr automatic-speech-recognition computer-science dtw dynamic-time-warping fir-filter librosa mel-frequency-cepstral-coefficients mfcc nyquist program python short-time-fourier-transform short-time-signal-analysis signal signal-processing student
Last synced: 28 Sep 2025
https://github.com/marquesafonso/multilang-asr-captioner
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.
automatic-speech-recognition captioning-videos faster-whisper whisper
Last synced: 11 Mar 2025
https://github.com/analyticsinmotion/werx
🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate analysis
asr automatic-speech-recognition levenshtein-distance metrics speech-to-text stt wer werx word-error-rate word-error-rate-calculator
Last synced: 16 Jun 2025
https://github.com/jeronymous/deep_learning_notebooks
Self-containing notebooks to play simply with some particular concepts in Deep Learning
artificial-intelligence artificial-neural-networks automatic-speech-recognition deep-learning deep-neural-networks machine-learning natural-language-processing speech-recognition speech-to-text tokenization tokenizer-nlp tokenizers
Last synced: 16 Feb 2026
https://github.com/nico-byte/whisper-web
The Whisper Web Transcription Server is a Python-based real-time speech-to-text transcription system powered by OpenAI's Whisper models. It leverages state-of-the-art models like Distil-Whisper to transcribe audio input in real-time.
ai asr automatic-speech-recognition distil-whisper distil-whisper-large-v3 huggingface huggingface-transformers server vad voice web websockets whisper
Last synced: 26 Apr 2026
https://github.com/0xpd33/sonori
Sonori is a STT app for linux (wayland).
asr automatic-speech-recognition ctranslate2 linux onnxruntime speech-recognition speech-to-text stt wayland wgpu whisper
Last synced: 08 Feb 2026
https://github.com/swaylenhayes/three-amigos-offline
Triple model automatic speech recognition for Mac: offline, push-to-talk with auto-paste, and MLX-optimized Whisper model choices.
automatic-speech-recognition mlx offline speech-to-text whisper
Last synced: 13 Jan 2026
https://github.com/ovoshatchery/ovos-stt-plugin-pocketsphinx
pocketsphinx STT plugin for mycroft
asr automatic-speech-recognition maintainer-wanted pocketsphinx speech-recognition speech-to-text stt
Last synced: 25 Mar 2025
https://github.com/mooerslab/bash-whisper-transcription
Bash function to ease the transcription of audio files with OpenAI's whisper.
asr audio audio-file-trancription audio-messages automate-the-boring-stuff automatic-speech-recognition automation bash bash-function beginner-friendly speech-to-text stt whisper
Last synced: 15 Feb 2026
https://github.com/OVOSHatchery/ovos-stt-plugin-pocketsphinx
pocketsphinx STT plugin for mycroft
asr automatic-speech-recognition maintainer-wanted pocketsphinx speech-recognition speech-to-text stt
Last synced: 10 May 2025
https://github.com/egorsmkv/asr-corpus-by-microphone
This is a simple solution for people who want to create own corpus for Automatic Speech Recognition with just a microphone
asr automatic-speech-recognition corpus corpus-tools
Last synced: 28 Mar 2025
https://github.com/nagababumo/open-source-models-with-hugging-face
asr audio-detection audio-processing automatic-speech-recognition blip clip huggingface huggingface-spaces huggingface-transformers image-captioning image-classification image-retrieval multi-modality object-detection open-source segementation sentence-embeddings transformers visual-question-answering zero-shot-learning
Last synced: 10 Sep 2025
https://github.com/ksm26/serverless-llm-apps-with-amazon-bedrock
The course equips you with the skills to deploy Large Language Model (LLM)-based applications into production using serverless technology with Amazon Bedrock.
audio-analysis audio-analysis-tasks audio-processing automatic-speech-recognition aws-generative-ai aws-lambda-serverless-framework cloud-computing deep-learning-techniques event-driven-architecture event-driven-architectures natural-language-understanding serverless-technology transcription-services
Last synced: 28 Mar 2025
https://github.com/rishabhmathur06/fine-tuning-whisper-small-for-asr-
This repository contains notebook that shows how to fine-tune OpenAI's Whisper model on custom Hindi dataset.
artificial-intelligence asr automatic-speech-recognition fine-tuning openai python whisper whisper-model
Last synced: 19 Jan 2026