Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with speech-recognition

A curated list of projects in awesome lists tagged with speech-recognition .

https://github.com/ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

inference openai speech-recognition speech-to-text transformer whisper

Last synced: 02 Oct 2024

https://github.com/mozilla/DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

deep-learning deepspeech embedded machine-learning neural-networks offline on-device speech-recognition speech-to-text tensorflow

Last synced: 30 Jul 2024

https://github.com/mozilla/deepspeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

deep-learning deepspeech embedded machine-learning neural-networks offline on-device speech-recognition speech-to-text tensorflow

Last synced: 29 Sep 2024

https://github.com/kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

c-plus-plus cuda kaldi shell speaker-id speaker-verification speech speech-recognition speech-to-text

Last synced: 29 Sep 2024

https://github.com/nvidia/deeplearningexamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation

Last synced: 29 Sep 2024

https://github.com/NVIDIA/DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation

Last synced: 31 Jul 2024

https://github.com/paddlepaddle/paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 29 Sep 2024

https://github.com/PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 31 Jul 2024

https://github.com/m-bain/whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 26 Sep 2024

https://github.com/m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 30 Jul 2024

https://github.com/uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

audio python speech-recognition speech-to-text

Last synced: 29 Sep 2024

https://github.com/Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

audio python speech-recognition speech-to-text

Last synced: 31 Jul 2024

https://github.com/nl8590687/asrt_speechrecognition

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

asrt chinese-speech-recognition cnn ctc keras python python3 speech-recognition speech-to-text tensorflow

Last synced: 02 Oct 2024

https://github.com/nl8590687/ASRT_SpeechRecognition

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

asrt chinese-speech-recognition cnn ctc keras python python3 speech-recognition speech-to-text tensorflow

Last synced: 31 Jul 2024

https://github.com/TalAter/annyang

:speech_balloon: Speech recognition for your site

hacktoberfest speech speech-recognition speech-to-text voice

Last synced: 30 Jul 2024

https://github.com/talater/annyang

:speech_balloon: Speech recognition for your site

hacktoberfest speech speech-recognition speech-to-text voice

Last synced: 29 Sep 2024

https://github.com/flashlight/wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

cpp deep-learning end-to-end speech-recognition wav2letter

Last synced: 30 Sep 2024

https://github.com/modelscope/funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper

Last synced: 26 Sep 2024

https://github.com/snakers4/silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models

Last synced: 31 Jul 2024

https://github.com/sanchit-gandhi/whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

deep-learning jax speech-recognition speech-to-text whisper

Last synced: 26 Sep 2024

https://github.com/wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

asr automatic-speech-recognition conformer e2e-models production-ready pytorch speech-recognition transformer whisper

Last synced: 26 Sep 2024

https://github.com/cmusphinx/pocketsphinx

A small speech recognizer

c python speech-recognition

Last synced: 30 Sep 2024

https://github.com/huggingface/distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

audio speech-recognition whisper

Last synced: 26 Sep 2024

https://github.com/mahmoudashraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 26 Sep 2024

https://github.com/modelscope/funclip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

gradio gradio-python-llm llm speech-recognition speech-to-text subtitles-generator video-clip video-subtitles

Last synced: 26 Sep 2024

https://github.com/yanshengjia/ml-road

Machine Learning Resources, Practice and Research

computer-vision deep-learning machine-learning nlp pytorch speech-recognition tensorflow

Last synced: 07 Aug 2024

https://github.com/MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 31 Jul 2024

https://github.com/toverainc/willow

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

alexa deep-learning echo esp-adf esp-idf esp32 google-home home-assistant home-automation privacy speech-recognition speech-to-text whisper

Last synced: 27 Sep 2024

https://github.com/mravanelli/pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

asr deep-learning deep-neural-networks dnn dnn-hmm gru kaldi lstm lstm-neural-networks multilayer-perceptron-network pytorch recurrent-neural-networks rnn rnn-model speech speech-recognition timit

Last synced: 30 Sep 2024

https://github.com/rhasspy/rhasspy

Offline private voice assistant for many human languages

home-assistant node-red privacy speech-recognition voice-assistants voice-commands

Last synced: 29 Sep 2024

https://github.com/coqui-ai/stt

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 26 Sep 2024

https://github.com/argmaxinc/WhisperKit

Swift native on-device speech recognition with Whisper for Apple Silicon

inference ios macos pretrained-models speech-recognition swift transformers visionos watchos whisper

Last synced: 31 Jul 2024

https://github.com/argmaxinc/whisperkit

Swift native on-device speech recognition with Whisper for Apple Silicon

inference ios macos pretrained-models speech-recognition swift transformers visionos watchos whisper

Last synced: 26 Sep 2024

https://github.com/coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition

Last synced: 31 Jul 2024

https://github.com/pannous/tensorflow-speech-recognition

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

deep-learning neural-network speech-recognition speech-to-text stt tensorflow

Last synced: 30 Sep 2024

https://github.com/alan-ai/alan-sdk-ios

Conversational AI SDK for iOS to enable text and voice conversations with actions (Swift, Objective-C)

alan-ios-sdk alan-studio alan-voice chatbot conversational-ai ios machine-learning sdk speech-recognition voice voice-ai voice-assistant voice-commands

Last synced: 30 Sep 2024

https://github.com/syhw/wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

deep-neural-network speech-recognition wer

Last synced: 30 Sep 2024

https://github.com/nobody132/masr

中文语音识别; Mandarin Automatic Speech Recognition;

chinese-speech-recognition mandarin-chinese pytorch speech-recognition

Last synced: 30 Sep 2024

https://github.com/julius-speech/julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

audio-processing recognition speech speech-recognition

Last synced: 30 Sep 2024

https://github.com/jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式

speech speech-recognition speech-to-text stt

Last synced: 31 Jul 2024

https://github.com/astorfi/lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

3d-convolutional-network computer-vision deep-learning speech-recognition tensorflow

Last synced: 03 Oct 2024

https://github.com/kalliope-project/kalliope

Kalliope is a framework that will help you to create your own personal assistant.

bot bot-creation home-automation jarvis linux personal-assistant raspberry speech-recognition speech-synthesis speech-to-text

Last synced: 30 Sep 2024

https://github.com/react-native-voice/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

android ios react-native speech-recognition voice-recognition

Last synced: 30 Sep 2024

https://github.com/fl33tw00d/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

audio machine-learning rust speech-recognition webgpu whisper windows

Last synced: 26 Sep 2024

https://github.com/bjoernkarmann/project_alias

Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.

alias classification hack machine-learning microphone raspberry-pi smarthome sound-synthesis speech-recognition wakeword

Last synced: 28 Sep 2024

https://github.com/FL33TW00D/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

audio machine-learning rust speech-recognition webgpu whisper windows

Last synced: 01 Aug 2024

https://github.com/pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

ai audio-to-text golang speech-recognition speech-to-text stt subtitles sveltekit transcription ui web web-whisper webapp whisper

Last synced: 26 Sep 2024

https://github.com/sdkcarlos/artyom.js

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

recognition speech-recognition speech-synthesis speech-to-text voice-commands

Last synced: 30 Sep 2024

https://github.com/alumae/kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

speech-recognition

Last synced: 26 Sep 2024

https://github.com/k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

asr c cpp csharp go kotlin python speech-recognition vad voice-activity-detection

Last synced: 30 Sep 2024

https://github.com/modal-labs/quillman

A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

ai language-model python serverless speech-recognition speech-to-text

Last synced: 31 Jul 2024

https://github.com/athena-team/athena

an open-source implementation of sequence-to-sequence based speech processing engine

asr ctc deployment sequence-to-sequence speaker-recognition speech-recognition speech-synthesis tensorflow transformer tts unsupervised-learning wfst

Last synced: 08 Aug 2024

https://github.com/freewym/espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

asr end-to-end fairseq kaldi python pytorch speech-recognition

Last synced: 01 Aug 2024

https://github.com/lhotse-speech/lhotse

Tools for handling speech data in machine learning projects.

ai audio data deep-learning kaldi machine-learning python pytorch speech speech-recognition

Last synced: 08 Aug 2024

https://github.com/sooftware/conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

asr augmented cnn conformer conv convolution pytorch recognition speech speech-recognition transformer transformer-xl

Last synced: 08 Aug 2024

https://github.com/astorfi/speechpy

:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

feature-extraction python speech-recognition speechpy

Last synced: 03 Oct 2024

https://github.com/softcatala/whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

openai- openai-whisper speech-recognition speech-to-text whisper

Last synced: 26 Sep 2024

https://github.com/Chenyme/Chenyme-AAVT

这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。

faster-whisper gpt-4 gpt-4o speech-recognition video-translation whisper

Last synced: 01 Aug 2024

https://github.com/alphacep/vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

asr grpc kaldi python saas speech-recognition vosk webrtc websocket

Last synced: 07 Aug 2024

https://github.com/gauravsingh9356/j.a.r.v.i.s

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

chatgpt dictionary-application difflib hacktoberfest hactober-accepted hactoberfest2021 jarvis jarvis-ai newsapi opencv optical-character-recognition optical-text-recognition python python3 pyttsx3 speech-recognition tesseract tesseract-ocr weather-api webbrowser

Last synced: 01 Oct 2024

https://github.com/Softcatala/whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

openai- openai-whisper speech-recognition speech-to-text whisper

Last synced: 01 Aug 2024

https://github.com/jtkim-kaist/VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

acam attention bdnn data dnn lstm speech speech-activity-detection speech-recognition vad voice-activity-detection voice-detection

Last synced: 03 Aug 2024