Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with speech-recognition
A curated list of projects in awesome lists tagged with speech-recognition .
https://github.com/huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
bert deep-learning flax hacktoberfest jax language-model language-models machine-learning model-hub natural-language-processing nlp nlp-library pretrained-models python pytorch pytorch-transformers seq2seq speech-recognition tensorflow transformer
Last synced: 29 Sep 2024
https://github.com/huggingface/pytorch-pretrained-BERT
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
bert deep-learning flax hacktoberfest jax language-model language-models machine-learning model-hub natural-language-processing nlp nlp-library pretrained-models python pytorch pytorch-transformers seq2seq speech-recognition tensorflow transformer
Last synced: 11 Aug 2024
https://github.com/huggingface/pytorch-transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
bert deep-learning flax hacktoberfest jax language-model language-models machine-learning model-hub natural-language-processing nlp nlp-library pretrained-models python pytorch pytorch-transformers seq2seq speech-recognition tensorflow transformer
Last synced: 05 Aug 2024
https://github.com/ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
inference openai speech-recognition speech-to-text transformer whisper
Last synced: 02 Oct 2024
https://github.com/mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
deep-learning deepspeech embedded machine-learning neural-networks offline on-device speech-recognition speech-to-text tensorflow
Last synced: 30 Jul 2024
https://github.com/mozilla/deepspeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
deep-learning deepspeech embedded machine-learning neural-networks offline on-device speech-recognition speech-to-text tensorflow
Last synced: 29 Sep 2024
https://github.com/leon-ai/leon
🧠 Leon is your open-source personal assistant.
ai ai-assistant artificial-intelligence assistant automation bot chatbot flite leon nodejs offline personal-assistant privacy python speech-recognition speech-synthesis speech-to-text text-to-speech virtual-assistant voice-assistant
Last synced: 25 Sep 2024
https://github.com/kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
c-plus-plus cuda kaldi shell speaker-id speaker-verification speech speech-recognition speech-to-text
Last synced: 29 Sep 2024
https://github.com/nvidia/deeplearningexamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation
Last synced: 29 Sep 2024
https://github.com/NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation
Last synced: 31 Jul 2024
https://github.com/kmario23/deep-learning-drizzle
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
artificial-intelligence-algorithms artificial-neural-networks bayesian-statistics computer-vision deep-learning deep-neural-networks deep-reinforcement-learning explainable-ai geometric-deep-learning graph-neural-networks machine-learning medical-imaging natural-language-processing optimization pattern-recognition probabilistic-graphical-models probability reinforcement-learning speech-recognition visual-recognition
Last synced: 30 Sep 2024
https://github.com/paddlepaddle/paddlespeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 29 Sep 2024
https://github.com/PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 31 Jul 2024
https://github.com/guillaumekln/faster-whisper
Faster Whisper transcription with CTranslate2
deep-learning inference openai quantization speech-recognition speech-to-text transformer whisper
Last synced: 12 Aug 2024
https://github.com/SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
deep-learning inference openai quantization speech-recognition speech-to-text transformer whisper
Last synced: 31 Jul 2024
https://github.com/systran/faster-whisper
Faster Whisper transcription with CTranslate2
deep-learning inference openai quantization speech-recognition speech-to-text transformer whisper
Last synced: 26 Sep 2024
https://github.com/m-bain/whisperx
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
asr speech speech-recognition speech-to-text whisper
Last synced: 26 Sep 2024
https://github.com/m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
asr speech speech-recognition speech-to-text whisper
Last synced: 30 Jul 2024
https://github.com/uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
audio python speech-recognition speech-to-text
Last synced: 29 Sep 2024
https://github.com/Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
audio python speech-recognition speech-to-text
Last synced: 31 Jul 2024
https://github.com/espnet/espnet
End-to-End Speech Processing Toolkit
chainer deep-learning end-to-end kaldi machine-translation pytorch singing-voice-synthesis speaker-diarization speech-enhancement speech-recognition speech-separation speech-synthesis speech-translation spoken-language-understanding voice-conversion
Last synced: 29 Sep 2024
https://github.com/speechbrain/speechbrain
A PyTorch-based Speech Toolkit
asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition
Last synced: 29 Sep 2024
https://github.com/nl8590687/asrt_speechrecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
asrt chinese-speech-recognition cnn ctc keras python python3 speech-recognition speech-to-text tensorflow
Last synced: 02 Oct 2024
https://github.com/nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
asrt chinese-speech-recognition cnn ctc keras python python3 speech-recognition speech-to-text tensorflow
Last synced: 31 Jul 2024
https://github.com/alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
android asr deep-learning deep-neural-networks deepspeech google-speech-to-text ios kaldi offline privacy python raspberry-pi speaker-identification speaker-verification speech-recognition speech-to-text speech-to-text-android stt voice-recognition vosk
Last synced: 29 Sep 2024
https://github.com/openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
ai computer-vision deep-learning deploy-ai diffusion-models generative-ai good-first-issue inference llm-inference natural-language-processing nlp openvino optimize-ai performance-boost recommendation-system speech-recognition stable-diffusion transformers yolo
Last synced: 29 Sep 2024
https://github.com/TalAter/annyang
:speech_balloon: Speech recognition for your site
hacktoberfest speech speech-recognition speech-to-text voice
Last synced: 30 Jul 2024
https://github.com/talater/annyang
:speech_balloon: Speech recognition for your site
hacktoberfest speech speech-recognition speech-to-text voice
Last synced: 29 Sep 2024
https://github.com/flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
cpp deep-learning end-to-end speech-recognition wav2letter
Last synced: 30 Sep 2024
https://github.com/modelscope/funasr
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper
Last synced: 26 Sep 2024
https://github.com/snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models
Last synced: 31 Jul 2024
https://github.com/sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
deep-learning jax speech-recognition speech-to-text whisper
Last synced: 26 Sep 2024
https://github.com/wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
asr automatic-speech-recognition conformer e2e-models production-ready pytorch speech-recognition transformer whisper
Last synced: 26 Sep 2024
https://github.com/Picovoice/Porcupine
On-device wake word detection powered by deep learning
handsfree hotword hotword-detection hotword-detector keyword-spotter keyword-spotting on-device speech-recognition trigger-word-detection voice-activation wake-word wake-word-detection wake-word-engine
Last synced: 14 Aug 2024
https://github.com/Picovoice/porcupine
On-device wake word detection powered by deep learning
handsfree hotword hotword-detection hotword-detector keyword-spotter keyword-spotting on-device speech-recognition trigger-word-detection voice-activation wake-word wake-word-detection wake-word-engine
Last synced: 31 Jul 2024
https://github.com/huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
audio speech-recognition whisper
Last synced: 26 Sep 2024
https://github.com/mahmoudashraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 26 Sep 2024
https://github.com/modelscope/funclip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
gradio gradio-python-llm llm speech-recognition speech-to-text subtitles-generator video-clip video-subtitles
Last synced: 26 Sep 2024
https://github.com/yanshengjia/ml-road
Machine Learning Resources, Practice and Research
computer-vision deep-learning machine-learning nlp pytorch speech-recognition tensorflow
Last synced: 07 Aug 2024
https://github.com/zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
audio automatic-speech-recognition chinese-speech-recognition cnn data-preprocessing deep-learning end-to-end evaluation feature-vector layer-normalization lstm paper phonemes rnn rnn-encoder-decoder speech-recognition tensorflow timit-dataset
Last synced: 01 Aug 2024
https://github.com/zzw922cn/automatic_speech_recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
audio automatic-speech-recognition chinese-speech-recognition cnn data-preprocessing deep-learning end-to-end evaluation feature-vector layer-normalization lstm paper phonemes rnn rnn-encoder-decoder speech-recognition tensorflow timit-dataset
Last synced: 30 Sep 2024
https://github.com/funaudiollm/sensevoice
Multilingual Voice Understanding Model
ai aigc asr audio-event-classification cross-lingual gpt-4o llm multilingual python pytorch speech-emotion-recognition speech-recognition speech-to-text
Last synced: 30 Sep 2024
https://github.com/MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 31 Jul 2024
https://github.com/toverainc/willow
Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
alexa deep-learning echo esp-adf esp-idf esp32 google-home home-assistant home-automation privacy speech-recognition speech-to-text whisper
Last synced: 27 Sep 2024
https://github.com/mravanelli/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
asr deep-learning deep-neural-networks dnn dnn-hmm gru kaldi lstm lstm-neural-networks multilayer-perceptron-network pytorch recurrent-neural-networks rnn rnn-model speech speech-recognition timit
Last synced: 30 Sep 2024
https://github.com/rhasspy/rhasspy
Offline private voice assistant for many human languages
home-assistant node-red privacy speech-recognition voice-assistants voice-commands
Last synced: 29 Sep 2024
https://github.com/coqui-ai/stt
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 26 Sep 2024
https://github.com/argmaxinc/WhisperKit
Swift native on-device speech recognition with Whisper for Apple Silicon
inference ios macos pretrained-models speech-recognition swift transformers visionos watchos whisper
Last synced: 31 Jul 2024
https://github.com/argmaxinc/whisperkit
Swift native on-device speech recognition with Whisper for Apple Silicon
inference ios macos pretrained-models speech-recognition swift transformers visionos watchos whisper
Last synced: 26 Sep 2024
https://github.com/coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
asr automatic-speech-recognition deep-learning speech-recognition speech-recognition-api speech-recognizer speech-to-text stt tensorflow voice-recognition
Last synced: 31 Jul 2024
https://github.com/pannous/tensorflow-speech-recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
deep-learning neural-network speech-recognition speech-to-text stt tensorflow
Last synced: 30 Sep 2024
https://github.com/ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
asr automatic-speech-recognition docker openai-whisper speech speech-recognition speech-to-text
Last synced: 25 Sep 2024
https://github.com/alan-ai/alan-sdk-ios
Conversational AI SDK for iOS to enable text and voice conversations with actions (Swift, Objective-C)
alan-ios-sdk alan-studio alan-voice chatbot conversational-ai ios machine-learning sdk speech-recognition voice voice-ai voice-assistant voice-commands
Last synced: 30 Sep 2024
https://github.com/linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper
Last synced: 26 Sep 2024
https://github.com/syhw/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
deep-neural-network speech-recognition wer
Last synced: 30 Sep 2024
https://github.com/nobody132/masr
中文语音识别; Mandarin Automatic Speech Recognition;
chinese-speech-recognition mandarin-chinese pytorch speech-recognition
Last synced: 30 Sep 2024
https://github.com/alan-ai/alan-sdk-android
Conversational AI SDK for Android to enable text and voice conversations with actions (Java, Kotlin)
alan-ai alan-sdk alan-studio alan-voice android conversational-ai machine-learning multimodal sdk speech-recognition text-to-speech voice voice-assistant voice-commands voice-control voice-interface vui
Last synced: 30 Sep 2024
https://github.com/julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
audio-processing recognition speech speech-recognition
Last synced: 30 Sep 2024
https://github.com/jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
speech speech-recognition speech-to-text stt
Last synced: 31 Jul 2024
https://github.com/FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
ai aigc asr audio-event-classification cross-lingual gpt-4o llm multilingual python pytorch speech-emotion-recognition speech-recognition speech-to-text
Last synced: 03 Aug 2024
https://github.com/astorfi/lip-reading-deeplearning
:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
3d-convolutional-network computer-vision deep-learning speech-recognition tensorflow
Last synced: 03 Oct 2024
https://github.com/alan-ai/alan-sdk-flutter
Conversational AI SDK for Flutter to enable text and voice conversations with actions (iOS and Android)
alan-sdk alan-studio alan-voice chatbot conversational-ai flutter machine-learning multimodal sdk speech-recognition text-to-speech voice voice-ai voice-assistant voice-commands voice-control voice-interface vui
Last synced: 30 Sep 2024
https://github.com/kalliope-project/kalliope
Kalliope is a framework that will help you to create your own personal assistant.
bot bot-creation home-automation jarvis linux personal-assistant raspberry speech-recognition speech-synthesis speech-to-text
Last synced: 30 Sep 2024
https://github.com/react-native-voice/voice
:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
android ios react-native speech-recognition voice-recognition
Last synced: 30 Sep 2024
https://github.com/alan-ai/alan-sdk-ionic
Conversational AI SDK for Ionic to enable text and voice conversations with actions (React, Angular, Vue)
alan-ionic-sdk alan-studio chatbot conversational-ai ionic machine-learning multimodal sdk speech-recognition text-to-speech voice voice-ai voice-assistant voice-commands voice-control voice-interface vui
Last synced: 29 Sep 2024
https://github.com/fl33tw00d/whisper-turbo
Cross-Platform, GPU Accelerated Whisper 🏎️
audio machine-learning rust speech-recognition webgpu whisper windows
Last synced: 26 Sep 2024
https://github.com/bjoernkarmann/project_alias
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
alias classification hack machine-learning microphone raspberry-pi smarthome sound-synthesis speech-recognition wakeword
Last synced: 28 Sep 2024
https://github.com/FL33TW00D/whisper-turbo
Cross-Platform, GPU Accelerated Whisper 🏎️
audio machine-learning rust speech-recognition webgpu whisper windows
Last synced: 01 Aug 2024
https://github.com/Delta-ML/delta
DELTA is a deep learning based natural language and speech processing platform.
asr custom-ops deep-learning emotion-recognition front-end inference nlp nlu ops seq2seq sequence-to-sequence serving speaker-verification speech speech-recognition tensorflow tensorflow-lite tensorflow-serving text-classification text-generation
Last synced: 01 Aug 2024
https://github.com/NVIDIA/OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
deep-learning float16 language-model mixed-precision multi-gpu multi-node neural-machine-translation seq2seq sequence-to-sequence speech-recognition speech-synthesis speech-to-text tensorflow text-to-speech
Last synced: 07 Aug 2024
https://github.com/pluja/whishper
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
ai audio-to-text golang speech-recognition speech-to-text stt subtitles sveltekit transcription ui web web-whisper webapp whisper
Last synced: 26 Sep 2024
https://github.com/DragonComputer/Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
artificial-intelligence chatbot kaldi linux machine-learning nlp personal-assistant spacy speech-recognition speech-to-text text-to-speech ubuntu virtual-assistant
Last synced: 01 Aug 2024
https://github.com/dragoncomputer/dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
artificial-intelligence chatbot kaldi linux machine-learning nlp personal-assistant spacy speech-recognition speech-to-text text-to-speech ubuntu virtual-assistant
Last synced: 25 Sep 2024
https://github.com/sc0ty/subsync
Subtitle Speech Synchronizer
speech-recognition subtitle-speech-synchronizer subtitles synchronization
Last synced: 30 Sep 2024
https://github.com/coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition
Last synced: 30 Sep 2024
https://github.com/miteshputhran/speech-emotion-analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 30 Sep 2024
https://github.com/MiteshPuthran/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 31 Jul 2024
https://github.com/sdkcarlos/artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
recognition speech-recognition speech-synthesis speech-to-text voice-commands
Last synced: 30 Sep 2024
https://github.com/microsoft/speecht5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
speech-pretraining speech-recognition speech-synthesis speech-text-pretraining speech-translation speech2c speechlm speecht5 speechut vallex vatlm
Last synced: 30 Sep 2024
https://github.com/mravanelli/sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform
Last synced: 30 Sep 2024
https://github.com/mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform
Last synced: 02 Aug 2024
https://github.com/alumae/kaldi-gstreamer-server
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
Last synced: 26 Sep 2024
https://github.com/pykaldi/pykaldi
A Python wrapper for Kaldi
asr clif feature-extraction kaldi language-model numpy openfst python speech speech-recognition wrapper
Last synced: 01 Oct 2024
https://github.com/k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
asr c cpp csharp go kotlin python speech-recognition vad voice-activity-detection
Last synced: 30 Sep 2024
https://github.com/modal-labs/quillman
A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.
ai language-model python serverless speech-recognition speech-to-text
Last synced: 31 Jul 2024
https://github.com/athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
asr ctc deployment sequence-to-sequence speaker-recognition speech-recognition speech-synthesis tensorflow transformer tts unsupervised-learning wfst
Last synced: 08 Aug 2024
https://github.com/synesthesiam/rhasspy
Rhasspy voice assistant for offline home automation
catalan dutch french german greek hassio hindi home-assistant intent-recognition italian mandarin portuguese russian spanish speech-recognition swedish vietnamese voice-assistant
Last synced: 29 Sep 2024
https://github.com/freewym/espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
asr end-to-end fairseq kaldi python pytorch speech-recognition
Last synced: 01 Aug 2024
https://github.com/lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
ai audio data deep-learning kaldi machine-learning python pytorch speech speech-recognition
Last synced: 08 Aug 2024
https://github.com/sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
asr augmented cnn conformer conv convolution pytorch recognition speech speech-recognition transformer transformer-xl
Last synced: 08 Aug 2024
https://github.com/astorfi/speechpy
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
feature-extraction python speech-recognition speechpy
Last synced: 03 Oct 2024
https://github.com/softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
openai- openai-whisper speech-recognition speech-to-text whisper
Last synced: 26 Sep 2024
https://github.com/Chenyme/Chenyme-AAVT
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
faster-whisper gpt-4 gpt-4o speech-recognition video-translation whisper
Last synced: 01 Aug 2024
https://github.com/bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
audio audio-processing bytedance iclr2024 large-language-models multi-modal music speech speech-recognition tsinghua-university
Last synced: 01 Aug 2024
https://github.com/alphacep/vosk-server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
asr grpc kaldi python saas speech-recognition vosk webrtc websocket
Last synced: 07 Aug 2024
https://github.com/gauravsingh9356/j.a.r.v.i.s
Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.
chatgpt dictionary-application difflib hacktoberfest hactober-accepted hactoberfest2021 jarvis jarvis-ai newsapi opencv optical-character-recognition optical-text-recognition python python3 pyttsx3 speech-recognition tesseract tesseract-ocr weather-api webbrowser
Last synced: 01 Oct 2024
https://github.com/Softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
openai- openai-whisper speech-recognition speech-to-text whisper
Last synced: 01 Aug 2024
https://github.com/jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
acam attention bdnn data dnn lstm speech speech-activity-detection speech-recognition vad voice-activity-detection voice-detection
Last synced: 03 Aug 2024