Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with speech-translation
A curated list of projects in awesome lists tagged with speech-translation .
https://github.com/nvidia/nemo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts
Last synced: 16 Dec 2024
https://github.com/NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts
Last synced: 25 Oct 2024
https://github.com/paddlepaddle/paddlespeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 16 Dec 2024
https://github.com/PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 29 Oct 2024
https://github.com/espnet/espnet
End-to-End Speech Processing Toolkit
chainer deep-learning end-to-end kaldi machine-translation pytorch singing-voice-synthesis speaker-diarization speech-enhancement speech-recognition speech-separation speech-synthesis speech-translation spoken-language-understanding text-to-speech voice-conversion
Last synced: 16 Dec 2024
https://github.com/huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
ai assistant language-model machine-learning python speech speech-synthesis speech-to-text speech-translation
Last synced: 07 Sep 2024
https://github.com/microsoft/speecht5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
speech-pretraining speech-recognition speech-synthesis speech-text-pretraining speech-translation speech2c speechlm speecht5 speechut vallex vatlm
Last synced: 15 Dec 2024
https://github.com/ictnlp/streamspeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
all-in-one asr audio-processing machine-translation non-autoregressive seamless simultaneous-translation speech speech-enhancement speech-processing speech-recognition speech-synthesis speech-to-text speech-translation streaming-audio text-to-audio text-to-speech translation tts voice
Last synced: 20 Dec 2024
https://github.com/dadangdut33/speech-translate
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
python speech-transcription speech-translation tkinter-python translate whisper
Last synced: 21 Dec 2024
https://github.com/Dadangdut33/Speech-Translate
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
python speech-transcription speech-translation tkinter-python translate whisper
Last synced: 20 Nov 2024
https://github.com/double22a/speech_dataset
The dataset of Speech Recognition
asr audio automatic-speech-recognition dataset deep-learning deep-neural-networks speech speech-diarization speech-enhancement speech-recognition speech-segmentation speech-separation speech-synthesis speech-to-text speech-translation text-to-speech tts voice-conversion wav
Last synced: 13 Nov 2024
https://github.com/echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes easy-to-use tools for synthesis, recognition, forced alignment, speech translation, source separation, language detection and more.
command-line forced-alignment language-detection language-identification node-js source-separation speech speech-alignment speech-recognition speech-synthesis speech-to-text speech-translation text-to-speech voice-isolation
Last synced: 16 Dec 2024
https://github.com/ictnlp/daspeech
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
machine-translation speech-to-speech speech-to-speech-translation speech-translation
Last synced: 19 Dec 2024
https://github.com/ictnlp/STEMM
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
machine-translation speech-to-text speech-translation
Last synced: 13 Nov 2024
https://github.com/ictnlp/stemm
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
machine-translation speech-to-text speech-translation
Last synced: 13 Nov 2024
https://github.com/ictnlp/diseg
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
machine-translation segment segmentation sequence-segmentation simultaneous-machine-translation simultaneous-translation speech speech-translation streaming streaming-speech-to-text
Last synced: 13 Nov 2024
https://github.com/liamdugan/speech-to-speech
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
simultaneous-translation speech speech-processing speech-to-speech speech-translation
Last synced: 27 Oct 2024
https://github.com/ictnlp/itst
Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"
end-to-end-speech-translation machine-translation simultaneous-machine-translation simultaneous-translation speech-translation
Last synced: 13 Nov 2024
https://github.com/ictnlp/comspeech
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
machine-translation non-autoregressive-translation speech-to-speech-translation speech-translation text-to-speech zero-shot-speech-translation
Last synced: 13 Nov 2024
https://github.com/ictnlp/bt4st
Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".
machine-translation speech-to-text speech-translation
Last synced: 13 Nov 2024
https://github.com/ictnlp/cress
Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".
machine-translation speech-to-text speech-translation
Last synced: 13 Nov 2024
https://github.com/yousef0sa/speech-to-text
Speech-To-Text is a C# desktop app that uses Azure Cognitive Services to convert and translate speech. You can copy or show the text on the screen, and choose the language of the speech or the translation.
azure-cognitive-services c-sharp desktop-application dotnet speech-to-text speech-translation
Last synced: 14 Dec 2024
https://github.com/tran-khoa/joint-training-cascaded-st
Code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (EMNLP 2022)
emnlp2022 fairseq nlp speech-translation
Last synced: 19 Nov 2024
https://github.com/prashver/nlp-driven-video-summarizer-and-insight-tool
An NLP-powered tool for transcribing, summarizing, and indexing podcast content, with video-to-audio conversion and multilingual support.
flask-application huggingface-transformers keyword-extraction named-entity-recognition natural-language-processing ntlk spacy speech-to-text speech-translation text-summarization topic-modeling
Last synced: 18 Dec 2024