Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with speech

A curated list of projects in awesome lists tagged with speech .

https://github.com/babysor/mockingbird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

ai deep-learning pytorch speech text-to-speech tts

Last synced: 16 Dec 2024

https://github.com/babysor/MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

ai deep-learning pytorch speech text-to-speech tts

Last synced: 27 Oct 2024

https://github.com/huggingface/datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

computer-vision datasets deep-learning hacktoberfest machine-learning natural-language-processing nlp numpy pandas pytorch speech tensorflow

Last synced: 16 Dec 2024

https://github.com/idea-research/grounded-segment-anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

3d-whole-body-pose-estimation automatic-labeling-system caption data-generation image-editing open-vocabulary-detection open-vocabulary-segmentation speech

Last synced: 16 Dec 2024

https://github.com/IDEA-Research/Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

3d-whole-body-pose-estimation automatic-labeling-system caption data-generation image-editing open-vocabulary-detection open-vocabulary-segmentation speech

Last synced: 27 Oct 2024

https://github.com/kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

c-plus-plus cuda kaldi shell speaker-id speaker-verification speech speech-recognition speech-to-text

Last synced: 16 Dec 2024

https://github.com/m-bain/whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 16 Dec 2024

https://github.com/aigc-audio/audiogpt

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio gpt music sound speech talking-head

Last synced: 17 Dec 2024

https://github.com/AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio gpt music sound speech talking-head

Last synced: 29 Oct 2024

https://github.com/mozilla/tts

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder

Last synced: 17 Dec 2024

https://github.com/mozilla/TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder

Last synced: 25 Oct 2024

https://github.com/m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text whisper

Last synced: 25 Oct 2024

https://github.com/netease-youdao/emotivoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts

Last synced: 16 Dec 2024

https://github.com/netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts

Last synced: 29 Oct 2024

https://github.com/modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

cv deep-learning machine-learning multi-modal nlp python science speech

Last synced: 16 Dec 2024

https://github.com/paddlepaddle/models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

computer-vision cv deep-learning models natural-language-processing neural-network nlp paddlepaddle recommendation speech

Last synced: 17 Dec 2024

https://github.com/PaddlePaddle/models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

computer-vision cv deep-learning models natural-language-processing neural-network nlp paddlepaddle recommendation speech

Last synced: 28 Oct 2024

https://github.com/talater/annyang

💬 Speech recognition for your site

speech speech-recognition speech-to-text voice

Last synced: 16 Dec 2024

https://github.com/TalAter/annyang

:speech_balloon: Speech recognition for your site

hacktoberfest speech speech-recognition speech-to-text voice

Last synced: 25 Oct 2024

https://github.com/snakers4/silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models

Last synced: 18 Dec 2024

https://github.com/MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 31 Oct 2024

https://github.com/mahmoudashraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text whisper

Last synced: 17 Dec 2024

https://github.com/huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

ai assistant language-model machine-learning python speech speech-synthesis speech-to-text speech-translation

Last synced: 07 Sep 2024

https://github.com/avinashkranjan/amazing-python-scripts

🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

artificial-intelligence hacktoberfest machine-learning projects python python-projects python-scripts speech webcam

Last synced: 18 Dec 2024

https://github.com/hahahumble/speechgpt

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

chat chatbot chatgpt conversation language-learning speech

Last synced: 20 Dec 2024

https://github.com/avinashkranjan/Amazing-Python-Scripts

🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

artificial-intelligence hacktoberfest machine-learning projects python python-projects python-scripts speech webcam

Last synced: 27 Oct 2024

https://github.com/jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式

speech speech-recognition speech-to-text stt

Last synced: 19 Dec 2024

https://github.com/pytorch/audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

audio audio-processing io machine-learning python pytorch speech

Last synced: 21 Dec 2024

https://github.com/readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

alignment audio cli dtw espeak espeak-ng festival ffmpeg forced-alignment linux macos nlp python smil speech srt text text-to-speech tts windows

Last synced: 17 Dec 2024

https://github.com/mravanelli/pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

asr deep-learning deep-neural-networks dnn dnn-hmm gru kaldi lstm lstm-neural-networks multilayer-perceptron-network pytorch recurrent-neural-networks rnn rnn-model speech speech-recognition timit

Last synced: 21 Dec 2024

https://github.com/pndurette/gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

cli gtts pypi python python-library speech speech-api text-to-speech tts

Last synced: 16 Dec 2024

https://github.com/pndurette/gTTS

Python library and CLI tool to interface with Google Translate's text-to-speech API

cli gtts pypi python python-library speech speech-api text-to-speech tts

Last synced: 25 Oct 2024

https://github.com/julius-speech/julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

audio-processing recognition speech speech-recognition

Last synced: 18 Dec 2024

https://github.com/Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

speech speech-synthesis-model tensorflow tts

Last synced: 27 Nov 2024

https://github.com/kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

speech speech-synthesis-model tensorflow tts

Last synced: 21 Dec 2024

https://github.com/iahispano/applio

A simple, high-quality voice conversion tool focused on ease of use and performance

ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion

Last synced: 19 Dec 2024

https://github.com/IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance

ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion

Last synced: 14 Nov 2024

https://github.com/praat/praat

Praat: Doing Phonetics By Computer

acoustics phonetics speech speech-analysis

Last synced: 19 Dec 2024

https://github.com/csteinmetz1/ai-audio-startups

Community list of startups working with AI in audio and music technology

audio list music speech startups

Last synced: 03 Dec 2024

https://github.com/dengbocong/nlp-paper

自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)

bert dialogue nlp nlp-machine-learning paper pytorch speech tensorflow2

Last synced: 21 Dec 2024

https://github.com/DengBoCong/nlp-paper

自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)

bert dialogue nlp nlp-machine-learning paper pytorch speech tensorflow2

Last synced: 14 Nov 2024

https://github.com/kyubyong/dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

speech speech-to-text tts

Last synced: 15 Dec 2024

https://github.com/Kyubyong/dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

speech speech-to-text tts

Last synced: 07 Nov 2024

https://github.com/roatienza/deep-learning-experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 19 Dec 2024

https://github.com/roatienza/Deep-Learning-Experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 30 Oct 2024

https://github.com/sooftware/conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

asr augmented cnn conformer conv convolution pytorch recognition speech speech-recognition transformer transformer-xl

Last synced: 16 Dec 2024

https://github.com/NATSpeech/NATSpeech

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

diffsinger diffspeech huggingface portaspeech pytorch speech speech-synthesis tts

Last synced: 27 Nov 2024

https://github.com/lhotse-speech/lhotse

Tools for handling speech data in machine learning projects.

ai audio data deep-learning kaldi machine-learning python pytorch speech speech-recognition

Last synced: 28 Nov 2024

https://github.com/jtkim-kaist/VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

acam attention bdnn data dnn lstm speech speech-activity-detection speech-recognition vad voice-activity-detection voice-detection

Last synced: 14 Nov 2024

https://github.com/yeyupiaoling/ppasr

基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

asr chinese conformer deep-learning deepspeech2 paddlepaddle speech speech-recognition speech-to-text squeezeformer streaming-asr

Last synced: 19 Dec 2024

https://github.com/yeyupiaoling/PPASR

基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

asr chinese conformer deep-learning deepspeech2 paddlepaddle speech speech-recognition speech-to-text squeezeformer streaming-asr

Last synced: 14 Nov 2024

https://github.com/santi-pdp/segan

Speech Enhancement Generative Adversarial Network in TensorFlow

deep-learning deep-neural-networks gan generative-adversarial-networks generative-model speech tensorflow

Last synced: 22 Nov 2024

https://github.com/googleapis/nodejs-speech

This repository is deprecated. All of its content and history has been moved to googleapis/google-cloud-node.

machine-learning nodejs speech speech-to-text

Last synced: 25 Oct 2024

https://github.com/demiseom/specaugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

data-augmentation python pytorch specaugment speech speech-recognition tensorflow

Last synced: 20 Dec 2024

https://github.com/DemisEom/SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

data-augmentation python pytorch specaugment speech speech-recognition tensorflow

Last synced: 27 Nov 2024

https://github.com/coqui-ai/TTS-papers

🐸 collection of TTS papers

coqui-ai deep-learning papers research-paper speech tts

Last synced: 16 Nov 2024

https://github.com/evancohen/sonus

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

alexa hotword-detection keyword-spotting node speech speech-recognition speech-to-text stt voice-control voice-recognition

Last synced: 19 Dec 2024

https://github.com/yeyupiaoling/masr

Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。

asr conformer deep-learning deepspeech pytorch speech speech-recognition speech-to-text squeezeformer

Last synced: 19 Dec 2024

https://github.com/vbelz/Speech-enhancement

Deep learning for audio denoising

cnn deep-learning speech unet

Last synced: 06 Nov 2024

https://github.com/OlaWod/FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

pytorch speech voice-conversion

Last synced: 18 Nov 2024

https://github.com/azkadev/whisper

Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models

ai android dart flutter ggml indonesia ios linux macos openai speech speech-recognition speech-synthesis speech-to-text transcribe transformer whisper whisper-dart whisper-flutter windows

Last synced: 21 Dec 2024

https://github.com/coqui-ai/tts-papers

🐸 collection of TTS papers

coqui-ai deep-learning papers research-paper speech tts

Last synced: 10 Nov 2024

https://github.com/xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

phonetics pytorch speech speech-recognition

Last synced: 12 Oct 2024

https://github.com/google/tacotron

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

audio machine-learning prosody speech tacotron tts

Last synced: 08 Nov 2024

https://github.com/Audio-WestlakeU/FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

audio band denoising full-band narrow-band noise-reduction paper pretrained-model pytorch reproducible-research single-channel speech speech-enhancement speech-processing speech-separation sub-band

Last synced: 22 Nov 2024

https://github.com/ddlbojack/speech-resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

speech speech-processing

Last synced: 21 Nov 2024

https://github.com/ddlBoJack/Speech-Resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

speech speech-processing

Last synced: 02 Nov 2024

https://github.com/gotev/android-speech

Android speech recognition and text to speech made easy

android recognition speech tts

Last synced: 20 Dec 2024

https://github.com/modelscope/kan-tts

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

modelscope speech speech-synthesis tts

Last synced: 21 Dec 2024