Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with speech
A curated list of projects in awesome lists tagged with speech .
https://github.com/babysor/mockingbird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
ai deep-learning pytorch speech text-to-speech tts
Last synced: 16 Dec 2024
https://github.com/babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
ai deep-learning pytorch speech text-to-speech tts
Last synced: 27 Oct 2024
https://github.com/coqui-ai/tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis
Last synced: 16 Dec 2024
https://github.com/coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis
Last synced: 25 Oct 2024
https://github.com/svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
ai audio-analysis deep-learning flow generative-adversarial-network pytorch singing-voice-conversion so-vits-svc sovits speech variational-inference vc vits voice voice-changer voice-conversion voiceconversion
Last synced: 29 Sep 2024
https://github.com/huggingface/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
computer-vision datasets deep-learning hacktoberfest machine-learning natural-language-processing nlp numpy pandas pytorch speech tensorflow
Last synced: 16 Dec 2024
https://github.com/idea-research/grounded-segment-anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
3d-whole-body-pose-estimation automatic-labeling-system caption data-generation image-editing open-vocabulary-detection open-vocabulary-segmentation speech
Last synced: 16 Dec 2024
https://github.com/IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
3d-whole-body-pose-estimation automatic-labeling-system caption data-generation image-editing open-vocabulary-detection open-vocabulary-segmentation speech
Last synced: 27 Oct 2024
https://github.com/kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
c-plus-plus cuda kaldi shell speaker-id speaker-verification speech speech-recognition speech-to-text
Last synced: 16 Dec 2024
https://github.com/m-bain/whisperx
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
asr speech speech-recognition speech-to-text whisper
Last synced: 16 Dec 2024
https://github.com/aigc-audio/audiogpt
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
audio gpt music sound speech talking-head
Last synced: 17 Dec 2024
https://github.com/AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
audio gpt music sound speech talking-head
Last synced: 29 Oct 2024
https://github.com/mozilla/tts
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder
Last synced: 17 Dec 2024
https://github.com/mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder
Last synced: 25 Oct 2024
https://github.com/m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
asr speech speech-recognition speech-to-text whisper
Last synced: 25 Oct 2024
https://github.com/netease-youdao/emotivoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts
Last synced: 16 Dec 2024
https://github.com/netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts
Last synced: 29 Oct 2024
https://github.com/modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
cv deep-learning machine-learning multi-modal nlp python science speech
Last synced: 16 Dec 2024
https://github.com/paddlepaddle/models
Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
computer-vision cv deep-learning models natural-language-processing neural-network nlp paddlepaddle recommendation speech
Last synced: 17 Dec 2024
https://github.com/PaddlePaddle/models
Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
computer-vision cv deep-learning models natural-language-processing neural-network nlp paddlepaddle recommendation speech
Last synced: 28 Oct 2024
https://github.com/talater/annyang
💬 Speech recognition for your site
speech speech-recognition speech-to-text voice
Last synced: 16 Dec 2024
https://github.com/TalAter/annyang
:speech_balloon: Speech recognition for your site
hacktoberfest speech speech-recognition speech-to-text voice
Last synced: 25 Oct 2024
https://github.com/snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models
Last synced: 18 Dec 2024
https://github.com/snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition
Last synced: 18 Dec 2024
https://github.com/metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
ai deep-learning pytorch speech speech-synthesis text-to-speech tts voice-clone zero-shot-tts
Last synced: 17 Dec 2024
https://github.com/MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 31 Oct 2024
https://github.com/mahmoudashraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Last synced: 17 Dec 2024
https://github.com/shu223/iOS-10-Sampler
Code examples for new APIs of iOS 10.
cnn convolutional-neural-networks demo image-recognition ios ios10 metal metal-cnn metal-performance-shaders speech swift-3 swift-4 uiviewpropertyanimator
Last synced: 24 Nov 2024
https://github.com/shu223/ios-10-sampler
Code examples for new APIs of iOS 10.
cnn convolutional-neural-networks demo image-recognition ios ios10 metal metal-cnn metal-performance-shaders speech swift-3 swift-4 uiviewpropertyanimator
Last synced: 20 Dec 2024
https://github.com/huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
ai assistant language-model machine-learning python speech speech-synthesis speech-to-text speech-translation
Last synced: 07 Sep 2024
https://github.com/avinashkranjan/amazing-python-scripts
🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
artificial-intelligence hacktoberfest machine-learning projects python python-projects python-scripts speech webcam
Last synced: 18 Dec 2024
https://github.com/hahahumble/speechgpt
💬 SpeechGPT is a web application that enables you to converse with ChatGPT.
chat chatbot chatgpt conversation language-learning speech
Last synced: 20 Dec 2024
https://github.com/avinashkranjan/Amazing-Python-Scripts
🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
artificial-intelligence hacktoberfest machine-learning projects python python-projects python-scripts speech webcam
Last synced: 27 Oct 2024
https://github.com/jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
speech speech-recognition speech-to-text stt
Last synced: 19 Dec 2024
https://github.com/rikorose/deepfilternet
Noise supression using deep filtering
audio deep-learning noise-suppression pytorch rust speech speech-enhancement
Last synced: 17 Dec 2024
https://github.com/pytorch/audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
audio audio-processing io machine-learning python pytorch speech
Last synced: 21 Dec 2024
https://github.com/camb-ai/mars5-tts
MARS5 speech model (TTS) from CAMB.AI
prosody speech speech-synthesis text-to-speech voice-cloneai voice-cloning
Last synced: 19 Dec 2024
https://github.com/readbeyond/aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
alignment audio cli dtw espeak espeak-ng festival ffmpeg forced-alignment linux macos nlp python smil speech srt text text-to-speech tts windows
Last synced: 17 Dec 2024
https://github.com/Rikorose/DeepFilterNet
Noise supression using deep filtering
audio deep-learning noise-suppression pytorch rust speech speech-enhancement
Last synced: 06 Nov 2024
https://github.com/mravanelli/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
asr deep-learning deep-neural-networks dnn dnn-hmm gru kaldi lstm lstm-neural-networks multilayer-perceptron-network pytorch recurrent-neural-networks rnn rnn-model speech speech-recognition timit
Last synced: 21 Dec 2024
https://github.com/r9y9/wavenet_vocoder
WaveNet vocoder
neural-vocoder python pytorch speech speech-processing speech-synthesis wavenet wavenet-vocoder
Last synced: 20 Dec 2024
https://github.com/pndurette/gtts
Python library and CLI tool to interface with Google Translate's text-to-speech API
cli gtts pypi python python-library speech speech-api text-to-speech tts
Last synced: 16 Dec 2024
https://github.com/ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
asr automatic-speech-recognition docker openai-whisper speech speech-recognition speech-to-text
Last synced: 18 Dec 2024
https://github.com/pndurette/gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API
cli gtts pypi python python-library speech speech-api text-to-speech tts
Last synced: 25 Oct 2024
https://github.com/linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper
Last synced: 17 Dec 2024
https://github.com/julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
audio-processing recognition speech speech-recognition
Last synced: 18 Dec 2024
https://github.com/Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
speech speech-synthesis-model tensorflow tts
Last synced: 27 Nov 2024
https://github.com/kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
speech speech-synthesis-model tensorflow tts
Last synced: 21 Dec 2024
https://github.com/jarikomppa/soloud
Free, easy, portable audio engine for games
audio blitzmax c cpp engine flac game game-development gamemaker mp3 ogg opensl-es portable python ruby sound sound-effects speech speech-to-text synthesizer
Last synced: 19 Dec 2024
https://github.com/iahispano/applio
A simple, high-quality voice conversion tool focused on ease of use and performance
ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion
Last synced: 19 Dec 2024
https://github.com/IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance
ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion
Last synced: 14 Nov 2024
https://github.com/ovidijusparsiunas/deep-chat
Fully customizable AI chatbot component for your website
ai ai-chatbot angular chat chatbot chatgpt cohere component files huggingface image nextjs openai react react-chatbot solid speech svelte vue
Last synced: 17 Dec 2024
https://github.com/Delta-ML/delta
DELTA is a deep learning based natural language and speech processing platform.
asr custom-ops deep-learning emotion-recognition front-end inference nlp nlu ops seq2seq sequence-to-sequence serving speaker-verification speech speech-recognition tensorflow tensorflow-lite tensorflow-serving text-classification text-generation
Last synced: 06 Nov 2024
https://github.com/praat/praat
Praat: Doing Phonetics By Computer
acoustics phonetics speech speech-analysis
Last synced: 19 Dec 2024
https://github.com/OvidijusParsiunas/deep-chat
Fully customizable AI chatbot component for your website
ai ai-chatbot angular chat chatbot chatgpt cohere component files huggingface image nextjs openai react react-chatbot solid speech svelte vue
Last synced: 06 Nov 2024
https://github.com/miteshputhran/speech-emotion-analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 15 Dec 2024
https://github.com/MITESHPUTHRANNEU/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 14 Dec 2024
https://github.com/MiteshPuthran/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 30 Oct 2024
https://github.com/dengbocong/nlp-paper
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
bert dialogue nlp nlp-machine-learning paper pytorch speech tensorflow2
Last synced: 21 Dec 2024
https://github.com/DengBoCong/nlp-paper
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
bert dialogue nlp nlp-machine-learning paper pytorch speech tensorflow2
Last synced: 14 Nov 2024
https://github.com/kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Last synced: 15 Dec 2024
https://github.com/Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Last synced: 07 Nov 2024
https://github.com/roatienza/deep-learning-experiments
Videos, notes and experiments to understand deep learning
artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision
Last synced: 19 Dec 2024
https://github.com/roatienza/Deep-Learning-Experiments
Videos, notes and experiments to understand deep learning
artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision
Last synced: 30 Oct 2024
https://github.com/bytedance/salmonn
SALMONN: Speech Audio Language Music Open Neural Network
audio audio-processing bytedance iclr2024 icml-2024 large-language-models multi-modal music research speech speech-recognition tsinghua-university
Last synced: 20 Dec 2024
https://github.com/haoheliu/voicefixer
General Speech Restoration
declipping denoise dereverberation mel speech speech-analysis speech-enhancement speech-processing speech-synthesis super-resolution tts vocoder
Last synced: 17 Dec 2024
https://github.com/bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
audio audio-processing bytedance iclr2024 icml-2024 large-language-models multi-modal music research speech speech-recognition tsinghua-university
Last synced: 08 Nov 2024
https://github.com/pykaldi/pykaldi
A Python wrapper for Kaldi
asr clif feature-extraction kaldi language-model numpy openfst python speech speech-recognition wrapper
Last synced: 20 Dec 2024
https://github.com/ictnlp/streamspeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
all-in-one asr audio-processing machine-translation non-autoregressive seamless simultaneous-translation speech speech-enhancement speech-processing speech-recognition speech-synthesis speech-to-text speech-translation streaming-audio text-to-audio text-to-speech translation tts voice
Last synced: 20 Dec 2024
https://github.com/sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
asr augmented cnn conformer conv convolution pytorch recognition speech speech-recognition transformer transformer-xl
Last synced: 16 Dec 2024
https://github.com/NATSpeech/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
diffsinger diffspeech huggingface portaspeech pytorch speech speech-synthesis tts
Last synced: 27 Nov 2024
https://github.com/lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
ai audio data deep-learning kaldi machine-learning python pytorch speech speech-recognition
Last synced: 28 Nov 2024
https://github.com/jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
acam attention bdnn data dnn lstm speech speech-activity-detection speech-recognition vad voice-activity-detection voice-detection
Last synced: 14 Nov 2024
https://github.com/yeyupiaoling/ppasr
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
asr chinese conformer deep-learning deepspeech2 paddlepaddle speech speech-recognition speech-to-text squeezeformer streaming-asr
Last synced: 19 Dec 2024
https://github.com/yeyupiaoling/PPASR
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
asr chinese conformer deep-learning deepspeech2 paddlepaddle speech speech-recognition speech-to-text squeezeformer streaming-asr
Last synced: 14 Nov 2024
https://github.com/santi-pdp/segan
Speech Enhancement Generative Adversarial Network in TensorFlow
deep-learning deep-neural-networks gan generative-adversarial-networks generative-model speech tensorflow
Last synced: 22 Nov 2024
https://github.com/EvelynFan/FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
3d-face 3d-models computer-graphics computer-vision deep-learning facial-animation facial-expressions lip-animation pytorch-implementation speech
Last synced: 07 Nov 2024
https://github.com/goxr3plus/xr3player
🎧 🎼 The MOST ADVANCED JavaFX Media Player
audio-formats audio-player audio-processing audio-recorder audio-visualizer dropbox-client java-speech java-stream-player javafx mp3 spectrum-analyzer speech stream-player web-browser
Last synced: 20 Dec 2024
https://github.com/googleapis/nodejs-speech
This repository is deprecated. All of its content and history has been moved to googleapis/google-cloud-node.
machine-learning nodejs speech speech-to-text
Last synced: 25 Oct 2024
https://github.com/drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
deep-learning end-to-end machine-learning neural-networks speech speech-denoising speech-processing wavenet
Last synced: 22 Nov 2024
https://github.com/demiseom/specaugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
data-augmentation python pytorch specaugment speech speech-recognition tensorflow
Last synced: 20 Dec 2024
https://github.com/DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
data-augmentation python pytorch specaugment speech speech-recognition tensorflow
Last synced: 27 Nov 2024
https://github.com/cboard-org/cboard
Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser
aac accessibility assistive-technology autism cerebral-palsy communication communication-board disabilities javascript progressive-web-app react speech symbols text-to-speech tts
Last synced: 25 Oct 2024
https://github.com/coqui-ai/TTS-papers
🐸 collection of TTS papers
coqui-ai deep-learning papers research-paper speech tts
Last synced: 16 Nov 2024
https://github.com/evancohen/sonus
:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
alexa hotword-detection keyword-spotting node speech speech-recognition speech-to-text stt voice-control voice-recognition
Last synced: 19 Dec 2024
https://github.com/yeyupiaoling/masr
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
asr conformer deep-learning deepspeech pytorch speech speech-recognition speech-to-text squeezeformer
Last synced: 19 Dec 2024
https://github.com/vbelz/Speech-enhancement
Deep learning for audio denoising
Last synced: 06 Nov 2024
https://github.com/hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
asr attention attention-mechanism automatic-speech-recognition ctc language-model language-modeling pytorch rnn-transducer seq2seq sequence-to-sequence speech speech-recognition streaming transformer transformer-xl
Last synced: 12 Nov 2024
https://github.com/OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
pytorch speech voice-conversion
Last synced: 18 Nov 2024
https://github.com/azkadev/whisper
Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models
ai android dart flutter ggml indonesia ios linux macos openai speech speech-recognition speech-synthesis speech-to-text transcribe transformer whisper whisper-dart whisper-flutter windows
Last synced: 21 Dec 2024
https://github.com/coqui-ai/tts-papers
🐸 collection of TTS papers
coqui-ai deep-learning papers research-paper speech tts
Last synced: 10 Nov 2024
https://github.com/xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
phonetics pytorch speech speech-recognition
Last synced: 12 Oct 2024
https://github.com/google/tacotron
Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
audio machine-learning prosody speech tacotron tts
Last synced: 08 Nov 2024
https://github.com/Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
audio band denoising full-band narrow-band noise-reduction paper pretrained-model pytorch reproducible-research single-channel speech speech-enhancement speech-processing speech-separation sub-band
Last synced: 22 Nov 2024
https://github.com/gotev/android-speech
Android speech recognition and text to speech made easy
android recognition speech tts
Last synced: 20 Dec 2024
https://github.com/modelscope/kan-tts
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
modelscope speech speech-synthesis tts
Last synced: 21 Dec 2024