Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/NATSpeech/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
https://github.com/NATSpeech/NATSpeech
diffsinger diffspeech huggingface portaspeech pytorch speech speech-synthesis tts
Last synced: 25 days ago
JSON representation
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
- Host: GitHub
- URL: https://github.com/NATSpeech/NATSpeech
- Owner: NATSpeech
- License: mit
- Created: 2022-02-13T03:44:45.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-04-02T00:55:24.000Z (over 1 year ago)
- Last Synced: 2024-08-08T23:21:12.181Z (4 months ago)
- Topics: diffsinger, diffspeech, huggingface, portaspeech, pytorch, speech, speech-synthesis, tts
- Language: Python
- Homepage:
- Size: 173 KB
- Stars: 962
- Watchers: 20
- Forks: 99
- Open Issues: 20
-
Metadata Files:
- Readme: README-zh.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - NATSpeech/NATSpeech - TTS) 框架,包括 PortaSpeech (NeurIPS 2021) 和 DiffSpeech (AAAI 2022) 的官方 PyTorch 实现。PortaSpeech:便携式和高质量的生成文本到语音转换(NeurIPS 2021)。DiffSinger:通过浅扩散机制合成歌唱声音(DiffSpeech)(AAAI 2022)。 (语音合成 / 网络服务_其他)
README
NATSpeech: A Non-Autoregressive Text-to-Speech Framework
[![](https://img.shields.io/github/stars/NATSpeech/NATSpeech)](https://github.com/NATSpeech/NATSpeech)
[![](https://img.shields.io/github/forks/NATSpeech/NATSpeech)](https://github.com/NATSpeech/NATSpeech)
[![](https://img.shields.io/github/license/NATSpeech/NATSpeech)](https://github.com/NATSpeech/NATSpeech/blob/main/LICENSE)
[![](https://img.shields.io/github/downloads/NATSpeech/NATSpeech/total?label=pretrained+model+downloads)](https://github.com/NATSpeech/NATSpeech/releases/tag/pretrained_models) | [English README](./README.md)本仓库包含了以下工作的官方PyTorch实现:
- [PortaSpeech: Portable and High-Quality Generative Text-to-Speech](https://proceedings.neurips.cc/paper/2021/file/748d6b6ed8e13f857ceaa6cfbdca14b8-Paper.pdf) (NeurIPS 2021)
[Demo页面](https://portaspeech.github.io/) | [HuggingFace🤗 Demo](https://huggingface.co/spaces/NATSpeech/PortaSpeech)
- [DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism](https://arxiv.org/abs/2105.02446) (DiffSpeech) (AAAI 2022)
[Demo页面](https://diffsinger.github.io/) | [项目主页](https://github.com/MoonInTheRiver/DiffSinger) | [HuggingFace🤗 Demo](https://huggingface.co/spaces/NATSpeech/DiffSpeech)## 主要特点
我们在本框架中实现了以下特点:- 基于[Montreal Forced Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner)的非自回归语音合成数据处理流程;
- 便于使用和可扩展的训练和测试框架;
- 简单但有效的随机访问数据集类的实现。## 安装依赖
```bash
## 在 Linux/Ubuntu 18.04 上通过测试
## 首先需要安装 Python 3.6+ (推荐使用Anaconda)export PYTHONPATH=.
# 创建虚拟环境 (推荐).
python -m venv venv
source venv/bin/activate
# 安装依赖
pip install -U pip
pip install Cython numpy==1.19.1
pip install torch==1.9.0 # 推荐 torch >= 1.9.0
pip install -r requirements.txt
sudo apt install -y sox libsox-fmt-mp3
bash mfa_usr/install_mfa.sh # 安装强制对齐工具
```## 文档
- [关于本框架](./docs/zh/framework.md)
- [运行PortaSpeech](./docs/portaspeech.md)
- [运行DiffSpeech](./docs/diffspeech.md)## 引用
如果本REPO对你的研究和工作有用,请引用以下论文:
- PortaSpeech
```bib
@article{ren2021portaspeech,
title={PortaSpeech: Portable and High-Quality Generative Text-to-Speech},
author={Ren, Yi and Liu, Jinglin and Zhao, Zhou},
journal={Advances in Neural Information Processing Systems},
volume={34},
year={2021}
}
```- DiffSpeech
```bib
@article{liu2021diffsinger,
title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism},
author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou},
journal={arXiv preprint arXiv:2105.02446},
volume={2},
year={2021}
}
```## 致谢
我们的代码受以下代码和仓库启发:
- [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning)
- [ParallelWaveGAN](https://github.com/kan-bayashi/ParallelWaveGAN)
- [Hifi-GAN](https://github.com/jik876/hifi-gan)
- [espnet](https://github.com/espnet/espnet)
- [Glow-TTS](https://github.com/jaywalnut310/glow-tts)
- [DiffSpeech](https://github.com/MoonInTheRiver/DiffSinger)