https://github.com/laion-ai/text-to-speech

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/laion-ai/text-to-speech
Owner: LAION-AI
License: apache-2.0
Created: 2023-10-18T06:09:40.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-11-04T10:10:21.000Z (over 1 year ago)
Last Synced: 2025-05-07T18:13:31.147Z (about 1 month ago)
Language: Python
Size: 60.5 KB
Stars: 59
Watchers: 10
Forks: 6
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

## Audio pipeline for TTS Datasets

This project aims to provide high level APIs for various feature engineering techniques for processing audio files. The implementation follows a modularized and config based approach, so any dataset processing pipeline can be built and managed using the same.

### Creating new pipeline

Creat an yaml file under `config/pipelines/` directory with the following structure

```
pipeline:
loader:
target: manager.Downloader
args:
configs:
- config/datasets/data-config.yaml
save_dir: raw_data/yt_data
manager:
target: manager.YoutubeRunner
processors:
- name: chunking
target: modules.AudioChunking
args:
model_choice: pydub_chunking
- name: denoise_audio
target: modules.DenoiseAudio
args:
model_choice: meta_denoiser_dns48
- name: audio_superres
target: modules.SuperResAudio
args:
model_choice: voicefixer
...
```

**Pipeline Schema**

- **Loader**: The entry point for fetching data from various sources like S3, local systems, or blob storage.
- **Manager**: Specifies the manager class responsible for running the pipeline.
- **Processors**: An ordered list of processors to apply for feature extraction or other manipulations.

If new feature extractors or manager are required for your needs, check the `modules/` directory for understanding the structure and create or update the objects as needed.

### Run pipleines

```
python workers/pipeline.py --configs
```
## Acknowledgements
credit a few of the amazing folks in the community that have helped to this happen:
- [bud-studio](https://bud.studio/) - For providing a initial framework

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/laion-ai/text-to-speech

Awesome Lists containing this project

README