https://github.com/olaviinha/neuraltexttoaudio

Text prompt steered synthetic audio generators
https://github.com/olaviinha/neuraltexttoaudio

audio audio-generation audio-processing audio-synthesis audioldm colab colab-notebook mubert mubertai music-generation text2audio text2music voice-cloning voice-synthesis

Last synced: 3 months ago
JSON representation

Text prompt steered synthetic audio generators

Host: GitHub
URL: https://github.com/olaviinha/neuraltexttoaudio
Owner: olaviinha
Created: 2022-12-21T22:19:49.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-12-02T15:19:34.000Z (over 1 year ago)
Last Synced: 2023-12-02T16:27:29.435Z (over 1 year ago)
Topics: audio, audio-generation, audio-processing, audio-synthesis, audioldm, colab, colab-notebook, mubert, mubertai, music-generation, text2audio, text2music, voice-cloning, voice-synthesis
Language: Jupyter Notebook
Homepage:
Size: 337 KB
Stars: 34
Watchers: 2
Forks: 4
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Colab notebooks for text-to-audio generators

User-friendly Colab notebooks for various text prompt steered synthetic audio generators.

**Available notebooks:**

- [AudioLDM](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/AudioLDM_pub.ipynb) – _text-to-audio_

- [TorToiSe TTS](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/tortoise_tts_pub.ipynb) – _text-to-speech w/ voice-cloning_

- [MubertAI Text-to-Music](https://colab.research.google.com/github/olaviinha/NeuralTextToMusic/blob/main/mubert_txt2music.ipynb) – _text-to-music_

- [TTS Voice Cloning](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/TTS_voice_cloning_pub.ipynb) – _text-to-speech w/ voice-cloning_

---

## AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/AudioLDM_pub.ipynb)

Paper: [Text-to-Audio Generation with Latent Diffusion Models](https://arxiv.org/abs/2301.12503)

Colab for [AudioLDM](https://github.com/haoheliu/AudioLDM). Generates audio based on text description. This is probably the beginning of _"Stable Diffusion of audio"_. Currently capable of producing 16 kHz audio only.

---

## TorToiSe: Text-to-speech

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/tortoise_tts_pub.ipynb)

Paper: [TorToiSe - Spending Compute for High Quality TTS](https://docs.google.com/document/d/13O_eyY65i6AkNrN_LdPhpUjGhyTNKYHvDrIvHnHe1GA)

Colab for [TorToiSe](https://github.com/neonbjb/tortoise-tts) text-to-speech voice-cloning. This notebook takes a text string and an audio file (or files) of a speaker's voice, and attempts to synthesize the text using the given voice. Currently works with English text only.

---

## MubertAI Text-to-Music

UPDATE: it seems like Mubert API now requires (paid) API key.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/olaviinha/NeuralTextToMusic/blob/main/mubert_txt2music.ipynb)

Colab for [MubertAI Text-to-Music](https://github.com/MubertAI/Mubert-Text-to-Music). Generates music using predefined blocks created by the community (afaik) based on text description. See the [source repository](https://github.com/MubertAI/Mubert-Text-to-Music) for information, such as licensing.

---

## TTS Voice Cloning

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/olaviinha/NeuralTextToAudio/blob/main/TTS_voice_cloning_pub.ipynb)

Paper: [Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis](https://arxiv.org/pdf/1806.04558.pdf)

Colab for [Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning) text-to-speech voice-cloning. This notebook takes a text string and an audio file of a speaker's voice, and attempt to synthesize the text using the given voice. Fair warning: results are not great.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/olaviinha/neuraltexttoaudio

Awesome Lists containing this project

README