Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kennethleungty/Text-to-Audio-with-Bark

Exploring Bark, the Open-Source Text-to-Audio Generative Model
https://github.com/kennethleungty/Text-to-Audio-with-Bark

ai artificial-intelligence bark data-science deep-learning gen-ai generative-ai machine-learning prompt-engineering speech text-prompt text-to-audio text-to-music text-to-sound text-to-speech

Last synced: 23 days ago
JSON representation

Exploring Bark, the Open-Source Text-to-Audio Generative Model

Awesome Lists containing this project

README

        

# Exploring Text-to-Audio with Bark

Link to article: https://betterprogramming.pub/text-to-audio-generation-with-bark-clearly-explained-4ee300a3713a

## Context
- Amidst the transformative surge of generative AI, text-to-audio models are emerging as one of the most promising frontiers.
- These advances are not just about converting text to speech, but also about crafting audio experiences that are indistinguishable from human-produced content.
- From audiobooks narrated in any voice imaginable to dynamic music compositions prompted by mere sentences, the potential applications are vast and captivating.
- In this article, we delve into the capabilities and technical intricacies of Bark, an open-source text-prompted audio generation model in Python.

___

## Introducing Bark
Bark is a transformer-based text-to-audio model capable of generating realistic multilingual speech, music, and sound effects. It is created by Suno, a research-driven company that develops cutting-edge audio AI.
As Bark was developed for research purposes, its pre-trained model checkpoints have been made open-source and available for commercial use, which is a valuable contribution to the generative AI community.

___

### References
- https://github.com/suno-ai/bark
- https://audiocraft.metademolab.com/encodec.html
- https://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=74487
- https://towardsdatascience.com/optimizing-vector-quantization-methods-by-machine-learning-algorithms-77c436d0749d
- https://www.assemblyai.com/blog/what-is-residual-vector-quantization/
- https://github.com/facebookresearch/encodec
- https://ai.meta.com/blog/ai-powered-audio-compression-technique/
- https://arxiv.org/abs/2210.13438
- https://github.com/facebookresearch/encodec#extracting-discrete-representations
- https://paperswithcode.com/paper/speaker-anonymization-using-neural-audio
- https://huggingface.co/suno/bark/tree/main/speaker_embeddings/v2