https://github.com/sitamgithub-msit/tangoflux-litserve
Leverage TangoFlux's text-to-audio capabilities using LitServe.
https://github.com/sitamgithub-msit/tangoflux-litserve
artificial-intelligence deep-learning fastapi flow-matching generative-ai lightning-ai litserve python pytorch text-to-audio
Last synced: 9 months ago
JSON representation
Leverage TangoFlux's text-to-audio capabilities using LitServe.
- Host: GitHub
- URL: https://github.com/sitamgithub-msit/tangoflux-litserve
- Owner: sitamgithub-MSIT
- License: apache-2.0
- Created: 2025-02-03T20:54:38.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-02-14T21:50:46.000Z (12 months ago)
- Last Synced: 2025-02-14T22:30:09.678Z (12 months ago)
- Topics: artificial-intelligence, deep-learning, fastapi, flow-matching, generative-ai, lightning-ai, litserve, python, pytorch, text-to-audio
- Language: Python
- Homepage: https://lightning.ai/sitammeur/studios/deploy-tangoflux-audio-generation-model
- Size: 255 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TangoFlux LitServe
[](https://lightning.ai/sitammeur/studios/deploy-tangoflux-audio-generation-model)
TangoFlux, a novel audio generation model, uses Diffusion Transformers conditioned on text and duration to produce high-quality audio. To achieve superior results, it employs a three-stage training process, including preference optimization using synthetic data. This project shows how to create a self-hosted, private API that deploys TangoFlux [text-to-audio model](https://huggingface.co/declare-lab/TangoFlux) with LitServe, an easy-to-use, flexible serving engine for AI models built on FastAPI.
## Project Structure
The project is structured as follows:
- `server.py`: The file containing the main code for the web server.
- `client.py`: The file containing the code for client-side requests.
- `LICENSE`: The license file for the project.
- `README.md`: The README file that contains information about the project.
- `assets`: The folder containing screenshots for working on the application.
- `.gitignore`: The file containing the list of files and directories to be ignored by Git.
## Tech Stack
- Python (for the programming language)
- PyTorch (for the deep learning framework)
- Hugging Face Transformers Library (for the model)
- LitServe (for the serving engine)
## Getting Started
To get started with this project, follow the steps below:
1. Run the server: `python server.py`
2. Upon running the server successfully, you will see uvicorn running on port 8000.
3. Open a new terminal window.
4. Run the client: `python client.py`
Now, you can see the model's output based on the input request. The model will generate an audio file based on the input prompt and duration.
## Usage
The project can be used to serve the TangoFlux text-to-audio model using LitServe. It allows you to input a text prompt and duration to generate an audio file, suggesting potential use cases in the audio generation domain, such as generating sound for videos, audiobooks, entertainment, and more.
## Contributing
Contributions are welcome! If you would like to contribute to this project, please raise an issue to discuss the changes you want to make. Once the changes are approved, you can create a pull request.
## License
This project is licensed under the [Apache-2.0 License](LICENSE).
## Contact
If you have any questions or suggestions about the project, please contact me on my GitHub profile.
Happy coding! 🚀