Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with audio-generation
A curated list of projects in awesome lists tagged with audio-generation .
https://github.com/mudler/LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
ai api audio-generation distributed gemma gpt4all image-generation kubernetes llama llama3 llm mamba mistral musicgen p2p rerank rwkv stable-diffusion text-generation tts
Last synced: 30 Jul 2024
https://github.com/go-skynet/LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
ai api audio-generation distributed gemma gpt4all image-generation kubernetes llama llama3 llm mamba mistral musicgen p2p rerank rwkv stable-diffusion text-generation tts
Last synced: 01 Aug 2024
https://github.com/open-mmlab/amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audio-generation audio-synthesis audioldm audit fastspeech2 hifi-gan music-generation naturalspeech2 singing-voice-conversion speech-synthesis text-to-audio text-to-speech vall-e vits voice-conversion
Last synced: 26 Sep 2024
https://github.com/open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audio-generation audio-synthesis audioldm audit fastspeech2 hifi-gan music-generation naturalspeech2 singing-voice-conversion speech-synthesis text-to-audio text-to-speech vall-e vits voice-conversion
Last synced: 31 Jul 2024
https://github.com/FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
audio-generation cantonese chatbot chatgpt chinese cosyvoice cross-lingual english fine-grained fine-tuning gpt-4o japanese korean multi-lingual natural-language-generation python text-to-speech tts voice-cloning
Last synced: 31 Jul 2024
https://github.com/haoheliu/audioldm
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Last synced: 01 Oct 2024
https://github.com/haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Last synced: 31 Jul 2024
https://github.com/archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
artificial-intelligence audio-generation deep-learning denoising-diffusion
Last synced: 30 Sep 2024
https://github.com/archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
artificial-intelligence audio-generation machine-learning
Last synced: 30 Sep 2024
https://github.com/rsxdalv/tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
ai audio-generation audiogen bark deep-learning generator gradio machine-learning magnet music musicgen rvc seamlessm4t styletts2 text-to-speech torch tortoise-tts tts vocos web
Last synced: 01 Aug 2024
https://github.com/lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
artificial-intelligence attention-mechanism audio-generation deep-learning non-autoregressive transformers
Last synced: 03 Oct 2024
https://github.com/declare-lab/tango
Hosts a family of diffusion models for text-to-audio generation.
audio-generation diffusion diffusion-models language-models large-language-models text-to-audio
Last synced: 01 Aug 2024
https://github.com/Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
aigc artificial-intelligence audio audio-effect audio-generation datasets deep-learning machine-learning music-generation
Last synced: 31 Jul 2024
https://github.com/v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
audio audio-generation bmvc evaluation-metrics gan melgan multi-modal pytorch transformer vas vggsound video video-features video-understanding vqvae
Last synced: 01 Aug 2024
https://github.com/Yuan-ManX/audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
artificial-intelligence audio audio-generation audio-processing deep-learning dsp machine-learning music music-generation signal-processing speech speech-processing speech-synthesis
Last synced: 31 Jul 2024
https://github.com/happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
audio-generation diffusion diffusion-models large-language-models text-to-audio
Last synced: 03 Aug 2024
https://github.com/archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.
artifical-intelligense audio-generation datasets deep-learning pytorch
Last synced: 05 Aug 2024
https://github.com/archinetai/audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
artificial-intelligence audio-generation deep-learning denoising-diffusion
Last synced: 05 Aug 2024
https://github.com/ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
ai-music audio-generation music-generation text-to-audio
Last synced: 05 Aug 2024
https://github.com/rsxdalv/musicgen-prompts
Site for sharing MusicGen + AudioGen Prompts and Creations
ai audio-generation audiogen generator machine-learning musicgen
Last synced: 10 Aug 2024
https://github.com/Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
audio-generation audio-processing consistency-models diffusion-models ldm
Last synced: 02 Sep 2024
https://github.com/merekat/children-stories
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
ai audio-generation data-science image-generation large-language-models llama lora lux neural-networks stable-diffusion story text-generation tts xtts
Last synced: 27 Sep 2024
https://github.com/lucadellalib/bigvgan
A single-file implementation of BigVGAN generator
audio-generation audio-synthesis bigvgan music-synthesis neural-vocoder pytorch singing-voice-synthesis speech-synthesis
Last synced: 01 Oct 2024
https://github.com/koppalexander/ohanashi-childgpt
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
ai audio-generation data-science flux generative-ai image-generation large-language-models llama lora neural-networks stable-diffusion story text-generation tts xtts
Last synced: 27 Sep 2024
https://github.com/work-nobu/ohanashigpt
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
ai audio-generation data-science image-generation large-language-models llama3 llamacpp lora low-rank-adaptation stable-diffusion text-generation xtts
Last synced: 01 Oct 2024