Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
https://github.com/AIGC-Audio/AudioGPT
audio gpt music sound speech talking-head
Last synced: about 2 months ago
JSON representation
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
- Host: GitHub
- URL: https://github.com/AIGC-Audio/AudioGPT
- Owner: AIGC-Audio
- License: other
- Created: 2023-03-16T07:12:18.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-05T19:11:58.000Z (9 months ago)
- Last Synced: 2024-02-26T19:00:50.255Z (4 months ago)
- Topics: audio, gpt, music, sound, speech, talking-head
- Language: Python
- Homepage: https://huggingface.co/spaces/AIGC-Audio/AudioGPT
- Size: 23 MB
- Stars: 9,667
- Watchers: 130
- Forks: 816
- Open Issues: 44
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-ai-talking-heads - AudioGPT
- awesome-langchain - AudioGPT - Audio/AudioGPT?style=social) (Open Source Projects / Other / Chatbots)
- awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-llm-and-aigc - AudioGPT - Audio/AudioGPT?style=social"/> : AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. (Summary)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-open-gpt - AudioGPT🔥
- awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-from-stars - AIGC-Audio/AudioGPT
- awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- AiTreasureBox - AIGC-Audio/AudioGPT - 06-12_9848_2](https://img.shields.io/github/stars/AIGC-Audio/AudioGPT.svg) |AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head| (Repos)
- awesome-starts - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
- awesome-langchain-zh - AudioGPT - Audio/AudioGPT?style=social): 理解和生成语音,音乐,声音和会说话的头部 (开源项目 / 其他聊天机器人)
- awesome-open-gpt - AudioGPT🔥
- awesome-large-multimodal-agents - Github
README
# AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
[![arXiv](https://img.shields.io/badge/arXiv-Paper-.svg)](https://arxiv.org/abs/2304.12995)
[![GitHub Stars](https://img.shields.io/github/stars/AIGC-Audio/AudioGPT?style=social)](https://github.com/AIGC-Audio/AudioGPT)
![visitors](https://visitor-badge.glitch.me/badge?page_id=AIGC-Audio.AudioGPT)
[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/AIGC-Audio/AudioGPT)We provide our implementation and pretrained models as open source in this repository.
## Get Started
Please refer to [run.md](run.md)
## Capabilities
Here we list the capability of AudioGPT at this time. More supported models and tasks are coming soon. For prompt examples, refer to [asset](assets/README.md).
Currently not every model has repository.
### Speech
| Task | Supported Foundation Models | Status |
|:--------------------------:|:-------------------------------:|:------:|
| Text-to-Speech | [FastSpeech](https://github.com/ming024/FastSpeech2), [SyntaSpeech](https://github.com/yerfor/SyntaSpeech), [VITS](https://github.com/jaywalnut310/vits) | Yes (WIP) |
| Style Transfer | [GenerSpeech](https://github.com/Rongjiehuang/GenerSpeech) | Yes |
| Speech Recognition | [whisper](https://github.com/openai/whisper), [Conformer](https://github.com/sooftware/conformer) | Yes |
| Speech Enhancement | [ConvTasNet]() | Yes (WIP) |
| Speech Separation | [TF-GridNet](https://arxiv.org/pdf/2211.12433.pdf) | Yes (WIP) |
| Speech Translation | [Multi-decoder](https://arxiv.org/pdf/2109.12804.pdf) | WIP |
| Mono-to-Binaural | [NeuralWarp](https://github.com/fdarmon/NeuralWarp) | Yes |### Sing
| Task | Supported Foundation Models | Status |
|:-------------------------:|:-------------------------------:|:------:|
| Text-to-Sing | [DiffSinger](https://github.com/MoonInTheRiver/DiffSinger), [VISinger](https://github.com/jerryuhoo/VISinger) | Yes (WIP) |### Audio
| Task | Supported Foundation Models | Status |
|:----------------------:|:---------------------------:|:------:|
| Text-to-Audio | [Make-An-Audio]() | Yes |
| Audio Inpainting | [Make-An-Audio]() | Yes |
| Image-to-Audio | [Make-An-Audio]() | Yes |
| Sound Detection | [Audio-transformer](https://github.com/RetroCirce/HTS-Audio-Transformer) | Yes |
| Target Sound Detection | [TSDNet](https://github.com/gy65896/TSDNet) | Yes |
| Sound Extraction | [LASSNet](https://github.com/liuxubo717/LASS) | Yes |### Talking Head
| Task | Supported Foundation Models | Status |
|:-------------------------:|:-------------------------------:|:----------:|
| Talking Head Synthesis | [GeneFace](https://github.com/yerfor/GeneFace) | Yes (WIP) |## Acknowledgement
We appreciate the open source of the following projects:[ESPNet](https://github.com/espnet/espnet)
[NATSpeech](https://github.com/NATSpeech/NATSpeech)
[Visual ChatGPT](https://github.com/microsoft/visual-chatgpt)
[Hugging Face](https://github.com/huggingface)
[LangChain](https://github.com/hwchase17/langchain)
[Stable Diffusion](https://github.com/CompVis/stable-diffusion)