Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
https://github.com/AIGC-Audio/AudioGPT

audio gpt music sound speech talking-head

Last synced: about 2 months ago
JSON representation

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Host: GitHub
URL: https://github.com/AIGC-Audio/AudioGPT
Owner: AIGC-Audio
License: other
Created: 2023-03-16T07:12:18.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-10-05T19:11:58.000Z (9 months ago)
Last Synced: 2024-02-26T19:00:50.255Z (4 months ago)
Topics: audio, gpt, music, sound, speech, talking-head
Language: Python
Homepage: https://huggingface.co/spaces/AIGC-Audio/AudioGPT
Size: 23 MB
Stars: 9,667
Watchers: 130
Forks: 816
Open Issues: 44
Metadata Files:
- Readme: README.md
- License: LICENSE

Lists

awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-ai-talking-heads - AudioGPT
awesome-langchain - AudioGPT - Audio/AudioGPT?style=social) (Open Source Projects / Other / Chatbots)
awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-llm-and-aigc - AudioGPT - Audio/AudioGPT?style=social"/> : AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. (Summary)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-open-gpt - AudioGPT🔥
awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-from-stars - AIGC-Audio/AudioGPT
awesome - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
AiTreasureBox - AIGC-Audio/AudioGPT - 06-12_9848_2](https://img.shields.io/github/stars/AIGC-Audio/AudioGPT.svg) |AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head| (Repos)
awesome-starts - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
my-awesome-stars - AIGC-Audio/AudioGPT - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (Python)
awesome-langchain-zh - AudioGPT - Audio/AudioGPT?style=social): 理解和生成语音，音乐，声音和会说话的头部 (开源项目 / 其他聊天机器人)
awesome-open-gpt - AudioGPT🔥
awesome-large-multimodal-agents - Github

README

        # AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

[![arXiv](https://img.shields.io/badge/arXiv-Paper-.svg)](https://arxiv.org/abs/2304.12995)

[![GitHub Stars](https://img.shields.io/github/stars/AIGC-Audio/AudioGPT?style=social)](https://github.com/AIGC-Audio/AudioGPT)

![visitors](https://visitor-badge.glitch.me/badge?page_id=AIGC-Audio.AudioGPT)

[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/AIGC-Audio/AudioGPT)

We provide our implementation and pretrained models as open source in this repository.

## Get Started

Please refer to [run.md](run.md)

## Capabilities

Here we list the capability of AudioGPT at this time. More supported models and tasks are coming soon. For prompt examples, refer to [asset](assets/README.md).

Currently not every model has repository.

### Speech

|            Task            |   Supported Foundation Models   | Status |

|:--------------------------:|:-------------------------------:|:------:|

|       Text-to-Speech       | [FastSpeech](https://github.com/ming024/FastSpeech2), [SyntaSpeech](https://github.com/yerfor/SyntaSpeech), [VITS](https://github.com/jaywalnut310/vits) |  Yes (WIP)   |

|       Style Transfer       |         [GenerSpeech](https://github.com/Rongjiehuang/GenerSpeech)         |  Yes   |

|     Speech Recognition     |           [whisper](https://github.com/openai/whisper), [Conformer](https://github.com/sooftware/conformer)           |  Yes   |

|     Speech Enhancement     |          [ConvTasNet]()         |  Yes (WIP)   |

|     Speech Separation      |          [TF-GridNet](https://arxiv.org/pdf/2211.12433.pdf)         |  Yes (WIP)   |

|     Speech Translation     |          [Multi-decoder](https://arxiv.org/pdf/2109.12804.pdf)      |  WIP   |

|      Mono-to-Binaural      |          [NeuralWarp](https://github.com/fdarmon/NeuralWarp)         |  Yes   |

### Sing

|           Task            |   Supported Foundation Models   | Status |

|:-------------------------:|:-------------------------------:|:------:|

|       Text-to-Sing        |         [DiffSinger](https://github.com/MoonInTheRiver/DiffSinger), [VISinger](https://github.com/jerryuhoo/VISinger)          |  Yes (WIP)   |

### Audio

|          Task          | Supported Foundation Models | Status |

|:----------------------:|:---------------------------:|:------:|

|     Text-to-Audio      |      [Make-An-Audio]()      |  Yes   |

|    Audio Inpainting    |      [Make-An-Audio]()      |  Yes   |

|     Image-to-Audio     |      [Make-An-Audio]()      |  Yes   |

|    Sound Detection     |    [Audio-transformer](https://github.com/RetroCirce/HTS-Audio-Transformer)    | Yes    |

| Target Sound Detection |    [TSDNet](https://github.com/gy65896/TSDNet)    |  Yes   |

|    Sound Extraction    |    [LASSNet](https://github.com/liuxubo717/LASS)    |  Yes   |

### Talking Head

|           Task            |   Supported Foundation Models   |   Status   |

|:-------------------------:|:-------------------------------:|:----------:|

|  Talking Head Synthesis   |          [GeneFace](https://github.com/yerfor/GeneFace)           | Yes (WIP)  |

## Acknowledgement

We appreciate the open source of the following projects:

[ESPNet](https://github.com/espnet/espnet)  

[NATSpeech](https://github.com/NATSpeech/NATSpeech)  

[Visual ChatGPT](https://github.com/microsoft/visual-chatgpt)  

[Hugging Face](https://github.com/huggingface)  

[LangChain](https://github.com/hwchase17/langchain)  

[Stable Diffusion](https://github.com/CompVis/stable-diffusion)