An open API service indexing awesome lists of open source software.

https://github.com/0nutation/SpeechGPT

SpeechGPT Series: Speech Large Language Models
https://github.com/0nutation/SpeechGPT

Last synced: 23 days ago
JSON representation

SpeechGPT Series: Speech Large Language Models

Awesome Lists containing this project

README

        

# SpeechGPT: Speech Large Language Models




- [**SpeechGPT**](speechgpt) (2023/05) - Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities

- [**SpeechGPT-Gen**](speechgpt-gen) (2024/01) - Scaling Chain-of-Information Speech Generation

## News
- **[2024/2/20]** We proposed **AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling**. Checkout the [paper](https://arxiv.org/abs/2402.12226) and [github](https://github.com/OpenMOSS/AnyGPT).
- **[2024/1/25]** We released **SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation**. Checkout the [paper](https://arxiv.org/abs/2401.13527) and [github](https://github.com/0nutation/SpeechGPT/tree/main/speechgpt-gen).
- **[2024/1/9]** We proposed **SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems**. Checkout the [paper](https://arxiv.org/abs/2401.03945) and [github](https://github.com/0nutation/SpeechAgents).
- **[2023/9/15]** We released SpeechGPT code and checkpoints and SpeechInstruct dataset.
- **[2023/9/1]** We proposed **SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models**. We released the code and checkpoints of SpeechTokenizer. Checkout the [paper](https://arxiv.org/abs/2308.16692), [demo](https://0nutation.github.io/SpeechTokenizer.github.io/) and [github](https://github.com/ZhangXInFD/SpeechTokenizer).
- **[2023/5/18]** We released **SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities**. We propose SpeechGPT, the first multi-modal LLM capable of perceiving and generating multi-modal contents following multi-modal human instructions. Checkout the [paper](https://arxiv.org/abs/2305.11000) and [demo](https://0nutation.github.io/SpeechGPT.github.io/).

## Acknowledgements
- We express our appreciation to Fuliang Weng and Rong Ye for their valuable suggestions and guidance.

## Citation
If you find our work useful for your research and applications, please cite using the BibTex:

```
@misc{zhang2023speechgpt,
title={SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities},
author={Dong Zhang and Shimin Li and Xin Zhang and Jun Zhan and Pengyu Wang and Yaqian Zhou and Xipeng Qiu},
year={2023},
eprint={2305.11000},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```