https://github.com/LinkSoul-AI/LLaSM

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
https://github.com/LinkSoul-AI/LLaSM

Last synced: about 1 year ago
JSON representation

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Host: GitHub
URL: https://github.com/LinkSoul-AI/LLaSM
Owner: LinkSoul-AI
License: apache-2.0
Created: 2023-07-31T06:55:36.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-09-11T05:29:06.000Z (over 2 years ago)
Last Synced: 2024-10-18T21:17:20.972Z (over 1 year ago)
Language: Python
Homepage: https://github.com/LinkSoul-AI/LLaSM
Size: 3.26 MB
Stars: 529
Watchers: 14
Forks: 54
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-llama-resources - LLaSM: Large Language and Speech Model
ai-game-devtools - LLaSM
StarryDivineSky - LinkSoul-AI/LLaSM - 文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。 (多模态大模型 / 网络服务_其他)

README

          # LLaSM: Large Language and Speech Model

[![](https://img.shields.io/badge/LLaSM-Chinese-blue)](https://huggingface.co/spaces/LinkSoul/LLaSM) [![](https://img.shields.io/badge/Commercial-Support-blue)](https://huggingface.co/spaces/LinkSoul/LLaSM) [![](https://img.shields.io/badge/License-Apache_v2-blue)](https://github.com/LinkSoul-AI/LLaSM/blob/main/LICENSE) [![](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/abs/2308.15930) [![](https://img.shields.io/badge/HuggingFace-Live_Demo-green)](https://huggingface.co/spaces/LinkSoul/LLaSM) [![](https://img.shields.io/badge/Datasets-LLaSM_Audio_Instructions-yellow)](https://huggingface.co/datasets/LinkSoul/LLaSM-Audio-Instructions)

开源，可商用的**中英文双语语音-语言助手 LLaSM 以及中英文语音 SFT 数据集 LLaSM-Audio-Instructions**，第一个支持中英文语音-文本多模态对话的开源可商用对话模型。



    



## 模型框架

![Framework](.github/framework.png)

## 基础演示

![Base Demo](.github/demo.gif)

## 在线试玩

> Talk is cheap, Show you the Demo.

- [Demo 地址 / Hugging Face Spaces](https://huggingface.co/spaces/LinkSoul/LLaSM) 

## 论文

- arXiv 链接：https://arxiv.org/abs/2308.15930

## 资源下载

- Hugging Face模型下载：

  - [LLaSM-Chinese-Llama-2-7B](https://huggingface.co/LinkSoul/LLaSM-Cllama2)

  - [LLaSM-Baichuan-7B](https://huggingface.co/LinkSoul/LLaSM-Baichuan)

- 百度网盘下载:

  - [LLaSM-Chinese-Llama-2-7B](https://pan.baidu.com/s/1PaipNDfqV7f3W1-tl5rwzA?pwd=2549)

  - [LLaSM-Baichuan-7B](https://pan.baidu.com/s/1QZrXA8IJXclN77T4jM7tEw?pwd=y2p7)

- 语言模型:

  - [Chinese-Llama-2-7b](https://github.com/LinkSoul-AI/Chinese-Llama-2-7b)

  - [Baichuan-7B](https://huggingface.co/baichuan-inc/Baichuan-7B)

- 数据集：[LLaSM-Audio-Instructions](https://huggingface.co/datasets/LinkSoul/LLaSM-Audio-Instructions)

## 环境安装

```shell

# clone the repository

git clone https://github.com/LinkSoul-AI/LLaSM

cd LLaSM

# install package

conda create -n llasm python=3.10 -y

conda activate llasm

pip install --upgrade pip

pip install -e .

```

## 快速测试

- 下载 Whisper large v2 模型：https://huggingface.co/openai/whisper-large-v2

```shell

export LLASM_DEVICE="cuda:0"

python infer.py \

    --input_audio_file PATH/TO/YOUR/AUDIO \

    --llasm_model PATH/TO/LLaSM/MODEL \

    --llasm_audio_tower PATH/TO/WHISPER/MODEL \

    --llm_type "Chinese_llama2" or "baichuan" \

```

## TODO

- 如何训练

- int4 量化

- docker 部署

## 相关项目

- [Chinese-Llama-2-7B](https://huggingface.co/LinkSoul/Chinese-Llama-2-7b)

- [Whisper](https://ai.meta.com/llama/)

- [baichuan-inc/Baichuan-7B](https://huggingface.co/baichuan-inc/Baichuan-7B)

## 项目协议

[Apache-2.0 license](https://github.com/LinkSoul-AI/LLaSM/blob/main/LICENSE)

## Citation

如果您发现我们的工作和此仓库有用，欢迎给一个星星 :star: 鼓励我们一下 :beer::

```bibtex

@misc{shu2023llasm,

      title={LLaSM: Large Language and Speech Model}, 

      author={Yu Shu and Siwei Dong and Guangyao Chen and Wenhao Huang and Ruihua Zhang and Daochen Shi and Qiqi Xiang and Yemin Shi},

      year={2023},

      eprint={2308.15930},

      archivePrefix={arXiv},

      primaryClass={cs.CL}

}

```

## 微信交流群

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/LinkSoul-AI/LLaSM

Awesome Lists containing this project

README