https://github.com/making/fastchat-on-mac

FastChat on Mac (M2)
https://github.com/making/fastchat-on-mac

Last synced: 12 months ago
JSON representation

FastChat on Mac (M2)

Host: GitHub
URL: https://github.com/making/fastchat-on-mac
Owner: making
Created: 2023-09-29T07:04:50.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-10-19T09:00:05.000Z (over 2 years ago)
Last Synced: 2025-02-28T15:18:05.673Z (over 1 year ago)
Language: Procfile
Size: 5.86 KB
Stars: 4
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# FastChat on Mac (Apple Silicon)

https://github.com/lm-sys/FastChat on Mac

## How to run OpenAI compatible FastChat server on Mac (Apple Silicon) with its Metal GPU

Download the `docker-compose.yaml` and run

```
wget https://github.com/making/fastchat-on-mac/raw/main/docker-compose.yaml
docker-compose up
```

It launches controller, gradio (Web UI, optional), and openai api server. These do not use GPU, so it is easy to start them with docker.

Next, start the model worker outside of docker and specify the `--device mps` option to take advantage of Metal GPU.

```
pip3 install "fschat[model_worker,webui]"
```

Then

```
python3 -m fastchat.serve.model_worker --port 21002 --model-path lmsys/vicuna-7b-v1.3 --device mps --load-8bit --controller-address http://localhost:21001 --worker-address http://host.docker.internal:21002
```

* http://localhost:7860 WebUI
* http://localhost:8000 OpenAI API

## Access the OpenAI API

https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md

```
$ curl -s http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "vicuna-7b-v1.3",
"messages": [{"role": "user", "content": "Tell me a joke"}]
}'

{
"id": "chatcmpl-fiy5p8FeTLjUNdfL2yS28q",
"object": "chat.completion",
"created": 1695969537,
"model": "vicuna-7b-v1.3",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Sure, here's a joke for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 44,
"total_tokens": 75,
"completion_tokens": 31
}
}
```

## Run the Spring AI OpenAI demo

```
git clone https://github.com/rd-1-2022/ai-openai-helloworld.git
cd ai-openai-helloworld
./mvnw clean package -DskipTests

java -jar target/ai-openai-helloworld-0.0.1-SNAPSHOT.jar --spring.ai.openai.base-url=http://localhost:8000 --spring.ai.openai.api-key=dummy --spring.ai.openai.model=vicuna-7b-v1.3
```

```
$ curl -s --get --data-urlencode 'message=Tell me a joke about a cow.' http://localhost:8080/ai/simple | jq .
{
"completion": "Why couldn't the cow play in the band?\n\nBecause it was too moo-sical!"
}
```

## How to build the docker image

```
pack build ghcr.io/making/fastchat --builder paketobuildpacks/builder:base --path image --publish
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/making/fastchat-on-mac

Awesome Lists containing this project

README