https://github.com/togethercomputer/quick_deployment_helm

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/togethercomputer/quick_deployment_helm
Owner: togethercomputer
Created: 2023-01-06T13:12:19.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-10-02T23:52:07.000Z (over 2 years ago)
Last Synced: 2023-10-03T08:03:59.767Z (over 2 years ago)
Language: Python
Size: 337 KB
Stars: 7
Watchers: 2
Forks: 8
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: licenses.csv

Awesome Lists containing this project

README

# Quick_Deployment_HELM

To deploy a new docker image, merge a PR to the main branch.

### To bring up a local REST server:

```console
mkdir -p .together/models
chmod 777 .together .together/models
docker run --rm --gpus device=0 \
-v $PWD/.together:/home/user/.together \
-e HF_HOME=/home/user/.together/models \
-e HTTP_HOST=0.0.0.0 \
-e SERVICE_DOMAIN=http \
-p 5001:5001 \
-it togethercomputer/native_hf_models /usr/bin/python3 serving_local_nlp_model.py --hf_model_name facebook/opt-350m
```

```console
curl -X POST -H 'Content-Type: application/json' http://localhost:5001/ -d '{"prompt": "Space robots"}'
```

```console
{"result_type": "language-model-inference", "choices": [{"text": " are a great way to get a lot of work done.", "index": 0, "finish_reason": "length"}], "raw_compute_time": 0.20327712898142636}
```

### Bridge local REST server to Together Inference API:

```console
together-node start -f none --worker.mode existing-service --worker.service my-foobar --worker.port 5001
```

```console
curl -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer YOUR-API-KEY-HERE' \
-d '{
"model": "my-foobar",
"prompt": "Space robots"
}'
```

```console
{"status":"finished","prompt":["Space robots"],"model":"my-foobar","model_owner":"","tags":{},"num_returns":1,"args":{"model":"my-foobar","prompt":"Space robots"},"subjobs":[],"output":{"choices":[{"finish_reason":"length","index":0,"text":" are a great way to get a lot of work done."}],"raw_compute_time":0.20714728604070842,"result_type":"language-model-inference"}}
```

### To bring up a standalone node:

```console
docker run --pull=always --rm --gpus device=2 \
-v $PWD/.together:/home/user/.together \
-it togethercomputer/native_hf_models /usr/local/bin/together-node start \
--config /home/user/cfg-neoxt.yaml --color \
--worker.service OpenChatTest --worker.model gpt-neoxt-v0.15
```

### To bring up a standalone node with retrieval:

```console
docker run --pull=always --rm --gpus device=2 \
--add-host=host.docker.internal:host-gateway \
-v $PWD/.together:/home/user/.together \
-it togethercomputer/native_hf_models /usr/local/bin/together-node start \
--config /home/user/cfg-neoxt-retrieval.yaml --color \
--worker.service ock-faiss --worker.model gpt-neoxt-v0.15
```

### To bring up a standalone safety model:

### Start opt-350m in CPU on Mac laptop:

```console
~/together-node/build/together-node start --config ./cfg-opt-350m-docker-macos.yaml
```

### Start opt-350m in CPU on Linux:

```console
curl -O https://together-distro-packages.s3.us-west-2.amazonaws.com/linux/x86_64/bin/together-node-latest
chmod a+x ./together-node-latest
./together-node-latest start --config ./cfg-opt-350m-docker.yaml
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/togethercomputer/quick_deployment_helm

Awesome Lists containing this project

README