An open API service indexing awesome lists of open source software.

https://github.com/marwan116/raycraft

A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve
https://github.com/marwan116/raycraft

fastapi fault-tolerance ray ray-serve scalability

Last synced: about 1 month ago
JSON representation

A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve

Awesome Lists containing this project

README

          

# RayCraft

## Motivation
FastAPI + Ray = <3

Let's take a FastAPI app and supercharge it with raycraft

```python
from fastapi import FastAPI

simple_service = FastAPI()

@simple_service.post("/")
async def read_root() -> dict[str, str]:
return {"Hello": "World"}
```

You can now run it using raycraft using the RayCraftAPI instead of FastAPI with only two lines of code changes

```diff
+ from raycraft import RayCraftAPI

+ simple_service = RayCraftAPI()

@simple_service.post("/")
async def read_root() -> dict[str, str]:
return {"Hello": "World"}
```

## How to use

### Basic example
Ok so an endpoint returning {"Hello": "World"} isn't going to be enough to serve as a basic example so let's try something more interesting and relevant to why you might want to use raycraft!

Let's say you build a translation service using the following fastAPI code:

```python
from fastapi import FastAPI
from transformers import pipeline

app = FastAPI()

def load_model():
return pipeline("translation_en_to_fr", model="t5-small")

@app.post("/")
async def translate(text: str):
model = load_model()
translated = model(text)[0]["translation_text"]
return {"translation": translated}
```

We can now build this app using raycraft with the same two lines of code changes

```python
from raycraft import RayCraftAPI
from transformers import pipeline

app = RayCraftAPI()

def load_model():
return pipeline("translation_en_to_fr", model="t5-small")

def translate(text: str):
model = load_model()
translated = model(text)[0]["translation_text"]
return translated

@app.post("/")
async def translate(text: str):
return translate(text)
```

We then call the following command to run the app:
```bash
raycraft run demo:app
```

Ok now for the distributed part, let's say we want to run this app on 2 "replicas", each "replica" taking half a GPU, and we want to properly load balance between the replicas, we can do this by running the following command:

```python
from raycraft import RayCraftAPI
from transformers import pipeline

app = RayCraftAPI(ray_actor_options={"num_gpus": 0.5}, num_replicas=2)

def load_model():
return pipeline("translation_en_to_fr", model="t5-small")

def translate(text: str):
model = load_model()
translated = model(text)[0]["translation_text"]
return translated

@app.post("/")
async def translate(text: str):
return translate(text)
```

To avoid loading the model on every request, we can load the model in the constructor of the app:

```python
from raycraft import RayCraftAPI, App
from transformers import pipeline

app = RayCraftAPI(ray_actor_options={"num_gpus": 0.5}, num_replicas=2)

@app.init
def model():
return pipeline("translation_en_to_fr", model="t5-small")

def translate(app: App, text: str):
translated = app.model(text)[0]["translation_text"]
return translated

@app.post("/")
async def translate(app: App, text: str):
return translate(app, text)
```

RayCraft is a thin-layer built on top of [Ray Serve](https://docs.ray.io/en/latest/serve/index.html) adopting a functional interface to ease the migration from fastAPI apps.

With Ray Serve, you can now:
- Scale your app deployment to multiple replicas running on different machines
- Define the resources allocated to each replica including fractional GPUs
- Batch requests together to improve throughput
- Get fault tolerance and automatic retries
- Stream responses using websockets
- Compose different services together using RPC calls that are strictly typed and faster than http requests

### Composing models

## How to setup

Using poetry:

```bash
poetry add raycraft
```

Using pip:

```bash
pip install raycraft
```

## Roadmap
- Streaming support using websockets
- Deployment guide