Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sozercan/aikit

🏗ïļ Fine-tune, build, and deploy open-source LLMs easily!
https://github.com/sozercan/aikit

ai buildkit chatgpt docker fine-tuning finetuning gemma gpt inference kubernetes large-language-models llama llama2 llm localllama mistral mixtral nvidia open-source-llm openai

Last synced: about 1 month ago
JSON representation

🏗ïļ Fine-tune, build, and deploy open-source LLMs easily!

Awesome Lists containing this project

README

        

# AIKit âœĻ




AIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

- **Inference**: AIKit uses [LocalAI](https://localai.io/), which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as [Kubectl AI](https://github.com/sozercan/kubectl-ai), [Chatbot-UI](https://github.com/sozercan/chatbot-ui) and many more, to send requests to open LLMs!

- **[Fine-Tuning](https://sozercan.github.io/aikit/docs/fine-tune)**: AIKit offers an extensible fine-tuning interface. It supports [Unsloth](https://github.com/unslothai/unsloth) for fast, memory efficient, and easy fine-tuning experience.

👉 For full documentation, please see [AIKit website](https://sozercan.github.io/aikit/)!

## Features

- ðŸģ No GPU, Internet access or additional tools needed except for [Docker](https://docs.docker.com/desktop/install/linux-install/)!
- ðŸĪ Minimal image size, resulting in less vulnerabilities and smaller attack surface with a custom [distroless](https://github.com/GoogleContainerTools/distroless)-based image
- ðŸŽĩ [Fine-tune support](https://sozercan.github.io/aikit/docs/fine-tune)
- 🚀 Easy to use declarative configuration for [inference](https://sozercan.github.io/aikit/docs/specs-inference) and [fine-tuning](https://sozercan.github.io/aikit/docs/specs-finetune)
- âœĻ OpenAI API compatible to use with any OpenAI API compatible client
- ðŸ“ļ [Multi-modal model support](https://sozercan.github.io/aikit/docs/vision)
- 🖞ïļ Image generation support with [Stable Diffusion](https://sozercan.github.io/aikit/docs/stablediffusion)
- ðŸĶ™ Support for GGUF ([`llama`](https://github.com/ggerganov/llama.cpp)), GPTQ ([`exllama`](https://github.com/turboderp/exllama) or [`exllama2`](https://github.com/turboderp/exllamav2)), EXL2 ([`exllama2`](https://github.com/turboderp/exllamav2)), and GGML ([`llama-ggml`](https://github.com/ggerganov/llama.cpp)) and [Mamba](https://github.com/state-spaces/mamba) models
- ðŸšĒ [Kubernetes deployment ready](https://sozercan.github.io/aikit/docs/kubernetes)
- ðŸ“Ķ Supports multiple models with a single image
- ðŸ–Ĩïļ Supports [AMD64 and ARM64](https://sozercan.github.io/aikit/docs/create-images#multi-platform-support) CPUs and [GPU-accelerated inferencing with NVIDIA GPUs](https://sozercan.github.io/aikit/docs/gpu)
- 🔐 Ensure [supply chain security](https://sozercan.github.io/aikit/docs/security) with SBOMs, Provenance attestations, and signed images
- 🌈 Supports air-gapped environments with self-hosted, local, or any remote container registries to store model images for inference on the edge.

## Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

```bash
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b
```

After running this, navigate to [http://localhost:8080/chat](http://localhost:8080/chat) to access the WebUI!

### API

AIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!

```bash
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llama-3.1-8b-instruct",
"messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
}'
```

Output should be similar to:

```jsonc
{
// ...
"model": "llama-3.1-8b-instruct",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
}
}
],
// ...
}
```

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

## Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always [create your own images](https://sozercan.github.io/aikit/docs/create-images), and host in a container registry of your choice!

## CPU

| Model | Optimization | Parameters | Command | Model Name | License |
| --------------- | ------------ | ---------- | ---------------------------------------------------------------- | ------------------------ | ----------------------------------------------------------------------------------- |
| ðŸĶ™ Llama 3.1 | Instruct | 8B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b` | `llama-3.1-8b-instruct` | [Llama](https://ai.meta.com/llama/license/) |
| ðŸĶ™ Llama 3.1 | Instruct | 70B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70b` | `llama-3.1-70b-instruct` | [Llama](https://ai.meta.com/llama/license/) | |
| Ⓜïļ Mixtral | Instruct | 8x7B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b` | `mixtral-8x7b-instruct` | [Apache](https://choosealicense.com/licenses/apache-2.0/) |
| ðŸ…ŋïļ Phi 3 | Instruct | 3.8B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b` | `phi-3-3.8b` | [MIT](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE) |
| ðŸ”Ą Gemma 1.1 | Instruct | 2B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2b` | `gemma-2b-instruct` | [Gemma](https://ai.google.dev/gemma/terms) |
| âŒĻïļ Codestral 0.1 | Code | 22B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b` | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md) |

### NVIDIA CUDA

> [!NOTE]
> To enable GPU acceleration, please see [GPU Acceleration](https://sozercan.github.io/aikit/docs/gpu).
> Please note that only difference between CPU and GPU section is the `--gpus all` flag in the command to enable GPU acceleration.

| Model | Optimization | Parameters | Command | Model Name | License |
| --------------- | ------------ | ---------- | --------------------------------------------------------------------------- | ------------------------ | ----------------------------------------------------------------------------------- |
| ðŸĶ™ Llama 3.1 | Instruct | 8B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b` | `llama-3.1-8b-instruct` | [Llama](https://ai.meta.com/llama/license/) |
| ðŸĶ™ Llama 3.1 | Instruct | 70B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70b` | `llama-3.1-70b-instruct` | [Llama](https://ai.meta.com/llama/license/) | |
| Ⓜïļ Mixtral | Instruct | 8x7B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b` | `mixtral-8x7b-instruct` | [Apache](https://choosealicense.com/licenses/apache-2.0/) |
| ðŸ…ŋïļ Phi 3 | Instruct | 3.8B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b` | `phi-3-3.8b` | [MIT](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE) |
| ðŸ”Ą Gemma 1.1 | Instruct | 2B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2b` | `gemma-2b-instruct` | [Gemma](https://ai.google.dev/gemma/terms) |
| âŒĻïļ Codestral 0.1 | Code | 22B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b` | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md) |

## What's next?

👉 For more information and how to fine tune models or create your own images, please see [AIKit website](https://sozercan.github.io/aikit/)!