Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sozercan/aikit
ðïļ Fine-tune, build, and deploy open-source LLMs easily!
https://github.com/sozercan/aikit
ai buildkit chatgpt docker fine-tuning finetuning gemma gpt inference kubernetes large-language-models llama llama2 llm localllama mistral mixtral nvidia open-source-llm openai
Last synced: about 1 month ago
JSON representation
ðïļ Fine-tune, build, and deploy open-source LLMs easily!
- Host: GitHub
- URL: https://github.com/sozercan/aikit
- Owner: sozercan
- License: mit
- Created: 2023-09-20T02:31:11.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-04-15T17:27:14.000Z (5 months ago)
- Last Synced: 2024-04-18T01:59:37.644Z (5 months ago)
- Topics: ai, buildkit, chatgpt, docker, fine-tuning, finetuning, gemma, gpt, inference, kubernetes, large-language-models, llama, llama2, llm, localllama, mistral, mixtral, nvidia, open-source-llm, openai
- Language: Go
- Homepage: https://sozercan.github.io/aikit/
- Size: 1.47 MB
- Stars: 164
- Watchers: 3
- Forks: 16
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
- project-awesome - sozercan/aikit - ðïļ Fine-tune, build, and deploy open-source LLMs easily! (Go)
- awesome-ChatGPT-repositories - aikit - ðïļ Fine-tune, build, and deploy open-source LLMs easily! (Langchain)
- awesome-LLM-resourses - aikit - tune, build, and deploy open-source LLMs easily! (åūŪč° Fine-Tuning)
README
# AIKit âĻ
AIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).
AIKit offers two main capabilities:
- **Inference**: AIKit uses [LocalAI](https://localai.io/), which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as [Kubectl AI](https://github.com/sozercan/kubectl-ai), [Chatbot-UI](https://github.com/sozercan/chatbot-ui) and many more, to send requests to open LLMs!
- **[Fine-Tuning](https://sozercan.github.io/aikit/docs/fine-tune)**: AIKit offers an extensible fine-tuning interface. It supports [Unsloth](https://github.com/unslothai/unsloth) for fast, memory efficient, and easy fine-tuning experience.
ð For full documentation, please see [AIKit website](https://sozercan.github.io/aikit/)!
## Features
- ðģ No GPU, Internet access or additional tools needed except for [Docker](https://docs.docker.com/desktop/install/linux-install/)!
- ðĪ Minimal image size, resulting in less vulnerabilities and smaller attack surface with a custom [distroless](https://github.com/GoogleContainerTools/distroless)-based image
- ðĩ [Fine-tune support](https://sozercan.github.io/aikit/docs/fine-tune)
- ð Easy to use declarative configuration for [inference](https://sozercan.github.io/aikit/docs/specs-inference) and [fine-tuning](https://sozercan.github.io/aikit/docs/specs-finetune)
- âĻ OpenAI API compatible to use with any OpenAI API compatible client
- ðļ [Multi-modal model support](https://sozercan.github.io/aikit/docs/vision)
- ðžïļ Image generation support with [Stable Diffusion](https://sozercan.github.io/aikit/docs/stablediffusion)
- ðĶ Support for GGUF ([`llama`](https://github.com/ggerganov/llama.cpp)), GPTQ ([`exllama`](https://github.com/turboderp/exllama) or [`exllama2`](https://github.com/turboderp/exllamav2)), EXL2 ([`exllama2`](https://github.com/turboderp/exllamav2)), and GGML ([`llama-ggml`](https://github.com/ggerganov/llama.cpp)) and [Mamba](https://github.com/state-spaces/mamba) models
- ðĒ [Kubernetes deployment ready](https://sozercan.github.io/aikit/docs/kubernetes)
- ðĶ Supports multiple models with a single image
- ðĨïļ Supports [AMD64 and ARM64](https://sozercan.github.io/aikit/docs/create-images#multi-platform-support) CPUs and [GPU-accelerated inferencing with NVIDIA GPUs](https://sozercan.github.io/aikit/docs/gpu)
- ð Ensure [supply chain security](https://sozercan.github.io/aikit/docs/security) with SBOMs, Provenance attestations, and signed images
- ð Supports air-gapped environments with self-hosted, local, or any remote container registries to store model images for inference on the edge.## Quick Start
You can get started with AIKit quickly on your local machine without a GPU!
```bash
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b
```After running this, navigate to [http://localhost:8080/chat](http://localhost:8080/chat) to access the WebUI!
### API
AIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!
```bash
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llama-3.1-8b-instruct",
"messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
}'
```Output should be similar to:
```jsonc
{
// ...
"model": "llama-3.1-8b-instruct",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
}
}
],
// ...
}
```That's it! ð API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.
## Pre-made Models
AIKit comes with pre-made models that you can use out-of-the-box!
If it doesn't include a specific model, you can always [create your own images](https://sozercan.github.io/aikit/docs/create-images), and host in a container registry of your choice!
## CPU
| Model | Optimization | Parameters | Command | Model Name | License |
| --------------- | ------------ | ---------- | ---------------------------------------------------------------- | ------------------------ | ----------------------------------------------------------------------------------- |
| ðĶ Llama 3.1 | Instruct | 8B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b` | `llama-3.1-8b-instruct` | [Llama](https://ai.meta.com/llama/license/) |
| ðĶ Llama 3.1 | Instruct | 70B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70b` | `llama-3.1-70b-instruct` | [Llama](https://ai.meta.com/llama/license/) | |
| âïļ Mixtral | Instruct | 8x7B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b` | `mixtral-8x7b-instruct` | [Apache](https://choosealicense.com/licenses/apache-2.0/) |
| ð ŋïļ Phi 3 | Instruct | 3.8B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b` | `phi-3-3.8b` | [MIT](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE) |
| ðĄ Gemma 1.1 | Instruct | 2B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2b` | `gemma-2b-instruct` | [Gemma](https://ai.google.dev/gemma/terms) |
| âĻïļ Codestral 0.1 | Code | 22B | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b` | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md) |### NVIDIA CUDA
> [!NOTE]
> To enable GPU acceleration, please see [GPU Acceleration](https://sozercan.github.io/aikit/docs/gpu).
> Please note that only difference between CPU and GPU section is the `--gpus all` flag in the command to enable GPU acceleration.| Model | Optimization | Parameters | Command | Model Name | License |
| --------------- | ------------ | ---------- | --------------------------------------------------------------------------- | ------------------------ | ----------------------------------------------------------------------------------- |
| ðĶ Llama 3.1 | Instruct | 8B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b` | `llama-3.1-8b-instruct` | [Llama](https://ai.meta.com/llama/license/) |
| ðĶ Llama 3.1 | Instruct | 70B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70b` | `llama-3.1-70b-instruct` | [Llama](https://ai.meta.com/llama/license/) | |
| âïļ Mixtral | Instruct | 8x7B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b` | `mixtral-8x7b-instruct` | [Apache](https://choosealicense.com/licenses/apache-2.0/) |
| ð ŋïļ Phi 3 | Instruct | 3.8B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b` | `phi-3-3.8b` | [MIT](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE) |
| ðĄ Gemma 1.1 | Instruct | 2B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2b` | `gemma-2b-instruct` | [Gemma](https://ai.google.dev/gemma/terms) |
| âĻïļ Codestral 0.1 | Code | 22B | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b` | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md) |## What's next?
ð For more information and how to fine tune models or create your own images, please see [AIKit website](https://sozercan.github.io/aikit/)!