{"id":13409558,"url":"https://github.com/sozercan/aikit","last_synced_at":"2025-05-16T04:04:28.545Z","repository":{"id":210601781,"uuid":"693936906","full_name":"sozercan/aikit","owner":"sozercan","description":"🏗️ Fine-tune, build, and deploy open-source LLMs easily!","archived":false,"fork":false,"pushed_at":"2025-05-05T02:20:02.000Z","size":4470,"stargazers_count":448,"open_issues_count":30,"forks_count":38,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-05-11T01:35:34.464Z","etag":null,"topics":["ai","buildkit","chatgpt","docker","fine-tuning","finetuning","gemma","gpt","inference","kubernetes","large-language-models","llama","llm","localllama","mistral","mixtral","nvidia","open-llm","open-source-llm","openai"],"latest_commit_sha":null,"homepage":"https://sozercan.github.io/aikit/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sozercan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-09-20T02:31:11.000Z","updated_at":"2025-05-04T15:50:51.000Z","dependencies_parsed_at":"2024-11-08T21:03:40.893Z","dependency_job_id":"4b868d4e-b7ef-4f89-8323-d2ed636e39a8","html_url":"https://github.com/sozercan/aikit","commit_stats":{"total_commits":331,"total_committers":7,"mean_commits":"47.285714285714285","dds":"0.42296072507552873","last_synced_commit":"8fddc75b896f17bb0a625038808850082998d65d"},"previous_names":["sozercan/aikit"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sozercan%2Faikit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sozercan%2Faikit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sozercan%2Faikit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sozercan%2Faikit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sozercan","download_url":"https://codeload.github.com/sozercan/aikit/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253672698,"owners_count":21945480,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","buildkit","chatgpt","docker","fine-tuning","finetuning","gemma","gpt","inference","kubernetes","large-language-models","llama","llm","localllama","mistral","mixtral","nvidia","open-llm","open-source-llm","openai"],"created_at":"2024-07-30T20:01:01.938Z","updated_at":"2025-05-16T04:04:28.515Z","avatar_url":"https://github.com/sozercan.png","language":"Go","funding_links":[],"categories":["HarmonyOS","Go","微调 Fine-Tuning","A01_文本生成_文本对话","Langchain"],"sub_categories":["Windows Manager","大语言对话模型及数据"],"readme":"# AIKit ✨\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./website/static/img/logo.png\" width=\"200\"\u003e\u003cbr\u003e\n\u003c/p\u003e\n\nAIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).\n\nAIKit offers two main capabilities:\n\n- **Inference**: AIKit uses [LocalAI](https://localai.io/), which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as [Kubectl AI](https://github.com/sozercan/kubectl-ai), [Chatbot-UI](https://github.com/sozercan/chatbot-ui) and many more, to send requests to open LLMs!\n\n- **[Fine-Tuning](https://sozercan.github.io/aikit/docs/fine-tune)**: AIKit offers an extensible fine-tuning interface. It supports [Unsloth](https://github.com/unslothai/unsloth) for fast, memory efficient, and easy fine-tuning experience.\n\n👉 For full documentation, please see [AIKit website](https://sozercan.github.io/aikit/)!\n\n## Features\n\n- 🐳 No GPU, Internet access or additional tools needed except for [Docker](https://docs.docker.com/desktop/install/linux-install/)!\n- 🤏 Minimal image size, resulting in less vulnerabilities and smaller attack surface with a custom [distroless](https://github.com/GoogleContainerTools/distroless)-based image\n- 🎵 [Fine-tune support](https://sozercan.github.io/aikit/docs/fine-tune)\n- 🚀 Easy to use declarative configuration for [inference](https://sozercan.github.io/aikit/docs/specs-inference) and [fine-tuning](https://sozercan.github.io/aikit/docs/specs-finetune)\n- ✨ OpenAI API compatible to use with any OpenAI API compatible client\n- 📸 [Multi-modal model support](https://sozercan.github.io/aikit/docs/vision)\n- 🖼️ [Image generation support](https://sozercan.github.io/aikit/docs/diffusion)\n- 🦙 Support for GGUF ([`llama`](https://github.com/ggerganov/llama.cpp)), GPTQ or EXL2 ([`exllama2`](https://github.com/turboderp/exllamav2)), and GGML ([`llama-ggml`](https://github.com/ggerganov/llama.cpp)) and [Mamba](https://github.com/state-spaces/mamba) models\n- 🚢 [Kubernetes deployment ready](https://sozercan.github.io/aikit/docs/kubernetes)\n- 📦 Supports multiple models with a single image\n- 🖥️ Supports [AMD64 and ARM64](https://sozercan.github.io/aikit/docs/create-images#multi-platform-support) CPUs and [GPU-accelerated inferencing with NVIDIA GPUs](https://sozercan.github.io/aikit/docs/gpu)\n- 🔐 Ensure [supply chain security](https://sozercan.github.io/aikit/docs/security) with SBOMs, Provenance attestations, and signed images\n- 🌈 Supports air-gapped environments with self-hosted, local, or any remote container registries to store model images for inference on the edge.\n\n## Quick Start\n\nYou can get started with AIKit quickly on your local machine without a GPU!\n\n```bash\ndocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b\n```\n\nAfter running this, navigate to [http://localhost:8080/chat](http://localhost:8080/chat) to access the WebUI!\n\n### API\n\nAIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!\n\n```bash\ncurl http://localhost:8080/v1/chat/completions -H \"Content-Type: application/json\" -d '{\n    \"model\": \"llama-3.1-8b-instruct\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"explain kubernetes in a sentence\"}]\n  }'\n```\n\nOutput should be similar to:\n\n```jsonc\n{\n  // ...\n    \"model\": \"llama-3.1-8b-instruct\",\n    \"choices\": [\n        {\n            \"index\": 0,\n            \"finish_reason\": \"stop\",\n            \"message\": {\n                \"role\": \"assistant\",\n                \"content\": \"Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure.\"\n            }\n        }\n    ],\n  // ...\n}\n```\n\nThat's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.\n\n## Pre-made Models\n\nAIKit comes with pre-made models that you can use out-of-the-box!\n\nIf it doesn't include a specific model, you can always [create your own images](https://sozercan.github.io/aikit/docs/create-images), and host in a container registry of your choice!\n\n## CPU\n\n\u003e [!NOTE]\n\u003e AIKit supports both AMD64 and ARM64 CPUs. You can run the same command on either architecture, and Docker will automatically pull the correct image for your CPU.\n\u003e\n\u003e Depending on your CPU capabilities, AIKit will automatically select the most optimized instruction set.\n\n| Model           | Optimization | Parameters | Command                                                          | Model Name               | License                                                                            |\n| --------------- | ------------ | ---------- | ---------------------------------------------------------------- | ------------------------ | ---------------------------------------------------------------------------------- |\n| 🦙 Llama 3.2     | Instruct     | 1B         | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:1b`   | `llama-3.2-1b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🦙 Llama 3.2     | Instruct     | 3B         | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:3b`   | `llama-3.2-3b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🦙 Llama 3.1     | Instruct     | 8B         | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b`   | `llama-3.1-8b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🦙 Llama 3.3     | Instruct     | 70B        | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.3:70b`  | `llama-3.3-70b-instruct` | [Llama](https://ai.meta.com/llama/license/)                                        |  |\n| Ⓜ️ Mixtral       | Instruct     | 8x7B       | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b`  | `mixtral-8x7b-instruct`  | [Apache](https://choosealicense.com/licenses/apache-2.0/)                          |\n| 🅿️ Phi 3.5       | Instruct     | 3.8B       | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b`   | `phi-3.5-3.8b-instruct`  | [MIT](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/resolve/main/LICENSE) |\n| 🔡 Gemma 2       | Instruct     | 2B         | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2b`     | `gemma-2-2b-instruct`    | [Gemma](https://ai.google.dev/gemma/terms)                                         |\n| ⌨️ Codestral 0.1 | Code         | 22B        | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b`          | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md)                                    |\n| QwQ             |              | 32B        | `docker run -d --rm -p 8080:8080 ghcr.io/sozercan/qwq:32b`       | `qwq-32b-preview`        | [Apache 2.0](https://huggingface.co/Qwen/QwQ-32B-Preview/blob/main/LICENSE)        |\n\n\n### NVIDIA CUDA\n\n\u003e [!NOTE]\n\u003e To enable GPU acceleration, please see [GPU Acceleration](https://sozercan.github.io/aikit/docs/gpu).\n\u003e\n\u003e Please note that only difference between CPU and GPU section is the `--gpus all` flag in the command to enable GPU acceleration.\n\n| Model           | Optimization  | Parameters | Command                                                                     | Model Name               | License                                                                                                                     |\n| --------------- | ------------- | ---------- | --------------------------------------------------------------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------- |\n| 🦙 Llama 3.2     | Instruct      | 1B         | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:1b`   | `llama-3.2-1b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                                                                 |\n| 🦙 Llama 3.2     | Instruct      | 3B         | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:3b`   | `llama-3.2-3b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                                                                 |\n| 🦙 Llama 3.1     | Instruct      | 8B         | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b`   | `llama-3.1-8b-instruct`  | [Llama](https://ai.meta.com/llama/license/)                                                                                 |\n| 🦙 Llama 3.3     | Instruct     | 70B        | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.3:70b`  | `llama-3.3-70b-instruct` | [Llama](https://ai.meta.com/llama/license/)                                        |  |\n| Ⓜ️ Mixtral       | Instruct      | 8x7B       | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b`  | `mixtral-8x7b-instruct`  | [Apache](https://choosealicense.com/licenses/apache-2.0/)                                                                   |\n| 🅿️ Phi 3.5       | Instruct      | 3.8B       | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b`   | `phi-3.5-3.8b-instruct`  | [MIT](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/resolve/main/LICENSE)                                          |\n| 🔡 Gemma 2       | Instruct      | 2B         | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2b`     | `gemma-2-2b-instruct`    | [Gemma](https://ai.google.dev/gemma/terms)                                                                                  |\n| ⌨️ Codestral 0.1 | Code          | 22B        | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b` | `codestral-22b`          | [MNLP](https://mistral.ai/licenses/MNPL-0.1.md)                                                                             |\n| QwQ             |               | 32B        | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/qwq:32b`       | `qwq-32b-preview`        | [Apache 2.0](https://huggingface.co/Qwen/QwQ-32B-Preview/blob/main/LICENSE)                                                 |\n| 📸 Flux 1 Dev    | Text to image | 12B        | `docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/flux1:dev`     | `flux-1-dev`             | [FLUX.1 [dev] Non-Commercial License](https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev) |\n\n\n### Apple Silicon (experimental)\n\n\u003e [!NOTE]\n\u003e To enable GPU acceleration on Apple Silicon, please see [Podman Desktop documentation](https://podman-desktop.io/docs/podman/gpu). For more information, please see [GPU Acceleration](https://sozercan.github.io/aikit/docs/gpu).\n\u003e\n\u003e Apple Silicon is an _experimental_ runtime and it may change in the future. This runtime is specific to Apple Silicon only, and it will not work as expected on other architectures, including Intel Macs.\n\u003e\n\u003e Only `gguf` models are supported on Apple Silicon.\n\n| Model       | Optimization | Parameters | Command                                                                                       | Model Name              | License                                                                            |\n| ----------- | ------------ | ---------- | --------------------------------------------------------------------------------------------- | ----------------------- | ---------------------------------------------------------------------------------- |\n| 🦙 Llama 3.2 | Instruct     | 1B         | `podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.2:1b` | `llama-3.2-1b-instruct` | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🦙 Llama 3.2 | Instruct     | 3B         | `podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.2:3b` | `llama-3.2-3b-instruct` | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🦙 Llama 3.1 | Instruct     | 8B         | `podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.1:8b` | `llama-3.1-8b-instruct` | [Llama](https://ai.meta.com/llama/license/)                                        |\n| 🅿️ Phi 3.5   | Instruct     | 3.8B       | `podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/phi3.5:3.8b` | `phi-3.5-3.8b-instruct` | [MIT](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/resolve/main/LICENSE) |\n| 🔡 Gemma 2   | Instruct     | 2B         | `podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/gemma2:2b`   | `gemma-2-2b-instruct`   | [Gemma](https://ai.google.dev/gemma/terms)                                         |\n\n## What's next?\n\n👉 For more information and how to fine tune models or create your own images, please see [AIKit website](https://sozercan.github.io/aikit/)!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsozercan%2Faikit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsozercan%2Faikit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsozercan%2Faikit/lists"}