{"id":49222744,"url":"https://github.com/ericflo/kiln","last_synced_at":"2026-05-01T12:01:44.857Z","repository":{"id":352505973,"uuid":"1208965589","full_name":"ericflo/kiln","owner":"ericflo","description":"A single-model LLM inference server with live online learning via LoRA hot-swap. Train while you serve.","archived":false,"fork":false,"pushed_at":"2026-04-26T12:26:03.000Z","size":18133,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-26T13:14:26.150Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ericflo.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-13T00:53:59.000Z","updated_at":"2026-04-26T12:25:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ericflo/kiln","commit_stats":null,"previous_names":["ericflo/kiln"],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/ericflo/kiln","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ericflo%2Fkiln","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ericflo%2Fkiln/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ericflo%2Fkiln/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ericflo%2Fkiln/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ericflo","download_url":"https://codeload.github.com/ericflo/kiln/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ericflo%2Fkiln/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32495949,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-24T05:01:15.012Z","updated_at":"2026-05-01T12:01:44.833Z","avatar_url":"https://github.com/ericflo.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/logo.png\" alt=\"Kiln\" width=\"200\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eKiln\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eYour model gets better every time you use it.\u003c/strong\u003e\u003cbr\u003e\n  A single-GPU inference server with live LoRA training. Pure Rust. Single binary.\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"QUICKSTART.md\"\u003eQuickstart\u003c/a\u003e \u0026middot;\n  \u003ca href=\"ARCHITECTURE.md\"\u003eArchitecture\u003c/a\u003e \u0026middot;\n  \u003ca href=\"kiln.example.toml\"\u003eConfiguration\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\nKiln serves a language model and trains it — in the same process, on the same GPU, at the same time. You submit corrections or scored completions over HTTP, and the model improves in seconds. No restarts, no separate training pipeline, no second copy of the weights.\n\nIt targets one model ([Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B)) and optimizes everything for that model — the scheduler, the memory manager, the kernels. This isn't a general-purpose framework. It's a scalpel.\n\n## Why\n\nToday, improving a deployed model looks like: collect failure examples, format them, upload to a training service, wait hours, download new weights, redeploy, hope. Kiln collapses that into one API call:\n\n```bash\n# Submit a correction — the model learns it in seconds\ncurl http://localhost:8420/v1/train/sft \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"examples\": [\n      {\"messages\": [\n        {\"role\": \"user\", \"content\": \"Summarize this contract clause...\"},\n        {\"role\": \"assistant\", \"content\": \"The clause establishes...\"}\n      ]}\n    ]\n  }'\n\n# The next request already uses the updated weights\ncurl http://localhost:8420/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"messages\": [{\"role\": \"user\", \"content\": \"Summarize this contract clause...\"}]}'\n```\n\nA 4B model continuously tuned to your specific workload will outperform a generic 70B model on the tasks you actually care about. And it runs on hardware you already own.\n\n## Features\n\n- **OpenAI-compatible API** — drop in as a local replacement. SSE streaming, chat completions, tool use formatting.\n- **SFT training** over HTTP — submit examples, model updates in seconds via LoRA hot-swap.\n- **GRPO training** over HTTP — submit scored completions for reinforcement learning. You control the reward function.\n- **LoRA hot-swap** — new adapter weights activate atomically at iteration boundaries. Zero downtime.\n- **Continuous batching** with chunked prefill — decode requests are never stalled by long prompts.\n- **128K+ context** on 24GB — Qwen3.5-4B's hybrid architecture (24 linear attention + 8 full attention layers) means KV cache is 4x smaller than a pure transformer.\n- **Paged KV cache** — virtual memory-style block allocation eliminates fragmentation.\n- **FP8 KV cache** — optional quantization doubles effective context length.\n- **Prefix caching** — shared prompt prefixes reuse cached KV blocks.\n- **Gradient checkpointing** — training fits on consumer 24GB GPUs (RTX 3090/4090).\n- **Adapter management** — load, unload, upload (import), download (export), and version LoRA adapters.\n- **Adapter composition** — stack multiple LoRAs per request with per-adapter scaling, or merge them server-side via weighted_average / TIES / concatenation.\n- **Embedded web dashboard** at `/ui` — live server status, VRAM breakdown, adapter management, training monitoring, and a chat playground. No extra service to run.\n- **Prometheus metrics** at `/metrics` — request latency, throughput, training progress, memory usage.\n- **Training webhooks** — POST a JSON event to a configured URL on training job completion or failure.\n- **Pure Rust** — single binary, single process. No Python. No sidecar. No second model in memory.\n\n## The GRPO Loop\n\nThis is the killer feature. Generate completions, score them with your own reward function, and feed the results back. The model learns what \"good\" means for your use case.\n\n```python\nimport openai\n\nclient = openai.OpenAI(base_url=\"http://localhost:8420/v1\", api_key=\"unused\")\n\n# 1. Generate candidates\nresponses = [\n    client.chat.completions.create(\n        model=\"qwen3.5-4b-kiln\",\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n        temperature=0.7\n    )\n    for _ in range(8)\n]\n\n# 2. Score them however you want — regex, unit tests, another model, human eval\nscored = [{\"text\": r.choices[0].message.content, \"reward\": my_score(r)} for r in responses]\n\n# 3. Submit — the server trains and hot-swaps immediately\nrequests.post(\"http://localhost:8420/v1/train/grpo\", json={\n    \"groups\": [{\"prompt\": prompt, \"completions\": scored}]\n})\n\n# 4. Next inference already uses the improved weights\n```\n\nSee [docs/GRPO_GUIDE.md](docs/GRPO_GUIDE.md) for worked verifiable-rewards examples (math, JSON, code).\n\n## Quick Start\n\n**Prerequisites:** NVIDIA GPU with 24GB+ VRAM and CUDA 12+ **OR** Apple Silicon Mac with 16GB+ unified memory. Rust stable toolchain on both.\n\n```bash\ngit clone https://github.com/ericflo/kiln.git\ncd kiln\n\n# Linux / Windows + NVIDIA\ncargo build --release --features cuda     # ~15-30 min first build (CUDA kernels)\n\n# macOS + Apple Silicon\ncargo build --release --features metal    # Metal backend via candle\n\n# Download model weights\npip install huggingface-hub\nhuggingface-cli download Qwen/Qwen3.5-4B --local-dir ./Qwen3.5-4B\n\n# Start serving\nKILN_MODEL_PATH=./Qwen3.5-4B ./target/release/kiln serve\n```\n\n```\n  ┌─────────────────────────────────────┐\n  │           🔥 K I L N 🔥             │\n  │   inference · training · adapters   │\n  └─────────────────────────────────────┘\n\n  Version: 0.2.1\n  Model:   ./Qwen3.5-4B\n  CUDA:    available ✓\n  GPU:     NVIDIA RTX A6000\n  VRAM:    49140 MiB total, 48891 MiB free\n  Listen:  http://0.0.0.0:8420\n```\n\nThe `GPU` and `VRAM` lines come from `nvidia-smi` and are skipped silently if it isn't installed.\n\n```bash\n# Chat\ncurl http://localhost:8420/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}], \"stream\": true}'\n\n# Train\ncurl http://localhost:8420/v1/train/sft \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"examples\": [{\"messages\": [{\"role\": \"user\", \"content\": \"Hi\"}, {\"role\": \"assistant\", \"content\": \"Hey there!\"}]}]}'\n\n# Check training\ncurl http://localhost:8420/v1/train/status\n```\n\nSee [QUICKSTART.md](QUICKSTART.md) for the full walkthrough including GRPO, adapter management, Docker, and systemd setup.\n\n## Memory Budget (24GB GPU)\n\nQwen3.5-4B's hybrid architecture is the key. Only 8 of 32 layers need KV cache, so long-context inference costs a fraction of what a pure transformer would.\n\n| Scenario | Total VRAM | Fits 24GB? |\n|---|---|---|\n| 128K context, 1 sequence, inference only | ~13 GB | Yes |\n| 128K context, 1 sequence, inference + training | ~18 GB | Yes |\n| 64K context, 4 sequences, inference + training | ~22 GB | Yes |\n| 32K context, 8 sequences, inference + training | ~22 GB | Yes |\n| 128K context, 4 sequences, FP8 KV cache | ~19 GB | Yes |\n\n### Apple Silicon (M3 Max / M4 Max 64GB unified memory)\n\nOn Apple Silicon, model weights, KV cache, and training state all live in unified memory shared with the OS. A 64 GB chip leaves generous headroom for long contexts and concurrent training.\n\n| Scenario | Unified Memory | Fits 64GB? |\n|---|---|---|\n| 128K context, 1 sequence, inference only | ~13 GB | Yes |\n| 128K context, 1 sequence, inference + training | ~18 GB | Yes |\n| 64K context, 4 sequences, inference + training | ~22 GB | Yes |\n| 128K context, 8 sequences, inference + training | ~32 GB | Yes |\n| 128K context, 4 sequences, FP8 KV cache | ~19 GB | Yes |\n\n16 GB M-series chips fit short-context inference; 32 GB fits 64K context comfortably; 64 GB+ matches or exceeds the 24 GB CUDA envelope.\n\n## API\n\n| Method | Path | Description |\n|---|---|---|\n| POST | `/v1/chat/completions` | Chat completions (OpenAI-compatible) |\n| POST | `/v1/completions/batch` | Batch generation API for GRPO (up to 64 prompts per request) |\n| POST | `/v1/train/sft` | Submit SFT training examples |\n| POST | `/v1/train/grpo` | Submit GRPO scored completions |\n| GET | `/v1/train/status` | Training queue and job status |\n| GET | `/v1/adapters` | List loaded LoRA adapters |\n| POST | `/v1/adapters/load` | Load adapter from disk |\n| POST | `/v1/adapters/unload` | Unload active adapter |\n| POST | `/v1/adapters/upload` | Multipart tar.gz import of an adapter |\n| GET  | `/v1/adapters/{name}/download` | Stream adapter as tar.gz (export) |\n| POST | `/v1/adapters/merge` | Merge adapters (weighted_average, TIES, or concatenation modes) |\n| GET | `/v1/models` | List available models |\n| GET | `/ui` | Embedded web dashboard (status, adapters, training, chat) |\n| GET | `/health` | Server health and diagnostics |\n| GET | `/metrics` | Prometheus metrics |\n\n## Architecture\n\n```\nSingle Rust binary:\n  HTTP (axum) ─── Scheduler (continuous batching, chunked prefill)\n                      │\n                  Block Manager (paged KV cache)\n                      │\n                  Engine (model forward + LoRA)\n                  ├── 24× Gated DeltaNet layers (linear attention, O(1) state)\n                  └──  8× GQA layers (full attention + KV cache)\n                      │\n                  Training (background thread, shares GPU memory)\n                  ├── SFT (cross-entropy on LoRA parameters)\n                  └── GRPO (advantage-weighted policy gradient)\n```\n\nEverything runs in one process. Training happens on a background thread sharing the already-loaded model — no second copy in VRAM, no Python sidecar.\n\nSee [ARCHITECTURE.md](ARCHITECTURE.md) for the full deep-dive.\n\n## Project Structure\n\n```\ncrates/\n  kiln-core/             Core types: block manager, prefix cache, config, request lifecycle\n  kiln-model/            Model loading, forward pass, LoRA, sampling\n  kiln-scheduler/        Continuous batching scheduler with chunked prefill\n  kiln-server/           HTTP server, CLI, training queue, metrics, config\n  kiln-train/            SFT and GRPO training loops with gradient checkpointing\n  kiln-nvtx/             Thin NVTX range wrapper for nsys attribution (no-op when off)\n  kiln-flce-kernel/      Fused Linear Cross-Entropy (chunked CE without [T, V] logits)\n  kiln-flash-attn/       Vendored Flash-Attention-2 CUDA kernels (C-ABI + Rust FFI) [CUDA only]\n  kiln-gdn-kernel/       Vendored Gated DeltaNet chunkwise + recurrent kernels [CUDA only]\n  kiln-conv1d-kernel/    Vendored mamba-ssm causal_conv1d_update decode kernel [CUDA only]\n  kiln-rmsnorm-kernel/   Fused RMSNorm CUDA kernel (Liger-style) [CUDA only]\n  kiln-marlin-gemm/      Vendored IST-DASLab Marlin W4A16 GEMM kernel [CUDA only]\n```\n\n## Configuration\n\nKiln uses a TOML config file. Environment variables override config values. See [`kiln.example.toml`](kiln.example.toml) for all options.\n\n| Setting | Env Var | Default | Description |\n|---|---|---|---|\n| `model.path` | `KILN_MODEL_PATH` | — | Path to model weights (required) |\n| `server.port` | `KILN_PORT` | 8420 | Server listen port |\n| `memory.inference_memory_fraction` | — | 0.7 | VRAM fraction for inference vs training |\n| `memory.kv_cache_fp8` | `KILN_KV_CACHE_FP8` | false | FP8 KV cache (2x context length) |\n| `logging.format` | `KILN_LOG_FORMAT` | auto | `auto` (default; pretty on TTY, JSON otherwise), `json`, `pretty`, `text`, or `human` |\n| `prefix_cache.enabled` | — | true | Reuse KV cache for shared prefixes |\n\n## Desktop App\n\nKiln Desktop is a system-tray app that wraps the `kiln` server for people who don't want to use a CLI. It spawns and supervises the `kiln` binary as a child process, shows server status in the tray, and opens a dashboard, settings, and log viewer in native windows.\n\n**Windows, Linux, and macOS (Apple Silicon).** The Windows and Linux installers drive the CUDA build of `kiln`; the macOS installer drives the candle-metal build on M-series hardware. Intel Macs are not supported.\n\n**Download — [Kiln Desktop v0.2.0](https://github.com/ericflo/kiln/releases/tag/desktop-v0.2.0):**\n\nSee **[desktop/CHANGELOG.md](desktop/CHANGELOG.md)** for the full version history.\n\n| Platform | Installer | Size |\n|---|---|---|\n| macOS (Apple Silicon) | [Kiln.Desktop_0.2.0_aarch64.dmg](https://github.com/ericflo/kiln/releases/download/desktop-v0.2.0/Kiln.Desktop_0.2.0_aarch64.dmg) | 8.1 MB |\n| Windows | [Kiln.Desktop_0.2.0_x64-setup.exe](https://github.com/ericflo/kiln/releases/download/desktop-v0.2.0/Kiln.Desktop_0.2.0_x64-setup.exe) (NSIS) | 4.3 MB |\n| Windows | [Kiln.Desktop_0.2.0_x64_en-US.msi](https://github.com/ericflo/kiln/releases/download/desktop-v0.2.0/Kiln.Desktop_0.2.0_x64_en-US.msi) (MSI) | 6.5 MB |\n| Linux | [Kiln.Desktop_0.2.0_amd64.deb](https://github.com/ericflo/kiln/releases/download/desktop-v0.2.0/Kiln.Desktop_0.2.0_amd64.deb) | 8.4 MB |\n| Linux | [Kiln.Desktop_0.2.0_amd64.AppImage](https://github.com/ericflo/kiln/releases/download/desktop-v0.2.0/Kiln.Desktop_0.2.0_amd64.AppImage) | 81.7 MB |\n\nThe installer bundles the desktop wrapper only. On first launch the app offers to auto-download the matching prebuilt `kiln` server binary for your platform (macOS aarch64 / Metal, Linux x86_64 / CUDA 12.4, Windows x86_64 / CUDA 12.4) from the latest `kiln-v*` GitHub release and verify it against the published SHA-256. You can also point it at an existing `kiln` binary from Settings. Model weights still need to be downloaded separately — the Settings window has a HuggingFace downloader, or you can use the CLI path in [QUICKSTART.md](QUICKSTART.md).\n\n**Dashboard** — a toolbar across the top surfaces server state, model path, VRAM usage, active LoRA adapter, training status, and the OpenAI base URL as click-to-copy pills, alongside Start / Stop / Restart Server, View Logs, and Settings buttons. A first-run empty state walks you through setting a model path, and if the kiln server crashes while the dashboard is open an error screen surfaces it with a one-click recovery path. Keyboard shortcuts cover the common actions — \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003eShift\u003c/kbd\u003e+\u003ckbd\u003eS\u003c/kbd\u003e to start, \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003eShift\u003c/kbd\u003e+\u003ckbd\u003e.\u003c/kbd\u003e to stop, \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003eShift\u003c/kbd\u003e+\u003ckbd\u003eR\u003c/kbd\u003e to restart, \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003eShift\u003c/kbd\u003e+\u003ckbd\u003eC\u003c/kbd\u003e to copy the base URL, \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003eL\u003c/kbd\u003e for logs, \u003ckbd\u003eCtrl/Cmd\u003c/kbd\u003e+\u003ckbd\u003e,\u003c/kbd\u003e for settings, and \u003ckbd\u003e?\u003c/kbd\u003e for the full cheatsheet modal. The toolbar wraps gracefully at narrow window widths, and the kiln server's `/ui` is embedded below.\n\n![Dashboard](docs/desktop/dashboard.png)\n\n**Settings** — pick a model path (or pull weights with the built-in HuggingFace downloader), configure the listening host and port, and tune the runtime: inference VRAM fraction, FP8 KV cache, CUDA graphs, prefix cache, and adapter directory. Startup options cover auto-start kiln on app launch, auto-restart on crash, and launch-at-login. A Check for Updates button (with the same check running automatically on launch) surfaces new `kiln-v*` releases and explains when an update is held back for CUDA driver or GPU SM-arch compatibility reasons.\n\n![Settings](docs/desktop/settings.png)\n\n**Logs** — tails the kiln server's stdout/stderr from an in-process ring buffer.\n\n![Logs](docs/desktop/logs.png)\n\nBuild and dev docs for the desktop app live in [desktop/README.md](desktop/README.md).\n\n## Deployment\n\n### Docker — pull a prebuilt image\n\nEach `kiln-v*` tag publishes a `linux/amd64` CUDA 12.4 image to GHCR:\n\n```bash\ndocker pull ghcr.io/ericflo/kiln-server:latest\n# or pin a version:\ndocker pull ghcr.io/ericflo/kiln-server:0.2.3\n```\n\nRun with the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html):\n\n```bash\ndocker run --gpus all -p 8420:8420 \\\n  -v /path/to/Qwen3.5-4B:/models \\\n  ghcr.io/ericflo/kiln-server:latest serve\n```\n\n### Docker — build from source\n\n```bash\ndocker build -f deploy/Dockerfile -t kiln .\ndocker run --gpus all -v /path/to/Qwen3.5-4B:/models -p 8420:8420 kiln serve\n```\n\n### systemd\n\n```bash\nsudo cp target/release/kiln /usr/local/bin/\nsudo cp deploy/kiln.service /etc/systemd/system/\nsudo systemctl enable --now kiln\n```\n\n## Status\n\nKiln is in active development. Core inference, LoRA serving, SFT training, GRPO training, production hardening, and the Phase 6 performance optimization sprint (FP8 KV cache, CUDA graphs, GPTQ + Marlin W4A16 quantization, fused decode kernels, SGLang-style radix prefix cache) are all shipped. Inference on macOS / Apple Silicon runs via the candle-metal backend, with a fused Metal kernel family landed in v0.2.0. Phase 7 (developer experience) and Phase 8 (advanced features: adapter upload/download, TIES + concatenation merge modes, per-request adapter composition, batch completions for GRPO, training webhooks) have shipped. Current work is Phase 9 — public release preparation. See [`CHANGELOG.md`](CHANGELOG.md) for what landed in the most recent release and [`BENCHMARKS.md`](BENCHMARKS.md) for current decode numbers.\n\nNot yet production-hardened for multi-tenant use. Designed for single-user, single-GPU deployments — your home server, your dev box, your dedicated cloud instance.\n\n## Prior Art\n\nKiln builds on ideas from:\n\n- [vLLM](https://github.com/vllm-project/vllm) — paged KV cache, continuous batching\n- [DeepSeekMath](https://arxiv.org/abs/2402.03300) — GRPO algorithm\n- [S-LoRA](https://arxiv.org/abs/2311.03285) — multi-LoRA serving techniques\n- [Tinker](https://thinkingmachines.ai/blog/announcing-tinker/) — the cloud-hosted version of this idea. Kiln is the self-hosted, open-source take.\n- [nano-vllm](https://github.com/GeeeekExplorer/nano-vllm) — proof that the core can be simple\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\nThird-party dependency licenses: see [THIRD_PARTY_LICENSES.md](THIRD_PARTY_LICENSES.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fericflo%2Fkiln","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fericflo%2Fkiln","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fericflo%2Fkiln/lists"}