{"id":50824721,"url":"https://github.com/chandlertee/noetica","last_synced_at":"2026-06-13T17:05:21.850Z","repository":{"id":362962529,"uuid":"1261447112","full_name":"chandlertee/noetica","owner":"chandlertee","description":"Self-hosted local-AI stack over Ollama: private chat, a structured-output API, and a QLoRA fine-tune → serve → chat loop. No cloud, no secrets.","archived":false,"fork":false,"pushed_at":"2026-06-06T18:12:27.000Z","size":851,"stargazers_count":0,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-06T19:21:12.465Z","etag":null,"topics":["chatbot","chatgpt-alternative","fastapi","fine-tuning","knowledge-base","llm","local-llm","ollama","open-webui","qlora","rag","self-hosted"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chandlertee.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-06-06T17:40:14.000Z","updated_at":"2026-06-06T18:11:08.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/chandlertee/noetica","commit_stats":null,"previous_names":["chandlertee/noetica"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/chandlertee/noetica","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chandlertee%2Fnoetica","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chandlertee%2Fnoetica/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chandlertee%2Fnoetica/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chandlertee%2Fnoetica/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chandlertee","download_url":"https://codeload.github.com/chandlertee/noetica/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chandlertee%2Fnoetica/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34292326,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-13T02:00:06.617Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","chatgpt-alternative","fastapi","fine-tuning","knowledge-base","llm","local-llm","ollama","open-webui","qlora","rag","self-hosted"],"created_at":"2026-06-13T17:05:21.058Z","updated_at":"2026-06-13T17:05:21.842Z","avatar_url":"https://github.com/chandlertee.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# Noetica\n\n**Your own local-AI stack. Chat with it, build on it, fine-tune it — one Ollama registry, no cloud, no secrets.**\n\n[![CI](https://github.com/chandlertee/noetica/actions/workflows/ci.yml/badge.svg)](https://github.com/chandlertee/noetica/actions/workflows/ci.yml)\n[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)\n[![Python 3.11 | 3.12](https://img.shields.io/badge/python-3.11%20%7C%203.12-blue.svg)](pyproject.toml)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Built on Ollama](https://img.shields.io/badge/built%20on-Ollama-black.svg)](https://ollama.com)\n[![100% local](https://img.shields.io/badge/inference-100%25%20local-success.svg)](#security)\n\nThree front doors over one local model registry — and a closed loop between them.\n\n\u003c/div\u003e\n\n---\n\nNoetica is a self-hosted AI stack you run on your own hardware. It puts three\nequal front doors over a single [Ollama](https://ollama.com) registry:\n\n| Door | What | Surface |\n|------|------|---------|\n| 💬 **Chat** — *use it* | a private ChatGPT-style workbench | [Open WebUI](https://github.com/open-webui/open-webui) over your local models, with RAG over your own files |\n| 🔌 **Serve** — *build on it* | a structured-output API | `POST /v1/llm/structured` (prompt + JSON Schema → validated JSON), `/v1/embed`, `/v1/health` |\n| 🎓 **Train + eval** — *extend it* | fine-tune your own model | QLoRA → export to Ollama → it shows up in **chat and the API** → score it |\n\n…and they form a **loop**: chat with local models → build apps on the structured\nAPI → fine-tune your own model (FOSS) → export it back to Ollama → it appears in\nchat *and* the API → evaluate → repeat. No cloud. No API keys. Nothing leaves your box.\n\n## Architecture\n\n```mermaid\nflowchart TB\n    subgraph doors[\" \"]\n        direction LR\n        chat[\"💬 Chat\u003cbr/\u003eOpen WebUI\u003cbr/\u003e\u003ci\u003euse it\u003c/i\u003e\"]\n        serve[\"🔌 Serve API\u003cbr/\u003e/v1/llm/structured · /v1/embed\u003cbr/\u003e\u003ci\u003ebuild on it\u003c/i\u003e\"]\n        train[\"🎓 Train + Eval\u003cbr/\u003eQLoRA · export · score\u003cbr/\u003e\u003ci\u003eextend it\u003c/i\u003e\"]\n    end\n\n    chat --\u003e ollama\n    serve --\u003e ollama\n    train -- \"ollama create\" --\u003e ollama\n    train -. \"scores via\" .-\u003e serve\n\n    ollama[(\"🧠 Ollama\u003cbr/\u003eone local model registry\")]\n    ollama --\u003e models[\"qwen2.5 · llama3.1 · nomic-embed\u003cbr/\u003e+ your fine-tunes\"]\n\n    apps[\"your apps / scripts\"] --\u003e serve\n    files[\"your documents\"] --\u003e chat\n\n    classDef door fill:#0d1117,stroke:#3b82f6,color:#e6edf3,stroke-width:2px;\n    classDef reg fill:#161b22,stroke:#f59e0b,color:#e6edf3,stroke-width:2px;\n    class chat,serve,train door;\n    class ollama,models reg;\n```\n\nEverything in the loop runs locally. Training can borrow a GPU (or Colab), but\n**inference never leaves your machine**.\n\n## Chat workbench\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/chat-ui.png\" alt=\"Noetica chat — Open WebUI over a local model, answering from an attached document collection\" width=\"760\"\u003e\n  \u003cbr\u003e\u003csub\u003eOpen WebUI on a local \u003ccode\u003eqwen2.5\u003c/code\u003e, answering from your own document collection — fully local, with sources.\u003c/sub\u003e\n\u003c/div\u003e\n\n## 60-second quickstart\n\n```sh\ngit clone https://github.com/chandlertee/noetica \u0026\u0026 cd noetica\n\n# 1. Make sure Ollama has a model (laptop / macOS path):\nollama pull qwen2.5:7b-instruct nomic-embed-text\n\n# 2. Bring up chat + the API (auto-picks cpu/gpu profile for your host):\n./bin/up\n\n# 3. Open the doors:\nopen http://localhost:3000          # 💬 chat (start here)\ncurl localhost:8001/v1/health       # 🔌 API\n```\n\nAsk the API for structured JSON:\n\n```sh\ncurl -X POST localhost:8001/v1/llm/structured -H 'Content-Type: application/json' -d '{\n  \"prompt\": \"Describe the film Arrival (2016).\",\n  \"response_schema\": {\n    \"type\": \"object\",\n    \"properties\": {\"title\": {\"type\":\"string\"}, \"year\": {\"type\":\"integer\"}, \"director\": {\"type\":\"string\"}},\n    \"required\": [\"title\", \"year\"]\n  }\n}'\n```\n\nNo Docker? Run the API natively: `uv sync \u0026\u0026 uv run noetica serve` (then `noetica health`).\n\n## The loop, end to end\n\n```sh\n# build on it — two endpoints compose into local RAG\npython examples/rag_local.py \"How do I expose the API safely?\"\n\n# extend it — fine-tune, export to Ollama, then it's in chat AND the API\npython -m noetica.train.data validate examples/data/sample_chat.jsonl\npython -m noetica.train.finetune --config noetica/train/configs/qwen2.5-7b.yaml   # GPU/Colab\npython -m noetica.train.export_ollama --gguf my-model.Q4_K_M.gguf --name my-model\n\n# evaluate — score models against golden cases\npython -m noetica.eval.run --model my-model --model qwen2.5:7b-instruct\n```\n\nFull walkthroughs: [chat over your docs](examples/chat_projects.md) ·\n[fine-tune → serve → chat → eval](examples/finetune_to_serve.md) ·\n[all examples](examples/README.md).\n\n## Features\n\n| | |\n|---|---|\n| **Structured output** | JSON-Schema-driven generation with a repair-retry loop + server-side validation |\n| **Embeddings + local RAG** | `/v1/embed` and an ~80-line [RAG example](examples/rag_local.py); built-in RAG in chat |\n| **Observability** | structured JSON logs, per-request `X-Request-ID`, Prometheus `/metrics` |\n| **Auth \u0026 CORS** | optional constant-time API key (`NOETICA_API_KEY`), configurable origins |\n| **Fine-tuning** | Unsloth QLoRA recipe, dataset validation, Colab notebook for the no-GPU path |\n| **Export to Ollama** | merge adapter → GGUF (llama.cpp) → quantize → Modelfile → `ollama create` |\n| **Evals** | golden cases, `case × model` comparison table, deterministic CI gate |\n| **Lifecycle-separated** | chat + serve install with **no torch/CUDA**; training lives behind `pip install .[train]` + a compose profile |\n| **Tested \u0026 typed** | deterministic tests (Ollama mocked via `respx`), full type coverage, ruff + mypy in CI |\n\n## Install profiles\n\n```sh\nuv sync                 # core: chat + serve + eval. No GPU, no torch.\nuv sync --extra dev     # + test/lint/type tools\nuv sync --extra train   # + QLoRA stack (torch/unsloth) — NVIDIA GPU\n```\n\n`./bin/up` picks a compose profile by host: `cpu` (laptop / macOS, Ollama native)\nor `full` (containerized Ollama + GPU). Training is a separate one-shot profile.\n\n## Security\n\nNoetica defaults to a friendly, single-user **localhost** setup — and nothing\never leaves your machine. Those defaults are convenient, not hardened. **Before\nexposing anything on a network**, set `NOETICA_API_KEY`, set `WEBUI_AUTH=True`,\nfront it with TLS, and keep Ollama's port internal. The full checklist and threat\nmodel are in [SECURITY.md](SECURITY.md).\n\n## Built on (FOSS credits)\n\n[Ollama](https://ollama.com) · [Open WebUI](https://github.com/open-webui/open-webui) ·\n[FastAPI](https://fastapi.tiangolo.com) · [Unsloth](https://github.com/unslothai/unsloth) ·\n[TRL](https://github.com/huggingface/trl) / [PEFT](https://github.com/huggingface/peft) ·\n[llama.cpp](https://github.com/ggerganov/llama.cpp) · [uv](https://github.com/astral-sh/uv) ·\n[Ruff](https://github.com/astral-sh/ruff). All FOSS, all local.\n\n## Roadmap\n\nThe next door is **`agents/`** — local agents \u0026 automation that consume the serve\nAPI + Ollama, slotting in behind their own `[agents]` extra and compose profile.\nSee [ROADMAP.md](ROADMAP.md). (Not in v1.0.0 — v1 stays focused on the three doors.)\n\n## Contributing\n\nIssues and PRs welcome — keep it local, light, and FOSS. See\n[CONTRIBUTING.md](CONTRIBUTING.md), [AGENTS.md](AGENTS.md) (for coding agents),\nand the [Code of Conduct](CODE_OF_CONDUCT.md).\n\n## License\n\n[Apache-2.0](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchandlertee%2Fnoetica","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchandlertee%2Fnoetica","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchandlertee%2Fnoetica/lists"}