{"id":34885087,"url":"https://github.com/yafitzdev/fitz-ai","last_synced_at":"2026-02-28T11:23:10.934Z","repository":{"id":328598615,"uuid":"1112616318","full_name":"yafitzdev/fitz-ai","owner":"yafitzdev","description":"Intelligent, honest RAG in 5 minutes. No infrastructure. No boilerplate.","archived":false,"fork":false,"pushed_at":"2026-01-28T01:06:47.000Z","size":4625,"stargazers_count":11,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-28T16:58:17.187Z","etag":null,"topics":["ai","framework","llm","nlp","python","rag","retrieval-augmented-generation","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yafitzdev.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-08T21:46:59.000Z","updated_at":"2026-01-28T03:53:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"cf86b18b-a5c8-449f-ae83-892681780771","html_url":"https://github.com/yafitzdev/fitz-ai","commit_stats":null,"previous_names":["yafitzdev/fitz","yafitzdev/fitz-ai"],"tags_count":19,"template":false,"template_full_name":null,"purl":"pkg:github/yafitzdev/fitz-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yafitzdev%2Ffitz-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yafitzdev%2Ffitz-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yafitzdev%2Ffitz-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yafitzdev%2Ffitz-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yafitzdev","download_url":"https://codeload.github.com/yafitzdev/fitz-ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yafitzdev%2Ffitz-ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29059124,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-03T20:13:53.544Z","status":"ssl_error","status_checked_at":"2026-02-03T20:13:40.507Z","response_time":96,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","framework","llm","nlp","python","rag","retrieval-augmented-generation","vector-search"],"created_at":"2025-12-26T03:36:24.305Z","updated_at":"2026-02-28T11:23:10.916Z","avatar_url":"https://github.com/yafitzdev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n\u003cdiv align=\"center\"\u003e\n\n# fitz-ai\n\n### Intelligent, honest knowledge retrieval in 5 minutes. No infrastructure. No boilerplate.\n\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![PyPI version](https://badge.fury.io/py/fitz-ai.svg)](https://pypi.org/project/fitz-ai/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n[![Version](https://img.shields.io/badge/version-0.10.1-green.svg)](CHANGELOG.md)\n[![Coverage](https://img.shields.io/badge/coverage-99%25-brightgreen)](https://github.com/yafitzdev/fitz-ai)\n\n\n[Why Fitz?](#why-fitz) • [Retrieval Intelligence](#retrieval-intelligence) • [Governance](#governance--know-what-you-dont-know) • [Documentation](#links) • [GitHub](https://github.com/yafitzdev/fitz-ai)\n\n\u003c/div\u003e\n\n\u003cbr /\u003e\n\n---\n\n```bash\npip install fitz-ai\n\nfitz query \"What is our refund policy?\" --source ./docs\n```\n\nThat's it. Your documents are now searchable with AI.\n\n\n![fitz-ai quickstart demo](https://raw.githubusercontent.com/yafitzdev/fitz-ai/main/docs/assets/quickstart_demo.gif)\n\n\u003cbr\u003e\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003ePython SDK\u003c/strong\u003e → \u003ca href=\"docs/SDK.md\"\u003eFull SDK Reference\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n```python\nimport fitz_ai\n\nfitz_ai.point(\"./docs\")\nanswer = fitz_ai.query(\"What is our refund policy?\")\n```\n\n\u003c/details\u003e\n\n\u003cbr\u003e\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003eREST API\u003c/strong\u003e → \u003ca href=\"docs/API.md\"\u003eFull API Reference\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n```bash\npip install fitz-ai[api]\n\nfitz serve  # http://localhost:8000/docs for interactive API\n```\n\n\u003c/details\u003e\n\n---\n\n### About 🧑‍🌾\n\n  Solo project by Yan Fitzner ([LinkedIn](https://www.linkedin.com/in/yan-fitzner/), [GitHub](https://github.com/yafitzdev)).\n\n  - ~50k lines of Python\n  - 1500+ tests, 99% coverage\n  - Zero LangChain/LlamaIndex dependencies — built from scratch\n\n![fitz-ai honest_rag](https://raw.githubusercontent.com/yafitzdev/fitz-ai/main/docs/assets/honest_rag.jpg)\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 What is RAG?\u003c/strong\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\nRAG is how ChatGPT's \"file search,\" Notion AI, and enterprise knowledge tools actually work under the hood.\nInstead of sending all your documents to an AI, RAG:\n\n1. [X] **Indexes your documents** — Splits them into chunks, converts to vectors, stores in a database\n2. [X] **Retrieves only what's relevant** — When you ask a question, finds the 5-10 most relevant chunks\n3. [X] **Sends just those chunks to the LLM** — The AI answers based on focused, relevant context\n\nTraditional approach:\n```\n  [All 10,000 documents] → LLM → Answer\n  ❌ Impossible (too large)\n  ❌ Expensive (if possible)\n  ❌ Unfocused\n```\nRAG approach:\n```\n  Question → [Search index] → [5 relevant chunks] → LLM → Answer\n  ✅ Works at any scale\n  ✅ Costs pennies per query\n  ✅ Focused context = better answers\n```\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 Why Can't I Just Send My Documents to ChatGPT directly?\u003c/strong\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\nYou can—but you'll hit walls fast.\n\n**Context window limits 🚨** \n\u003e GPT-4 accepts ~128k tokens. That's roughly 300 pages. Your company wiki, codebase, or document archive is likely 10x-100x larger. You physically cannot paste it all.\n\n**Cost explosion 💥**\n\u003e Even if you could fit everything, you'd pay for every token on every query. Sending 100k tokens costs ~\\$1-3 per question. Ask 50 questions a day? That's $50-150 daily—for one user.\n\n**No selective retrieval ❌**\n\u003e When you paste documents, the model reads everything equally. It can't focus on what's relevant. Ask about refund policies and it's also processing your hiring guidelines, engineering specs, and meeting notes—wasting context and degrading answers.\n\n**No persistence 💢**\n\u003e Every conversation starts fresh. You re-upload, re-paste, re-explain. There's no knowledge base that accumulates and improves.\n\n\u003c/details\u003e\n\n---\n\n### Why Fitz?\n\n**Zero-wait querying 🐆** → [Progressive KRAG](docs/features/platform/progressive-krag-agentic-search.md)\n\u003e Point at a folder. Ask a question immediately — no ingestion step required. Fitz serves answers instantly via agentic search while a background worker indexes your files. Queries get faster over time as indexing completes, but they work from second one.\n\n**Honest answers ✅** → [Governance Benchmark](docs/features/governance/governance-benchmarking.md)\n\u003e Most RAG tools confidently answer even when the answer isn't in your documents. Ask \"What was our Q4 revenue?\" when your docs only cover Q1-Q3, and typical RAG hallucinates a number. Fitz says: *\"I cannot find Q4 revenue figures in the provided documents.\"\n\u003e \n\u003e → Fitz detects disputes at **79.1% recall** on [fitz-gov 5.0](https://github.com/yafitzdev/fitz-gov), a 2,900+ case benchmark for epistemic honesty (92% hard difficulty).\n\n**Queries that actually work 📊**\n\u003e Standard RAG fails silently on real queries. Fitz has built-in intelligence: hierarchical summaries for \"What are the trends?\", exact keyword matching for \"Find TC-1000\", multi-query decomposition for complex questions, address-based code retrieval with import graph traversal, and SQL execution for tabular data. No configuration—it just works.\n\n**Tabular data that is actually searchable 📈** → [Unified Storage](docs/features/platform/unified-storage.md)\n\u003e CSV and table data is a nightmare in most RAG systems—chunked arbitrarily, structure lost, queries fail. Fitz stores tables natively in PostgreSQL alongside your vectors—same database, no sync issues. Auto-detects schema and runs real SQL. Ask \"What's the average price by region?\" and get an actual computed answer, not fragmented rows.\n\n**Other Features at a Glance 🃏**\n\u003e1. [x] **Fully local execution possible.** Embedded PostgreSQL + Ollama, no API keys required to start.\n\u003e2. [x] **Plugin-based architecture.** Swap LLMs, rerankers, and retrieval pipelines via YAML config.\n\u003e3. [x] **[KRAG (Knowledge Routing Augmented Generation)](docs/features/platform/krag.md).** Asymmetric indexing — documents are parsed into typed retrieval units (symbols, sections, tables) with structural metadata, not flat chunks. Queries are routed to the right strategy per content type.\n\u003e4. [x] **Full provenance.** Every answer traces back to the exact source symbol, section, or document.\n\u003e5. [x] **Data privacy**: No telemetry, no cloud, no external calls except to the LLM provider you configure.\n\u003e6. [x] **[Enterprise gateway support](docs/features/platform/enterprise-gateway.md).** OAuth2 M2M, custom CA certs, mTLS, and corporate proxy/gateway integration.\n\n####\n\n\u003e [!TIP]\n\u003e Any questions left? Try fitz on itself:\n\u003e \n\u003e ```bash\n\u003e fitz query \"How does the retrieval pipeline work?\" --source ./fitz_ai\n\u003e ```\n\u003e\n\u003e The codebase speaks for itself.\n\n---\n\n### What You Can Search\n\nTraditional RAG chops every document into flat text blocks and searches them the same way. [FitzKRAG](docs/features/platform/krag.md) parses each document by type — tree-sitter for code, heading hierarchy for docs, schema detection for CSVs — and produces typed retrieval units, each with its own storage format and search strategy.\n\n\u003cbr\u003e\n\n| Retrieval Unit              | Extracted From | How It Works |\n|-----------------------------|----------------|-------------|\n| [**Symbols 🖌️**](docs/features/ingestion/code-symbol-extraction.md) | Code files | Tree-sitter parses functions, classes, and methods into addressable units with qualified names, references, and import graphs. Cross-file dependencies are graph traversals, not text searches. |\n| **Sections 📑** | Documents (PDF, markdown, text) | Headings and paragraphs are extracted with parent/child hierarchy. Deeply nested sections include parent context; top-level headings include child summaries. |\n| [**Tables 📅**](docs/features/ingestion/tabular-data-routing.md) | CSV files or tables within documents | Native PostgreSQL storage with auto-detected schema. Real SQL execution from natural language — not chunked text. |\n| **Images 🖼️** | Figures and diagrams within documents | VLM-powered figure extraction and visual understanding. *(Coming soon)* |\n| **Chunks 🧩** | Any content as fallback | Traditional chunk-based retrieval when structured extraction doesn't apply. Automatic fallback — no configuration needed. |\n\n\u003cbr\u003e\n\n\u003e [!NOTE]\n\u003e All retrieval units share the same retrieval intelligence (temporal handling, comparison queries, multi-hop reasoning, etc.) and the same enrichment pipeline (summaries, keywords, entities, hierarchical summaries).\n\n---\n\n### Retrieval Intelligence\n\nMost RAG implementations are naive vector search—they fail silently on real-world queries. Fitz has [built-in intelligence](docs/features/retrieval) that handles edge cases automatically:\n\n\u003cbr\u003e\n\n| Feature | Query | Naive RAG Problem | Fitz Solution |\n|---------|-------|-------------------|------------------|\n| [**epistemic-honesty**](docs/features/governance/epistemic-honesty.md) | \"What was our Q4 revenue?\" | ❌ Hallucinated number — Info doesn't exist, but LLM won't admit it | ✅ \"I don't know\" |\n| [**keyword-vocabulary**](docs/features/retrieval/keyword-vocabulary.md) | \"Find TC_1000\" | ❌ Wrong test case — Embeddings see TC_1000 ≈ TC_2000 (semantically similar) | ✅ Exact keyword matching |\n| [**hybrid-search**](docs/features/retrieval/hybrid-search.md) | \"X100 battery specs\" | ❌ Returns Y200 docs — Semantic search misses exact model numbers | ✅ Hybrid search (dense + sparse) |\n| [**sparse-search**](docs/features/retrieval/sparse-search.md) | \"error code E_AUTH_401\" | ❌ No exact match — Embeddings miss precise error codes | ✅ PostgreSQL full-text search |\n| [**multi-hop**](docs/features/retrieval/multi-hop-reasoning.md) | \"Who wrote the paper cited by the 2023 review?\" | ❌ Returns the review only — Single-step search can't traverse references | ✅ Iterative retrieval |\n| [**hierarchical-rag**](docs/features/ingestion/hierarchical-rag.md) | \"What are the design principles?\" | ❌ Random fragments — Answer is spread across docs; no single chunk contains it | ✅ Hierarchical summaries |\n| [**multi-query**](docs/features/retrieval/multi-query-rag.md) | *[User pastes 500-char test report]* \"What failed and why?\" | ❌ Vaguely related chunks — Long input → averaged embedding → matches nothing specifically | ✅ Multi-query decomposition |\n| [**comparison-queries**](docs/features/retrieval/comparison-queries.md) | \"Compare React vs Vue performance\" | ❌ Incomplete comparison — Only retrieves one entity, missing the other | ✅ Multi-entity retrieval |\n| [**entity-graph**](docs/features/retrieval/entity-graph.md) | \"What else mentions AuthService?\" | ❌ Isolated chunks — No awareness of shared entities across docs | ✅ Entity-based linking across sources |\n| [**temporal-queries**](docs/features/retrieval/temporal-queries.md) | \"What changed between Q1 and Q2?\" | ❌ Random chunks — No awareness of time periods in query | ✅ Temporal query handling |\n| [**aggregation-queries**](docs/features/retrieval/aggregation-queries.md) | \"List all the test cases that failed\" | ❌ Partial list — No mechanism for comprehensive retrieval | ✅ Aggregation query handling |\n| [**freshness-authority**](docs/features/retrieval/freshness-authority.md) | \"What does the official spec say?\" | ❌ Returns notes — Can't distinguish authoritative vs informal sources | ✅ Freshness/authority boosting |\n| [**query-expansion**](docs/features/retrieval/query-expansion.md) | \"How do I fetch the db config?\" | ❌ No matches — User says \"fetch\", docs say \"retrieve\"; \"db\" vs \"database\" | ✅ Query expansion |\n| [**query-rewriting**](docs/features/retrieval/query-rewriting.md) | \"Tell me more about it\" *(after discussing TechCorp)* | ❌ Lost context — Pronouns like \"it\" reference nothing, retrieval fails | ✅ Conversational context resolution |\n| [**hyde**](docs/features/retrieval/hyde.md) | \"What's TechCorp's approach to sustainability?\" | ❌ Poor recall — Abstract queries don't embed close to concrete documents | ✅ Hypothetical document generation |\n| [**contextual-embeddings**](docs/features/retrieval/contextual-embeddings.md) | \"When does it expire?\" | ❌ Ambiguous chunk — \"It expires in 24h\" embedded without context; \"it\" = ? | ✅ Summary-prefixed symbol/section embeddings |\n| [**reranking**](docs/features/retrieval/reranking.md) | \"What's the battery warranty?\" | ❌ Imprecise ranking — Vector similarity ≠ true relevance; best answer buried | ✅ Cross-encoder precision |\n\n\u003cbr\u003e\n\n\u003e [!IMPORTANT]\n\u003e These features are **always on**—no configuration needed. Fitz automatically detects when to use each capability.\n\n---\n\n### Governance — Know What You Don't Know\n\n[Feature docs](docs/features/governance/governance-benchmarking.md) • [fitz-gov benchmark](https://github.com/yafitzdev/fitz-gov)\n\nMost RAG systems hallucinate confidently. Fitz **measures and enforces** epistemic honesty using a 4-question cascade ML classifier trained on 2,900+ labeled cases from [fitz-gov](https://github.com/yafitzdev/fitz-gov), a benchmark for epistemic honesty.\n\n\u003cbr\u003e\n\n```\n  Query + Retrieved Context\n            │\n            ▼\n  ┌─────────────────────┐\n  │ 5 Constraints       │     Contradiction detection, evidence sufficiency,\n  │ (epistemic sensors) │     causal attribution, answer verification, specific info type\n  └──────────┬──────────┘\n             │ 109 features extracted\n             ▼\n  ┌─────────────────────┐\n  │ Q1: Evidence        │     Is the evidence sufficient?\n  │ sufficient? (ML)    ├───► NO ──► ABSTAIN\n  └──────────┬──────────┘\n             │ YES\n             ▼\n  ┌─────────────────────┐\n  │ Q2: Conflict?       │     Did conflict-aware constraint fire?\n  │ (rule: ca_fired)    ├───► YES ──┐\n  └──────────┬──────────┘           │\n             │ NO                   ▼\n             │            ┌─────────────────────┐\n             │            │ Q3: Conflict        │\n             │            │ resolved? (ML)      ├───► NO ──► DISPUTED\n             │            └──────────┬──────────┘\n             │                       └ YES ────────────────► TRUSTWORTHY\n             ▼\n  ┌─────────────────────┐\n  │ Q4: Evidence truly  │     Is the evidence solid enough?\n  │ solid? (ML)         ├───► NO ──► ABSTAIN\n  └──────────┬──────────┘\n             └ YES ────────────────► TRUSTWORTHY\n             \n```\n\n\u003cbr\u003e\n\n| Decision | Meaning                              | Recall    |\n|----------|--------------------------------------|-----------|\n| **ABSTAIN** | Evidence doesn't answer the question | **90.2%** |\n| **DISPUTED** | Sources contradict each other        | **74.9%** |\n| **TRUSTWORTHY** | Consistent, sufficient evidence      | **78.6%** |\n\n**Overall accuracy: 81.3%** on fitz-gov 5.0 (2,910 cases, 5-fold cross-validated, 92% hard difficulty)\n\n\u003cbr\u003e\n\n\u003e [!NOTE]\n\u003e Governance asks \"given three relevant documents that partially contradict each other, should you flag a dispute, hedge the answer, or trust the consensus?\" That's a judgment call even humans disagree on. 92% of our test cases are rated \"hard.\"\n\n\u003cstrong\u003eThe system fails safe 🛡️\u003c/strong\u003e\n\u003e The safety-first threshold is tuned so that when the classifier is wrong, it over-hedges (\"disputed\" instead of \"trustworthy\") — annoying but harmless. Over-confidence (\"trustworthy\" instead of \"disputed\") is the rarest error mode.\n\n\u003cstrong\u003eThese scores are a floor, not a ceiling 👣\u003c/strong\u003e\n\u003e All benchmarks were measured using `qwen2.5:3b` — a 3B parameter local model. The governance constraints run on the fast-tier LLM to keep latency low. Stronger models produce better constraint signals, which feed better features into the classifier. Upgrading your chat provider should improve governance accuracy for free.\n\n\u003cstrong\u003eZero extra latency ⏱️\u003c/strong\u003e\n\u003e The constraints already run as part of the pipeline. The ML classifier just replaces hand-coded rules with a local sklearn model — inference takes microseconds, no additional API calls.\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 Quick Start\u003c/strong\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n#### CLI\n\u003e\n\u003e```bash\n\u003epip install fitz-ai\n\u003e\n\u003efitz query \"Your question here\" --source ./docs\n\u003e```\n\u003e\n\u003eFitz auto-detects your LLM provider:\n\u003e1. **Ollama running?** → Uses it automatically (fully local)\n\u003e2. **`COHERE_API_KEY` or `OPENAI_API_KEY` set?** → Uses it automatically\n\u003e3. **First time?** → Guides you through free Cohere signup (2 minutes)\n\u003e\n\u003eAfter first run, it's completely zero-friction.\n\n\u003cbr\u003e\n\n#### Python SDK\n\u003e\n\u003e```python\n\u003eimport fitz_ai\n\u003e\n\u003efitz_ai.point(\"./docs\")\n\u003eanswer = fitz_ai.query(\"Your question here\")\n\u003e\n\u003eprint(answer.text)\n\u003efor source in answer.provenance:\n\u003e    print(f\"  - {source.source_id}: {source.excerpt[:50]}...\")\n\u003e```\n\u003e\n\u003eThe SDK provides:\n\u003e- Module-level functions matching CLI (`point`, `query`)\n\u003e- Auto-config creation (no setup required)\n\u003e- Full provenance tracking\n\u003e- Same honest retrieval as the CLI\n\u003e\n\u003eFor advanced use (multiple collections), use the `fitz` class directly:\n\u003e```python\n\u003efrom fitz_ai import fitz\n\u003e\n\u003ephysics = fitz(collection=\"physics\")\n\u003ephysics.point(\"./physics_papers\")\n\u003eanswer = physics.query(\"Explain entanglement\")\n\u003e```\n\n\u003cbr\u003e\n\n#### Fully Local (Ollama)\n\u003e\n\u003e```bash\n\u003epip install fitz-ai[local]\n\u003e\n\u003eollama pull llama3.2\n\u003eollama pull nomic-embed-text\n\u003e\n\u003efitz query \"Your question here\" --source ./docs\n\u003e```\n\u003e\n\u003eFitz auto-detects Ollama when running. No API keys needed—no data leaves your machine.\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 Real-World Usage\u003c/strong\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\nFitz is a foundation. It handles document indexing and grounded retrieval—you build whatever sits on top: chatbots, dashboards, alerts, or automation.\n\n\u003cbr\u003e\n\n\u003cstrong\u003eChatbot Backend 🤖\u003c/strong\u003e\n\n\u003e Connect fitz to Slack, Discord, Teams, or your own UI. One function call returns an answer with sources—no hallucinations, full provenance. You handle the conversation flow; fitz handles the knowledge.\n\u003e\n\u003e *Example:* A SaaS company plugs fitz into their support bot. Tier-1 questions like \"How do I reset my password?\" get instant answers. Their support team focuses on edge cases while fitz deflects 60% of incoming tickets.\n\n\u003cbr\u003e\n\n\u003cstrong\u003eInternal Knowledge Base 📖\u003c/strong\u003e\n\n\u003e Point fitz at your company's wiki, policies, and runbooks. Employees ask natural language questions instead of hunting through folders or pinging colleagues on Slack.\n\u003e\n\u003e *Example:* A 200-person startup points fitz at their Notion workspace and compliance docs. New hires find answers to \"How do I request PTO?\" on day one—no more waiting for someone in HR to respond.\n\n\u003cbr\u003e\n\n\u003cstrong\u003eContinuous Intelligence \u0026 Alerting (Watchdog) 🐶\u003c/strong\u003e\n\n\u003e Pair fitz with cron, Airflow, or Lambda. Point at data on a schedule, run queries automatically, trigger alerts when conditions match. Fitz provides the retrieval primitive; you wire the automation.\n\u003e\n\u003e *Example:* A security team points fitz at SIEM logs nightly. Every morning, a scheduled job asks \"Were there failed logins from unusual locations?\" If fitz finds evidence, an alert fires to the on-call channel before anyone checks email.\n\n\u003cbr\u003e\n\n\u003cstrong\u003eWeb Knowledge Base 🌎\u003c/strong\u003e\n\n\u003e Scrape the web with Scrapy, BeautifulSoup, or Playwright. Save to disk, point fitz at it. The web becomes a queryable knowledge base.\n\u003e\n\u003e *Example:* A football analytics hobbyist scrapes Premier League match reports. They point fitz at the folder and ask \"How did Arsenal perform against top 6 teams?\" or \"What tactics did Liverpool use in away games?\"—insights that would take hours to compile manually.\n\n\u003cbr\u003e\n\n\u003cstrong\u003eCodebase Search 🐍\u003c/strong\u003e → [Code Symbol Extraction](docs/features/ingestion/code-symbol-extraction.md) • [KRAG](docs/features/platform/krag.md)\n\n\u003e FitzKRAG uses address-based retrieval for code: tree-sitter parses your codebase into symbols (functions, classes, methods) with qualified names, references, and import graphs. No chunking—each symbol is a precise, addressable unit. Cross-file dependencies are tracked, so \"what calls this function?\" is a graph traversal, not a text search.\n\u003e\n\u003e *Example:* A team inherits a legacy Django monolith—200k lines, sparse docs. They point fitz at the codebase and ask \"Where is user authentication handled?\" or \"What depends on the billing module?\" FitzKRAG returns specific functions with their callers and dependencies. New developers onboard in days instead of weeks.\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 Architecture\u003c/strong\u003e → \u003ca href=\"docs/ARCHITECTURE.md\"\u003eFull Architecture Guide\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n```\n┌───────────────────────────────────────────────────────────────┐\n│                         fitz-ai                               │\n├───────────────────────────────────────────────────────────────┤\n│  User Interfaces                                              │\n│  CLI: query (--source) | init | collections | config | serve  │\n│  SDK: fitz_ai.point() → fitz_ai.query()                       │\n│  API: /query | /chat | /point | /collections | /health        │\n├───────────────────────────────────────────────────────────────┤\n│  Engines                                                      │\n│  ┌────────────┐  ┌────────────┐                               │\n│  │  FitzKRAG  │  │  Custom... │  (extensible registry)        │\n│  └────────────┘  └────────────┘                               │\n├───────────────────────────────────────────────────────────────┤\n│  LLM Plugins (YAML-defined)                                   │\n│  ┌────────┐ ┌───────────┐ ┌────────┐                          │\n│  │  Chat  │ │ Embedding │ │ Rerank │                          │\n│  └────────┘ └───────────┘ └────────┘                          │\n│  openai, cohere, anthropic, ollama, azure...                  │\n├───────────────────────────────────────────────────────────────┤\n│  Storage (PostgreSQL + pgvector)                              │\n│  vectors | metadata | tables | keywords | full-text search    │\n├───────────────────────────────────────────────────────────────┤\n│  Retrieval (address-based, baked-in intelligence)             │\n│  symbols | sections | tables | import graphs | reranking      │\n├───────────────────────────────────────────────────────────────┤\n│  Enrichment (baked in)                                        │\n│  summaries | keywords | entities | hierarchical summaries     │\n├───────────────────────────────────────────────────────────────┤\n│  Constraints (epistemic safety)                               │\n│  ConflictAware | InsufficientEvidence | CausalAttribution     │\n└───────────────────────────────────────────────────────────────┘\n```\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 CLI Reference\u003c/strong\u003e → \u003ca href=\"docs/CLI.md\"\u003eFull CLI Guide\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n```bash\nfitz query \"question\" --source ./docs  # Point at docs and query (start here)\nfitz query \"question\"                  # Query existing collection\nfitz query --chat                      # Multi-turn conversation mode\nfitz init                              # Interactive setup wizard\nfitz collections                       # List and delete knowledge collections\nfitz config                            # View/edit configuration\nfitz serve                             # Start REST API server\nfitz reset                             # Reset pgserver database (when stuck/corrupted)\nfitz eval                              # Evaluation tools\nfitz config --doctor                   # System diagnostics\n```\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 Python SDK Reference\u003c/strong\u003e → \u003ca href=\"docs/SDK.md\"\u003eFull SDK Guide\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n**Simple usage (module-level, matches CLI):**\n```python\nimport fitz_ai\n\nfitz_ai.point(\"./docs\")\nanswer = fitz_ai.query(\"What is the refund policy?\")\nprint(answer.text)\n```\n\n\u003cbr\u003e\n\n**Advanced usage (multiple collections):**\n```python\nfrom fitz_ai import fitz\n\n# Create separate instances for different collections\nphysics = fitz(collection=\"physics\")\nphysics.point(\"./physics_papers\")\n\nlegal = fitz(collection=\"legal\")\nlegal.point(\"./contracts\")\n\n# Query each collection\nphysics_answer = physics.query(\"Explain entanglement\")\nlegal_answer = legal.query(\"What are the payment terms?\")\n```\n\n\u003cbr\u003e\n\n**Working with answers:**\n```python\nanswer = fitz_ai.query(\"What is the refund policy?\")\n\nprint(answer.text)\nprint(answer.mode)  # TRUSTWORTHY, DISPUTED, or ABSTAIN\n\nfor source in answer.provenance:\n    print(f\"Source: {source.source_id}\")\n    print(f\"Excerpt: {source.excerpt}\")\n```\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cstrong\u003e📦 REST API Reference\u003c/strong\u003e → \u003ca href=\"docs/API.md\"\u003eFull API Guide\u003c/a\u003e\u003c/summary\u003e\n\n\u003cbr\u003e\n\n**Start the server:**\n```bash\npip install fitz-ai[api]\n\nfitz serve                    # localhost:8000\nfitz serve -p 3000            # custom port\nfitz serve --host 0.0.0.0     # all interfaces\n```\n\n**Interactive docs:** Visit `http://localhost:8000/docs` for Swagger UI.\n\n\u003cbr\u003e\n\n**Endpoints:**\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| POST | `/query` | Query knowledge base |\n| POST | `/chat` | Multi-turn chat (stateless) |\n| POST | `/point` | Point at folder for indexing |\n| GET | `/collections` | List all collections |\n| GET | `/collections/{name}` | Get collection stats |\n| DELETE | `/collections/{name}` | Delete a collection |\n| GET | `/health` | Health check |\n\n\u003cbr\u003e\n\n**Example request:**\n\n```bash\ncurl -X POST http://localhost:8000/query \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"question\": \"What is the refund policy?\", \"collection\": \"default\"}'\n```\n\n\u003c/details\u003e\n\n---\n\n### License\n\nMIT\n\n---\n\n### Links\n\n- [GitHub](https://github.com/yafitzdev/fitz-ai)\n- [PyPI](https://pypi.org/project/fitz-ai/)\n- [Changelog](CHANGELOG.md)\n\n**Documentation:**\n- [CLI Reference](docs/CLI.md)\n- [Python SDK](docs/SDK.md)\n- [REST API](docs/API.md)\n- [Configuration Guide](docs/CONFIG.md)\n- [Architecture](docs/ARCHITECTURE.md)\n- [Unified Storage (PostgreSQL + pgvector)](docs/features/platform/unified-storage.md)\n- [Progressive KRAG \u0026 Agentic Search](docs/features/platform/progressive-krag-agentic-search.md)\n- [Ingestion Pipeline](docs/INGESTION.md)\n- [Enrichment (Hierarchies, Entities)](docs/ENRICHMENT.md)\n- [Epistemic Constraints](docs/CONSTRAINTS.md)\n- [Governance Benchmarking (fitz-gov)](docs/features/governance/governance-benchmarking.md)\n- [Plugin Development](docs/PLUGINS.md)\n- [Feature Control](docs/FEATURE_CONTROL.md)\n- [KRAG — Knowledge Routing Augmented Generation](docs/features/platform/krag.md)\n- [Code Symbol Extraction](docs/features/ingestion/code-symbol-extraction.md)\n- [Tabular Data Routing](docs/features/ingestion/tabular-data-routing.md)\n- [Enterprise Gateway](docs/features/platform/enterprise-gateway.md)\n- [Engines](docs/ENGINES.md)\n- [Configuration Examples](docs/CONFIG_EXAMPLES.md)\n- [Custom Engines](docs/CUSTOM_ENGINES.md)\n- [Troubleshooting](docs/TROUBLESHOOTING.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyafitzdev%2Ffitz-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyafitzdev%2Ffitz-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyafitzdev%2Ffitz-ai/lists"}