{"id":50885006,"url":"https://github.com/hanxiao/knowledge-graph-extractor","last_synced_at":"2026-06-15T16:04:26.630Z","repository":{"id":361045187,"uuid":"1251648836","full_name":"hanxiao/knowledge-graph-extractor","owner":"hanxiao","description":"Turn any document or a whole zip into an interactive knowledge graph, using a self-hosted Qwen3.6-35B-A3B-MTP on a single NVIDIA L4","archived":false,"fork":false,"pushed_at":"2026-06-12T05:01:39.000Z","size":5519,"stargazers_count":2,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-12T05:06:32.762Z","etag":null,"topics":["fastapi","force-graph","gpu","information-extraction","knowledge-graph","llama-cpp","llm","qwen","self-hosted"],"latest_commit_sha":null,"homepage":"https://hanxiao.io/knowledge-graph","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hanxiao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-27T19:35:04.000Z","updated_at":"2026-06-12T05:01:43.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hanxiao/knowledge-graph-extractor","commit_stats":null,"previous_names":["hanxiao/ki-extractor","hanxiao/knowledge-graph-extractor"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hanxiao/knowledge-graph-extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hanxiao%2Fknowledge-graph-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hanxiao%2Fknowledge-graph-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hanxiao%2Fknowledge-graph-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hanxiao%2Fknowledge-graph-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hanxiao","download_url":"https://codeload.github.com/hanxiao/knowledge-graph-extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hanxiao%2Fknowledge-graph-extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34369850,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","force-graph","gpu","information-extraction","knowledge-graph","llama-cpp","llm","qwen","self-hosted"],"created_at":"2026-06-15T16:04:23.461Z","updated_at":"2026-06-15T16:04:26.622Z","avatar_url":"https://github.com/hanxiao.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Knowledge Graph Extractor\n\nTurn any document, URL, or a zip of files into an interactive knowledge graph,\nusing a self-hosted LLM (Qwen3.6-35B-A3B-MTP) on a single NVIDIA L4.\n\nLive demo: https://hanxiao.io/knowledge-graph\n\n[![Knowledge Graph Extractor](assets/hero.png)](https://hanxiao.io/knowledge-graph)\n\nEach extracted fact is one graph edge: a `(subject) --[predicate]--\u003e (object)`\ntriple plus a title, description, evidence span, confidence, tags, and source\nfile. Facts stream into a force-directed graph; hover an edge for the full card.\n\n## How it works\n\n1. **Input** — paste text, a URL (fetched to markdown via Jina Reader), or a\n   `.zip` (txt, md, html, pdf, docx, json, csv, code...). Oversized docs are\n   chunked (not truncated) so the full text is processed.\n2. **Extract** — the LLM emits atomic `(subject, predicate, object)` triples.\n   The prompt forces canonical entity/value subjects and objects so nodes\n   connect instead of becoming prose dead-ends.\n3. **Dedup** (on by default) — semantic dedup via\n   [jina-embeddings-v5-text-nano](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano)\n   on CPU, across rounds and across files.\n4. **Visualize** — every unique fact is one edge; node names are normalized so\n   variants merge. Download the result as JSONL.\n\n## Job queue\n\nThe L4 has one llama slot, so jobs run one at a time via a single-slot scheduler:\na new submission preempts the running job, which is persisted and auto-resumes\nfrom where it left off when the slot frees. Jobs (meta + facts.jsonl + input)\npersist under `data/jobs/` so the list, JSONL reload, and resume survive\nrestarts.\n\n## Stack\n\n- **llama-server** — [llama.cpp](https://github.com/ggml-org/llama.cpp) with\n  CUDA, serves the model over an OpenAI-compatible API (port 8080).\n- **app** — FastAPI: extraction + scheduler + CPU dedup + UI (port 3000).\n\n## Setup\n\nSingle NVIDIA L4 24GB GPU (e.g. GCP `g2-standard-8`). Needs Docker + the NVIDIA\nContainer Toolkit.\n\n```bash\ngit clone https://github.com/hanxiao/knowledge-graph-extractor.git\ncd knowledge-graph-extractor\n\ncp .env.example .env          # add your JINA_API_KEY (https://jina.ai/api-key)\nbash scripts/setup.sh         # downloads the model (~17GB) and starts both services\n```\n\nThen open `http://\u003cyour-ip\u003e:3000`.\n\nManual model download + run:\n\n```bash\nmkdir -p models\npip install -q huggingface-hub\npython3 -c \"from huggingface_hub import hf_hub_download; \\\nhf_hub_download('unsloth/Qwen3.6-35B-A3B-MTP-GGUF', \\\n'Qwen3.6-35B-A3B-UD-Q3_K_XL.gguf', local_dir='models')\"\ndocker compose up -d --build\n```\n\n## Configuration\n\nllama-server flags live in `docker-compose.yml`. Key ones:\n\n| Flag | Value | Why |\n|------|-------|-----|\n| `--ctx-size` | 16384 | Input capacity vs VRAM |\n| `--spec-type draft-mtp` | — | MTP speculative decoding (large speedup on L4) |\n| `--cache-reuse` | 256 | KV cache reuse across rounds on the same doc |\n| `--flash-attn` | 1 | Flash attention |\n| `--n-predict` | 8192 | Max generation length |\n\nUI parameters: rounds per doc, dedup model (on/off), dedup field, dedup\nthreshold. Benchmark notes on quantization and decoding live in\n[`autoresearch/`](autoresearch/REPORT.md).\n\n## Layout\n\n```\napp.py             FastAPI app: extraction + UI + API\njobs.py            single-slot job scheduler (queue/preempt/backfill/persist)\nDockerfile         app container\ndocker-compose.yml both services + data volume\nscripts/setup.sh   one-shot GCP L4 setup\nautoresearch/      throughput benchmark notes\ndata/              persisted jobs (gitignored)\nmodels/            model files (gitignored)\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhanxiao%2Fknowledge-graph-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhanxiao%2Fknowledge-graph-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhanxiao%2Fknowledge-graph-extractor/lists"}