{"id":31796190,"url":"https://github.com/graph-com/haystackcraft","last_synced_at":"2026-04-15T20:03:01.932Z","repository":{"id":318568570,"uuid":"1069340213","full_name":"Graph-COM/HaystackCraft","owner":"Graph-COM","description":"Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation","archived":false,"fork":false,"pushed_at":"2025-10-07T22:44:56.000Z","size":3243,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-08T00:27:23.825Z","etag":null,"topics":["agent","benchmark","context-engineering","deep-research","llm","long-context","rag","retrieval"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Graph-COM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-03T19:20:47.000Z","updated_at":"2025-10-07T22:44:59.000Z","dependencies_parsed_at":"2025-10-08T00:27:25.921Z","dependency_job_id":"275bd215-0d45-474a-8609-9c77d60dfdfb","html_url":"https://github.com/Graph-COM/HaystackCraft","commit_stats":null,"previous_names":["graph-com/haystackcraft"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Graph-COM/HaystackCraft","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHaystackCraft","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHaystackCraft/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHaystackCraft/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHaystackCraft/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Graph-COM","download_url":"https://codeload.github.com/Graph-COM/HaystackCraft/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHaystackCraft/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279005270,"owners_count":26083863,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","benchmark","context-engineering","deep-research","llm","long-context","rag","retrieval"],"created_at":"2025-10-10T20:30:08.497Z","updated_at":"2025-10-10T20:30:11.976Z","avatar_url":"https://github.com/Graph-COM.png","language":"Python","readme":"# Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation\n\n[![arXiv](https://img.shields.io/badge/📄arXiv-2510.07414-ff6b6b?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white\u0026labelColor=1a1a2e)](https://arxiv.org/abs/2510.07414)\n\n![fig](theme_figure.png)\n\n## Table of Contents\n\n- [Environment Setup](#environment-setup)\n- [Static NIAH with Heterogeneous Retrieval Strategies](#static-niah-with-heterogeneous-retrieval-strategies)\n- [Dynamic NIAH](#dynamic-niah)\n    * [Retrieval Environment Setup](#retrieval-environment-setup)\n        + [BM25](#bm25)\n        + [qwen3_0.6](#qwen3_06)\n    * [LLM Inference (Enforced Multi-Round)](#llm-inference-enforced-multi-round)\n    * [LLM Inference (Variable-Round)](#llm-inference-variable-round)\n    * [Evaluation](#evaluation)\n- [Citation](#citation)\n\n## Environment Setup\n\n```bash\nconda create -n HaystackCraft python=3.10 -y\nconda activate HaystackCraft\npip install -r requirements.txt\n```\n\nIf you have trouble running Qwen2.5-1M models, you may create a separate environment with `requirements_0-7-2.txt`.\n\nIf you need to evaluate models from OpenAI, specify your OpenAI API key with\n\n```bash\nexport OPENAI_API_KEY=...\n```\n\nIf you need to evaluate Gemini models, specify your Gemini API key with\n\n```bash\nexport GEMINI_API_KEY=...\n```\n\n## Static NIAH with Heterogeneous Retrieval Strategies\n\nFor access to certain open source LLMs, you may need to first specify your huggingface token with `export HUGGING_FACE_HUB_TOKEN=...`.\n\nWe use vLLM for serving open source LLMs, e.g.,\n\n```bash\nvllm serve meta-llama/Llama-3.1-8B-Instruct --api-key token-abc123 --gpu-memory-utilization 0.95 --trust-remote-code --port 8000\n```\n\nFor LLM inference,\n\n```bash\npython infer_static.py --llm MODEL_TO_EVALUATE --retriever RETRIEVER_FOR_HAYSTACK_CONSTRUCTION --context_size TARGET_CONTEXT_SIZE --order HAYSTACK_ORDERING\n```\n\nAdditionally specify `--ppr` for graph-based reranking with Personalized PageRank (PPR) in haystack construction.\n\nFor inference with locally deployed open source LLMs, specify the port you use in vLLM deployment, e.g., `--port 8000`.\n\nFor evaluation, do for example\n\n```bash\npython eval.py --result_dir results/bm25/Llama-3.1-8B-Instruct/8000/descending_order/\n```\n\n## Dynamic NIAH\n\n### Retrieval Environment Setup\n\n#### BM25\n\nInstall Java 21 with for example\n\n```bash\ncurl -s \"https://get.sdkman.io\" | bash\nsource \"$HOME/.sdkman/bin/sdkman-init.sh\"\nsdk install java 21.0.3-tem\n```\n\n#### qwen3_0.6\n\nDeploy a local embedding server with vLLM.\n\n```bash\nvllm serve Qwen/Qwen3-Embedding-0.6B --port QWEN_RETRIEVER_EMB_PORT --api-key token-abc123 --gpu-memory-utilization 0.95 --trust-remote-code --enforce-eager\n```\n\n### LLM Inference (Enforced Multi-Round)\n\n```bash\npython infer_multi.py --llm MODEL_TO_EVALUATE --retriever RETRIEVER_FOR_HAYSTACK_CONSTRUCTION --context_size TARGET_CONTEXT_SIZE --num_rounds NUM_REASONING_ROUNDS\n```\n\nAdditional args:\n- `--port`: For inference with locally deployed open source LLMs, specify the port you use in vLLM deployment, e.g., `--port 8000`.\n- `--emb_port`: If you use `Qwen3-Embedding-0.6B` for haystack construction, specify `QWEN_RETRIEVER_EMB_PORT` used above.\n- `--ppr`: Specify `--ppr` for graph-based reranking with Personalized PageRank (PPR) in haystack construction.\n\n### LLM Inference (Variable-Round)\n\n```bash\npython infer_variable.py --llm MODEL_TO_EVALUATE --retriever RETRIEVER_FOR_HAYSTACK_CONSTRUCTION --context_size TARGET_CONTEXT_SIZE --max_rounds MAX_REASONING_ROUNDS\n```\n\nAdditional args:\n- `--port`: For inference with locally deployed open source LLMs, specify the port you use in vLLM deployment, e.g., `--port 8000`.\n- `--emb_port`: If you use `Qwen3-Embedding-0.6B` for haystack construction, specify `QWEN_RETRIEVER_EMB_PORT` used above.\n- `--ppr`: Specify `--ppr` for graph-based reranking with Personalized PageRank (PPR) in haystack construction.\n\n### Evaluation\n\nFor example\n\n```bash\npython eval_100.py --result_dir 2_round_results/qwen3_0.6/gemini-2.5-flash-lite/8000/descending_order\n```\n\n## Citation\n\n```bash\n@article{li2025haystack,\n  title={Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation},\n  author={Mufei Li and Dongqi Fu and Limei Wang and Si Zhang and Hanqing Zeng and Kaan Sancak and Ruizhong Qiu and Haoyu Wang and Xiaoxin He and Xavier Bresson and Yinglong Xia and Chonglin Sun and Pan Li},\n  journal={arXiv preprint arXiv:2510.07414},\n  year={2025}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraph-com%2Fhaystackcraft","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgraph-com%2Fhaystackcraft","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraph-com%2Fhaystackcraft/lists"}