{"id":47928326,"url":"https://github.com/icemap/sprig","last_synced_at":"2026-04-04T07:03:03.102Z","repository":{"id":336845174,"uuid":"1151319355","full_name":"Icemap/SPRIG","owner":"Icemap","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-06T11:40:50.000Z","size":229,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-06T19:25:37.328Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Icemap.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-06T10:15:03.000Z","updated_at":"2026-02-06T11:40:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/Icemap/SPRIG","commit_stats":null,"previous_names":["icemap/sprig"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Icemap/SPRIG","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Icemap%2FSPRIG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Icemap%2FSPRIG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Icemap%2FSPRIG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Icemap%2FSPRIG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Icemap","download_url":"https://codeload.github.com/Icemap/SPRIG/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Icemap%2FSPRIG/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31390695,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T04:26:24.776Z","status":"ssl_error","status_checked_at":"2026-04-04T04:23:34.147Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-04T07:03:01.508Z","updated_at":"2026-04-04T07:03:03.064Z","avatar_url":"https://github.com/Icemap.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SPRIG Reproduction Guide\n\nThis repository contains the experimental code and scripts for the SPRIG paper. The steps\nbelow reproduce all tables/figures in the main text and appendix (efficiency, ablations,\nsignificance tests, QA evaluation, etc.). CPU-only is assumed.\n\n## 1. Environment \u0026 Install\n\n### Hardware / OS\n- Python \u003e= 3.12\n- CPU-only (paper uses 4 GB RAM budget)\n- Linux/macOS\n\n### Install dependencies\n```bash\npython3 -m venv .venv\nsource .venv/bin/activate\npip install -U pip\npip install -e \".[vector,data,plot,qa,dev]\"\npython3 -m spacy download en_core_web_sm\n```\n\n## 2. Datasets\n\nWe use HuggingFace `datasets`:\n- HotpotQA: `hotpot_qa` (config `distractor`)\n- 2WikiMultiHopQA: `framolfese/2WikiMultihopQA`\n\nDefault cache is `~/.cache/huggingface`; override if needed:\n```bash\nexport HF_HOME=/path/to/hf_cache\n```\n\nPaper validation sizes:\n- HotpotQA: 7,405 queries / 66,581 docs\n- 2WikiMultiHopQA: 10,000 queries / 45,902 docs\n\n## 3. Main Results (Tables)\n\nSet a common tag and sample sizes first:\n```bash\nexport TAG=xxx\nexport HOTPOT_N=7405\nexport TWOWIKI_N=10000\n```\n\nChoose your output locations (set any paths you like):\n```bash\nexport HOTPOT_LEX_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_DENSE_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_RRF_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_GRAPH_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_LEX_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_DENSE_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_RRF_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_GRAPH_DIR=\"\u003cpath\u003e\"\n```\n\n### 3.1 HotpotQA (main)\n\n**(a) Lexical baselines**\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods bm25 rm3 bm25_2step \\\n  --tag $TAG --output $HOTPOT_LEX_DIR\n```\n\n**(b) Dense baseline (bge-small)**\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods dense \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --tag $TAG --output $HOTPOT_DENSE_DIR\n```\n\n**(c) RRF / Rerank**\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods rrf rerank rrf_rerank \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --rrf-k 60 \\\n  --tag $TAG --output $HOTPOT_RRF_DIR\n```\n\n**(d) Graph family (GraphHybrid/GraphDense/GraphRRF, etc.)**\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods tfidf_graph graph graph_hybrid graph_dense graph_rrf rrf_ppr_fusion graph_bm25_fallback \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy \\\n  --graph-entity-normalize simple \\\n  --graph-seed-weighting rank \\\n  --graph-hub-penalty 0.5 \\\n  --graph-seed-entity-df-power 0.5 \\\n  --graph-seed-docs-k 5 \\\n  --graph-seed-docs-k-bm25 10 \\\n  --graph-seed-docs-k-rrf 10 \\\n  --graph-fallback-k 1 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --term-min-df 3 --term-max-df-ratio 0.1 --term-norm row \\\n  --hnsw-m 32 --hnsw-ef-construction 200 --hnsw-ef-search 64 \\\n  --tag $TAG --output $HOTPOT_GRAPH_DIR\n```\n\n### 3.2 2WikiMultiHopQA (main)\n\n**(a) Lexical baselines**\n```bash\nsprig run --dataset 2wikimultihopqa --split validation --max-samples $TWOWIKI_N \\\n  --methods bm25 rm3 bm25_2step \\\n  --tag $TAG --output $TWOWIKI_LEX_DIR\n```\n\n**(b) Dense baseline (bge-small)**\n```bash\nsprig run --dataset 2wikimultihopqa --split validation --max-samples $TWOWIKI_N \\\n  --methods dense \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --tag $TAG --output $TWOWIKI_DENSE_DIR\n```\n\n**(c) RRF / Rerank**\n```bash\nsprig run --dataset 2wikimultihopqa --split validation --max-samples $TWOWIKI_N \\\n  --methods rrf rerank rrf_rerank \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --rrf-k 60 \\\n  --tag $TAG --output $TWOWIKI_RRF_DIR\n```\n\n**(d) Graph family**\n```bash\nsprig run --dataset 2wikimultihopqa --split validation --max-samples $TWOWIKI_N \\\n  --methods tfidf_graph graph graph_hybrid graph_dense graph_rrf rrf_ppr_fusion graph_bm25_fallback \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy \\\n  --graph-entity-normalize lower \\\n  --graph-seed-weighting rank \\\n  --graph-hub-penalty 0.5 \\\n  --graph-seed-entity-df-power 1.0 \\\n  --graph-seed-docs-k 3 \\\n  --graph-seed-docs-k-bm25 5 \\\n  --graph-seed-docs-k-rrf 5 \\\n  --graph-fallback-k 1 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n  --term-min-df 3 --term-max-df-ratio 0.1 --term-norm row \\\n  --hnsw-m 32 --hnsw-ef-construction 200 --hnsw-ef-search 64 \\\n  --tag $TAG --output $TWOWIKI_GRAPH_DIR\n```\n\n## 4. Auxiliary Analysis (required before exporting tables)\n\n### 4.1 Merge runs (for QA / significance)\n```bash\nexport HOTPOT_MERGED_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_MERGED_DIR=\"\u003cpath\u003e\"\n\npython3 scripts/merge_runs.py --output $HOTPOT_MERGED_DIR \\\n  --runs $HOTPOT_LEX_DIR $HOTPOT_DENSE_DIR \\\n         $HOTPOT_RRF_DIR $HOTPOT_GRAPH_DIR\n\npython3 scripts/merge_runs.py --output $TWOWIKI_MERGED_DIR \\\n  --runs $TWOWIKI_LEX_DIR $TWOWIKI_DENSE_DIR \\\n         $TWOWIKI_RRF_DIR $TWOWIKI_GRAPH_DIR\n```\n\n### 4.2 QA eval (Appendix QA table)\n```bash\npython3 scripts/run_qa_eval.py --run-dir $HOTPOT_MERGED_DIR \\\n  --dataset hotpotqa --split validation \\\n  --method bm25 rrf rerank graph graph_hybrid graph_dense \\\n  --top-k 5 --limit-queries 1000 --seed 42\n\npython3 scripts/run_qa_eval.py --run-dir $TWOWIKI_MERGED_DIR \\\n  --dataset 2wikimultihopqa --split validation \\\n  --method bm25 rrf rerank graph graph_hybrid graph_dense \\\n  --top-k 5 --limit-queries 1000 --seed 42\n```\n\nCopy `qa_metrics.json` into the main run dirs (table export expects them there):\n```bash\ncp $HOTPOT_MERGED_DIR/qa_metrics.json $HOTPOT_GRAPH_DIR/qa_metrics.json\ncp $TWOWIKI_MERGED_DIR/qa_metrics.json $TWOWIKI_GRAPH_DIR/qa_metrics.json\n```\n\n### 4.3 NER proxy (Appendix NER table)\n```bash\npython3 scripts/ner_proxy_eval.py --run-dir $HOTPOT_GRAPH_DIR \\\n  --method graph graph_hybrid --ner-mode spacy regex --entity-normalize simple\n\npython3 scripts/ner_proxy_eval.py --run-dir $TWOWIKI_GRAPH_DIR \\\n  --method graph graph_hybrid --ner-mode spacy regex --entity-normalize lower\n```\n\n### 4.4 Hub pruning coverage (Appendix Hub table)\n```bash\npython3 scripts/hub_pruning_analysis.py --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-hub-top-ratio 0.01 \\\n  --output $HOTPOT_GRAPH_DIR/hub_pruning.json\n\npython3 scripts/hub_pruning_analysis.py --dataset 2wikimultihopqa --split validation --max-samples $TWOWIKI_N \\\n  --graph-ner spacy --graph-entity-normalize lower \\\n  --graph-hub-top-ratio 0.01 \\\n  --output $TWOWIKI_GRAPH_DIR/hub_pruning.json\n```\n\n## 5. Export paper tables\n\nGenerate summary CSV + LaTeX tables:\n```bash\npython3 scripts/summarize_results.py\npython3 scripts/export_paper_tables.py --tag $TAG --supp-tag $TAG --ann-tag $TAG --dense-tag $TAG\n```\n\n## 6. Significance tests (Appendix Significance tables)\n\n```bash\nexport SIG_HOTPOT_JSON=\"\u003cpath\u003e\"\nexport SIG_TWOWIKI_JSON=\"\u003cpath\u003e\"\nexport SIG_TABLE_DIR=\"\u003cpath\u003e\"\n\npython3 scripts/bootstrap_significance_all.py --run-dir $HOTPOT_MERGED_DIR \\\n  --baseline bm25 rrf --k 10 --iters 1000 \\\n  --output $SIG_HOTPOT_JSON\n\npython3 scripts/bootstrap_significance_all.py --run-dir $TWOWIKI_MERGED_DIR \\\n  --baseline bm25 rrf --k 10 --iters 1000 \\\n  --output $SIG_TWOWIKI_JSON\n\npython3 scripts/export_significance_tables.py \\\n  --input $SIG_HOTPOT_JSON \\\n  --output $SIG_TABLE_DIR/significance_hotpotqa_bm25.tex \\\n  --baseline bm25 --metric recall --top 10\n\npython3 scripts/export_significance_tables.py \\\n  --input $SIG_HOTPOT_JSON \\\n  --output $SIG_TABLE_DIR/significance_hotpotqa_rrf.tex \\\n  --baseline rrf --metric recall --top 10\n\npython3 scripts/export_significance_tables.py \\\n  --input $SIG_TWOWIKI_JSON \\\n  --output $SIG_TABLE_DIR/significance_2wiki_bm25.tex \\\n  --baseline bm25 --metric recall --top 10\n\npython3 scripts/export_significance_tables.py \\\n  --input $SIG_TWOWIKI_JSON \\\n  --output $SIG_TABLE_DIR/significance_2wiki_rrf.tex \\\n  --baseline rrf --metric recall --top 10\n```\n\n## 7. Efficiency \u0026 Scalability (Figure + Appendix table)\n\n```bash\npython3 scripts/run_efficiency.py --dataset hotpotqa --split validation \\\n  --sizes 200 1000 3000 7405 \\\n  --methods bm25 dense graph graph_dense \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-seed-docs-k 5 --graph-seed-docs-k-bm25 10 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 \\\n  --graph-seed-entity-df-power 0.5 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --tag eff2\n\npython3 scripts/run_efficiency.py --dataset 2wikimultihopqa --split validation \\\n  --sizes 200 1000 3000 10000 \\\n  --methods bm25 dense graph graph_dense \\\n  --graph-ner spacy --graph-entity-normalize lower \\\n  --graph-seed-docs-k 3 --graph-seed-docs-k-bm25 5 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 \\\n  --graph-seed-entity-df-power 1.0 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n  --dense-model BAAI/bge-small-en-v1.5 \\\n  --tag eff2\n\npython3 scripts/plot_efficiency.py\n```\n\nPer-doc table (choose any output path):\n```bash\npython3 scripts/efficiency_per_doc.py --summary-csv \u003cpath\u003e --tag eff2 --output \u003cpath\u003e\n```\n\n## 8. Ablations\n\n### 8.1 Graph/PPR ablations (500-query subsets)\n```bash\nexport ABLATION_HOTPOT_DIR=\"\u003cpath\u003e\"\nexport ABLATION_TWOWIKI_DIR=\"\u003cpath\u003e\"\nexport ABLATION_PLOT_DIR=\"\u003cpath\u003e\"\n\npython3 scripts/run_ablation.py --dataset hotpotqa --split validation \\\n  --max-samples 2000 --limit-queries 500 \\\n  --seed-source dense --ner spacy regex \\\n  --seed-k 1 3 5 10 \\\n  --seed-weighting raw softmax rank \\\n  --alpha 0.1 0.15 0.2 --iter 5 10 20 \\\n  --ppr-mode power push \\\n  --output $ABLATION_HOTPOT_DIR\n\npython3 scripts/run_ablation.py --dataset 2wikimultihopqa --split validation \\\n  --max-samples 2000 --limit-queries 500 \\\n  --seed-source dense --ner spacy regex \\\n  --seed-k 1 3 5 10 \\\n  --seed-weighting raw softmax rank \\\n  --alpha 0.1 0.15 0.2 --iter 5 10 20 \\\n  --ppr-mode power push \\\n  --output $ABLATION_TWOWIKI_DIR\n\npython3 scripts/plot_ablation.py --inputs \\\n  $ABLATION_HOTPOT_DIR/ablation_results.csv \\\n  $ABLATION_TWOWIKI_DIR/ablation_results.csv \\\n  --output $ABLATION_PLOT_DIR\n```\n\n### 8.2 TF-IDF Term Graph ablation\n```bash\nexport ABLATION_TERM_HOTPOT_DIR=\"\u003cpath\u003e\"\nexport ABLATION_TERM_TWOWIKI_DIR=\"\u003cpath\u003e\"\n\npython3 scripts/run_term_graph_ablation.py --dataset hotpotqa --split validation \\\n  --max-samples 2000 --limit-queries 500 \\\n  --min-df 3 5 10 --max-df-ratio 0.1 0.2 0.3 \\\n  --output $ABLATION_TERM_HOTPOT_DIR\n\npython3 scripts/run_term_graph_ablation.py --dataset 2wikimultihopqa --split validation \\\n  --max-samples 2000 --limit-queries 500 \\\n  --min-df 3 5 10 --max-df-ratio 0.1 0.2 0.3 \\\n  --output $ABLATION_TERM_TWOWIKI_DIR\n```\n\n### 8.3 Ablation Top-10 tables\n`run_ablation.py` writes `ablation_top10.json`; convert to LaTeX:\n```bash\npython3 scripts/export_ablation_top10.py --input \u003cpath\u003e --output \u003cpath\u003e\n```\n\n## 9. Dense Seeding / ANN / Model Sensitivity\n\nChoose output locations for the runs below:\n```bash\nexport HOTPOT_GRAPHDENSE_2K_EXACT_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_GRAPHDENSE_2K_ANN_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_GRAPHDENSE_2K_EXACT_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_GRAPHDENSE_2K_ANN_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_GRAPHDENSE_HNSW_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_GRAPHDENSE_HNSW_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_DENSE_SENS_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_DENSE_SENS_DIR=\"\u003cpath\u003e\"\n```\n\n### 9.1 GraphDense: Exact vs ANN (2k subset)\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples 2000 \\\n  --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-seed-docs-k 5 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --dense-no-hnsw \\\n  --tag $TAG --output $HOTPOT_GRAPHDENSE_2K_EXACT_DIR\n\nsprig run --dataset hotpotqa --split validation --max-samples 2000 \\\n  --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-seed-docs-k 5 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --hnsw-m 32 --hnsw-ef-construction 200 --hnsw-ef-search 64 \\\n  --tag $TAG --output $HOTPOT_GRAPHDENSE_2K_ANN_DIR\n\nsprig run --dataset 2wikimultihopqa --split validation --max-samples 2000 \\\n  --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize lower \\\n  --graph-seed-docs-k 3 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 1.0 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n  --dense-no-hnsw \\\n  --tag $TAG --output $TWOWIKI_GRAPHDENSE_2K_EXACT_DIR\n\nsprig run --dataset 2wikimultihopqa --split validation --max-samples 2000 \\\n  --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize lower \\\n  --graph-seed-docs-k 3 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 1.0 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n  --hnsw-m 32 --hnsw-ef-construction 200 --hnsw-ef-search 64 \\\n  --tag $TAG --output $TWOWIKI_GRAPHDENSE_2K_ANN_DIR\n```\n\n### 9.2 HNSW grid (2k subset)\n```bash\nfor m in 16 32 64; do\n  for efs in 32 64 128 256; do\n    sprig run --dataset hotpotqa --split validation --max-samples 2000 \\\n      --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n      --graph-ner spacy --graph-entity-normalize simple \\\n      --graph-seed-docs-k 5 \\\n      --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n      --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n      --hnsw-m $m --hnsw-ef-construction 200 --hnsw-ef-search $efs \\\n      --tag $TAG --output $HOTPOT_GRAPHDENSE_HNSW_DIR/m${m}_efs${efs}\n\n    sprig run --dataset 2wikimultihopqa --split validation --max-samples 2000 \\\n      --methods graph_dense --dense-model BAAI/bge-small-en-v1.5 \\\n      --graph-ner spacy --graph-entity-normalize lower \\\n      --graph-seed-docs-k 3 \\\n      --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 1.0 \\\n      --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n      --hnsw-m $m --hnsw-ef-construction 200 --hnsw-ef-search $efs \\\n      --tag $TAG --output $TWOWIKI_GRAPHDENSE_HNSW_DIR/m${m}_efs${efs}\n  done\ndone\n```\n\n### 9.3 Dense model sensitivity (2k subset)\n```bash\nfor model in BAAI/bge-small-en-v1.5 sentence-transformers/all-MiniLM-L6-v2 intfloat/e5-small-v2; do\n  sprig run --dataset hotpotqa --split validation --max-samples 2000 \\\n    --methods dense graph_dense --dense-model $model \\\n    --graph-ner spacy --graph-entity-normalize simple \\\n    --graph-seed-docs-k 5 \\\n    --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n    --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n    --tag $TAG --output $HOTPOT_DENSE_SENS_DIR/${model##*/}\n\n  sprig run --dataset 2wikimultihopqa --split validation --max-samples 2000 \\\n    --methods dense graph_dense --dense-model $model \\\n    --graph-ner spacy --graph-entity-normalize lower \\\n    --graph-seed-docs-k 3 \\\n    --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 1.0 \\\n    --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode power \\\n    --tag $TAG --output $TWOWIKI_DENSE_SENS_DIR/${model##*/}\ndone\n```\n\n## 10. SPRIG-EL/PRUNE/MIX Enhancements\n\nGraphHybrid comparison on full and q1000:\n- Base: no enhancements\n- +EL: `--graph-use-aliases`\n- +PRUNE: `--graph-hub-top-ratio 0.01`\n- +MIX: `--graph-seed-mix-mode auto`\n- +ALL: all three combined\n\nChoose output locations for enhancement runs:\n```bash\nexport HOTPOT_ENH_BASE_DIR=\"\u003cpath\u003e\"\nexport HOTPOT_ENH_ALL_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_ENH_BASE_DIR=\"\u003cpath\u003e\"\nexport TWOWIKI_ENH_ALL_DIR=\"\u003cpath\u003e\"\n```\n\nExample (HotpotQA, full):\n```bash\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods graph_hybrid --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-seed-docs-k 5 --graph-seed-docs-k-bm25 10 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --tag $TAG --output $HOTPOT_ENH_BASE_DIR\n\nsprig run --dataset hotpotqa --split validation --max-samples $HOTPOT_N \\\n  --methods graph_hybrid --dense-model BAAI/bge-small-en-v1.5 \\\n  --graph-ner spacy --graph-entity-normalize simple \\\n  --graph-seed-docs-k 5 --graph-seed-docs-k-bm25 10 \\\n  --graph-seed-weighting rank --graph-hub-penalty 0.5 --graph-seed-entity-df-power 0.5 \\\n  --graph-use-aliases --graph-hub-top-ratio 0.01 --graph-seed-mix-mode auto \\\n  --ppr-alpha 0.15 --ppr-max-iter 5 --ppr-mode push \\\n  --tag $TAG --output $HOTPOT_ENH_ALL_DIR\n```\n\nOther variants are produced by adding `--graph-use-aliases` / `--graph-hub-top-ratio 0.01`\n/ `--graph-seed-mix-mode auto` onto the Base command; for 2Wiki use the parameters from\nSection 3.2 (e.g., `--graph-entity-normalize lower`, `--graph-seed-docs-k 3`, `--graph-seed-docs-k-bm25 5`).\nq1000 is the same with `--max-samples 1000`.\n\nSummarize GraphHybrid R@10/QTime into a table:\n```bash\npython3 scripts/summarize_graph_enhancements.py \\\n  --hotpot-base \u003cpath\u003e \\\n  --hotpot-all \u003cpath\u003e \\\n  --twowiki-base \u003cpath\u003e \\\n  --twowiki-all \u003cpath\u003e \\\n  --output \u003cpath\u003e\n```\n\n## 11. Robustness (w/o tune)\n\nThe command below removes 500 tuning queries from full validation:\n```bash\npython3 scripts/robustness_no_tune.py \\\n  --hotpot-run \u003cpath\u003e --hotpot-n 7405 \\\n  --twowiki-run \u003cpath\u003e --twowiki-n 10000 \\\n  --output \u003cpath\u003e\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficemap%2Fsprig","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ficemap%2Fsprig","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficemap%2Fsprig/lists"}