{"id":22814802,"url":"https://github.com/seonglae/resrer","last_synced_at":"2026-01-30T14:16:35.727Z","repository":{"id":233747514,"uuid":"703546424","full_name":"seonglae/ReSRer","owner":"seonglae","description":"Retriever, Summarizer, Reader for LLM ODQA(Open-Domain Question Answering) to increase Information Density","archived":false,"fork":false,"pushed_at":"2025-12-06T11:33:11.000Z","size":516,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-12-10T05:48:47.312Z","etag":null,"topics":["context-compression","llm","odqa","qa","question-answering","summarizer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seonglae.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-11T12:52:46.000Z","updated_at":"2025-12-06T11:33:08.000Z","dependencies_parsed_at":"2024-05-06T14:31:39.042Z","dependency_job_id":"edc32be6-2edf-4b47-bf4e-9f3cc2e1a39a","html_url":"https://github.com/seonglae/ReSRer","commit_stats":null,"previous_names":["seonglae/resrer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/seonglae/ReSRer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seonglae%2FReSRer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seonglae%2FReSRer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seonglae%2FReSRer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seonglae%2FReSRer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seonglae","download_url":"https://codeload.github.com/seonglae/ReSRer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seonglae%2FReSRer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28913955,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-30T12:13:43.263Z","status":"ssl_error","status_checked_at":"2026-01-30T12:13:22.389Z","response_time":66,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["context-compression","llm","odqa","qa","question-answering","summarizer"],"created_at":"2024-12-12T13:10:41.116Z","updated_at":"2026-01-30T14:16:35.713Z","avatar_url":"https://github.com/seonglae.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ReSRer (Retriever, Summarizer, Reader)\nReducing the context text size and increasing QA score simultaneously for ODQA(Open-Domain Question Answering) by raising Information Density\n\n\n## Abstract\nLarge Language Models (LLMs) demonstrate strong performance in various tasks like Question Answering and Reasoning. However, due to the nature of the Transformer structure, they are limited to considering only a restricted length of context. Despite recent attempts to extend context using techniques like Sparse Attention, there is a lack of research on context shortening. Reducing context while maintaining the same performance can be computationally efficient and particularly effective in removing noise that contained unrelated to query in document retrieved by Retrieval-Augmented Generation (RAG). [Summarization is a task that creates a shorter version of text while preserving its principal information content](https://aclanthology.org/N18-1158/). We proposes a ReSRer architecture, which incorporates a Summarizer model between the traditional Reader-Retriever architecture in an Open-Domain Question Answering system. This approach provides the several Reader and Retriever with improving the overall QA pipeline performance.\n\n\n# Results\n## Demo on Huggingface Space\n- [Demo](https://huggingface.co/spaces/seonglae/resrer-demo) in Huggingface Space\n\n\u003ca href=\"https://huggingface.co/spaces/seonglae/resrer-demo\"\u003e\n  \u003cimg style=\"width: 75%\" src=\"image/image.png\" alt=\"ReSRer Demo\" /\u003e\n\u003c/a\u003e\n\n\n\n\n## Score results\n[Total score resulst](https://huggingface.co/datasets/seonglae/nq_open-validation)\n\n### Exact Match Increase Along Top-k Increase\n\u003cimg style=\"width: 75%\" src=\"https://github.com/seonglae/ReSRer/assets/27716524/ba5a6751-1091-498f-9807-ca431cb792d5\" alt=\"ReSRer Demo\" /\u003e\n\n### Exact Match Shrinking Along QA Pipeline\n\u003cimg style=\"width: 75%\" src=\"https://github.com/seonglae/ReSRer/assets/27716524/82a15456-0cda-4a67-a2de-3ba3c3505fbb\" alt=\"ReSRer Demo\" /\u003e\n\n### Token Count Changes Along Top-k Changing\n\u003cimg style=\"width: 75%\" src=\"https://github.com/seonglae/ReSRer/assets/27716524/3f518759-8687-4675-becf-c5df1d785651\" alt=\"Token count\" /\u003e\n\n\n# Prompt\nWe mainly focused on NQ(Natural Question) dataset for this time.\n\n## Reader prompt for NQ\n```\nExtract a concise noun-based answer from the provided context for the question. Your answer should be under three words and extracted directly from a context of no more than five words. You can analyze the context step by step to derive the answer. Avoid using prefixes that indicate the type of answer; simply present the shortest relevant answer span from the context.\n```\n## Summarizer prompt for NQ\nWe did several \n```\nCondense the provided passages to focus on key elements directly answering the question. Your summary should be a third of the original passages' length and at least 150 words. Highlight critical information and evidence supporting the answer. Avoid generalizations or unrelated details. Ensure the final answer is present in the summary, keeping the exact span of the answer to under five words. Present the summary in a clear, bullet-point format for each key element related to the question. Aim for a balance between conciseness and completeness.\n```\n\n## Models\nTrained model for ReSRer reader on Huggingface trained in [55k Training Dataset](https://huggingface.co/datasets/seonglae/resrer-nq) generated from GPT-3 with the below prompt\nOur main goal was not to train a summarizing small model, but rather to prove that a summarizer module between the retriever and reader is an efficient method. So, we did not delve into training with the most recent summarizer prompt dataset. Therefore, this model's performance is not as good as with the original context (even better than native summarizer though). We disclose this because it might be helpful for people who want to reduce computing costs dramatically.\n- [PegasusX](https://huggingface.co/seonglae/resrer-pegasus-x)\n- [Bart](https://huggingface.co/seonglae/resrer-bart-base)\n\n\n# Contribution\nAs I mentioned earlier, our research was aimed at exploring the potential benefits of an effective abstractive summarizer for QA tasks. Initially, we planned to test this approach. However, given the significant advancements made by SuRe and LLMLingua in this domain, we decided to halt our research.\n\nAlthough our improvements of 4% (which translates to nearly 20% from the original score) may not seem impressive, we demonstrate that a single summarizer module can effectively handle simple tasks such as single-hop question answering, in contrast to more complex multi-step approaches. However, we were unable to confirm whether this single-step context pruning is effective for more intricate tasks like reasoning and code generation. Therefore, there may be room for further contributions in this area in the future.\n\n# Get Started\n\n\n### 1. Install dependencies\n\n```bash\ngit clone https://github.com/seonglae/ReSRer\ncd ReSRer\nrye sync\n# or\npip insatll .\n# for training\npip install git+https://github.com/NVIDIA/TransformerEngine.git@stable\npip install --force-reinstall typing-extensions==4.5.0\npip uninstall deepspeed\npip install deepspeed\npip uninstall -y apex\n```\n\n### 2. create .env\n\n```bash\nMILVUS_PW=\nMILVUS_HOST=resrer\n```\n\n### 3. QA pipeline\n\n```bash\npython qa_pipeline.py\n```\n\n# Index to Vector DB\n\n`indexing.json`\n\n- check embedding dimension of tei\n- subset target\n- streaming or not\n- collection name\n\n```bash\npython indexing.py\n```\n\n# TEI\n\n[install guide](https://texonom.com/434f6f39b88342ea9e5156bd8501d8c4)\n\n```\nnpm i -g pm2\nmodel=\npm2 start data/tei.json\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseonglae%2Fresrer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseonglae%2Fresrer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseonglae%2Fresrer/lists"}