{"id":32720610,"url":"https://github.com/alexmartin1722/mirage","last_synced_at":"2026-05-14T13:32:05.149Z","repository":{"id":320408208,"uuid":"1081992288","full_name":"alexmartin1722/mirage","owner":"alexmartin1722","description":"An evaluation framework for evaluating any modality to text generation and multimodal RAG. ","archived":false,"fork":false,"pushed_at":"2025-10-28T16:01:09.000Z","size":252,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-02T20:03:50.105Z","etag":null,"topics":["multimodal","multimodal-rag","multimodal-summarization","rag","rag-evaluation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alexmartin1722.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-23T15:21:35.000Z","updated_at":"2025-10-28T16:01:13.000Z","dependencies_parsed_at":"2025-10-23T17:34:23.300Z","dependency_job_id":"092c7624-43c3-404c-8c95-e52a299f7fa8","html_url":"https://github.com/alexmartin1722/mirage","commit_stats":null,"previous_names":["alexmartin1722/mirage"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/alexmartin1722/mirage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexmartin1722%2Fmirage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexmartin1722%2Fmirage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexmartin1722%2Fmirage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexmartin1722%2Fmirage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alexmartin1722","download_url":"https://codeload.github.com/alexmartin1722/mirage/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexmartin1722%2Fmirage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33026817,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-14T02:00:06.663Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["multimodal","multimodal-rag","multimodal-summarization","rag","rag-evaluation"],"created_at":"2025-11-02T20:01:09.654Z","updated_at":"2026-05-14T13:32:05.121Z","avatar_url":"https://github.com/alexmartin1722.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MiRAGE: Multimodal Retrieval-Augmented Generation Evaluation\n\n\u003cdiv align=\"center\"\u003e\n\u003ca href=\"\" target=\"_blank\"\u003e\u003cimg src=https://img.shields.io/badge/arXiv-b5212f.svg?logo=arxiv\u003e\u003c/a\u003e\n\u003c!-- \u003ca href=\"\" target=\"_blank\"\u003e\u003cimg src=https://img.shields.io/badge/HuggingFace-Evaluate-FF6D00?logo=huggingface\u003e\u003c/a\u003e --\u003e\n\u003c/div\u003e\n\nMiRAGE: Multimodal Retrieval-Augmented Generation Evaluation. \n\n## Contents\n* [Features](#features)\n* [Supported Tasks](#supported-tasks)\n* [Installation](#installation)\n* [MiRAGE Usage](#mirage-usage)\n* [Citation](#citation)\n* [Contact](#contact)\n\n\n## Features\n- Evaluating multimodal retrieval-augmented generation systems.\n- Integration with vLLM, DeepSpeed, FlashAttention, and other efficient inference techniques.\n- Easy-to-use command line interface for running various metrics.\n- Evaluation for generation from videos.\n\n## Supported Tasks\n### Video RAG\n- WikiVideo: [repo](https://github.com/alexmartin1722/wikivideo), [paper](https://arxiv.org/abs/2504.00939)\n\n## Installation\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eFrom Scratch\u003c/b\u003e\u003c/summary\u003e\n\n```bash\nconda create -n video_rag_eval python=3.12 -y \nconda activate video_rag_eval\npip install --upgrade uv\nuv pip install vllm --torch-backend=cu128\npip install evaluate \npip install qwen-vl-utils[decord]==0.0.8\npip install peft\n```\n\u003c/details\u003e\n\n## MiRAGE Usage\n\n### VideoRAG Evaluation\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eData Prep\u003c/b\u003e\u003c/summary\u003e\n\nWhen evaluating VideoRAG, you will need the following data:\n\n- predictions, \n- references, \n- video directory, containing all the videos possible to use in RAG (for collection eval only),\n\n#### WikiVideo Data \nWe provide everything need to evaluate WikiVideo RAG systems in `data/wikivideo`\n- Human judgments for grounding `data/wikivideo/human_judgments/grounding_judgments`\n- Human preference judgments (EQJs in the paper) `data/wikivideo/human_preference`\n- Metric preference judgments (ICJs in the paper) `data/wikivideo/metric_preference`\n- Model predictions for various systems in `data/wikivideo/model_preds/`\n- Eval subset from WikiVideo used in the human eval and paper `data/wikivideo/human_eval_subset.json`\n\nFor any reference evaluation, you'll need to download the videos used in WikiVideo, which can be found on [huggingface](https://huggingface.co/datasets/hltcoe/wikivideo).\n\n\n#### Custom data\nTo run our code as is, you'll need to format your data and system predictions in the following formats:\n\n- **System Predictions**: A JSON file where each entry's key is the topic ID and the values associated with that ID are\n  - `prediction`: The generated text from the RAG system. We recommend stripping the citations from this so it is pure text. \n  - `sentences`: The sentence tokenized version of the prediction.\n  - `claims`: A list where each index corresponds to a sentence and at each index is the subclaims for that sentence\n  - `citations`: A list where each index corresponds to a sentence and at each index is the citations for that sentence. We use the video path as the citation text\n    \n    Example:\n    ```json\n    {\n      \"Topic_ID\" : {\n        \"prediction\": \"Generated text here...\",\n        \"sentences\": [\"Generated sentence 1.\", \"Generated sentence 2.\"],\n        \"claims\": [[\"Subclaim 1 for sentence 1.\", \"Subclaim 2 for sentence 1.\"], [\"Subclaim 1 for sentence 2.\"]],\n        \"citations\": [[\"path to citation 1\", \"path to citation 2\"], [\"path to citation 3\"]]\n      }\n    }\n    ```\n\n- **References**: A JSON file where each entry's key is the topic ID and the values associated with that ID are \n  - `article`: The ground truth text for the topic written by a human. \n  - `claims_to_supporting_videos`: a mapping between the claims of the reference and the videos that support those claims. This is a dictionary formatted s.t. each key is a claim and the values are (1) supporting videos and (2) the modalities from the videos that support the claim.\n  \n    Example:\n    ```json\n    {\n      \"Topic_ID\" : {\n        \"article\": \"Ground truth article text here...\",\n        \"claims_to_supporting_videos\": {\n          \"Claim 1\": {\n            \"supporting_videos\": [\"video_id\", \"video_id\"],\n            \"videos_modalities\": {\n              \"video_id\": [\"video\", \"audio\"],\n              \"video_id\": [\"video\"]\n            }\n          },\n          \"Claim 2\": {\n            \"supporting_videos\": [\"video_id\"],\n            \"videos_modalities\": {\n              \"video_id\": [\"video\", \"audio\", \"ocr\"]\n            }\n          }\n        }\n    }\n    ```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eEvaluation\u003c/b\u003e\u003c/summary\u003e\n\nWhen evaluating the RAG tasks, our metrics are driven by two files `infof1.py` and `citef1.py` for InfoF1 and CiteF1 respectively. \n\n#### InfoF1:\n```bash\npython infof1.py \\\n    --eval_type [reference|collection] \\\n    --prediction [path_to_system_prediction] \\\n    --reference [path_to_human_eval_json] \\\n    --video_dir [path_to_videos] \\ #only needed for collection eval\n    --output_dir [path_to_output_directory] \\\n    --model_name [qwen_7b|qwen_72b]\n```\n```bash\npython infof1.py \\\n    --eval_type collection \\\n    --prediction data/wikivideo/model_preds/qwen_72b_cag_relevant_citations.json \\\n    --reference data/wikivideo/human_eval_subset.json \\\n    --video_dir /exp/amartin/wikivideo/all_videos \\\n    --output_dir data/wikivideo/model_preds/metric_outputs \\\n    --model_name qwen_7b\n```\n#### CiteF1:\n```bash\npython citef1.py \\\n    --eval_type [reference|collection] \\\n    --prediction [path_to_system_prediction] \\\n    --reference [path_to_human_eval_json] \\\n    --video_dir [path_to_videos] \\ #only needed for collection eval\n    --output_dir [path_to_output_directory] \\\n    --model_name [qwen_7b|qwen_72b]\n```\n```bash\npython citef1.py \\\n    --eval_type collection \\\n    --prediction data/wikivideo/model_preds/qwen_72b_cag_relevant_citations.json \\\n    --reference data/wikivideo/human_eval_subset.json \\\n    --video_dir /exp/amartin/wikivideo/all_videos \\\n    --output_dir data/wikivideo/model_preds/metric_outputs \\\n    --model_name qwen_7b\n```\n\n\u003c/details\u003e\n\n\n\n## Citation\nIf you find MiRAGE useful in your research, please consider citing the following paper:\n\n```\n```\n\n## Contact\nIf you have MiRAGE specific questions, would like a new feature, model support, supported dataset, etc., feel free to open an issue. \n\nYou can also reach out to me for general comments/suggestions/questions through email. \n- Alexander Martin, amart233@jhu.edu\n    - if the email listed there is out of date, you can find my current email on my [personal website](https://alexmartin1722.github.io/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexmartin1722%2Fmirage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falexmartin1722%2Fmirage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexmartin1722%2Fmirage/lists"}