{"id":50938073,"url":"https://github.com/simula/mediaeval-medico-2025","last_synced_at":"2026-06-17T11:03:42.794Z","repository":{"id":299409419,"uuid":"979907937","full_name":"simula/MediaEval-Medico-2025","owner":"simula","description":"Official repository for the MediaEval Medico 2025: VQA (with multimodal explanations) for GastroIntestinal Imaging, featuring the Kvasir-VQA-x1 dataset, participation guidelines, and starter resources.","archived":false,"fork":false,"pushed_at":"2025-11-09T11:11:14.000Z","size":228,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-11-09T11:29:40.787Z","etag":null,"topics":["dataset","explainable-ai","gastroenterology","kvasir","mediaeval","medical-imaging","medico","multimodal","vqa"],"latest_commit_sha":null,"homepage":"https://multimediaeval.github.io/editions/2025/tasks/medico/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simula.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-08T09:00:48.000Z","updated_at":"2025-11-09T11:11:17.000Z","dependencies_parsed_at":"2025-08-27T09:25:00.438Z","dependency_job_id":"a265a196-4450-4d3e-8d29-b7c9bff01f7b","html_url":"https://github.com/simula/MediaEval-Medico-2025","commit_stats":null,"previous_names":["simula/mediaeval-medico-2025"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/simula/MediaEval-Medico-2025","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simula%2FMediaEval-Medico-2025","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simula%2FMediaEval-Medico-2025/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simula%2FMediaEval-Medico-2025/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simula%2FMediaEval-Medico-2025/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simula","download_url":"https://codeload.github.com/simula/MediaEval-Medico-2025/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simula%2FMediaEval-Medico-2025/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34445186,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","explainable-ai","gastroenterology","kvasir","mediaeval","medical-imaging","medico","multimodal","vqa"],"created_at":"2026-06-17T11:03:38.878Z","updated_at":"2026-06-17T11:03:42.777Z","avatar_url":"https://github.com/simula.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003e 🚀 **The journey continues!**  \n\u003e 🆕 The next iteration of the challenge is now live:  \n\u003e 🔗 https://github.com/simula/MediaEval-Medico-2026  \n\u003e 🌟 Join us for another exciting round of innovation and collaboration!\n\u003e\n\u003e \u003e ✅ **Update (October 2025):** The MediaEval Medico 2025 Challenge has concluded.  \n\u003e 📊 **Competition Results:** https://github.com/simula/MediaEval-Medico-2025/blob/main/competition_results.md  \n\u003e 🎥 Session recordings: https://www.youtube.com/playlist?list=PLHr-k69ARa0jMZycp19Kefje3dPMG4znR  \n\u003e 🙏 Thank you to all participants and contributors!\n\u003e \n# 🌟 **MediaEval Medico 2025: VQA (with multimodal explanations) for GastroIntestinal Imaging** 🌟\n\n📋 [**GitHub Repository**](https://github.com/simula/MediaEval-Medico-2025) | 🔗 [**MediaEval 2025**](https://multimediaeval.github.io/editions/2025/tasks/medico/) | 📝 [**Registration Form**](https://forms.gle/y7v1VLP7D9vsbuqv5) | 🏆 [**Leaderboard / Registered Submissions**](https://simulamet-medico-2025.hf.space)\n\n---\n\nThe **MediaEval Medico 2025 Challenge** 🔬 focuses on **Visual Question Answering (VQA)** for **Gastrointestinal (GI) imaging**, emphasizing **explainability** 🤔📖 to foster **trustworthy AI** for **clinical adoption** ⚕️.\n\nThis task continues the long-running **Medico series** at MediaEval, now leveraging the newly developed **Kvasir-VQA-x1 dataset**, designed to support **multimodal reasoning** and **interpretable clinical decision support** 📈.\n\n## 🏁 Workshop Completed\nThe **MediaEval Workshop** 🗣️ was held on:\n**🗓️ Saturday–Sunday, 25–26 October 2025** | 📍 Dublin, Ireland 🇮🇪 \u0026 Online 🌐 (between CMBI 2025 and ACM Multimedia 2025).  \n📊 **Competition Results:** https://github.com/simula/MediaEval-Medico-2025/blob/main/competition_results.md\n🎥 Recordings: https://www.youtube.com/playlist?list=PLHr-k69ARa0jMZycp19Kefje3dPMG4znR\n\n---\n\n---\n\n## 🌟 **Task Descriptions**\n\n### 🔍 **Subtask 1: AI Performance on Medical Image Question Answering**\n\n📈 **Goal:** Develop AI models that can accurately answer clinical questions using **GI endoscopic images**.\n\n🧠 The task uses **Kvasir-VQA-x1**, an advanced dataset comprising **159,549 QA pairs** from **6,500 original GI images**, featuring:\n- Multi-step reasoning questions  \n- Naturalized medical language  \n- Complexity scores for curriculum training  \n\n🔠 **Question Types** include:\n- Yes/No  \n- Single-Choice  \n- Multiple-Choice  \n- Color-related  \n- Location-related  \n- Numerical Count  \n- Merged reasoning-based questions  \n\n💡 **Example Training Notebook:**  \nNot sure where to start? Check out: [Training with ms-swift](https://github.com/simula/MediaEval-Medico-2025/blob/main/Task_1_Sample_Notebook.ipynb)  \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/simula/MediaEval-Medico-2025/blob/main/Task_1_Sample_Notebook.ipynb\"\u003e\n    \u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/\u003e\n\u003c/a\u003e\n\n⚠️ **Note:** You can only submit work for **Task 1** if you wish to participate.\n\n---\nIt is acceptable to use the full test set for training in your final submission to get competitive score. However, we strongly recommend using proper splits for training and clearly reporting in your paper which splits were used for training, and validation.\n\n### 💬 **Subtask 2: Clinician-Oriented Multimodal Explanations in GI**\n\n📌 **Goal:** Move beyond simply predicting an answer (Subtask 1) and generate **rich, multimodal explanations** that are **transparent, understandable, and trustworthy** for clinicians.  \n\nYour system should **justify its predictions** using **multiple complementary reasoning forms**—e.g., combining a detailed textual clinical explanation with a visual localization and/or a confidence measure.\n\n**Requirements:**\n- **Faithful** to the model’s reasoning.  \n- **Clinically relevant** and medically sound.  \n- **Useful** for real-world decision-making.\n  \n#### 📄 Validation set for Subtask 2:\n```python\nfrom datasets import load_dataset, Image as HfImage\n\nds = load_dataset(\"SimulaMet/Kvasir-VQA-x1\")[\"test\"]\nval_set_task2 = (\n    ds.filter(lambda x: x[\"complexity\"] == 1)\n      .shuffle(seed=42)\n      .select(range(1500))\n      .add_column(\"val_id\", list(range(1500)))\n      .remove_columns([\"complexity\", \"answer\", \"original\", \"question_class\"])\n      .cast_column(\"image\", HfImage())\n)\n```\nval_set_task2 is a 🤗 Dataset containing the columns val_id, img_id, image, and question, where image is Pillow Image for easy access.\n\n#### 📄 Submission Format\n\nA JSONL file where each entry corresponds to one test case:\n\n```json\n{\n  \"val_id\": \"index of validation subset for subtask 2, as in val_set_task2\",\n  \"img_id\": \"UNIQUE_IMAGE_IDENTIFIER\",\n  \"question\": \"Original question posed to the model.\",\n  \"answer\": \"Prediction from your model from Subtask 1.\",\n  \"textual_explanation\": \"Detailed narrative in clinical language justifying the answer.\",\n  \"visual_explanation\": [{\n    \"type\": \"heatmap | segmentation_mask | bounding_box | etc.\",\n    \"data\": \"path/to/visual.png | [[x1,y1,x2,y2]]\",\n    \"description\": \"(Optional) Highlights the region of interest that supports the answer (e.g., bounding box around the polyp, or heatmap showing focus on mucosal irregularity).\"\n  }],\n  \"confidence_score\": 0.92\n}\n```\n\n**Field-by-Field Requirements:**\n- **`img_id` / `question` / `answer`** → Must match Subtask 1 data and predictions exactly.  \n- **`textual_explanation`** (Mandatory) → Clinician-oriented reasoning referencing visual cues (location, morphology, color, size, vascular pattern, etc.).  \n- **`visual_explanation`** (Optional but encouraged) → Heatmaps, segmentation masks, or bounding boxes linked to the textual explanation.  \n- **`confidence_score`** (Optional but encouraged) → Float in [0, 1], from model confidence or uncertainty estimation.  \n\n\n#### 💡 Suggested Approaches\n1. **VLM Self-Probing for Explanations** — Ask auxiliary questions (e.g., *\"What is the abnormality?\"*, *\"Where is it located?\"*, *\"Describe its morphology\"*) and combine answers into the `textual_explanation`.\n2. **Visual Grounding** — Generate **heatmaps** or attention maps showing influential regions and link them to textual descriptions.\n3. **Segmentation / Detection** — Produce masks or bounding boxes highlighting relevant pathology, reinforcing clinician trust.\n\n⚠️ **Participation in Subtask 2 requires completion of Subtask 1.**\n\n---\n\n## 📂 **Dataset Overview: Kvasir-VQA-x1**\n\nBuilt on **HyperKvasir** and **Kvasir-Instrument**, the **Kvasir-VQA-x1** dataset includes:\n- 🧬 **159,549 QA pairs**\n- 🖼️ **6,500 original GI images**\n- ♻️ **10 weakly augmented images per original** (augmentation script provided)\n- 🧠 **Complexity levels 1–3**\n- 🧪 **Realistic medical question reformulations using LLMs**\n\n📥 Dataset: [**Kvasir-VQA-x1 @ SimulaMet on Hugging Face**](https://huggingface.co/datasets/SimulaMet/Kvasir-VQA-x1)\n \n---\n\n\n## 🔍 **Evaluation Methodology**\n\n**Subtask 1 (VQA Performance)**  \n- Metrics: BLEU, ROUGE (1/2/L), METEOR  \n- Settings: Original \u0026 augmented images  \n- Criteria: Accuracy, relevance, medical correctness\n\nThe official challenge score will be computed on a separate hidden challenge set with more metrics. This ensures fairness and that final results truly reflect model performance.\n\n**Subtask 2 (Explainability)**  \nRated by experts on:\n1. Answer correctness  \n2. Clarity \u0026 clinical relevance  \n3. Visual alignment  \n4. Confidence calibration  \n5. Methodology \u0026 novelty  \n\n---\n## 🏆 **Submission System**\n\n\u003e  🚧 Please do not hesitate to contact us if you encounter any issues.\n\n📌 [View Registered Submissions](https://simulamet-medico-2025.hf.space)\n\nWe use the [`medvqa`](https://pypi.org/project/medvqa/) Python package to **validate and submit** models to the official system.\n### 📦 Install\n```bash\npip install -U medvqa\n```\nAlways use the latest version.\n\nThe model that needs to be submitted is expected to be in a HuggingFace repository. Your HuggingFace repo **must include** a standalone script named:\n- [submission_task1.py](https://raw.githubusercontent.com/SushantGautam/MedVQA/refs/heads/main/medvqa/submission_samples/medico-2025/submission_task1.py) for task 1.\n- [submission_task2.py](https://raw.githubusercontent.com/SushantGautam/MedVQA/refs/heads/main/medvqa/submission_samples/medico-2025/submission_task2.py) for task 2.\n\n### Instructions for Participants\n\nUse the provided **template script**, and make sure to:  \n- Modify all `TODO` sections  \n- Add required information (e.g., model path, inference logic, preprocessing steps) directly in the script  \n- Keep the required input/output format unchanged  \n\n###  Task 1 : Script Variants \u0026 Naming Requirements\n\nYou have two template options for the Task 1 inference script:  \n\n- **MS-Swift version**: [submission_task1_swift.py](https://github.com/SushantGautam/MedVQA/blob/main/medvqa/submission_samples/medico-2025/submission_task1_swift.py)\n- **PyTorch version**: [submission_task1.py](https://raw.githubusercontent.com/SushantGautam/MedVQA/refs/heads/main/medvqa/submission_samples/medico-2025/submission_task1.py)\n\nBoth scripts already include **template example code** for model loading and inference.  \n\n⚠️ **Important**: Even if you use the MS-Swift template, your final script in the repository **must still be named** `submission_task1.py`.  \n\n### Task 2 : 📦 What to Submit (Repository Layout)\nHost your submission in a **Hugging Face model repository** containing:\n- `submission_task2.jsonl` — one object per `val_id`  \n- `visuals/` — optional folder with any referenced visual artifacts (heatmaps, masks, boxes as JSON, etc.)\n- `submission_task2.py` file with you team details\n- A short `README.md` explaining how you created the explanations and any post-processing you want to share\n\n**Demo submission repo:**  \nhttps://huggingface.co/SushantGautam/Medico2025_subtask2_demo_submission/tree/main\n\n**Naming tips**\n- Keep `data` paths in `visual_explanation` **relative** to repo root (e.g., `visuals/1234_heatmap.png`).  \n- Ensure every `val_id` in the file corresponds to an item in `val_set_task2`.\n\n### ✅ Validate Before Submitting\nFirst make sure your submission script works fine in your working environment and it loads the model correctly from your submission repo and generates outputs in the required format.\n\n```bash\npython submission_task1.py\n```\n\nNext, you can validate the script to work independently. The .py script should now be in the root of the same HuggingFace repo as your model. You can try this in a new venv:\n\n```bash\nmedvqa validate --competition=medico-2025 --task=1/2 --repo_id=\u003cyour_repo_id\u003e\n```\n- `--competition`: Set to `medico-2025`\n- `--task`: Use `1` for Task 1 or `2` for Task 2  \n- `--repo_id`: Your **HuggingFace model repo ID** (e.g., [SushantGautam/Florence-2-vqa-demo](https://huggingface.co/SushantGautam/Florence-2-vqa-demo))\n  \n#### 📄 Additional Dependencies  \nIf your code requires extra packages, you must include a `requirements.txt` in the **root of the repo**. The system will install these automatically during validation/submission.\nElse you will get package missing errors.\n\n### 🚀 Submit\nIf validation is okey, you can just run:\n\n```bash\nmedvqa validate_and_submit --competition=medico-2025 --task=1/2 --repo_id=\u003cyour_repo_id\u003e\n```\nThis will make a submisision and your username, along with the task and time, should be visible on the [leaderboard](https://simulamet-medico-2025.hf.space) for it to be considered officially submitted.\nThe submission library will make your Hugging Face repository public but gated, granting the organizers access to your repo.\nIt must remain unchanged at least until the results of the competition are announced. However, you are free to make your model fully public (non-gated). \nIf you encounter any issues with submission, **don’t hesitate to contact us**.\n\n---\n\n## 🛠️ **Tools \u0026 Resources**\n- Scripts for augmentation, splits, and baselines  \n- Submission templates  \n- Fine-tuned model configs  \n- Attention \u0026 saliency visualization methods  \n\n---\n\n## 📅 **Timeline (Preliminary)**\n\n- 📝 **April 2025** — Registration for task participation opens ✅  \n- 📦 **May 2025** — Development data release ✅  \n- 🧪 **June 2025** — Test data release ✅  \n- 📄 **24 September 2025 (Wed.)** — Runs due  \n- 📝 **8 October 2025 (Wed.)** — Working Notes deadline  \n- 🏫 **25–26 October 2025 (Sat.–Sun.)** — MediaEval Workshop (Dublin + Online)  \n\n---\n\n## 💼 **Organizers**\n- 👨‍🔬 Steven A. Hicks — [steven@simula.no](mailto:steven@simula.no)  \n- 🧑‍💻 Michael A. Riegler — [michael@simula.no](mailto:michael@simula.no)  \n- 🧑‍🔬 Vajira Thambawita — [vajira@simula.no](mailto:vajira@simula.no)  \n- 👨‍🏫 Pål Halvorsen — [paalh@simula.no](mailto:paalh@simula.no)  \n- 🧑‍🎓 [Sushant Gautam](https://sushant.info.np) — [sushant@simula.no](mailto:sushant@simula.no)  \n\n---\n\n## 🔗 **Join Us**\nLet’s build the future of **trustworthy, explainable medical AI**.  \n🌟 GI diagnostics needs interpretable answers. Your model can help save lives.\n\n📍 Register: [**MediaEval 2025**](https://multimediaeval.github.io/editions/2025/)  \n📁 Repo: [**GitHub**](https://github.com/SushantGautam/MediaEval-Medico-2025)  \n\n🚀 *Develop explainable AI. Help doctors. Improve lives.*  \n---\n\n## 📚 How to Cite  \n\nIf you are inspired by the **MediaEval Medico 2025 Challenge** or the **Kvasir-VQA-x1 dataset** in your research, please cite the following papers:  \n\n```bibtex\n@article{Gautam2025Aug,\n\tauthor = {Gautam, Sushant and Thambawita, Vajira and Riegler, Michael and others},\n\ttitle = {{Medico 2025: Visual Question Answering for Gastrointestinal Imaging}},\n\tjournal = {arXiv},\n\tyear = {2025},\n\tmonth = aug,\n\teprint = {2508.10869},\n\tdoi = {10.48550/arXiv.2508.10869}\n}\n\n@article{Gautam2025Jun,\n\tauthor = {Gautam, Sushant and Riegler, Michael A. and Halvorsen, P{\\aa}l},\n\ttitle = {{Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy}},\n\tjournal = {arXiv},\n\tyear = {2025},\n\tmonth = jun,\n\teprint = {2506.09958},\n\tdoi = {10.48550/arXiv.2506.09958}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimula%2Fmediaeval-medico-2025","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimula%2Fmediaeval-medico-2025","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimula%2Fmediaeval-medico-2025/lists"}