{"id":40761019,"url":"https://github.com/MiraPurkrabek/BBoxMaskPose","last_synced_at":"2026-01-30T19:00:45.929Z","repository":{"id":264333381,"uuid":"893069322","full_name":"MiraPurkrabek/BBoxMaskPose","owner":"MiraPurkrabek","description":"[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'","archived":false,"fork":false,"pushed_at":"2026-01-22T08:29:37.000Z","size":6929,"stargazers_count":83,"open_issues_count":1,"forks_count":9,"subscribers_count":7,"default_branch":"main","last_synced_at":"2026-01-22T23:16:55.703Z","etag":null,"topics":["computer-vision","human-pose-estimation","iccv","iccv2025","keypoint-detection","pose-estimation","research-paper"],"latest_commit_sha":null,"homepage":"https://MiraPurkrabek.github.io/BBox-Mask-Pose/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MiraPurkrabek.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-23T13:02:53.000Z","updated_at":"2026-01-22T21:07:07.000Z","dependencies_parsed_at":"2025-06-20T08:23:19.514Z","dependency_job_id":"e7ad89f1-9a73-413d-be4c-efec3298dc84","html_url":"https://github.com/MiraPurkrabek/BBoxMaskPose","commit_stats":null,"previous_names":["mirapurkrabek/bboxmaskpose"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/MiraPurkrabek/BBoxMaskPose","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MiraPurkrabek%2FBBoxMaskPose","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MiraPurkrabek%2FBBoxMaskPose/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MiraPurkrabek%2FBBoxMaskPose/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MiraPurkrabek%2FBBoxMaskPose/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MiraPurkrabek","download_url":"https://codeload.github.com/MiraPurkrabek/BBoxMaskPose/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MiraPurkrabek%2FBBoxMaskPose/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28777013,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-26T09:42:00.929Z","status":"ssl_error","status_checked_at":"2026-01-26T09:42:00.591Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","human-pose-estimation","iccv","iccv2025","keypoint-detection","pose-estimation","research-paper"],"created_at":"2026-01-21T17:00:37.469Z","updated_at":"2026-01-30T19:00:45.918Z","avatar_url":"https://github.com/MiraPurkrabek.png","language":"Python","funding_links":[],"categories":["Paper List"],"sub_categories":["Follow-up Papers"],"readme":"\u003c/h1\u003e\u003cdiv id=\"toc\"\u003e\n  \u003cul align=\"center\" style=\"list-style: none; padding: 0; margin: 0;\"\u003e\n    \u003csummary\u003e\n      \u003ch1 style=\"margin-bottom: 0.0em;\"\u003e\n        Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle\n      \u003c/h1\u003e\n    \u003c/summary\u003e\n  \u003c/ul\u003e\n\u003c/div\u003e\n\u003c/h1\u003e\u003cdiv id=\"toc\"\u003e\n  \u003cul align=\"center\" style=\"list-style: none; padding: 0; margin: 0;\"\u003e\n    \u003csummary\u003e\n      \u003ch2 style=\"margin-bottom: 0.2em;\"\u003e\n        ICCV 2025\n      \u003c/h2\u003e\n    \u003c/summary\u003e\n  \u003c/ul\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"images/004806_BMP.gif\" alt=\"BBox-Mask-Pose loop\" height=\"500px\"\u003e\n\n  [![Paper](https://img.shields.io/badge/Paper-ICCV%202025-blue)](https://arxiv.org/abs/2412.01562) \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  [![Website](https://img.shields.io/badge/Website-BBoxMaskPose-green)](https://mirapurkrabek.github.io/BBox-Mask-Pose/) \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  [![License](https://img.shields.io/badge/License-GPL%203.0-orange.svg)](LICENSE) \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  [![Video](https://img.shields.io/badge/Video-YouTube-red?logo=youtube)](https://youtu.be/U05yUP4b2LQ)\n  \n\n  Papers with code:\n\n  [![2D Pose AP on OCHuman: 42.5](https://img.shields.io/badge/OCHuman-2D_Pose:_49.2_AP-blue)](https://paperswithcode.com/sota/2d-human-pose-estimation-on-ochuman?p=detection-pose-estimation-and-segmentation-1) \u0026nbsp;\u0026nbsp;\n  [![Human Instance Segmentation AP on OCHuman: 34.0](https://img.shields.io/badge/OCHuman-Human_Instance_Segmentation:_34.0_AP-blue)](https://paperswithcode.com/sota/human-instance-segmentation-on-ochuman?p=detection-pose-estimation-and-segmentation-1)  \n\n\u003c/div\u003e\n\n\u003e [!IMPORTANT]\n\u003e The new version of \u003cb\u003eBBox-Mask-Pose (BMPv2)\u003c/b\u003e is now available on [\u003cb\u003earXiv\u003c/b\u003e](https://arxiv.org/abs/2601.15200v1).\n\u003e BMPv2 significantly improves performance; see the quantitative results reported in the preprint.\n\u003e One of the key contributions is \u003cb\u003ePMPose\u003c/b\u003e, a new top-down pose estimation model, that is already strong on standard benchmarks and in crowded scenes.\n\u003e The code will be added to the \u003ccode\u003eBMP-v2\u003c/code\u003e branch in the following weeks and gradually merged into \u003ccode\u003emain\u003c/code\u003e as well as to the online demo.\n\n\n## 📋 Overview\n\nThe BBox-Mask-Pose (BMP) method integrates detection, pose estimation, and segmentation into a self-improving loop by conditioning these tasks on each other. This approach enhances all three tasks simultaneously. Using segmentation masks instead of bounding boxes improves performance in crowded scenarios, making top-down methods competitive with bottom-up approaches.\n\nKey contributions:\n1. **MaskPose**: a pose estimation model conditioned by segmentation masks instead of bounding boxes, boosting performance in dense scenes without adding parameters\n    - Download pre-trained weights below\n2. **BBox-MaskPose (BMP)**: method linking bounding boxes, segmentation masks, and poses to simultaneously address multi-body detection, segmentation and pose estimation\n    - Try the demo!\n3. Fine-tuned RTMDet adapted for itterative detection (ignoring 'holes')\n    - Download pre-trained weights below\n5. Support for multi-dataset training of ViTPose, previously implemented in the official ViTPose repository but absent in MMPose.\n\nFor more details, please visit our [project website](https://mirapurkrabek.github.io/BBox-Mask-Pose/).\n\n\n## 📢 News\n\n- **Aug 2025**: [HuggingFace Image Demo](https://huggingface.co/spaces/purkrmir/BBoxMaskPose-demo) is out! 🎮\n- **Jul 2025**: Version 1.1 with easy-to-run image demo released\n- **Jun 2025**: Paper accepted to ICCV 2025! 🎉\n- **Dec 2024**: The code is available\n- **Nov 2024**: The [project website](https://MiraPurkrabek.github.io/BBox-Mask-Pose) is on\n\n\n## 🚀 Installation\n\n### Docker Installation (Recommended)\n\nThe fastest way to get started with GPU support:\n\n```bash\n# Clone and build\ngit clone https://github.com/mirapurkrabek/BBoxMaskPose.git\ncd BBoxMaskPose\ndocker-compose build\n\n# Run the demo\ndocker-compose up\n```\n\nRequires: Docker Engine 19.03+, [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html), NVIDIA GPU with CUDA 12.1 support.\n\n### Manual Installation\n  \nThis project is built on top of [MMPose](https://github.com/open-mmlab/mmpose) and [SAM 2.1](https://github.com/facebookresearch/sam2).\nPlease refer to the [MMPose installation guide](https://mmpose.readthedocs.io/en/latest/installation.html) or [SAM installation guide](https://github.com/facebookresearch/sam2/blob/main/INSTALL.md) for detailed setup instructions.\n\nBasic installation steps:\n```bash\n# Clone the repository\ngit clone https://github.com/mirapurkrabek/BBoxMaskPose.git BBoxMaskPose/\ncd BBoxMaskPose\n\n# Install your version of torch, torchvision, OpenCV and NumPy\npip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --extra-index-url https://download.pytorch.org/whl/cu121\npip install numpy==1.25.1 opencv-python==4.9.0.80\n\n# Install MMLibrary\npip install -U openmim\nmim install mmengine \"mmcv==2.1.0\" \"mmdet==3.3.0\" \"mmpretrain==1.2.0\"\n\n# Install dependencies\npip install -r requirements.txt\npip install -e .\n```\n\n## 🎮 Demo\n\nStep 1: Download SAM2 weights using the [enclosed script](models/SAM/download_ckpts.sh).\n\nStep 2: Run the full BBox-Mask-Pose pipeline on an input image:\n\n```bash\npython demo/bmp_demo.py configs/bmp_D3.yaml data/004806.jpg\n```\n\nIt will take an image 004806.jpg from OCHuman and run (1) detector, (2) pose estimator and (3) SAM2 refinement. \nDetails are in the cofiguration file [bmp_D3.yaml](configs/bmp_D3.yaml).\n\nOptions:\n- `configs/bmp_D3.yaml`: BMP configuration file\n- `data/004806.jpg`: Input image\n- `--device`: (Optional) Inference device (default: `cuda:0`)\n- `--output-root`: (Optional) Directory to save outputs (default: `demo/outputs`)\n- `--create-gif`: (Optional) Generate an animated GIF of all iterations (default `False`)\n\nAfter running, outputs are in `outputs/004806/`. The expected output should look like this:\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"images/004806_mask.jpg\" target=\"_blank\"\u003e\n    \u003cimg src=\"images/004806_mask.jpg\" alt=\"Detection results\" width=\"200\" /\u003e\n  \u003c/a\u003e\n  \u0026nbsp\u0026nbsp\u0026nbsp\u0026nbsp\n  \u003ca href=\"images/004806_pose.jpg\" target=\"_blank\"\u003e\n    \u003cimg src=\"images/004806_pose.jpg\" alt=\"Pose results\" width=\"200\" style=\"margin-right:10px;\" /\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n\n## 📦 Pre-trained Models\n\nPre-trained models are available on [VRG Hugging Face 🤗](https://huggingface.co/vrg-prague/BBoxMaskPose/).\nTo run the demo, you only need do download SAM weights with [enclosed script](models/SAM/download_ckpts.sh).\nOur detector and pose estimator will be downloaded during the runtime.\n\nIf you want to download our weights yourself, here are the links to our HuggingFace:\n- ViTPose-b trained on COCO+MPII+AIC -- [download weights](https://huggingface.co/vrg-prague/BBoxMaskPose/resolve/main/ViTPose-b-multi_mmpose20.pth)\n- MaskPose-b -- [download weights](https://huggingface.co/vrg-prague/BBoxMaskPose/resolve/main/MaskPose-b.pth)\n- Fine-tuned RTMDet-L -- [download weights](https://huggingface.co/vrg-prague/BBoxMaskPose/resolve/main/rtmdet-ins-l-mask.pth)\n\n## 🙏 Acknowledgments\n\nThe code combines [MMDetection](https://github.com/open-mmlab/mmdetection), [MMPose 2.0](https://github.com/open-mmlab/mmpose), [ViTPose](https://github.com/ViTAE-Transformer/ViTPose) and [SAM 2.1](https://github.com/facebookresearch/sam2).\n\n## 📝 Citation\n\nThe code was implemented by [Miroslav Purkrábek]([htt]https://mirapurkrabek.github.io/).\nIf you use this work, kindly cite it using the reference provided below.\n\nFor questions, please use the Issues of Discussion.\n\n```\n@InProceedings{Purkrabek2025ICCV,\n    author    = {Purkrabek, Miroslav and Matas, Jiri},\n    title     = {Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle},\n    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},\n    month     = {October},\n    year      = {2025},\n    pages     = {9004-9013}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMiraPurkrabek%2FBBoxMaskPose","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FMiraPurkrabek%2FBBoxMaskPose","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMiraPurkrabek%2FBBoxMaskPose/lists"}