{"id":29688500,"url":"https://github.com/cocowy1/smoe-stereo","last_synced_at":"2025-07-23T05:06:41.288Z","repository":{"id":302774998,"uuid":"1013510501","full_name":"cocowy1/SMoE-Stereo","owner":"cocowy1","description":"[ICCV 2025] 🌟🌟🌟 Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts","archived":false,"fork":false,"pushed_at":"2025-07-11T07:50:07.000Z","size":100877,"stargazers_count":46,"open_issues_count":1,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-11T11:02:55.172Z","etag":null,"topics":["depth","iccv","moe"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cocowy1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-04T03:05:14.000Z","updated_at":"2025-07-11T10:58:48.000Z","dependencies_parsed_at":"2025-07-04T07:32:11.512Z","dependency_job_id":null,"html_url":"https://github.com/cocowy1/SMoE-Stereo","commit_stats":null,"previous_names":["cocowy1/smoe-stereo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cocowy1/SMoE-Stereo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocowy1%2FSMoE-Stereo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocowy1%2FSMoE-Stereo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocowy1%2FSMoE-Stereo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocowy1%2FSMoE-Stereo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cocowy1","download_url":"https://codeload.github.com/cocowy1/SMoE-Stereo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocowy1%2FSMoE-Stereo/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266620084,"owners_count":23957309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["depth","iccv","moe"],"created_at":"2025-07-23T05:06:40.628Z","updated_at":"2025-07-23T05:06:41.262Z","avatar_url":"https://github.com/cocowy1.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🚀 SMoE-Stereo (ICCV 2025) 🚀 \n### [**ICCV 2025**] 🌟 **Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts**  \u003ca href=\"https://arxiv.org/pdf/2507.04631\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2507.0631-b31b1b?logo=arxiv\" alt='arxiv'\u003e\u003c/a\u003e \n##  🌼 Abstract\nOur SMoE-Stereo framework fuses Vision Foundation Models (VFMs) with a Selective-MoE design to unlock robust stereo matching at minimal computational cost. Its standout features are 😄 :\n* Our SMoE dynamically selects the **most suitable experts** for each input and thereby adapts to varying input characteristics, allowing it to adapt seamlessly to diverse “in-the-wild” scenes and domain shifts.\n  \n* Unlike existing stereo matching methods that rely on rigid, sequential processing pipelines for all inputs, SMoE-Stereo intelligently prioritizes computational resources by selectively engaging only **the most relevant MoEs** for simpler scenes. This adaptive architecture optimally balances accuracy and processing speed according to available resources.\n\n* Remarkably, despite being trained exclusively on standard datasets (KITTI 2012/2015, Middlebury, and ETH3D training splits) without any additional data, SMoE-Stereo has achieved top ranking on the Robust Vision Challenge (RVC) leaderboards.\n\n##  📝 Zero-shot performance on Standard Stereo Benchmarks\n![teaser](media/teaser.png)\n\n## 👀 Zero-shot on Adverse Weather Conditions and Enjoyable Inference Efficiency \n\u003cp\u003e\n  \u003cimg src=\"media/dr_weather.jpg\" alt=\"weather\" width=\"460\" height=\"300\" /\u003e\n \u003cimg src=\"media/efficiency.jpg\" alt=\"efficiency\" width=\"450\" height=\"250\" /\u003e\n\u003c/p\u003e\n\n\n## 😇  Robust Vision Challenge (RVC) Benchmark\n![RVC](media/RVC.jpg)\n\n\n## 🎇 Parameter-efficient Finetuning Methods (PEFT) \u0026 VFM backbones\nExciting Update! Our framework now comprehensively supports mainstream PEFT strategies for stereo matching, including:\n* Visual Prompt Tuning ([ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136930696.pdf))\n* LoRA  ([ICLR 2022](https://arxiv.org/abs/2106.09685))\n* AdapterFormer ([NeuralPS 2022](https://arxiv.org/abs/2205.13535))\n* Adapter Tuning ([ECCV 2024](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/05841.pdf))\n* LoRA MoE, Adapter MoE\n* Our SMoE strategy\n\nAdditionally, the framework is compatible with multiple leading vision foundation models (VFMs):\n* DepthAnything ([DAM](https://arxiv.org/abs/2401.10891))\n* DepthAnythingV2 ([DAMV2](https://arxiv.org/abs/2406.09414))\n* SegmentAnything ([SAM](https://arxiv.org/abs/2304.02643))\n* DINOV2([DINOV2](https://arxiv.org/abs/2304.07193))\n\nAll these models can now leverage our PEFT implementation for enhanced performance and flexibility.\nPlease choose the model variants you want !!!\n\nBelow are Examples:\n```\nparser.add_argument('--peft_type', default='smoe', choices=[\"lora\", \"smoe\", \"adapter\", \"tuning\", \"vpt\", \"ff\"], type=str)\nparser.add_argument('--vfm_type', default='damv2', choices=[\"sam\", \"dam\", \"damv2\", \"dinov2\"], type=str)\nparser.add_argument('--vfm_size', default='vitl', choices=[\"vitb\", \"vits\", \"vitl\"], type=str)\n```\n\n## ✅ TODO List\n\n- [x] Upload the ViT-small weights of SMoE-Stereo.\n- [x] add SMoE-IGEV-backbone.  \n- [x] add the KITTI demo.mp4.  \n\n## 😎 Our Framework\nWe use RAFT-Stereo as our backbone and replace its feature extractor with VFMs, while the remaining structures are unchanged. \n![framework](media/framework.png)\n\n## 💪 Flexible Selective Property\nOur MoE modules and the experts within each MoE layer can be selectively activated to adapt to different input characteristics, facilitate scene-specific adaptation, enabling robust stereo matching across diverse real-world scenarios.\n![framework](media/selection.png)\n\n## ⚙️ Installation\n* NVIDIA RTX a6000\n* Python 3.8.13\n\n### ⏳ Create a virtual environment and activate it.\n\n```Shell\nconda create -n smoestereo python=3.8\nconda activate smoestereo\n```\n### 🎬 Dependencies\n\n```Shell\npip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118\npip install tqdm\npip install scipy\npip install opencv-python\npip install scikit-image\npip install tensorboard\npip install matplotlib \npip install timm==0.5.4\npip install thop\npip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html\npip install accelerate==1.0.1\npip install gradio_imageslider\npip install gradio==4.29.0\n\n```\n\n## ✏️ Required Data\n\n* [SceneFlow](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html)\n* [KITTI](https://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo)\n* [ETH3D](https://www.eth3d.net/datasets)\n* [Middlebury](https://vision.middlebury.edu/stereo/submit3/)\n\n## ✈️ Model weights\n\n| Model      |                                               Link                                                |\n|:----:|:-------------------------------------------------------------------------------------------------:|\n|sceneflow | [Google Driver](https://drive.google.com/drive/folders/1UoY7Yam0MA2qUI1GIVll0owH4tMTpzw7?usp=drive_link)|\n|RVC (mix of all training datasets) | [Google Driver](https://drive.google.com/drive/folders/1UoY7Yam0MA2qUI1GIVll0owH4tMTpzw7?usp=drive_link)|\n\nThe mix_all model is trained on all the datasets mentioned above, which has the best performance on zero-shot generalization.\nThe model weights can be placed in the ckpt folders.\n\n## ✈️ Evaluation\n\nTo evaluate the zero-shot performance of SMoE-Stereo on Scene Flow, KITTI, ETH3D, vkitti, DrivingStereo, or Middlebury, run\n\n```Shell\npython evaluate_stereo.py --resume ./pretrained/damv2_sceneflow.pth --eval_dataset *(select one of [\"eth3d\", \"kitti\", \"middlebury\", \"robust_weather\",  \"robust\"])\n```\nor use the model trained on all datasets, which is better for zero-shot generalization.\n```Shell   \npython evaluate_stereo.py --resume ./pretrained/SMoEStereo_RVC.pth --eval_dataset *(select one of [\"eth3d\", \"kitti\", \"sceneflow\", \"vkitti\", \"driving\"])\n```\n## Citation\n\nIf you find our work useful in your research, please consider citing our paper:\n\n```bibtex\n\n\n@article{wang2025moe,\n  title={learning robust stereo matching in the wild with selective mixture-of-experts},\n  author={Yun Wang, Longguang Wang, Chenghao Zhang, Yongjian Zhang, Zhanjie Zhang, Ao Ma, Chenyou Fan, Tin Lun Lam, Junjie Hu},\n  journal={arXiv preprint arXiv:2507.04631},\n  year={2025}\n}\n```\n\n\n# Acknowledgements\n\nThis project is based on [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) and [GMStereo](https://github.com/autonomousvision/unimatch). We thank the original authors for their excellent works.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcocowy1%2Fsmoe-stereo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcocowy1%2Fsmoe-stereo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcocowy1%2Fsmoe-stereo/lists"}