{"id":13443379,"url":"https://github.com/MCG-NJU/CamLiFlow","last_synced_at":"2025-03-20T16:31:10.062Z","repository":{"id":40566590,"uuid":"465590042","full_name":"MCG-NJU/CamLiFlow","owner":"MCG-NJU","description":"[CVPR 2022 Oral \u0026 TPAMI 2023] Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion","archived":false,"fork":false,"pushed_at":"2024-07-29T16:01:09.000Z","size":2588,"stargazers_count":223,"open_issues_count":1,"forks_count":21,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-10-28T06:58:07.034Z","etag":null,"topics":["cvpr2022","multimodal","optical-flow","point-cloud","scene-flow"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2303.12017","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MCG-NJU.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-03T06:06:52.000Z","updated_at":"2024-10-24T08:24:25.000Z","dependencies_parsed_at":"2023-12-22T13:04:01.375Z","dependency_job_id":"c7a57cba-ceb2-44cc-993b-74f47a6b26a3","html_url":"https://github.com/MCG-NJU/CamLiFlow","commit_stats":{"total_commits":13,"total_committers":1,"mean_commits":13.0,"dds":0.0,"last_synced_commit":"3ad23adfa702c970f14547f9f3180da6742ca150"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCG-NJU%2FCamLiFlow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCG-NJU%2FCamLiFlow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCG-NJU%2FCamLiFlow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCG-NJU%2FCamLiFlow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MCG-NJU","download_url":"https://codeload.github.com/MCG-NJU/CamLiFlow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244649813,"owners_count":20487496,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cvpr2022","multimodal","optical-flow","point-cloud","scene-flow"],"created_at":"2024-07-31T03:01:59.897Z","updated_at":"2025-03-20T16:31:09.191Z","avatar_url":"https://github.com/MCG-NJU.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# CamLiFlow \u0026 CamLiRAFT\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-optical-flow-and-scene-flow-with/optical-flow-estimation-on-kitti-2015)](https://paperswithcode.com/sota/optical-flow-estimation-on-kitti-2015?p=learning-optical-flow-and-scene-flow-with)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-optical-flow-and-scene-flow-with/scene-flow-estimation-on-kitti-2015-scene-1)](https://paperswithcode.com/sota/scene-flow-estimation-on-kitti-2015-scene-1?p=learning-optical-flow-and-scene-flow-with)\n\nThis is the official PyTorch implementation for our two papers: \n\n* Conference version: [CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation](https://arxiv.org/abs/2111.10502). (CVPR 2022 Oral)\n\n* Extended version (CamLiRAFT): [Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion](https://arxiv.org/abs/2303.12017). (TPAMI 2023)\n\n中文解读：[https://zhuanlan.zhihu.com/p/616384758](https://zhuanlan.zhihu.com/p/616384758)\n\n![](asserts/banner.jpg)\n\n## Changes to the Conference Paper\n\nIn this extended version, we instantiate a new type of the bidirectional fusion pipeline, the **CamLiRAFT** based on the recurrent all-pairs field transforms. CamLiRAFT obtains significant performance improvements over the original PWC-based CamLiFlow and sets a new state-of-the-art record on various datasets.\n\n* **Comparison with stereo scene flow methods**:  On FlyingThings3D, CamLiRAFT achieves 1.73 EPE2D and 0.049 EPE3D, 21\\% and 20\\% lower error compared to CamLiFlow. On KITTI, even the non-rigid CamLiRAFT performs on par with the previous state-of-the-art method RigidMask (SF-all: 4.97\\% vs. 4.89\\%). By refining the background scene flow with rigid priors, CamLiRAFT further achieves an error of 4.26\\%, ranking **first** on the [leaderboard](http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php).\n\n* **Comparison with LiDAR-only scene flow methods**: The LiDAR-only variant of our method, dubbed CamLiRAFT-L, also outperforms all previous LiDAR-only scene flow methods in terms of both accuracy and speed (see Tab. 5 in the paper). Thus, CamLiRAFT-L can also serve as a strong baseline for LiDAR-only scene flow estimation.\n\n* **Comparison on MPI Sintel**: Without finetuning on Sintel, CamLiRAFT achieves 2.38 AEPE on the final pass of the Sintel training set, reducing the error by 12% and 18% over RAFT and RAFT-3D respectively. This demonstrates that our method has good generalization performance and can handle non-rigid motion.\n\n* **Training schedule**: The original CamLiFlow requires a complicated training schedule of Things (L2 loss) -\u003e Things (Robust loss) -\u003e Driving -\u003e KITTI and takes about 10 days to train. CamLiRAFT simplifies the schedule to Things -\u003e KITTI, and the training only takes about 3 days. (Tested on 4x RTX 3090 GPUs)\n\n## News\n\n* 2023-11-05: CamLiRAFT is accepted to TPAMI. Thanks for the valuable suggestions from the reviewers!\n* 2023-09-20: We provide a demo for CamLiRAFT, see `demo.py` for more details.\n* 2023-03-22: We release CamLiRAFT, an extended version of CamLiFlow on [https://arxiv.org/abs/2303.12017](https://arxiv.org/abs/2303.12017).\n* 2022-03-29: Our paper is selected for an **oral** presentation. \n* 2022-03-07: We release the code and the pretrained weights.\n* 2022-03-03: Our paper is accepted by **CVPR 2022**.\n* 2021-11-20: Our paper is available at [https://arxiv.org/abs/2111.10502](https://arxiv.org/abs/2111.10502)\n* 2021-11-04: Our method ranked **first** on the leaderboard of [KITTI Scene Flow](http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php).\n\n## Pretrained Weights\n\n| Model | Training set | Weights | Comments |\n|-------|--------------|---------|----------|\n| CamLiRAFT | Things (80e) | [camliraft_things80e.pt](https://drive.google.com/file/d/1nTh4Mugy5XltjcJHa7Byld2KIQ1IXrbm/view?usp=sharing) | Best generalization performance |\n| CamLiRAFT | Things (150e) | [camliraft_things150e.pt](https://drive.google.com/file/d/1BEuKy5WMbaABW5Wz-Gx879kcNJ2Zla2Z/view?usp=sharing) | Best performance on Things | \n| CamLiRAFT | Things (150e) -\u003e KITTI (800e) | [camliraft_things150e_kitti800e.pt](https://drive.google.com/file/d/18rBJpy1Bero9dM6HfqKfdZqE4vpU84aD/view?usp=sharing) | Best performance on KITTI |\n| CamLiRAFT-L | Things-Occ (100e) | [camliraft_l_best_things_occ.pt](https://drive.google.com/file/d/1mEpFtI4-lfFqE1X8ZYBiWSPXudn8SCUx/view?usp=share_link) | Best performance on Things-Occ |\n| CamLiRAFT-L | Things-Occ (100e) | [camliraft_l_best_kitti_occ.pt](https://drive.google.com/file/d/1917Lt2iDSL7DqvnOwR0PEsYX8XEabRCJ/view?usp=share_link) | Best generalization performance on KITTI-Occ |\n| CamLiRAFT-L | Things-Noc (100e) | [camliraft_l_best_things_noc.pt](https://drive.google.com/file/d/1zOO2eOclkfsqXSRkoO4716dM6nEHZXk2/view?usp=share_link) | Best performance on Things-Noc |\n| CamLiRAFT-L | Things-Noc (100e) | [camliraft_l_best_kitti_noc.pt](https://drive.google.com/file/d/1aiqnwCWibN5InvSGAncYDFe0XQPDYbox/view?usp=share_link) | Best generalization performance on KITTI-Noc |\n\n\u003e Things-Occ means \"occluded FlyingThings3D\" and Things-Noc means \"non-occluded FlyingThings3D\". Same for KITTI-Occ and KITTI-Noc.\n\n## Precomputed Results\n\nHere, we provide precomputed results for the submission to the online benchmark of [KITTI Scene Flow](http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php). \\* denotes refining the background scene flow with rigid priors.\n\n|  Model  | D1-all | D2-all | Fl-all | SF-all | Link |\n|---------|--------|--------|--------|--------|------|\n| CamLiFlow    | 1.81%  | 3.19%  | 4.05%  | 5.62%  | [camliflow-wo-refine.zip](https://drive.google.com/file/d/1zfH-uS9MxgZ8JZwUjNNHq7vASz1WD7SW/view?usp=sharing) |\n| CamLiFlow \\* | 1.81%  | 2.95%  | 3.10%  | 4.43%  | [camliflow.zip](https://drive.google.com/file/d/1qi7zxSmEDcCA1ChwVHv6_eyNSXVxez7x/view?usp=sharing) |\n| CamLiRAFT    | 1.81%  | 3.02%  | 3.43%  | 4.97%  | [camliraft-wo-refine.zip](https://drive.google.com/file/d/1H3x_OBRsVteDb7i5gaaY7cDc9ZQxUnZy/view?usp=sharing) |\n| CamLiRAFT \\* | 1.81%  | 2.94%  | 2.96%  | 4.26%  | [camliraft.zip](https://drive.google.com/file/d/1mzL5vKIOg-boBgknaxssuaiGcqvUybrV/view?usp=sharing) |\n\n## Environment\n\nCreate a PyTorch environment using `conda`:\n\n```\nconda create -n camliraft python=3.7\nconda activate camliraft\nconda install pytorch==1.10.2 torchvision==0.11.3 cudatoolkit=11.3 -c pytorch\n```\n\nInstall mmcv and mmdet:\n\n```\npip install openmim\nmim install mmcv-full==1.4.0\nmim install mmdet==2.14.0\n```\n\nInstall other dependencies:\n\n```\npip install opencv-python open3d tensorboard hydra-core==1.1.0\n```\n\nCompile CUDA extensions for faster training and evaluation:\n\n```\ncd models/csrc\npython setup.py build_ext --inplace\n```\n\nDownload the ResNet-50 pretrained on ImageNet-1k:\n\n```\nwget https://download.pytorch.org/models/resnet50-11ad3fa6.pth\nmkdir pretrain\nmv resnet50-11ad3fa6.pth pretrain/\n```\n\nNG-RANSAC is also required if you want to evaluate on KITTI. Please follow [https://github.com/vislearn/ngransac](https://github.com/vislearn/ngransac) to install the library.\n\n## Demo\n\nThen, run the following script to launch a demo of estimating optical flow and scene flow from a pair of images and point clouds:\n\n```\npython demo.py --model camliraft --weights /path/to/camliraft/checkpoint.pt\n```\n\nNote that CamLiRAFT is not very robust to objects at a greater distance, as the network has only been trained on data with a depth of less than 35m. If you are getting bad results on your own data, try scaling the depth of the point cloud to a range of 5 ~ 35m.\n\n## Evaluate CamLiFlow and CamLiRAFT\n\n### FlyingThings3D\n\nFirst, download and preprocess the dataset (see `preprocess_flyingthings3d_subset.py` for detailed instructions):\n\n```\npython preprocess_flyingthings3d_subset.py --input_dir /mnt/data/flyingthings3d_subset\n```\n\nThen, download the pretrained weights [camliraft_things150e.pt](https://drive.google.com/file/d/1BEuKy5WMbaABW5Wz-Gx879kcNJ2Zla2Z/view?usp=sharing) and save it to `checkpoints/camliraft_things150e.pt`.\n\nNow you can reproduce the results in Table 2 (see the extended paper):\n\n```\npython eval_things.py testset=flyingthings3d_subset model=camliraft ckpt.path=checkpoints/camliraft_things150e.pt\n```\n\n### KITTI\n\nFirst, download the following parts:\n\n* Main data: [data_scene_flow.zip](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_scene_flow.zip)\n* Calibration files: [data_scene_flow_calib.zip](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_scene_flow_calib.zip)\n* Disparity estimation (from GA-Net): [disp_ganet.zip](https://drive.google.com/file/d/1ieFpOVzqCzT8TXNk1zm2d9RLkrcaI78o/view?usp=sharing)\n* Semantic segmentation (from DDR-Net): [semantic_ddr.zip](https://drive.google.com/file/d/1dVSJeE9BBmVv2rCe5TR0PVanEv2WzwIy/view?usp=sharing)\n\nUnzip them and organize the directory as follows:\n\n```\ndatasets/kitti_scene_flow\n├── testing\n│   ├── calib_cam_to_cam\n│   ├── calib_imu_to_velo\n│   ├── calib_velo_to_cam\n│   ├── disp_ganet\n│   ├── flow_occ\n│   ├── image_2\n│   ├── image_3\n│   ├── semantic_ddr\n└── training\n    ├── calib_cam_to_cam\n    ├── calib_imu_to_velo\n    ├── calib_velo_to_cam\n    ├── disp_ganet\n    ├── disp_occ_0\n    ├── disp_occ_1\n    ├── flow_occ\n    ├── image_2\n    ├── image_3\n    ├── obj_map\n    ├── semantic_ddr\n```\n\nThen, download the pretrained weights [camliraft_things150e_kitti800e.pt](https://drive.google.com/file/d/18rBJpy1Bero9dM6HfqKfdZqE4vpU84aD/view?usp=sharing) and save it to `checkpoints/camliraft_things150e_kitti800e.pt`.\n\nTo reproduce the results **without** leveraging rigid-body assumptions (SF-all: 4.97%):\n\n```\npython kitti_submission.py testset=kitti model=camliraft ckpt.path=checkpoints/camliraft_things150e_kitti800e.pt\n```\n\nTo reproduce the results **with** rigid background refinement (SF-all: 4.26%), you need to further refine the background scene flow:\n\n```\npython refine_background.py\n```\n\nResults are saved to `submission/testing`. The initial non-rigid estimations are indicated by the `_initial` suffix.\n\n### Sintel\n\nFirst, download the flow dataset from: http://sintel.is.tue.mpg.de and the depth dataset from https://sintel-depth.csail.mit.edu/landing\n\nUnzip them and organize the directory as follows:\n\n```\ndatasets/sintel\n├── depth\n│   ├── README_depth.txt\n│   ├── sdk\n│   └── training\n└── flow\n    ├── bundler\n    ├── flow_code\n    ├── README.txt\n    ├── test\n    └── training\n```\n\nThen, download the pretrained weights [camliraft_things80e.pt](https://drive.google.com/file/d/1nTh4Mugy5XltjcJHa7Byld2KIQ1IXrbm/view?usp=sharing) and save it to `checkpoints/camliraft_things80e.pt`.\n\nNow you can reproduce the results in Table 4 (see the extended paper):\n\n```\npython eval_sintel.py testset=sintel model=camliraft ckpt.path=checkpoints/camliraft_things80e.pt\n```\n\n## Evaluate CamLiRAFT-L\n\n### FlyingThings3D\n\nThere are two different ways of data preprocessing. The first setting is the one proposed by HPLFlowNet, which only keeps non-occluded points during the preprocessing. The second setting, proposed by FlowNet3D, remains the occluded points.\n\n```\n# Non-occluded\npython eval_things_noc_sf.py testset=flyingthings3d_subset_hpl model=camlipwc_l ckpt.path=checkpoints/camliraft_l_best_things_noc.pt\n# Occluded\npython eval_things_occ_sf.py testset=flyingthings3d_subset_flownet3d model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_things_occ.pt\n```\n\n### KITTI\n\nSame with FlyingThings3D, there are two different ways of data preprocessing. We report results on both settings.\n\n```\n# Non-occluded\npython eval_kitti_noc_sf.py testset=kitti model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_kitti_noc.pt\n# Occluded\npython eval_kitti_occ_sf.py testset=kitti model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_kitti_occ.pt\n```\n\n## Training\n\n### FlyingThings3D\n\n\u003e You need to preprocess the FlyingThings3D dataset before training (see `preprocess_flyingthings3d_subset.py` for detailed instructions).\n\nTrain CamLiRAFT on FlyingThings3D (150 epochs):\n\n```\npython train.py trainset=flyingthings3d_subset valset=flyingthings3d_subset model=camliraft\n```\n\nThe entire training process takes about 3 days on 4x RTX 3090 GPUs.\n\n### KITTI\n\nFinetune the model on KITTI using the weights trained on FlyingThings3D:\n\n```\npython train.py trainset=kitti valset=kitti model=camliraft ckpt.path=checkpoints/camliraft_things150e.pt\n```\n\nThe entire training process takes about 0.5 days on 4x RTX 3090 GPUs. We use the last checkpoint (800th) to generate the submission.\n\n## Citation\n\nIf you find them useful in your research, please cite:\n\n```\n@article{liu2023learning,\n  title   = {Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion},\n  author  = {Haisong Liu and Tao Lu and Yihui Xu and Jia Liu and Limin Wang},\n  journal = {arXiv preprint arXiv:2303.12017},\n  year    = {2023}\n}\n\n@inproceedings{liu2022camliflow,\n  title     = {Camliflow: bidirectional camera-lidar fusion for joint optical flow and scene flow estimation},\n  author    = {Liu, Haisong and Lu, Tao and Xu, Yihui and Liu, Jia and Li, Wenjie and Chen, Lijun},\n  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages     = {5791--5801},\n  year      = {2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMCG-NJU%2FCamLiFlow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FMCG-NJU%2FCamLiFlow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMCG-NJU%2FCamLiFlow/lists"}