{"id":13441485,"url":"https://github.com/valeoai/PointBeV","last_synced_at":"2025-03-20T12:31:08.108Z","repository":{"id":210258900,"uuid":"726138443","full_name":"valeoai/PointBeV","owner":"valeoai","description":"Official implementation of PointBeV: A Sparse Approach to BeV Predictions","archived":false,"fork":false,"pushed_at":"2024-03-07T14:15:43.000Z","size":55217,"stargazers_count":71,"open_issues_count":1,"forks_count":7,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-08-01T03:34:44.307Z","etag":null,"topics":["autonomous-driving","birdeyeview","driving","segmentation","sparse-coding"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/valeoai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-12-01T16:02:45.000Z","updated_at":"2024-07-29T08:06:10.000Z","dependencies_parsed_at":"2024-01-16T02:45:24.774Z","dependency_job_id":"637e0027-cf82-41da-9381-b53ccacbe19a","html_url":"https://github.com/valeoai/PointBeV","commit_stats":null,"previous_names":["valeoai/pointbev"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valeoai%2FPointBeV","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valeoai%2FPointBeV/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valeoai%2FPointBeV/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/valeoai%2FPointBeV/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/valeoai","download_url":"https://codeload.github.com/valeoai/PointBeV/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221760010,"owners_count":16876334,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autonomous-driving","birdeyeview","driving","segmentation","sparse-coding"],"created_at":"2024-07-31T03:01:34.570Z","updated_at":"2025-03-20T12:31:08.101Z","avatar_url":"https://github.com/valeoai.png","language":"Python","readme":"# Official PyTorch Implementation of *PointBeV: A Sparse Approach to BeV Predictions*\n\n\n\u003e [**PointBeV: A Sparse Approach to BeV Predictions**](https://arxiv.org/abs/2312.00703)\u003cbr\u003e\n\u003e [Loick Chambon](https://loickch.github.io/), [Eloi Zablocki](https://scholar.google.fr/citations?user=dOkbUmEAAAAJ\u0026hl=fr), [Mickael Chen](https://sites.google.com/view/mickaelchen/), [Florent Bartoccioni](https://f-barto.github.io/), [Patrick Perez](https://ptrckprz.github.io/), [Matthieu Cord](https://cord.isir.upmc.fr/).\u003cbr\u003e Valeo AI, Sorbonne University\n\n\n\u003cdiv align=\"center\"\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cimg src=\"imgs/iou_vs_mem.png\" width=\"320\"\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cimg src=\"imgs/iou_vs_mem2.png\" width=\"320\"\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cem\u003ePointBeV reaches state-of-the-art on several segmentation tasks (vehicle without filtering above) while allowing a trade-off between performance and memory consumption.\u003c/em\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cem\u003ePointBeV reaches state-of-the-art on several segmentation tasks (vehicle with filtering above). It can also be used using different pattern strategies, for instance a LiDAR pattern.\u003c/em\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n  \u003ctd colspan=\"2\" align=\"center\"\u003e\n    \u003cimg src=\"imgs/overview.png\" alt=\"alt text\" width=\"700\"\u003e\n  \u003c/td\u003e\n\n  \u003ctr\u003e\n    \u003ctd colspan=\"2\" align=\"center\"\u003e\n      \u003cem\u003eIllustration of different sampling patterns, respectively: a full, a regular, a drivable hdmap, a lane hdmap, a front camera and a LiDAR pattern. PointBeV is flexible to any pattern.\u003c/em\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\n\u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\n# Abstract\n*We propose PointBeV, a novel sparse BeV segmentation model operating on sparse BeV features instead of dense grids. This approach offers precise control over memory usage, enabling the use of long temporal contexts and accommodating memory-constrained platforms. PointBeV employs an efficient two-pass strategy for training, enabling focused computation on regions of interest. At inference time, it can be used with various memory/performance trade-offs and flexibly adjusts to new specific use cases. PointBeV achieves state-of-the-art results on the nuScenes dataset for vehicle, pedestrian, and lane segmentation, showcasing superior performance in static and temporal settings despite being trained solely with sparse signals.*\n\n\u003ctable\u003e\n  \u003ctr\u003e\n  \u003ctd align=\"center\"\u003e\n    \u003cimg src=\"imgs/archi.png\"\u003e\n  \u003c/td\u003e\n  \u003c/tr\u003e\n  \n  \u003ctr\u003e\n  \u003ctd align=\"center\"\u003e\n    \u003cem\u003ePointBeV architecture is an architecture dealing with sparse representations. It uses an efficient Sparse Feature Pulling module to propagate features from images to BeV and a Sparse Attention module for temporal aggregation.\u003c/em\u003e\n  \u003c/td\u003e\n  \u003c/tr\u003e\n\n\u003c/table\u003e\n\n## ✏️ Bibtex\n\nIf this work is helpful for your research, please consider citing the following BibTeX entry and putting a star on this repository.\n\n```\n@inproceedings{chambon2024pointbev,\n      title={PointBeV: A Sparse Approach to BeV Predictions}, \n      author={Loick Chambon and Eloi Zablocki and Mickael Chen and Florent Bartoccioni and Patrick Perez and Matthieu Cord},\n      year={2024},\n      booktitle={CVPR}\n}\n```\n\n## Updates:\n* 【28/02/2024】 Code released.\n* 【27/02/2024】 [PointBeV](https://arxiv.org/abs/2312.00703) has been accepted to CVPR 2024.\n\n\n# 🚀 Main results\n\n### 🔥 Vehicle segmentation\nPointBeV is originally designed for vehicle segmentation. It can be used with different sampling patterns and different memory/performance trade-offs. It can also be used with temporal context to improve the segmentation.\n\u003cdiv align=\"center\"\u003e\n\u003ctable border=\"1\"\u003e\n  \u003ccaption\u003e\u003ci\u003eVehicle segmentation of various static models at 448x800 image resolution with visibility filtering. More details can be found in our paper.\u003c/i\u003e\u003c/caption\u003e\n    \u003ctr\u003e\n        \u003cth\u003eModels\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2312.00703\"\u003ePointBeV (ours)\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://openaccess.thecvf.com/content/CVPR2023/html/Pan_BAEFormer_Bi-Directional_and_Early_Interaction_Transformers_for_Birds_Eye_View_CVPR_2023_paper.html\"\u003eBAEFormer\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2206.07959\"\u003eSimpleBeV\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2203.17270\"\u003eBEVFormer\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2205.02833\"\u003eCVT\u003c/a\u003e\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr class=\"highlight-column\"\u003e\n        \u003ctd\u003eIoU\u003c/td\u003e\n        \u003ctd\u003e47.6\u003c/td\u003e\n        \u003ctd\u003e41.0\u003c/td\u003e\n        \u003ctd\u003e46.6\u003c/td\u003e\n        \u003ctd\u003e45.5\u003c/td\u003e\n        \u003ctd\u003e37.7\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\nBelow we illustrate the model output. On the ground truth, we distinguish vehicle with low visibility (vis \u003c 40%) in light blue from those with higher visibility (vis \u003e 40%) in dark blue. We can see that PointBeV is able to segment vehicles with low visibility, which is a challenging task for other models. They often correspond to occluded vehicles.\n\n\u003cimg src='./imgs/vehicle_segm.gif'\u003e\n\nWe also illustrate the results of a temporal model on random samples taken from the NuScenes validation set. The model used for the visualisation is trained without filtering, at resolution 448x800.\n\n\u003cimg src='./imgs/vehicle_segm_temp.gif'\u003e\n\n### ✨ Sparse inference\n\nPointBeV can be used to perform inference with fewer points than other models. We illustrate this below with a vehicle segmentation model. We can see that PointBeV is able to perform inference with 1/10 of the points used by other models while maintaining a similar performance. This is possible thanks to the sparse approach of PointBeV. In green is represented the sampling mask. Predictions are only performed on the sampled points.\n\n\u003cimg src='./imgs/vehicle_segm_sparse_inference.gif'\u003e\n\n### 🔥 Pedestrian and lane segmentation\n\nPointBeV can also be used for different segmentation tasks such as pedestrians or hdmap segmentation.\n\u003cdiv align=\"center\"\u003e\n\u003ctable border=\"1\"\u003e\n  \u003ccaption\u003e\u003ci\u003ePedestrian segmentation of various static models at 224x480 resolution. More details can be found in our paper.\u003c/i\u003e\u003c/caption\u003e\n    \u003ctr\u003e\n        \u003cth\u003eModels\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2312.00703\"\u003ePointBeV (ours)\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003eTBP-Former\u003c/th\u003e\n        \u003cth\u003eST-P3\u003c/th\u003e\n        \u003cth\u003eFIERY\u003c/th\u003e\n        \u003cth\u003eLSS\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr class=\"highlight-column\"\u003e\n        \u003ctd\u003eIoU\u003c/td\u003e\n        \u003ctd\u003e18.5\u003c/td\u003e\n        \u003ctd\u003e17.2\u003c/td\u003e\n        \u003ctd\u003e14.5\u003c/td\u003e\n        \u003ctd\u003e17.2\u003c/td\u003e\n        \u003ctd\u003e15.0\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\u003cimg src='./imgs/pedes_segm.gif'\u003e\n\n### 🔥 Lane segmentation\n\u003cdiv align=\"center\"\u003e\n\u003ctable border=\"1\"\u003e\n  \u003ccaption\u003e\u003ci\u003eLane segmentation of various static models at different resolutions. More details can be found in our paper.\u003c/i\u003e\u003c/caption\u003e\n    \u003ctr\u003e\n        \u003cth\u003eModels\u003c/th\u003e\n        \u003cth\u003e\u003ca href=\"https://arxiv.org/abs/2312.00703\"\u003ePointBeV (ours)\u003c/a\u003e\u003c/th\u003e\n        \u003cth\u003eMatrixVT\u003c/th\u003e\n        \u003cth\u003eM2BeV\u003c/th\u003e\n        \u003cth\u003ePeTRv2\u003c/th\u003e\n        \u003cth\u003eBeVFormer\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr class=\"highlight-column\"\u003e\n        \u003ctd\u003eIoU\u003c/td\u003e\n        \u003ctd\u003e49.6\u003c/td\u003e\n        \u003ctd\u003e44.8\u003c/td\u003e\n        \u003ctd\u003e38.0\u003c/td\u003e\n        \u003ctd\u003e44.8\u003c/td\u003e\n        \u003ctd\u003e25.7\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\n## 🔨 Setup \u003ca name=\"setup\"\u003e\u003c/a\u003e\n\n➡️ Create the environment.\n```bash\ngit clone https://github.com/...\ncd PointBeV\nmicromamba create -f environment.yaml -y\nmicromamba activate pointbev\n```\n\n➡️ Install cuda dependencies.\n```bash\ncd pointbev/ops/gs; python setup.py build install; cd -\n```\n\n➡️ Datasets.\n\nWe used nuScenes dataset for our experiments. You can download it from the official website: https://www.nuscenes.org/nuscenes.\n```bash\nmkdir data\nln -s $PATH/nuscenes data/nuScenes\npytest tests/test_datasets.py\n```\n\n➡️ Backbones:\n\nBackbones are downloaded the first time the code is run. We've moved them to a folder so that we can retrieve the weights quickly for other runs.\n```bash\nwget https://download.pytorch.org/models/resnet50-0676ba61.pth -P backbones\nwget https://download.pytorch.org/models/resnet101-63fe2227.pth -P backbones\nwget https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b4-6ed6700e.pth -P backbones\n```\n\n**Optional:**\nPreprocess the dataset to train HDmaps model.\nBuilding hdmaps 'on the fly' can slow down the dataloader, so we strongly advise you to save the preprocessed dataset.\n```bash\npython pointbev/data/dataset/create_maps.py --split val train --version=trainval\npython pointbev/data/dataset/create_maps.py --split mini_val mini_train --version=mini\n```\n\nThe directory will be as follows.\n```\nPointBeV\n├── data\n│   ├── nuScenes\n│   │   ├── samples\n│   │   ├── sweeps\n│   │   ├── v1.0-mini\n|   |   ├── v1.0-trainval\n|   |── nuscenes_processed_map\n|   |   ├── label\n|   |   |   ├── mini_train\n|   |   |   ├── mini_val\n|   |   |   ├── train\n|   |   |   ├── val\n|   |   ├── map_0.1\n```\n\n## 🔄 Training \u003ca name=\"training\"\u003e\u003c/a\u003e\n\nSanity check.\n```bash\npytest tests/test_model.py\n```\n\nOverfitting.\n```bash\npython pointbev/train.py flags.debug=True task_name=debug\n```\n\nTraining with simple options:\n```bash\npython pointbev/train.py \\\nmodel/net/backbone=efficientnet \\ # Specifiy the backbone\ndata.batch_size=8 \\ # Select a batch size\ndata.valid_batch_size=24 \\ # Can be a different batch size to faster validation\ndata.img_params.min_visibility=1 \\ # With or without the visibility filtering\ndata/augs@data.img_params=scale_0_3 \\ # Image resolution\ntask_name=folder # Where to save the experiment in the logs folder.\n```\n\nIf you want to train with the reproduced code of BeVFormer static (by specifying `model=BeVFormer`), do not forget to compile the CUDA dependency.\n```bash\ncd pointbev/ops/defattn; python setup.py build install; cd -\n```\n\nThen select BeVFormer model when running code:\n```bash\npython pointbev/train.py \\\nmodel=BeVFormer \n```\n\n## 🔄 Evaluation \u003ca name=\"evaluating\"\u003e\u003c/a\u003e\n\nTo evaluate a checkpoint, do not forget to specify the actual resolution and the visibility filtering applied.\n```bash\npython pointbev/train.py train=False test=True task_name=eval \\\nckpt.path=PATH_TO_CKPT \\\nmodel/net/backbone=efficientnet \\\ndata/augs@data.img_params=scale_0_5 \\\ndata.img_params.min_visibility=1 \n```\n\nIf you evaluate a pedestrian or an hdmap model do not forget to change the annotations.\n```bash\npython pointbev/train.py train=False test=True task_name=eval \\\nckpt.path=PATH_TO_CKPT \\\nmodel/net/backbone=resnet50 \\\ndata/augs@data.img_params=scale_0_3 \\\ndata.img_params.min_visibility=2 \\\ndata.filters_cat=\"[pedestrian]\" # Instead of filtering vehicles, we filter pedestrians for GT.\n```\n\nIf you evaluate a temporal model do not forget to change the model and the temporal frames.\n```bash\npython pointbev/train.py train=False test=True task_name=eval \\\nmodel=PointBeV_T \\\ndata.cam_T_P='[[-8,0],[-7,0],[-6,0],[-5,0],[-4,0],[-3,0],[-2,0],[-1,0],[0,0]]' \\\nckpt.path=PATH_TO_CKPT \\\nmodel/net/backbone=resnet50 \\\ndata/augs@data.img_params=scale_0_3 \\\ndata.img_params.min_visibility=2 \\\ndata.filters_cat=\"[pedestrian]\"\n```\nAbout the temporal frames, T_P means 'Time_Pose'. For instance:\n- [[-1,0]] outputs the T=-1 BeV at the T=0 location.\n- [[0,-1]] outputs the T=0 BeV at the T=-1 location.\n- [[-8,0],[-7,0],[-6,0],[-5,0],[-4,0],[-3,0],[-2,0],[-1,0],[0,0]] outputs the T=-8 to T=0 BeV at the T=0 location.\n\n# Checkpoints\n\n| Backbone | Resolution | Visibility | IoU |\n| -------- | --------   | --------   | --- |\n| Eff-b4   | 224x480    | 1          | [38.69](https://github.com/valeoai/PointBeV/releases/download/checkpoints/38_69.ckpt) |\n| Eff-b4   | 448x800    | 1          | [42.09](https://github.com/valeoai/PointBeV/releases/download/checkpoints/42_09.ckpt) |\n| Eff-b4   | 224x480    | 2          | [43.97](https://github.com/valeoai/PointBeV/releases/download/checkpoints/43_97.ckpt) |\n| Eff-b4   | 448x800    | 2          | [47.58](https://github.com/valeoai/PointBeV/releases/download/checkpoints/47_58.ckpt) |\n\n## 👍 Acknowledgements\n\nMany thanks to these excellent open source projects:\n* https://github.com/nv-tlabs/lift-splat-shoot\n* https://github.com/aharley/simple_bev\n* https://github.com/fundamentalvision/BEVFormer\n\nTo structure our code we used this template:\nhttps://github.com/ashleve/lightning-hydra-template\n\n\n## Todo:\n- [x] Release other checkpoints.\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvaleoai%2FPointBeV","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvaleoai%2FPointBeV","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvaleoai%2FPointBeV/lists"}