{"id":18792490,"url":"https://github.com/prbonn/tarl","last_synced_at":"2025-04-13T14:31:22.941Z","repository":{"id":157848577,"uuid":"608568656","full_name":"PRBonn/TARL","owner":"PRBonn","description":"TARL: Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving","archived":false,"fork":false,"pushed_at":"2023-12-19T09:15:24.000Z","size":287,"stargazers_count":86,"open_issues_count":0,"forks_count":7,"subscribers_count":7,"default_branch":"main","last_synced_at":"2023-12-19T11:40:18.306Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PRBonn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-03-02T09:39:21.000Z","updated_at":"2023-12-11T13:18:00.000Z","dependencies_parsed_at":"2023-12-19T10:47:51.128Z","dependency_job_id":null,"html_url":"https://github.com/PRBonn/TARL","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PRBonn%2FTARL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PRBonn%2FTARL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PRBonn%2FTARL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PRBonn%2FTARL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PRBonn","download_url":"https://codeload.github.com/PRBonn/TARL/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223589773,"owners_count":17170035,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T21:20:06.291Z","updated_at":"2024-11-07T21:20:07.545Z","avatar_url":"https://github.com/PRBonn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TARL: Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving\n\n**[Paper](http://www.ipb.uni-bonn.de/pdfs/nunes2023cvpr.pdf)** **|** **[Sup. material](http://www.ipb.uni-bonn.de/pdfs/nunes2023cvpr-supmaterial.pdf)** **|** **[Video](https://www.youtube.com/watch?v=0CtDbwRYLeo)**\n\nThis repo contains the code for the self-supervised pre-training method proposed in the CVPR'23 paper: Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving.\n\nOur approach extract temporal views as augmented versions of the same object. We aggregate sequential LiDAR scans, and by removing the ground (in an unsupervised manner) and clustering the remaining points we define coarse segments of objects in the scene to be used for self-supervised pre-training. We evaluate our pre-training by fine-tuning the pre-trained model to different downstream tasks. In our experiments we show that our approach could significantly reduce the amount of labels needed to achieve the same performance as the network trained from scratch using the full training set.\n\n![](pics/tarl_diagram.png)\n\n## Dependencies\n\nTo run our code first install the dependencies with:\n\n```\nsudo apt install build-essential python3-dev libopenblas-dev\npip3 install -r requirements.txt\n```\n\nFollowed by installing MinkowskiEngine from the official repo:\n\n```\npip3 install -U git+https://github.com/NVIDIA/MinkowskiEngine --install-option=\"--blas=openblas\" -v --no-deps\n```\n\nThen you need to run the code setup with:\n\n`pip3 install -U -e .`\n\n## SemanticKITTI Dataset\n\nThe SemanticKITTI dataset has to be download from the official [site](http://www.semantic-kitti.org/dataset.html#download) and extracted in the following structure:\n\n```\n./\n└── Datasets/\n    └── SemanticKITTI\n        └── dataset\n          └── sequences\n            ├── 00/\n            │   ├── velodyne/\n            |   |       ├── 000000.bin\n            |   |       ├── 000001.bin\n            |   |       └── ...\n            │   └── labels/\n            |       ├── 000000.label\n            |       ├── 000001.label\n            |       └── ...\n            ├── 08/ # for validation\n            ├── 11/ # 11-21 for testing\n            └── 21/\n                └── ...\n```\n\nFor the unsupervised ground segmentation, you need to run [patchwork](https://github.com/LimHyungTae/patchwork) over the SemanticKITTI dataset and put the generated files over:\n```\n./\n└── Datasets/\n    └── SemanticKITTI\n        └── assets\n            └── patchwork\n                ├── 08\n                    ├── 000000.label\n                    ├── 000001.label\n                    └── ...\n```\n\nFor SemanticKITTI we have available [here](https://www.ipb.uni-bonn.de/html/projects/tarl/ground_labels.zip) the ground segment labels to be used in\nour pre-training.\n\n**Note** that using patchwork is **not** the only option to have ground prediction. Among other options one could for example use ransac implementation\nfrom Open3D which would give also a ground estimation (as done by SegContrast). We have also implemented this option you could use ransac by setting\nin the config file the flag `use_ground_pred: False`. However, we recommend using patchwork since the ground segmentation is more accurate.\n\n## Custom dataset\n\nTo use our pre-training with a different dataset we provide some few instructions on how to do it at [NEW_DATA.md](https://github.com/PRBonn/TARL/blob/main/tarl/datasets/NEW_DATA.md).\n\n## Running the code\n\nThe command to run the pre-training is:\n\n```\npython3 tarl_train.py\n```\n\nIn the `config/config.yaml` the parameters used in our experiments are already set.\n\nNote that, the first epoch will take longer to train since we generate the segments over the aggregated scans and save them to disk. However, after\nthe first epoch the training should be first since it is just necessary to load the segments from disk.\n\n## Docker\n\nWe have also a `Dockerfile` in `docker/` directory to make things easy to run. You can build the docker image locally in case any changes\nregarding CUDA is needed with:\n\n```docker build . -t nuneslu/tarl:latest```\n\nIn case it is not needed, you can run it directly with docker-compose and the image will be downloaded from docker hub:\n\n```CUDA_VISIBLE_DEVICES=0 docker-compose run pretrain python3 tarl_train.py```\n\n## Pre-trained weights\n\n- TARL MinkUNet pre-trained [weights](https://www.ipb.uni-bonn.de/html/projects/tarl/lastepoch199_model_tarl.pt)\n\n---\n\n# Fine-tuning\n\nFor fine-tuning we have used repositories from the baselines, so after pre-training with TARL you should copy the pre-trained weights to the target task and use it for fine-tuning.\n\n## Semantic segmentation\n\n|Method         | Scribbles | 0.1% | 1% | 10% | 50% | 100% |\n|----------------|-----------|--------|--------|--------|--------|--------|\n|No pre-training         | 54.96% | 29.35% | 42.77% | 53.96% | 58.27% | 59.03% |\n|PointContrast   | 54.52% | 32.63% | 44.62% | 58.68% | 59.98% | 61.45% |\n|DepthContrast   | 55.90% | 31.66% | 48.05% | 57.11% | 60.99% | 61.14% |\n|SegContrast     | 56.70% | 32.75% | 44.83% | 56.31% | 60.45% | 61.02% |\n|**TARL (Ours)** |**57.25%**|**38.59%**|**51.42%**|**60.34%**|**61.42%**|**61.47%**|\n\n\nFor fine-tuning to semantic segmentation we refer to the SegContrast [repo](https://github.com/PRBonn/segcontrast).\nClone the repo with `git clone https://github.com/PRBonn/segcontrast.git` and follow the installation instructions. Note that the requirements from\nTARL and segcontrast are similar since both use `MinkowskiEngine` so you should be able to use the same environment than TARL just installing\nthe remaining packages missing. **NOTE:** we **replaced** SegContrast optimizer from *`SGD`* to **`AdamW`** and **removed** the learning rate scheduler.\n\nAfter setting up the packages, copy the pre-trained model from `TARL/tarl/experiments/TARL/default/version_0/checkpoints/last.ckpt` to `segcontrast/checkpoint/contrastive/lastepoch199_model_tarl.pt` and run the following command:\n\n```\npython3 downstream_train.py --use-cuda --use-intensity --checkpoint \\\n        tarl --contrastive --load-checkpoint --batch-size 2 \\\n        --sparse-model MinkUNet --epochs 15\n```\n\n# Panoptic segmentation\n\nFor the panoptic segmentation task, we refer to this [repo](https://github.com/PRBonn/MinkowskiPanoptic) where we have our implementation for the baseline used to evaluate this task in the paper.\n\n# Object detection\n\nFor object detection we have used the OpenPCDet [repo](https://github.com/zaiweizhang/OpenPCDet) with few modifications. In this docker [image](https://hub.docker.com/r/nuneslu/segcontrast_openpcdet) we have setted up everything to run it with `MinkUNet` and to load our pre-trained weights.\nThe weights should be copied to `/tmp/OpenPCDet/pretrained/lastepoch199_model_tarl.pt` inside the container and then running the command:\n\n```\ncd /tmp/OpenPCDet/tools\npython3 train.py --cfg_file cfgs/kitti_models/tarl_pretrained.yaml --pretrained_model ../pretrained/lastepoch199_model_tarl.pt\n```\n\n# Citation\n\nIf you use this repo, please cite as :\n\n```\n@inproceedings{nunes2023cvpr,\n    author = {Lucas Nunes and Louis Wiesmann and Rodrigo Marcuzzi and Xieyuanli Chen and Jens Behley and Cyrill Stachniss},\n    title = {{Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving}},\n    booktitle = {{Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)}},\n    year = {2023}\n}\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprbonn%2Ftarl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprbonn%2Ftarl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprbonn%2Ftarl/lists"}