{"id":26694478,"url":"https://github.com/deeplearnphysics/polar-mae","last_synced_at":"2025-04-13T00:35:22.164Z","repository":{"id":277042404,"uuid":"927478490","full_name":"DeepLearnPhysics/PoLAr-MAE","owner":"DeepLearnPhysics","description":"Repository for \"Particle Trajectory Representation Learning with Masked Point Modeling\"","archived":false,"fork":false,"pushed_at":"2025-04-04T00:48:11.000Z","size":2486,"stargazers_count":1,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-13T00:34:49.492Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DeepLearnPhysics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-05T02:48:31.000Z","updated_at":"2025-04-03T20:00:22.000Z","dependencies_parsed_at":"2025-04-03T21:19:43.178Z","dependency_job_id":"61bcee60-20e0-4daa-bf17-54326273c535","html_url":"https://github.com/DeepLearnPhysics/PoLAr-MAE","commit_stats":null,"previous_names":["deeplearnphysics/polar-mae"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepLearnPhysics%2FPoLAr-MAE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepLearnPhysics%2FPoLAr-MAE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepLearnPhysics%2FPoLAr-MAE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepLearnPhysics%2FPoLAr-MAE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DeepLearnPhysics","download_url":"https://codeload.github.com/DeepLearnPhysics/PoLAr-MAE/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248650461,"owners_count":21139670,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-26T18:29:43.626Z","updated_at":"2025-04-13T00:35:22.139Z","avatar_url":"https://github.com/DeepLearnPhysics.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cdiv align=\"center\"\u003e\n  \u003ch2\u003eParticle Trajectory Representation Learning with Masked Point Modeling\u003cbr\u003e(PoLAr-MAE)\u003c/h2\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\u003ca href=\"https://arxiv.org/abs/2502.02558\"\u003e[Paper]\u003c/a\u003e\n\u003ca href=\"./DATASET.md\"\u003e[Dataset]\u003c/a\u003e\n\u003ca href=\"https://youngsm.com/polarmae\"\u003e[Project Site]\u003c/a\u003e\n\u003ca href=\"./tutorial\"\u003e[Tutorial]\u003c/a\u003e\n\u003ca href=\"#citing-polar-mae\"\u003e[BibTeX]\u003c/a\u003e\n\u003c/div\u003e\n\n\\\n![arch](images/arch.png)\n\n## Installation\n\nThis codebase relies on a number of dependencies, some of which are difficult to get running. If you're using conda on Linux, use the following to create an environment and install the dependencies:\n\n```bash\nconda env create -f environment.yml\nconda activate polarmae\n\n# Install pytorch3d\ncd extensions\ngit clone https://github.com/facebookresearch/pytorch3d.git\ncd pytorch3d\nMAX_JOBS=N pip install -e .\n\n# Install C-NMS\ncd ../cnms\nMAX_JOBS=N pip install -e .\n\n# Install polarmae\ncd ../.. # should be in the root directory of the repository now\npip install -e .\n```\n\n\u003e [!NOTE]\n\u003eenvironment.yml is a full environment specification, which includes an install of cuda 12.4, pytorch 2.1.5, and python 3.9.\n\u003e \n\u003e `pytorch3d` and `cnms` are compiled from source, and will only be compiled for the CUDA device architecture of the visible GPU(s) available on the system.\n\nChange `N` in `MAX_JOBS=N` to the number of cores you want to use for installing `pytorch3d` and `cnms`. At least 4 cores is recommended to compile `pytorch3d` in a reasonable amount of time. \n\nIf you'd like to do the installation on your own, you will need the following dependencies:\n\n- [CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#)\n- PyTorch 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.4.0 or 2.4.1.\n- gcc \u0026 g++ \u003e= 4.9 and \u003c 13\n- [pytorch3d](https://github.com/facebookresearch/pytorch3d)\n- [PyTorch Lightning](https://github.com/Lightning-AI/pytorch-lightning)\n- [NumPy](https://github.com/numpy/numpy)\n- [Lightning-Utilities](https://github.com/Lightning-AI/utilities)\n- [Scikit-Learn](https://github.com/scikit-learn/scikit-learn)\n- [Omegaconf](https://github.com/omry/omegaconf)\n- [h5py](https://github.com/h5py/h5py)\n\n`pytorch3d` is usually the most difficult dependency to install. See the `pytorch3d` [INSTALL.md](https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md) for more details.\n\nThere are a couple of extra dependencies that are optional, but recommended:\n\n```bash\nconda install wandb jupyter matplotlib\n```\n\n## Tutorial Notebooks\n\nTutorial notebooks for understanding the dataset, model architecture, pretraining, and finetuning are available in the [`tutorial`](tutorial) directory.\n\n\n## PILArNet-M Dataset\n\nWe use and provide the 156 GB **PILArNet-M** dataset of \u003e1M [LArTPC](https://www.symmetrymagazine.org/article/october-2012/time-projection-chambers-a-milestone-in-particle-detector-technology?language_content_entity=und) events. See [DATASET.md](DATASET.md) for more details, but the dataset is available at this [link](https://drive.google.com/drive/folders/1nec9WYPRqMn-_3m6TdM12TmpoInHDosb?usp=drive_link), or can be downloaded with the following command:\n\n```bash\ngdown --folder 1nec9WYPRqMn-_3m6TdM12TmpoInHDosb -O /path/to/save/dataset\n```\n\n\u003e [!NOTE]\n\u003e `gdown` must be installed via e.g. `pip install gdown` or `conda install gdown`.\n\n\n## Models\n\n\n### Pretraining\n| Model | Num. Events |  Config | SVM $F_1$ | Download |\n|-------|-------------|---------|-------|----------|\n| Point-MAE | 1M | [pointmae.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/pointmae.yml) | 0.719 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/mae_pretrain.ckpt) |\n| PoLAr-MAE | 1M | [polarmae.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/polarmae.yml) | 0.732 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/polarmae_pretrain.ckpt) |\n\nOur evaluation consists of training an ensemble of linear SVMs to classify individual tokens (i.e., groups) as containing one or more classes. This is done via a One vs Rest strategy, where each SVM is trained to classify a single class against all others. $F_1$ is the mean $F_1$ score over all semantic categories in the validation set of the PILArNet-M dataset.\n\nAfter installing the dependencies, you can run the following commands in an interactive Python session to load the pretrained model(s):\n\n```python\n\u003e\u003e\u003e from polarmae.models.ssl import PointMAE, PoLArMAE\n\u003e\u003e\u003e !wget https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/mae_pretrain.ckpt\n\u003e\u003e\u003e !wget https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/polarmae_pretrain.ckpt\n\u003e\u003e\u003e model = PointMAE.load_from_checkpoint(\"mae_pretrain.ckpt\") # or\n\u003e\u003e\u003e model = PoLArMAE.load_from_checkpoint(\"polarmae_pretrain.ckpt\")\n```\n\n### Semantic Segmentation\n\n| Model | Training Method | Num. Events |  Config | $F_1$ | Download |\n|-------|-----------------|-------------|---------|-------|----------|\n| Point-MAE | Linear probing | 10k | [part_segmentation_mae_peft.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/part_segmentation_mae_peft.yml) | 0.772 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/mae_peft_segsem_10k.ckpt) |\n| PoLAr-MAE | Linear probing | 10k | [part_segmentation_polarmae_peft.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/part_segmentation_polarmae_peft.yml) | 0.798 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/polarmae_peft_segsem_10k.ckpt) |\n| Point-MAE | FFT | 10k | [part_segmentation_mae_fft.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/part_segmentation_mae_fft.yml) | 0.831 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/mae_fft_segsem_10k.ckpt) |\n| PoLAr-MAE | FFT | 10k | [part_segmentation_polarmae_fft.yml](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/part_segmentation_polarmae_fft.yml) | 0.837 | [here](https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/polarmae_fft_segsem_10k.ckpt) |\n\nOur evaluation for semantic segmentation consists of 1:1 comparisons between the predicted and ground truth segmentations. $F_1$ is the mean $F_1$ score over all semantic categories in the validation set of the PILArNet-M dataset.\n\nAfter installing the dependencies, you can run the following commands in an interactive Python session to load the pretrained model(s):\n\n```python\n\u003e\u003e\u003e from polarmae.models.finetune import SemanticSegmentation\n\u003e\u003e\u003e from polarmae.utils.checkpoint import load_finetune_checkpoint\n\u003e\u003e\u003e !wget https://github.com/DeepLearnPhysics/PoLAr-MAE/releases/download/weights/{mae,polarmae}_{fft,peft}_segsem.ckpt\n\u003e\u003e\u003e model = load_finetune_checkpoint(SemanticSegmentation, \n                                    \"{mae,polarmae}_{fft,peft}_segsem.ckpt\",\n                                    data_path=\"/path/to/pilarnet-m/dataset\",\n                                    pretrained_ckpt_path=\"{mae,polarmae}_pretrain.ckpt\")\n```\n\nHere, the brackets {} denote the model and the training method -- choose one from `{mae,polarmae}` and `{fft,peft}`. Note that you must use the `load_finetune_checkpoint` function to load the model, as it has to do some extra setup not required for the pretraining phase. Knowing the `data_path` is necessary as the number of segmentation classes is determined by the dataset.\n\n## Training\n\n\u003cdetails\u003e\n  \u003csummary\u003eImportant: Learning rate instructions\u003c/summary\u003e\n\n  The following commands use the configurations we used for our experiments. Particularly, our learning rates are set assuming a batch size of 128 (i.e., 32 across 4 GPUs). If you want to train on a single GPU with the same batch size in the configuration file, you will need to scale the learning rate accordingly. We recommend scaling the learning rate by the square root of the ratio of the batch sizes. I.e., if your batch size is $b$, you should set the learning rate $l \\rightarrow l \\times \\sqrt{b/128}$.\n\u003c/details\u003e\n\n\n### Pretraining\n\nTo pretrain Point-MAE, modify the [config file](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/pointmae.yml) to include the path to the PILArNet-M dataset, and run the following command:\n\n```bash\npython -m polarmae.tasks.pointmae fit --config configs/pointmae.yml\n```\n\nTo pretrain PoLAr-MAE, modify the [config file](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/polarmae.yml) to include the path to the PILArNet-M dataset, and run the following command:\n\n```bash\npython -m polarmae.tasks.polarmae fit --config configs/polarmae.yml\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eExample training plots\u003c/summary\u003e\n\n  \u003cimg src=\"images/pretrain_loss.png\" alt=\"training plots\" width=\"300\"\u003e\n  \u003cimg src=\"images/svm_f1_scores.png\" alt=\"svm plots\" width=\"300\"\u003e\n\u003c/details\u003e\n\n### Semantic Segmentation\n\nTo train a semantic segmentation model, modify the [config file](https://github.com/DeepLearnPhysics/PoLAr-MAE/blob/main/configs/part_segmentation_mae_peft.yml) to include the path to the PILArNet-M dataset, and run the following command:\n\n```bash\npython -m polarmae.tasks.part_segmentation fit --config configs/part_segmentation_{mae,polarmae}_{peft,fft}.yml \\\n                        --model.pretrained_ckpt_path path/to/pretrained/checkpoint.ckpt\n```\n\nwhere `{mae,polarmae}` is either `mae` or `polarmae`, and `{peft,fft}` is either `peft` or `fft`. You can either specify the pretrained checkpoint path in the config, or pass it as an argument to the command like above.\n\n\u003cdetails\u003e\n  \u003csummary\u003eExample training plots\u003c/summary\u003e\n\n  \u003cimg src=\"images/ft_segsem_loss.png\" alt=\"training plots\" width=\"300\"\u003e\n  \u003cimg src=\"images/ft_segsem_accprec.png\" alt=\"svm plots\" width=\"300\"\u003e\n  \u003cimg src=\"images/ft_segsem_mious.png\" alt=\"svm plots\" width=\"300\"\u003e\n\u003c/details\u003e\n\n\n## Acknowledgements\n\nThis repository is built upon the lovely [Point-MAE](https://github.com/Pang-Yatian/Point-MAE) and [point2vec](https://github.com/kabouzeid/point2vec) repositories.\n\n## Citing PoLAr-MAE\n\nIf you find this work useful, please consider citing the following paper:\n\n```bibtex\n@misc{young2025particletrajectoryrepresentationlearning,\n      title={Particle Trajectory Representation Learning with Masked Point Modeling}, \n      author={Sam Young and Yeon-jae Jwa and Kazuhiro Terao},\n      year={2025},\n      eprint={2502.02558},\n      archivePrefix={arXiv},\n      primaryClass={hep-ex},\n      url={https://arxiv.org/abs/2502.02558}, \n}\n```\n\n## Contact\n\nAny questions? Any suggestions? Want to collaborate? Feel free to raise an issue on Github or email Sam Young [youngsam@stanford.edu](mailto:youngsam@stanford.edu).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplearnphysics%2Fpolar-mae","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeplearnphysics%2Fpolar-mae","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplearnphysics%2Fpolar-mae/lists"}