{"id":22229751,"url":"https://github.com/graph-com/hept","last_synced_at":"2025-07-27T19:31:46.943Z","repository":{"id":223960020,"uuid":"760050927","full_name":"Graph-COM/HEPT","owner":"Graph-COM","description":"[ICML'24 Oral] LSH-Based Efficient Point Transformer (HEPT)","archived":false,"fork":false,"pushed_at":"2024-07-08T21:57:48.000Z","size":3064,"stargazers_count":11,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-07-09T02:47:57.726Z","etag":null,"topics":["ai4science","geometric-deep-learning","hep-ex","point-cloud","pytorch","transformer"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2402.12535","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Graph-COM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-19T17:33:23.000Z","updated_at":"2024-07-08T21:57:52.000Z","dependencies_parsed_at":"2024-04-24T04:42:12.229Z","dependency_job_id":"ea9e9454-41aa-47f3-b448-3fefdeb84164","html_url":"https://github.com/Graph-COM/HEPT","commit_stats":null,"previous_names":["graph-com/hept"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHEPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHEPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHEPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graph-COM%2FHEPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Graph-COM","download_url":"https://codeload.github.com/Graph-COM/HEPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227830968,"owners_count":17826154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai4science","geometric-deep-learning","hep-ex","point-cloud","pytorch","transformer"],"created_at":"2024-12-03T01:12:13.679Z","updated_at":"2024-12-03T01:12:16.177Z","avatar_url":"https://github.com/Graph-COM.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eLSH-Based Efficient Point Transformer (HEPT)\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://arxiv.org/abs/2402.12535\"\u003e\u003cimg src=\"https://img.shields.io/badge/-arXiv-grey?logo=gitbook\u0026logoColor=white\" alt=\"Paper\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/Graph-COM/HEPT\"\u003e\u003cimg src=\"https://img.shields.io/badge/-Github-grey?logo=github\" alt=\"Github\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://arxiv.org/abs/2402.12535\"\u003e \u003cimg alt=\"License\" src=\"https://img.shields.io/static/v1?label=Pub\u0026message=ICML%2724\u0026color=blue\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n## TODO\n- [ ] Put more details in the README.\n- [ ] Add support for FlashAttn.\n- [x] Add support for efficient processing of batched input.\n- [x] Add an example of HEPT with minimal code.\n\n## News\n- **2024.06:** HEPT has been accepted to ICML 2024 and is selected as an oral presentation (144/9473, 1.5% acceptance rate)!\n- **2024.04:** HEPT now supports efficient processing of batched input by this [commit](https://github.com/Graph-COM/HEPT/commit/2e408388a16400050c0eb4c4f7390c3c24078dee). This is implemented via integrating batch indices in the computation of AND hash codes, which is more efficient than naive padding, especially for batches with imbalanced point cloud sizes. **Note:**\n  - Only the code in `./example` is updated to support batched input, and the original implementation in `./src` is not updated.\n  - The current implementation for batched input is not yet fully tested. Please feel free to open an issue if you encounter any problems.\n\n- **2024.04:** An example of HEPT with minimal code is added in `./example` by this [commit](https://github.com/Graph-COM/HEPT/commit/350a9863d7757e556177c52a44bac2aaf0c6dde8). It's a good starting point for users who want to use HEPT in their own projects. There are minor differences between the example and the original implementation in `./src/models/attention/hept.py`, but they should not affect the performance of the model.\n\n\n## Introduction\nThis study introduces a novel transformer model optimized for large-scale point cloud processing in scientific domains such as high-energy physics (HEP) and astrophysics. Addressing the limitations of graph neural networks and standard transformers, our model integrates local inductive bias and achieves near-linear complexity with hardware-friendly regular operations. One contribution of this work is the quantitative analysis of the error-complexity tradeoff of various sparsification techniques for building efficient transformers. Our findings highlight the superiority of using locality-sensitive hashing (LSH), especially OR \\\u0026 AND-construction LSH, in kernel approximation for large-scale point cloud data with local inductive bias. Based on this finding, we propose LSH-based Efficient Point Transformer (**HEPT**), which combines E2LSH with OR \\\u0026 AND constructions and is built upon regular computations. HEPT demonstrates remarkable performance in two critical yet time-consuming HEP tasks, significantly outperforming existing GNNs and transformers in accuracy and computational speed, marking a significant advancement in geometric deep learning and large-scale scientific data processing.\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./data/HEPT.png\" width=85% height=85%\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e\u003cem\u003eFigure 1.\u003c/em\u003ePipline of HEPT.\u003c/p\u003e\n\n## Datasets\nAll the datasets can be downloaded and processed automatically by running the scripts in `./src/datasets`, i.e.,\n```\ncd ./src/datasets\npython pileup.py\npython tracking.py -d tracking-6k\npython tracking.py -d tracking-60k\n```\n\n## Installation\n\n#### Environment\nWe are using `torch 2.3.1` and `pyg 2.5.3` with `python 3.10.14` and `cuda 12.1`. Use the following command to install the required packages:\n```\nconda env create -f environment.yml\npip install torch_geometric==2.5.3\npip install torch_scatter==2.1.2 torch_cluster==1.6.3 -f https://data.pyg.org/whl/torch-2.3.0+cu121.html\n```\n\n#### Running the code\nTo run the code, you can use the following command:\n```\npython tracking_trainer.py -m hept\n```\n\nOr\n```\npython pileup_trainer.py -m hept\n```\nConfigurations will be loaded from those located in `./configs/` directory.\n\n## FAQ\n\n#### How to tune the hyperparameters of HEPT?\nThere are three key hyperparameters in HEPT:\n- `block_size`: block size for attention computation\n- `n_hashes`: the number of hash tables, i.e., OR LSH\n-  `num_regions`: # of regions HEPT will randomly divide the input space into (Sec. 4.3 in the paper)\n\nWe suggest first determine `block_size` and `n_hashes` according to the computational budget, but generally `n_hashes` should be greater than 1. `num_regions` should be tuned according to the local inductive bias of the dataset.\n\n\n\n\n\n\n## Reference\n```bibtex\n@article{miao2024locality,\n  title   = {Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics},\n  author  = {Miao, Siqi and Lu, Zhiyuan and Liu, Mia and Duarte, Javier and Li, Pan},\n  journal = {International Conference on Machine Learning},\n  year    = {2024}\n}\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraph-com%2Fhept","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgraph-com%2Fhept","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraph-com%2Fhept/lists"}