{"id":30683268,"url":"https://github.com/waridrox/e2e_sparse","last_synced_at":"2026-06-24T22:31:17.800Z","repository":{"id":309798109,"uuid":"1037588474","full_name":"waridrox/e2e_sparse","owner":"waridrox","description":"E2E deep learning for particle classification","archived":false,"fork":false,"pushed_at":"2025-10-26T09:03:48.000Z","size":954,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-26T10:17:55.504Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://ml4sci.org/gsoc/2025/proposal_E2E2.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/waridrox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-13T20:03:28.000Z","updated_at":"2025-10-26T09:03:51.000Z","dependencies_parsed_at":"2025-08-13T22:20:56.181Z","dependency_job_id":"87a71d42-dcb2-497b-bf36-1fa0730caafc","html_url":"https://github.com/waridrox/e2e_sparse","commit_stats":null,"previous_names":["waridrox/e2e_sparse"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/waridrox/e2e_sparse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waridrox%2Fe2e_sparse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waridrox%2Fe2e_sparse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waridrox%2Fe2e_sparse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waridrox%2Fe2e_sparse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/waridrox","download_url":"https://codeload.github.com/waridrox/e2e_sparse/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waridrox%2Fe2e_sparse/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34752465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-24T02:00:07.484Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-01T19:43:16.729Z","updated_at":"2026-06-24T22:31:17.794Z","avatar_url":"https://github.com/waridrox.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# End-to-End Sparse Point Cloud Jet Classification\n\n[![ML4Sci](https://img.shields.io/badge/ML4Sci%20-blue)](https://ml4sci.org/gsoc/2025/proposal_E2E2.html)\n\nA deep learning framework for Quark-Gluon jet classification using sparse point cloud representations. This project implements multiple neural network architectures optimized for high-energy physics jet tagging tasks.\n\n## Project Overview\n\nThis repository contains the implementation of end-to-end deep learning models for classifying particle jets as quarks or gluons. The framework processes jet data as point clouds, leveraging both convolutional and attention-based architectures to capture geometric and topological features.\n\n**Project Reference:** [ML4Sci GSoC 2025 - End-to-End Sparse Deep Learning](https://ml4sci.org/gsoc/2025/proposal_E2E2.html)\n\n## Architecture\n\n### Supported Models\n\n#### 1. **Sparse CNN (ResNet-based)**\n- 1D Convolutional ResNet architecture optimized for point clouds\n- Variants: Small (S), Medium (M), Large (L)\n- Supports multiple resolutions: 256, 512, 768, 1024 points\n- Features residual connections and batch normalization\n\n#### 2. **Aggregation Transformer**\n- Self-attention mechanism for point cloud processing\n- Global and local feature aggregation\n- Positional encoding for spatial relationships\n- Variants optimized for different point cloud sizes\n\n### Model Variants\n\n| Model | Resolution | Parameters |\n|-------|-----------|--------------|\n| ResNet_PC_256_S | 256 points | Small |\n| ResNet_PC_512_S | 512 points | Small |\n| ResNet_PC_768_S | 768 points | Small |\n| ResNet_PC_768_M | 768 points | Medium |\n| ResNet_PC_1024_S | 1024 points | Small |\n| ResNet_PC_1024_M | 1024 points | Medium |\n| ResNet_PC_1024_L | 1024 points | Large |\n\n## Project Structure\n\n```\ne2e_sparse/\n├── DataGeneration/\n│   └── QuarkGluon/\n│       ├── ToPointCloudForm.py    # Convert raw data to point cloud format\n│       └── 4Resolutions.sh        # Generate datasets at 4 resolutions\n│\n├── Supervised/\n│   ├── CNN/\n│   │   └── model.py               # ResNet-based CNN architectures\n│   ├── AggregationTransformer/\n│   │   └── model.py               # Transformer-based models\n│   ├── trainer.py                 # Single-GPU training script\n│   ├── trainer4Node.py            # Multi-node distributed training\n│   └── Experiments/\n│       └── Scripts/\n│           ├── bash/              # Local training scripts\n│           ├── slurm-no-resume/   # SLURM scripts without resume\n│           └── slurm-preempt-chain/  # SLURM with checkpoint resume\n│\n└── README.md\n```\n\n### Data Preparation\n\n1. **Generate Point Cloud Datasets:**\n```bash\ncd DataGeneration/QuarkGluon\nbash 4Resolutions.sh\n```\n\nThis will create datasets at 4 different resolutions:\n- QG256.h5 (256 points)\n- QG512.h5 (512 points)\n- QG768.h5 (768 points)\n- QG1024.h5 (1024 points)\n\n## Training\n\n### Single GPU Training\n\n```bash\npython Supervised/trainer.py \\\n  --datapath=/path/to/QG1024.h5 \\\n  --Nepochs=100 \\\n  --lr=1e-3 \\\n  --model_variant=ResNet_PC_1024_S \\\n  --UseWandb=True \\\n  --wandb_project=quark-gluon \\\n  --wandb_entity=your-entity \\\n  --wandb_run_name=resnet_1024_experiment \\\n  --wandb_key=your-api-key \\\n  --Checkpoint_dir=/path/to/checkpoints \\\n  --NAccumSteps=1\n```\n\n### Multi-GPU Training\n\n```bash\npython Supervised/trainer4Node.py \\\n  --datapath=/path/to/QG1024.h5 \\\n  --Nepochs=100 \\\n  --lr=1e-3 \\\n  --model_variant=Transformer_PC_1024_S \\\n  --UseWandb=True \\\n  --wandb_project=quark-gluon \\\n  --wandb_entity=your-entity \\\n  --wandb_run_name=transformer_1024_4gpu \\\n  --wandb_key=your-api-key \\\n  --Checkpoint_dir=/path/to/checkpoints\n```\n\n### SLURM Cluster Training\n\nFor HPC clusters with SLURM:\n\n```bash\n# Preempt-chain (with automatic resume on preemption)\ncd Supervised/Experiments/Scripts/slurm-preempt-chain\nsbatch AggregationTransformer1024.sh \u003crun_id\u003e\n\n# Standard SLURM (no resume)\ncd Supervised/Experiments/Scripts/slurm-no-resume\nsbatch SparseCNNResnet.sh\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaridrox%2Fe2e_sparse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwaridrox%2Fe2e_sparse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaridrox%2Fe2e_sparse/lists"}