{"id":22068410,"url":"https://threedle.github.io/iSeg/","last_synced_at":"2025-07-24T06:31:13.895Z","repository":{"id":231669876,"uuid":"781600761","full_name":"threedle/iSeg","owner":"threedle","description":"Interactive 3D Segmentation via Interactive Attention","archived":false,"fork":false,"pushed_at":"2025-07-06T09:46:51.000Z","size":32785,"stargazers_count":28,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-06T10:34:26.365Z","etag":null,"topics":["3d-mesh-representation","deep-learning","differentiable-rendering","interactive-segmentation"],"latest_commit_sha":null,"homepage":"https://threedle.github.io/iSeg/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/threedle.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-04-03T17:33:54.000Z","updated_at":"2025-07-06T09:46:54.000Z","dependencies_parsed_at":"2025-07-06T10:27:22.765Z","dependency_job_id":"17b33a53-3768-49f7-9727-02a35bcfdfcc","html_url":"https://github.com/threedle/iSeg","commit_stats":null,"previous_names":["threedle/iseg"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/threedle/iSeg","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threedle%2FiSeg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threedle%2FiSeg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threedle%2FiSeg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threedle%2FiSeg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/threedle","download_url":"https://codeload.github.com/threedle/iSeg/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threedle%2FiSeg/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266802637,"owners_count":23986384,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-24T02:00:09.469Z","response_time":99,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-mesh-representation","deep-learning","differentiable-rendering","interactive-segmentation"],"created_at":"2024-11-30T20:04:00.622Z","updated_at":"2025-07-24T06:31:13.883Z","avatar_url":"https://github.com/threedle.png","language":"Jupyter Notebook","funding_links":[],"categories":["Paper List"],"sub_categories":["Follow-up Papers"],"readme":"# iSeg: Interactive 3D Segmentation via Interactive Attention\n\n[Itai Lang](https://itailang.github.io/)\u003csup style=\"font-size: 0.7em;\"\u003e1\u003c/sup\u003e, [Fei Xu](https://github.com/FeiXu-spacetime)\u003csup style=\"font-size: 0.7em;\"\u003e1\u003c/sup\u003e, [Dale Decatur](https://ddecatur.github.io/)\u003csup style=\"font-size: 0.7em;\"\u003e1\u003c/sup\u003e, [Sudarshan Babu](https://github.com/sudarshan1994)\u003csup style=\"font-size: 0.7em;\"\u003e2\u003c/sup\u003e, [Rana Hanocka](https://people.cs.uchicago.edu/~ranahanocka/)\u003csup style=\"font-size: 0.7em;\"\u003e1\u003c/sup\u003e\n\n\u0026nbsp;\n\u003cspan style=\"position: relative; display: inline-block;\"\u003e\n  \u003cspan style=\"position: absolute; top: -0.3em; left: -0.8em; font-size: 1em;\"\u003e1\u003c/span\u003e\n  \u003cimg src=\"./media/uchicago_logo.svg\" alt=\"TTIC Logo\" width=\"230\"\u003e\n\u003c/span\u003e\n\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\n\u003cspan style=\"position: relative; display: inline-block;\"\u003e\n  \u003cspan style=\"position: absolute; top: -0.3em; left: -0.95em; font-size: 1em;\"\u003e2\u003c/span\u003e\n  \u003cimg src=\"./media/ttic_logo.png\" alt=\"TTIC Logo\" width=\"160\"\u003e\n\u003c/span\u003e\n\n\u003cbr\u003e\n\n\u003ca href=\"https://threedle.github.io/iSeg/\"\u003e\u003cimg src=\"https://img.shields.io/website?down_color=lightgrey\u0026down_message=offline\u0026label=Project%20Page\u0026up_color=lightgreen\u0026up_message=online\u0026url=https%3A//threedle.github.io/iSeg/\" height=22\u003e\u003c/a\u003e\n\u003ca href=\"https://dl.acm.org/doi/10.1145/3680528.3687605\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Conference-SIGGRAPH%20Asia%202024-61d5fe\" height=22\u003e\n\u003c/a\u003e\n\u003ca href=\"https://arxiv.org/abs/2404.03219\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-iSeg-ff6961.svg\" height=22\u003e\u003c/a\u003e\n\n![teaser](./media/teaser.png)\n\n\u003cdiv align=\"center\" style=\"display: inline-flex; align-items: center; gap: 8px;\"\u003e\n  \u003ca href=\"http://www.replicabilitystamp.org#https-github-com-threedle-iseg\"\u003e\n    \u003cimg src=\"https://www.replicabilitystamp.org/logo/Reproducibility-tiny.png\" alt=\"Replicability Stamp\"\u003e\n  \u003c/a\u003e\n  \u003cspan\u003eiSeg has been awarded the GRSI Replicability Stamp.\u003c/span\u003e\n\u003c/div\u003e\n\n## Abstract\nWe present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is challenging since occluded areas of the same semantic region may not be visible together from any 2D view. Thus, we design a segmentation method conditioned on fine user clicks, which operates entirely in 3D. Our system accepts user clicks directly on the shape's surface, indicating the inclusion or exclusion of regions from the desired shape partition. To accommodate various click settings, we propose a novel interactive attention module capable of processing different numbers and types of clicks, enabling the training of a single unified interactive segmentation model. We apply iSeg to a myriad of shapes from different domains, demonstrating its versatility and faithfulness to the user's specifications.\n\n## Installation\nClone this repository:\n```bash\ngit clone https://github.com/threedle/iSeg.git\ncd iSeg/\n```\n\nCreate and activate a conda environment:\n```bash\nconda create -n iseg python=3.9.18 --yes\nconda activate iseg\n```\n\nAlternatively, you may create the environment locally inside the `iSeg` folder and activate it as follows:\n```bash\nconda create --prefix ./iseg python=3.9.18 --yes\nconda activate ./iseg\n```\n\nInstall the required packages:\n```bash\nsh ./install_environment.sh\n```\n\nNote: the installation assumes a machine with GPU and CUDA.\n\n## Demo\nThis demo shows how to run a pre-trained iSeg model for interactive segmentation of a `hammer` mesh.\n\nDownlod the demo data:\n```bash\nbash download_demo_data.sh\n```\n\n* The `hammer` mesh will be stored at `./meshes/hammer.obj`.\n\n* The pre-trained vertex features will be stored at `./demo/hammer/encoder/pred_f.pth`. \n\n* The pre-trained decoder model will be stored at `./demo/hammer/decoder/decoder_checkpoint.pth`.\n\nIf you experience issues with the script, download directly the [mesh](https://drive.google.com/file/d/1u8GJ7cT7_5hQlplj-5_pYh16m86GdF99/view?usp=sharing), [features](https://drive.google.com/file/d/13bhW6FDzLs4UAQAyaR6N6w1efK41Z8M3/view?usp=sharing), and [checkpoint](https://drive.google.com/file/d/1WWu0NO1pZpS39_tNCAhFD77RSotyq5E4/view?usp=sharing), and store them under the corresponding folders.\n\nRun the decoder with a single click on vertex 5141:\n```bash\npython decoder.py --mode test --save_dir ./demo/hammer/decoder/ --model_name decoder_checkpoint.pth --encoder_f_path ./demo/hammer/encoder/pred_f.pth --obj_path ./meshes/hammer.obj --select_vertices 5141\n```\n\nRun the decoder with a first positive click on vertex 5141 and a second positive click on vertex 61:\n```bash\npython decoder.py --mode test --save_dir ./demo/hammer/decoder/ --model_name decoder_checkpoint.pth --encoder_f_path ./demo/hammer/encoder/pred_f.pth --obj_path ./meshes/hammer.obj --select_vertices 5141 61\n```\n\nRun the decoder with a first positive click on vertex 5141 and a second negative click on vertex 10795:\n```bash\npython decoder.py --mode test --save_dir ./demo/hammer/decoder/ --model_name decoder_checkpoint.pth --encoder_f_path ./demo/hammer/encoder/pred_f.pth --obj_path ./meshes/hammer.obj --select_vertices 5141 -10795\n```\n\nFor each run, the per-vertex predicted probability (an `npy` file), a mesh colored according to the predicted probabilities (a `ply` file), and rendered views of the colored mesh and the clicked points (a `png` file) will be saved under the folder `./demo/hammer/decoder/`.\n\nNote that you can save the colored mesh with spheres at the clicked points' location by adding the argument `--show_spheres 1` to the commands above.\n\n## Interactive Demo\nAdfer downloading the demo data, you can run an interactive demo as follows:\n```bash\npython interactive.py --save_dir ./demo/hammer/decoder/ --model_name decoder_checkpoint.pth --encoder_f_path ./demo/hammer/encoder/pred_f.pth --obj_path ./meshes/hammer.obj\n```\n\nYou can segment the shape interactively by clicking on the shape. You may choose the `Single Click` mode, where each click segments the shape:\n![single](./media/single_click_demo.png)\n\nYou may also choose the `Multiple Clicks` mode, where the shape is segmented with multiple clicks:\n![multiple](./media/multiple_clicks_demo.png)\n\nTo undo the last click, push on the `Undo` button. To redo the last click, push on the `Redo` button. To reset all clicks selection, push the `Reset` button.\n\n## Training Instructions\nThis section explains how to generate data and train the encoder and decoder of iSeg. These items will be exemplified for the `hammer` mesh.\n\nDownload the [ViT-H SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth) (simply click on the link). Put the downloaded file `sam_vit_h_4b8939.pth` under the folder `./SAM_repo/model_checkpoints/`.\n\nGenerate data for training the encoder:\n```bash\npython encoder.py --obj_path ./meshes/hammer.obj --name hammer --encoder_data_dir ./data/hammer/encoder_data --generate_random_views 1 --start_training 0 --test 0\n```\n\nThe data will be saved under the folder `./data/hammer/encoder_data/`.\n\nTrain the encoder:\n```bash\npython encoder.py --obj_path ./meshes/hammer.obj --name hammer --encoder_data_dir ./data/hammer/encoder_data --encoder_model_dir ./experiments/hammer/encoder --generate_random_views 0 --num_epochs 3 --start_training 1 --test 1\n```\n\nThe predicted encoder features per mesh vertex will be saved to `./experiments/hammer/encoder/pred_f.pth`. These features will be used during the decoder training.\n\nGenerate data for training the decoder (single click and couple of clicks):\n```bash\npython data_generation.py --name hammer --obj_path ./meshes/hammer.obj --decoder_data_dir ./data/hammer/decoder_data --single_click 1 --second_positive 1 --second_negative 1\n```\n\nIf you want to generate only single click data, set the arguments `--second_positive 0` and `--second_negative 0`.\n\nThe data will be saved under the folder `./data/hammer/decoder_data/`.\n\nTrain the decoder:\n```bash\npython decoder.py --obj_path ./meshes/hammer.obj --encoder_f_path ./experiments/hammer/encoder/pred_f.pth --decoder_data_dir ./data/hammer/decoder_data --save_dir ./experiments/hammer/decoder/ --model_name decoder_checkpoint.pth --mode train --num_epochs 5 --use_positive_click 1 --use_negative_click 1\n```\n\nIf you generated only single click data, set the arguments `--use_positive_click 0` and `--use_negative_click 0`.\n\nThe decoder model will be saved to `./experiments/hammer/decoder/decoder/decoder_checkpoint.pth`\n\nTo evaluate the trained decoder, see the instuctions in the [Demo](#demo) section.\n\n## Citation\nIf you find iSeg useful for your work, please consider citing:\n```\n@InProceedings{lang2024iseg,\n  author    = {Lang, Itai and Xu, Fei and Decatur, Dale and Babu, Sudarshan and Hanocka, Rana},\n  title     = {{iSeg: Interactive 3D Segmentation via Interactive Attention}},\n  booktitle = {SIGGRAPH Asia 2024 Conference Papers},\n  doi       = {10.1145/3680528.3687605},\n  publisher = {Association for Computing Machinery},\n  year      = {2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/threedle.github.io%2FiSeg%2F","html_url":"https://awesome.ecosyste.ms/projects/threedle.github.io%2FiSeg%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/threedle.github.io%2FiSeg%2F/lists"}