{"id":31944404,"url":"https://github.com/ehsanik/segan","last_synced_at":"2025-10-14T10:24:16.752Z","repository":{"id":46099725,"uuid":"132189294","full_name":"ehsanik/SeGAN","owner":"ehsanik","description":"SeGAN: Segmenting and Generating the Invisible (https://arxiv.org/pdf/1703.10239.pdf)","archived":false,"fork":false,"pushed_at":"2021-11-17T20:47:23.000Z","size":1041,"stargazers_count":62,"open_issues_count":0,"forks_count":12,"subscribers_count":5,"default_branch":"master","last_synced_at":"2023-10-20T19:38:34.076Z","etag":null,"topics":["computer-vision","deep-learning","generative-adversarial-network","image-generation","segan","segmentation"],"latest_commit_sha":null,"homepage":"","language":"Lua","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ehsanik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-05-04T21:08:45.000Z","updated_at":"2023-08-16T14:38:02.000Z","dependencies_parsed_at":"2022-08-21T18:20:20.767Z","dependency_job_id":null,"html_url":"https://github.com/ehsanik/SeGAN","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/ehsanik/SeGAN","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehsanik%2FSeGAN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehsanik%2FSeGAN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehsanik%2FSeGAN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehsanik%2FSeGAN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ehsanik","download_url":"https://codeload.github.com/ehsanik/SeGAN/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehsanik%2FSeGAN/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279018782,"owners_count":26086452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","generative-adversarial-network","image-generation","segan","segmentation"],"created_at":"2025-10-14T10:24:14.297Z","updated_at":"2025-10-14T10:24:16.747Z","avatar_url":"https://github.com/ehsanik.png","language":"Lua","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [SeGAN: Segmenting and Generating the Invisible](https://arxiv.org/abs/1703.10239)\nThis project is presented as spotlight in CVPR2018.\n\n\u003ccenter\u003e\u003cimg src=\"figs/teaser.jpg\" height=\"300px\" \u003e\u003c/center\u003e\n\n### Abstract\n\nHumans have strong ability to make inferences about the appearance of the invisible and occluded parts of scenes. For example, when we look at the scene on the left we can make predictions about what is behind the coffee table, and can even complete the sofa based on the visible parts of the sofa, the coffee table, and what we know\nin general about sofas and coffee tables and how they occlude each other.\n\nSeGAN can learn to \n\u003col\u003e\n\u003cli\u003eGenerate the \u003cstrong class=\"important\"\u003eappearance\u003c/strong\u003e of the occluded parts of objects,\u003c/li\u003e\n\u003cli\u003e\u003cstrong class=\"important\"\u003eSegment\u003c/strong\u003e the invisible parts of objects,\u003c/li\u003e\n\u003cli\u003eAlthough trained on synthetic photo realistic images reliably segment \u003cstrong class=\"important\"\u003enatural images\u003c/strong\u003e,\u003c/li\u003e \n\u003cli\u003eBy reasoning about occluder-occludee relations infer \u003cstrong class=\"important\"\u003edepth layering\u003c/strong\u003e.\n\u003c/li\u003e\n\u003c/ol\u003e\n\n### Citation\n\nIf you find this project useful in your research, please consider citing:\n\n\t@inproceedings{ehsani2018segan,\n\t  title={Segan: Segmenting and generating the invisible},\n\t  author={Ehsani, Kiana and Mottaghi, Roozbeh and Farhadi, Ali},\n\t  booktitle={CVPR},\n\t  year={2018}\n\t}\n\t\n\n### Prerequisites\n\n- Using Torch 7 and dependencies from [this repository](https://github.com/torch/distro).\n- Linux OS\n- NVIDIA GPU + CUDA + CuDNN\n\n### Installation\n\n1. Clone the repository using the command:\n\n\t\tgit clone https://github.com/ehsanik/SeGAN\n\t\tcd SeGAN\n\n2. Download the dataset from [here](https://drive.google.com/file/d/1TfrP4Sptm6wPMdrn9MrWghfTNAMTCtlY/view?usp=sharing) and extract it.\n3. Make a link to the dataset.\n\n\t\tln -s /PATH/TO/DATASET dyce_data\n\n4. Download pretrained weights from [here](https://drive.google.com/file/d/1cGXaO8rHLOVwuVZOXw3tuDDfNxw2eGbL/view?usp=sharing) and extract it.\n5. Make a link to the weights' folder.\n\n\t\tln -s /PATH/TO/WEIGHTS weights\n\n\n### Dataset\n\nWe introduce DYCE, a dataset of synthetic\noccluded objects. This is a synthetic dataset with\nphoto-realistic images and natural configuration of objects\nin scenes. All of the images of this dataset are taken in indoor\nscenes. The annotations for each image contain the\nsegmentation mask for the visible and invisible regions of\nobjects. The images are obtained by taking snapshots from\nour 3D synthetic scenes.\n\n##### Statistics\n\nThe number of the synthetic scenes that we use is 11,\nwhere we use 7 scenes for training and validation, and 4\nscenes for testing. Overall there are 5 living rooms and 6 kitchens, where 2 living rooms and 2 kitchen are used for\ntesting. On average, each scene contains 60 objects and the\nnumber of visible objects per image is 17.5 (by visible we\nmean having at least 10 visible pixels). There is no common\nobject instance in train and test scenes.\n\n\u003ccenter\u003e\u003cimg src=\"figs/dataset.jpg\" height=\"300px\" \u003e\u003c/center\u003e\n\nThe dataset can be downloaded from [here](https://drive.google.com/file/d/1TfrP4Sptm6wPMdrn9MrWghfTNAMTCtlY/view?usp=sharing).\n\n### Train\n\nTo train your own model:\n\n```\nth main.lua -baseLR 1e-3 -end2end -istrain \"train\"\n```\n\nSee `data_settings.lua` for additional commandline options.\n\n### Test\n\nTo test using the pretrained model and reproduce the results in the paper:\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003cth rowspan=\"2\"\u003eModel\u003c/th\u003e\n\u003cth colspan=\"3\"\u003eSegmentation\u003c/th\u003e\n\u003cth colspan=\"2\"\u003eTexture\u003c/th\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eVisible \u0026cup; Invisible\u003c/td\u003e\n\u003ctd\u003eVisible\u003c/td\u003e\n\u003ctd\u003eInvisible\u003c/td\u003e\n\u003ctd\u003eL1\u003c/td\u003e\n\u003ctd\u003eL2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMultipath\u003c/td\u003e\n\u003ctd\u003e47.51\u003c/td\u003e\n\u003ctd\u003e48.58\u003c/td\u003e\n\u003ctd\u003e6.01\u003c/td\u003e\n\u003ctd\u003e-\u003c/td\u003e\n\u003ctd\u003e-\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eSeGAN(ours) w/ SV\u003csub\u003epredicted\u003c/sub\u003e\u003c/td\u003e\n\u003ctd\u003e68.78\u003c/td\u003e\n\u003ctd\u003e64.76\u003c/td\u003e\n\u003ctd\u003e15.59\u003c/td\u003e\n\u003ctd\u003e0.070\u003c/td\u003e\n\u003ctd\u003e0.023\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eSeGAN(ours) w/ SV\u003csub\u003egt\u003c/sub\u003e\u003c/td\u003e\n\u003ctd\u003e75.71\u003c/td\u003e\n\u003ctd\u003e68.05\u003c/td\u003e\n\u003ctd\u003e23.26\u003c/td\u003e\n\u003ctd\u003e0.026\u003c/td\u003e\n\u003ctd\u003e0.008\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n```\nth main.lua -weights_segmentation \"weights/segment\" -end2end -weights_texture \"weights/texture\" -istrain \"test\" -predictedSV\n```\n\nFor testing using the groundtruth visible mask as input instead of the predicted mask:\n\n```\nth main.lua -weights_segmentation \"weights/segment_gt_sv\" -end2end -weights_texture \"weights/texture_gt_sv\" -istrain \"test\"\n```\n\n\n\n## Acknowledgments\nCode for GAN network borrows heavily from [pix2pix](https://github.com/phillipi/pix2pix).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fehsanik%2Fsegan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fehsanik%2Fsegan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fehsanik%2Fsegan/lists"}