{"id":21027676,"url":"https://github.com/vinthony/s2am","last_synced_at":"2025-10-14T04:34:44.748Z","repository":{"id":37643170,"uuid":"186835417","full_name":"vinthony/s2am","owner":"vinthony","description":"[TIP 2020] Improving the Harmony of the Composite Image by Spatial-Separated Attention Module","archived":false,"fork":false,"pushed_at":"2023-07-06T21:36:21.000Z","size":1928,"stargazers_count":50,"open_issues_count":2,"forks_count":6,"subscribers_count":6,"default_branch":"master","last_synced_at":"2023-08-20T10:01:45.406Z","etag":null,"topics":["image-composition","image-harmonization","pytorch"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1907.06406","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vinthony.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-15T13:45:48.000Z","updated_at":"2023-08-11T07:07:57.000Z","dependencies_parsed_at":"2022-09-15T08:20:36.881Z","dependency_job_id":null,"html_url":"https://github.com/vinthony/s2am","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinthony%2Fs2am","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinthony%2Fs2am/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinthony%2Fs2am/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinthony%2Fs2am/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vinthony","download_url":"https://codeload.github.com/vinthony/s2am/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225345947,"owners_count":17459893,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["image-composition","image-harmonization","pytorch"],"created_at":"2024-11-19T11:52:04.761Z","updated_at":"2025-10-14T04:34:39.716Z","avatar_url":"https://github.com/vinthony.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spatial-Separated Attention Module (S²AM)\n[Arxiv](https://arxiv.org/abs/1907.06406) | [Demo](https://colab.research.google.com/drive/1UTjyi0J1F2mjc9rf9ZbFUOL2_kkZmdlQ?usp=sharing)\n\nThis repo contains the PyTorch implement of the following paper:\n\n\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[Improving the Harmony of the Composite Image by Spatial-Separated Attention Module](https://arxiv.org/abs/1907.06406)\u003cbr\u003e\n\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[_Xiaodong Cun_](https://vinthony.github.io/academicpages.github.io/) and [_Chi-Man Pun_](http://www.cis.umac.mo/~cmpun/)\u003cbr\u003e\n\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;University of Macau\u003cbr\u003e\n\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;Trans. on Image Processing, vol. 29, pp. 4759-4771, 2020.\n\n## News\n\n- 2020-12-18 The pretrained model on [iHarmony5 dataset](https://github.com/bcmi/Image_Harmonization_Datasets) are released.\n- 2020-12-18 SCOCO and SAdobe5K are released.\n- 2020-06-16 Pretrained model(S²AD) and online demo are released.\n\n## Abstract\n\nImage composition is one of the most important applications in image processing. However, the inharmonious appearance between the spliced region and background degrade the quality of image. Thus, we address the problem of Image Harmonization: Given a spliced image and the mask of the spliced region, we try to harmonize the ''style'' of the pasted region with the background (non-spliced region). Previous approaches have been focusing on learning directly by the neural network.\nIn this work, we start from an empirical observation: the differences can only be found in the spliced region between the spliced image and the harmonized result while they share the same semantic information and the appearance in the non-spliced region. Thus, in order to learn the feature map in the masked region and the others individually, we propose a novel attention module named Spatial-Separated Attention Module (S²AM). Furthermore, we design a novel image harmonization framework by inserting the S²AM in the coarser low level features of the Unet structure by two different ways. Besides image harmonization, we make a big step for harmonizing the composite image without the specific mask under previous observation. The experiments show that the proposed S²AM performs better than other state-of-the-art attention modules in our task.  Moreover, we demonstrate the advantages of our model against other state-of-the-art image harmonization methods via criteria from multiple points of view.\n\n## Some Results\n\n![results](https://user-images.githubusercontent.com/4397546/61209516-931c0f00-a72c-11e9-84ef-c7b7bc794c0e.png)\n![sample](https://user-images.githubusercontent.com/4397546/61209520-93b4a580-a72c-11e9-881f-40de42c3a4f7.png)\n\n\n## Requirements\nThe code is tested on the python 3.6 and PyTorch v0.4+ under Ubuntu 18.04 OS.\u003c/br\u003e\nYou need to install all the requirements from `pip`.\u003c/br\u003e\n`Anaconda` is highly recommendation for install the dependences.\u003c/br\u003e \n```\ngit clone https://github.com/vinthony/s2am.git\ncd s2am\npip install -r requirements.txt\n```\n\n## Datasets\nWe train the network under two different synthesized datasets.\u003cbr\u003e\n* [SCOCO dataset(~5G)](https://uofmacau-my.sharepoint.com/:f:/g/personal/yb87432_umac_mo/EpemCJwfnhpIoDNAMfiegqIB0RXkdKH9Z2WibJJ4s27PbA?e=qPNzpI) contains `40k` images for training and `1.7k` images for testing.\u003cbr\u003e\n* [S-Adobe5k(~25G in tiff format)](https://uofmacau-my.sharepoint.com/:f:/g/personal/yb87432_umac_mo/EpemCJwfnhpIoDNAMfiegqIB0RXkdKH9Z2WibJJ4s27PbA?e=qPNzpI) dataset contains `32k` images form training and `2k` images for testing. \u003cbr\u003e\n\n\n## Train\n\nAll the options of the training can be found in `options.py`\n\n```\n# train the S2AD methods \nchmod +x ./example/train_harmorization_s2ad.sh \u0026\u0026 ./example/train_harmorization_s2ad.sh\n\n# train the S2ASC methods .\nchmod +x ./example/train_harmorization_s2asc.sh \u0026\u0026 ./example/train_harmorization_s2asc.sh\n\n# train the image harmonization w/o mask task from our paper.\nchmod +x ./example/train_harmorization_wo_mask.sh \u0026\u0026 ./example/train_harmorization_wo_mask.sh\n```\n\n\u003e you may also try our new code framework to train s2am.\n\u003e please refer to [this link](https://github.com/vinthony/deep-blind-watermark-removal/blob/e75983417fee2f5a9276ccff05db63f2ece42cea/examples/evaluate.sh#L36).\n\n## Visualization\n\nWe use `TensorboardX`  to monitor the training process, just install it by the [introduction](https://github.com/lanpa/tensorboardX) of tensorboardX.\n\nrun the watching commond as :\n```\ntensorboard --logdir ./checkpoint\n```\n## Demo \n\n#### Local machine.\n\n1. clone this repo.\n\n2. download the pretrain models from [google drive](https://drive.google.com/file/d/1bm1ZdZ4xmV9fKCQBDsulvYwrxPAidZ3T/view?usp=sharing)\n\n3. download some sample validation dataset from [google drive](https://drive.google.com/file/d/1qTVN-uem-MOYaTL-JaBxGbrqDniyLWQH/view?usp=sharing)\n\n4. configure the path to the dataset and pretrained model in `visualize.ipynb`\n\n5. run the notebook \n\n#### Online demo\n\nJust visit our [google colab notebook](https://colab.research.google.com/drive/1UTjyi0J1F2mjc9rf9ZbFUOL2_kkZmdlQ?usp=sharing).\n\n\n## The pretrained model and results on iHarmony5 Dataset.\n\nWe report the MAE and PSNR as shown in the original iHarmony5 paper. The pretrained model can be downloaded from [here](https://uofmacau-my.sharepoint.com/:f:/g/personal/yb87432_umac_mo/EpemCJwfnhpIoDNAMfiegqIB0RXkdKH9Z2WibJJ4s27PbA?e=qPNzpI).\nThese results are trained and evaluated using the newer version of our code framework with nothing changes to the algorithm(please refer to our new work [here](https://github.com/vinthony/deep-blind-watermark-removal/blob/e75983417fee2f5a9276ccff05db63f2ece42cea/examples/evaluate.sh#L36)). All the results have been evaluated using a jupyter notebook in `eval_s2am_iharmony4.ipynb`, which is modified from the [evaluation code](https://github.com/bcmi/Image_Harmonization_Datasets/blob/master/evaluation.py) in DoveNet(CVPR 2020). **Notice that the original DoveNet use the total sub-datasets for training, the results report here are trained on each sub-dataset individually.**\n\n\u003ctable\u003e\n   \u003ctr\u003e\n     \u003ctd\u003e\u003c/td\u003e\n     \u003ctd colspan=\"2\"\u003ew/o global skip-connection \u003c/td\u003e\n     \u003ctd colspan=\"2\"\u003ew global skip-connection \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n     \u003ctd\u003edataset\\method\u003c/td\u003e\n     \u003ctd\u003ePSNR↑\u003c/td\u003e\n     \u003ctd\u003eMAE↓\u003c/td\u003e\n     \u003ctd\u003ePSNR↑\u003c/rd\u003e\n     \u003ctd\u003eMAE↓\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eHCOCO\u003c/td\u003e\n     \u003ctd\u003e37.33\u003c/td\u003e\n     \u003ctd\u003e25.59\u003c/td\u003e\n     \u003ctd\u003e37.25\u003c/rd\u003e\n     \u003ctd\u003e26.22\u003c/td\u003e\n  \u003c/tr\u003e\n  \n  \u003ctr\u003e\n    \u003ctd\u003eHAdobe5K\u003c/td\u003e\n     \u003ctd\u003e34.33\u003c/td\u003e\n     \u003ctd\u003e47.49\u003c/td\u003e\n     \u003ctd\u003e34.32\u003c/rd\u003e\n     \u003ctd\u003e51.66\u003c/td\u003e\n  \u003c/tr\u003e\n  \n  \u003ctr\u003e\n    \u003ctd\u003eHFlickr\u003c/td\u003e\n     \u003ctd\u003e30.71\u003c/td\u003e\n     \u003ctd\u003e112.92\u003c/td\u003e\n     \u003ctd\u003e31.02\u003c/rd\u003e\n     \u003ctd\u003e106.21\u003c/td\u003e\n  \u003c/tr\u003e\n  \n  \u003ctr\u003e\n    \u003ctd\u003eHDay2night\u003c/td\u003e\n     \u003ctd\u003e33.63\u003c/td\u003e\n     \u003ctd\u003e70.03\u003c/td\u003e\n     \u003ctd\u003e34.28\u003c/rd\u003e\n     \u003ctd\u003e66.31\u003c/td\u003e\n  \u003c/tr\u003e\n  \n\u003c/table\u003e\n\n\n## The Application of Spatial-Separated Attention Module (S²AM) w/o mask\n\n#### Image Classification\n\nWe evaluate our method with the baseline attention module: [CBAM](https://arxiv.org/abs/1807.06521) and original ResNet in CIFAR-10 with the default setting of code in [pytorch_resnet_cifar10](https://github.com/akamaster/pytorch_resnet_cifar10)\n\n| method | Test err (Orginal) | Test err (w/ CBAM) | **Test err (w/ S²AM)**|\n| -- | -- | -- | -- |\n| ResNet20 | 8.45% | 7.91% | **7.60%** |\n| ResNet32 | 7.40% | 7.07% | **7.06%** |\n| ResNet44 | 6.96% | 6.92% | **6.58%** |\n| ResNet56 | 6.47% | 6.43% | **6.41%** |\n\n\n#### Interactive Wartmark Removal from a region.\n\nBy regard a region as mask, Our method can use to remove the visible wartmark from the image. We generate the datasets from VOC as image and 100 famous logo as watermark region. The network trains on 70 of them and testing on the rest of them, here are some random results:\n![1511](https://user-images.githubusercontent.com/4397546/61209289-e80b5580-a72b-11e9-9608-6da743935cb0.png)\n![1582](https://user-images.githubusercontent.com/4397546/61209290-e80b5580-a72b-11e9-862a-24f71217b43d.png)\n![1654](https://user-images.githubusercontent.com/4397546/61209291-e8a3ec00-a72b-11e9-8372-ed45e26d18e4.png)\n![1728](https://user-images.githubusercontent.com/4397546/61209292-e8a3ec00-a72b-11e9-875b-ed7bf9027af9.png)\n\n\n## **Citation**\n\nIf you find our work useful in your research, please consider citing:\n```\n@misc{cun2019improving,\n    title={Improving the Harmony of the Composite Image by Spatial-Separated Attention Module},\n    author={Xiaodong Cun and Chi-Man Pun},\n    year={2019},\n    eprint={1907.06406},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinthony%2Fs2am","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvinthony%2Fs2am","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinthony%2Fs2am/lists"}