{"id":13732575,"url":"https://github.com/DmitryUlyanov/texture_nets","last_synced_at":"2025-05-08T08:32:04.829Z","repository":{"id":46295904,"uuid":"53321801","full_name":"DmitryUlyanov/texture_nets","owner":"DmitryUlyanov","description":"Code for \"Texture Networks: Feed-forward Synthesis of Textures and Stylized Images\" paper.","archived":false,"fork":false,"pushed_at":"2018-01-07T06:56:34.000Z","size":9765,"stargazers_count":1227,"open_issues_count":42,"forks_count":215,"subscribers_count":57,"default_branch":"master","last_synced_at":"2025-04-12T14:55:51.439Z","etag":null,"topics":["neural-style","style-transfer","texture-networks","torch"],"latest_commit_sha":null,"homepage":"","language":"Lua","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DmitryUlyanov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-03-07T11:53:05.000Z","updated_at":"2025-04-02T08:57:37.000Z","dependencies_parsed_at":"2022-09-09T09:51:11.264Z","dependency_job_id":null,"html_url":"https://github.com/DmitryUlyanov/texture_nets","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DmitryUlyanov%2Ftexture_nets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DmitryUlyanov%2Ftexture_nets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DmitryUlyanov%2Ftexture_nets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DmitryUlyanov%2Ftexture_nets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DmitryUlyanov","download_url":"https://codeload.github.com/DmitryUlyanov/texture_nets/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253029153,"owners_count":21843031,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["neural-style","style-transfer","texture-networks","torch"],"created_at":"2024-08-03T03:00:27.666Z","updated_at":"2025-05-08T08:32:04.527Z","avatar_url":"https://github.com/DmitryUlyanov.png","language":"Lua","readme":"# Texture Networks + Instance normalization: Feed-forward Synthesis of Textures and Stylized Images\n\nIn the paper [Texture Networks: Feed-forward Synthesis of Textures and Stylized Images](http://arxiv.org/abs/1603.03417) we describe a faster way to generate textures and stylize images. It requires learning a feedforward generator with a loss function proposed by [Gatys et al.](http://arxiv.org/abs/1505.07376). When the model is trained, a texture sample or stylized image of any size can be generated instantly.\n\n[Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis](https://arxiv.org/abs/1701.02096) presents a better architectural design for the generator network. By switching `batch_norm` to `Instance Norm` we facilitate the learning process resulting in much better quality.\n\n\n\nThis also implements the stylization part from [Perceptual Losses for Real-Time Style Transfer and Super-Resolution](https://arxiv.org/abs/1603.08155).\n\nYou can find an oline demo [here](https://riseml.com/DmitryUlyanov/texture_nets) (thanks to RiseML). \n# Prerequisites\n- [Torch7](http://torch.ch/docs/getting-started.html) + [loadcaffe](https://github.com/szagoruyko/loadcaffe)\n- cudnn + torch.cudnn (optionally)\n- [display](https://github.com/szym/display) (optionally)\n\nDownload VGG-19.\n```\ncd data/pretrained \u0026\u0026 bash download_models.sh \u0026\u0026 cd ../..\n```\n\n# Stylization\n\u003c!-- \nContent image|  Dalaunay | Modern \n:-------------------------:|:-------------------------:|:------------------------------:\n![](data/readme_pics/karya.jpg \" \") | ![](data/readme_pics/karya512.jpg  \" \")| ![](data/readme_pics/karya_s_mo.jpg  \" \")\n --\u003e\n![](data/readme_pics/all.jpg \" \")\n\n### Training\n\n#### Preparing image dataset\n\nYou can use an image dataset of any kind. For my experiments I tried `Imagenet` and `MS COCO` datasets. The structure of the folders should be the following:\n```\ndataset/train\ndataset/train/dummy\ndataset/val/\ndataset/val/dummy\n```\n\nThe dummy folders should contain images. The dataloader is based on one used in[fb.resnet.torch](https://github.com/facebook/fb.resnet.torch). \n\nHere is a quick example for MSCOCO: \n```\nwget http://msvocds.blob.core.windows.net/coco2014/train2014.zip\nwget http://msvocds.blob.core.windows.net/coco2014/val2014.zip\nunzip train2014.zip\nunzip val2014.zip\nmkdir -p dataset/train\nmkdir -p dataset/val\nln -s `pwd`/val2014 dataset/val/dummy\nln -s `pwd`/train2014 dataset/train/dummy\n```\n\n#### Training a network\n\nBasic usage:\n```\nth train.lua -data \u003cpath to any image dataset\u003e  -style_image path/to/img.jpg\n```\n\nThese parameters work for me: \n```\nth train.lua -data \u003cpath to any image dataset\u003e -style_image path/to/img.jpg -style_size 600 -image_size 512 -model johnson -batch_size 4 -learning_rate 1e-2 -style_weight 10 -style_layers relu1_2,relu2_2,relu3_2,relu4_2 -content_layers relu4_2\n```\nCheck out issues tab, you will find some useful advices there. \n\nTo achieve the results from the paper you need to play with `-image_size`, `-style_size`, `-style_layers`, `-content_layers`, `-style_weight`, `-tv_weight`. \n\nDo not hesitate to set `-batch_size` to one, but remember the larger `-batch_size` the larger `-learning_rate` you can use.   \n\n### Testing\n\n```\nth test.lua -input_image path/to/image.jpg -model_t7 data/checkpoints/model.t7\n```\n\nPlay with `-image_size` here. Raise `-cpu` flag to use CPU for processing. \n\nYou can find a **pretrained model** [here](https://yadi.sk/d/GwL9jNJovBwQg). It is *not* the model from the paper.\n\n## Generating textures\n\nsoon\n\u003c!-- ## Train texture generator\n\n### Train\n\nThis command should train a generator close to what is presented in the paper. It is tricky, the variance in the results is rather high, many things lead to degrading (even optimizing for too long time).\n```\nth texture_train.lua -texture data/textures/red-peppers256.o.jpg -model_name pyramid -backend cudnn -num_iterations 1500 -vgg_no_pad true -normalize_gradients true -batch_size 15\n```\nThe generator will fit the texture\n\n![Texture](data/textures/red-peppers256.o.jpg)\n\nAnd here is a sample of size `512x512` after learning for 700 iterations:\n\n![Sample](data/readme_pics/peppers_sample.png)\n\n\nYou may also explore other models. We found `pyramid2` requires bigger `learning rate` of about `5e-1`. To prevent degrading noise dimensionality should be increased: `noise_depth 16`. It also converges slower.\n\nThis works good for me:\n```\nth texture_train.lua -texture data/textures/red-peppers256.o.jpg -gpu 0 -model_name pyramid2 -backend cudnn -num_iterations 1500 -vgg_no_pad true -normalize_gradients true -learning_rate 5e-1 -noise_depth 16\n```\n\n- `vgg_no_pad` corresponds to padding option used in VGG. If set, padding mode = `valid`.\n\nThe samples and loss plot will appear at `display` web interface.\n\n### Sample\n\nA sample from above can be obtained with\n```\nth texture_sample.lua -model data/out/model.t7 -noise_depth 3 -sample_size 512\n```\n`noise_depth` should correspond to `noise_depth` used when training.\n\n## Stylization\n\n\n### Prepare\n\nWe used ILSVRC2012 validation set to train a generator. One pass through the data was more than enough for the model described in the paper.\n\nExtract content from `relu4_2` layer.\n```\nth scripts/extract4_2.lua -images_path \u003cpath/ILSVRC2012\u003e\n```\n### Train\n\nUse this command to learn a generator to stylize like in the next example.\n```\nth stylization_train.lua -style_image data/textures/cezanne.jpg -train_hdf5 \u003cpath/to/generated/hdf5\u003e -noise_depth 3 -model_name pyramid -normalize_gradients true -train_images_path \u003cpath/to/ILSVRC2012\u003e -content_weight 0.8\n\n```\n### Process\n\nStylize an image.\n```\nth stylization_process.lua -model data/out/model.t7 -input_image data/readme_pics/kitty.jpg -noise_depth 3\n```\nAgain, `noise_depth` should be consistent with training setting.\n\n### Example\n\n![Cezanne](data/textures/cezanne.jpg)\n\n![Original](data/readme_pics/kitty.jpg)\n\n![Processed](data/readme_pics/kitty_cezanne.jpg)\n\n#### Variations\nWe were not able to archive similar results to original parer of L. Gatys on artistic style, which is partially explained by balance problem (read the paper for the details). Yet, while not transferring the style exactly as expected, models produce nice pictures. We tried several hacks to redefine the objective function, which could be more suitable for convolutional parametric generator, none of them worked considerably better, but the results were nice.\n\nFor the next pair we used a generator, trained using 16 images only. It is funny, that it did not overfit. Also, in this setting the net does not degrade for much longer time if zero padding is used. Note that, tiger image was not in the train set.\n\n![Tiger](data/readme_pics/tiger.jpg)\n\n![Tiger_processed](data/readme_pics/tiger_starry.jpg)\nUsing \"Starry night\" by Van Gogh. It takes about quarter of second to process an image at `1024 x 768` resolution.\n\n\nIn one of the experiments the generator failed to learn Van Gogh, but went very stylish.\n\n![Pseudo](data/readme_pics/pseudo.png)\n\nThis model tried to fit both texture and content losses on a fixed set of 16 images and only content loss on the big number of images.\n --\u003e\n\n# Hardware\n- The code was tested with 12GB NVIDIA Titan X GPU and Ubuntu 14.04.\n- You may decrease `batch_size`, `image_size` if the model do not fit your GPU memory.\n- The pretrained models do not need much memory to sample.\n\n# Credits\n\nThe code is based on [Justin Johnson's great code](https://github.com/jcjohnson/neural-style) for artistic style.\n\nThe work was supported by [Yandex](https://www.yandex.ru/) and [Skoltech](http://sites.skoltech.ru/compvision/).\n","funding_links":[],"categories":["Style Transfer","LowLevelVision","Model Zoo"],"sub_categories":["3D SemanticSeg","Convolutional Networks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDmitryUlyanov%2Ftexture_nets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDmitryUlyanov%2Ftexture_nets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDmitryUlyanov%2Ftexture_nets/lists"}