{"id":13395831,"url":"https://github.com/kamenbliznashki/normalizing_flows","last_synced_at":"2025-03-13T22:31:02.746Z","repository":{"id":45776665,"uuid":"163574100","full_name":"kamenbliznashki/normalizing_flows","owner":"kamenbliznashki","description":"Pytorch implementations of density estimation algorithms: BNAF, Glow, MAF, RealNVP, planar flows","archived":false,"fork":false,"pushed_at":"2021-07-12T12:40:38.000Z","size":2980,"stargazers_count":597,"open_issues_count":10,"forks_count":101,"subscribers_count":16,"default_branch":"master","last_synced_at":"2024-07-31T18:15:55.771Z","etag":null,"topics":["deep-learning","density-estmation","normalizing-flows","probability"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kamenbliznashki.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-12-30T08:50:07.000Z","updated_at":"2024-07-22T23:50:08.000Z","dependencies_parsed_at":"2022-07-22T06:02:07.757Z","dependency_job_id":null,"html_url":"https://github.com/kamenbliznashki/normalizing_flows","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kamenbliznashki%2Fnormalizing_flows","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kamenbliznashki%2Fnormalizing_flows/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kamenbliznashki%2Fnormalizing_flows/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kamenbliznashki%2Fnormalizing_flows/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kamenbliznashki","download_url":"https://codeload.github.com/kamenbliznashki/normalizing_flows/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243493323,"owners_count":20299633,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","density-estmation","normalizing-flows","probability"],"created_at":"2024-07-30T18:00:33.428Z","updated_at":"2025-03-13T22:31:02.150Z","avatar_url":"https://github.com/kamenbliznashki.png","language":"Python","funding_links":[],"categories":["🧑‍💻 Code","Python","🧑‍💻 Repos \u003csmall\u003e(18)\u003c/small\u003e"],"sub_categories":["\u003cimg src=\"assets/pytorch.svg\" alt=\"PyTorch\" height=\"20px\"\u003e \u0026nbsp;PyTorch Repos"],"readme":"# Normalizing flows\n\nReimplementations of density estimation algorithms from:\n* [Block Neural Autoregressive Flow](https://arxiv.org/abs/1904.04676)\n* [Glow: Generative Flow with Invertible 1×1 Convolutions](https://arxiv.org/abs/1807.03039)\n* [Masked Autoregressive Flow for Density Estimation](https://arxiv.org/abs/1705.07057)\n* [Density Estimation using RealNVP](https://arxiv.org/abs/1605.08803)\n* [Variational Inference with Normalizing Flows](https://arxiv.org/abs/1505.05770)\n\n## Block Neural Autoregressive Flow\nhttps://arxiv.org/abs/1904.04676\n\nImplementation of BNAF on toy density estimation datasets.\n\n#### Results\nDensity estimation of 2d toy data and density estimation of 2d test energy potentials (cf. Figure 2 \u0026 3 in paper):\n\nThe models were trained for 20,000 steps with the architectures and hyperparameters described in the Section 5 of the paper, with the exception of `rings` dataset (bottom right) which had 5 hidden layers. The models trained significantly faster than the planar flow model in Rezende \u0026 Mohamed and were much more stable; interestingly, BNAF stretches space differently and requires a lot more test points to show a smooth potential.\n\n| Density matching on 2d energy potentials | Density estimation on 2d toy data |\n| --- | --- |\n| ![bnaf_u1](images/bnaf/bnaf_u1_vis_step_20000.png) | ![bnaf_8gaussians](images/bnaf/bnaf_8gaussians_vis_step_20000.png) |\n| ![bnaf_u2](images/bnaf/bnaf_u2_vis_step_20000.png) | ![bnaf_checkerboard](images/bnaf/bnaf_checkerboard_vis_step_20000.png) |\n| ![bnaf_u3](images/bnaf/bnaf_u3_vis_step_20000.png) | ![bnaf_2spirals](images/bnaf/bnaf_2spirals_vis_step_20000.png) |\n| ![bnaf_u4](images/bnaf/bnaf_u4_vis_step_20000.png) | ![bnaf_rings](images/bnaf/bnaf_rings_vis_step_20000.png) |\n\n\n#### Usage\nTo train model:\n```\npython bnaf.py --train\n               --dataset      # choice from u1, u2, u3, u4, 8gaussians, checkerboard, 2spirals\n               --log_interval # how often to save the model and visualize results\n               --n_steps      # number of training steps\n               --n_hidden     # number of hidden layers\n               --hidden_dim   # dimension of the hidden layer\n               --[add'l options]\n```\nAdditional options are: learning rate, learning rate decay and patience, cuda device id, batch_size.\n\nTo plot model:\n```\npython bnaf.py --plot\n               --restore_file [path to .pt checkpoint]\n```\n\n#### Useful resources\n* Official implementation by the authors https://github.com/nicola-decao/BNAF\n\n## Glow: Generative Flow with Invertible 1x1 Convolutions\nhttps://arxiv.org/abs/1807.03039\n\nImplementation of Glow on CelebA and MNIST datasets.\n\n#### Results\nI trained two models:\n- Model A with 3 levels, 32 depth, 512 width (~74M parameters). Trained on 5 bit images, batch size of 16 per GPU over 100K iterations.\n- Model B with 3 levels, 24 depth, 256 width (~22M parameters). Trained on 4 bit images, batch size of 32 per GPU over 100K iterations.\n\nIn both cases, gradients were clipped at norm 50, learning rate was 1e-3 with linear warmup from 0 over 2 epochs. Both reached similar results and 4.2 bits/dim.\n\n##### Samples at varying temperatures\nTemperatures ranging 0, 0.25, 0.5, 0.6, 0.7, 0.8, 0.9, 1 (rows, top to bottom):\n\n| Model A | Model B |\n| --- | --- |\n| ![model_a_range](images/glow/model_3_32_512_generated_samples_at_z_std_range.png) | ![model_b_range](images/glow/model_3_24_256_generated_samples_at_z_std_range.png) |\n\n##### Samples at temperature 0.7:\n| Model A | Model B |\n| --- | --- |\n| ![model_a_range](images/glow/model_3_32_512_generated_samples_at_z_std_0.7_seed_2.png) | ![model_b_range](images/glow/model_3_24_256_generated_samples_at_z_std_0.7.png) |\n\n##### Model A attribute manipulation on in-distribution sample:\n\nEmbedding vectors were calculated for the first 30K training images and positive / negative attributes were averaged then subtracting. The resulting `dz` was ranged and applied on a test set image (middle image represents the unchanged / actual data point).\n\n| Attribute | `dz` range [-2, -1, 0, 1, 2] |\n| --- | --- |\n| Brown hair | ![attr_8](images/glow/manipulated_sample_attr_8.png) |\n| Male | ![attr_20](images/glow/manipulated_sample_attr_20.png) |\n| Mouth slightly opened | ![attr_21](images/glow/manipulated_sample_attr_21.png) |\n| Young | ![attr_39](images/glow/manipulated_sample_attr_39.png) |\n\n##### Model A attribute manipulation on 'out-of-distribution' sample (i.e. me):\n\n| Attribute | `dz` range |\n| --- | --- |\n| Brown hair | ![me_8](images/glow/manipulated_img_me3_attr_8.png) |\n| Mouth slightly opened | ![me_21](images/glow/manipulated_img_me1_attr_21.png) |\n\n\n#### Usage\n\nTo train a model using pytorch distributed package:\n```\npython -m torch.distributed.launch --nproc_per_node=NUM_GPUS_YOU_HAVE \\\n       glow.py --train \\\n               --distributed \\\n               --dataset=celeba \\\n               --data_dir=[path to data source] \\\n               --n_levels=3 \\\n               --depth=32 \\\n               --width=512 \\\n               --batch_size=16 [this is per GPU]\n```\nFor larger models or image sizes add `--checkpoint_grads` to checkpoint gradients using pytorch's library. I trained a 3 layer / 32 depth / 512 width model with batch size of 16 without gradient checkpointing and a 4 layer / 48 depth / 512 width model with batch size of 16 which had ~190M params so required gradient checkpointing (and was painfully slow on 8 GPUs).\n\n\nTo evaluate model:\n```\npython glow.py --evaluate \\\n               --restore_file=[path to .pt checkpoint] \\\n               --dataset=celeba \\\n               --data_dir=[path to data source] \\\n               --[options of the saved model: n_levels, depth, width, batch_size]\n```\n\nTo generate samples from a trained model:\n```\npython glow.py --generate \\\n               --restore_file=[path to .pt checkpoint] \\\n               --dataset=celeba \\\n               --data_dir=[path to data source] \\\n               --[options of the saved model: n_levels, depth, width, batch_size] \\\n               --z_std=[temperature parameter; if blank, generates range]\n```\n\nTo visualize manipulations on specific image given a trained model:\n```\npython glow.py --visualize \\\n               --restore_file=[path to .pt checkpoint] \\\n               --dataset=celeba \\\n               --data_dir=[path to data source] \\\n               --[options of the saved model: n_levels, depth, width, batch_size] \\\n               --z_std=[temperature parameter; if blank, uses default] \\\n               --vis_attrs=[list of indices of attribute to be manipulated, if blank, manipulates every attribute] \\\n               --vis_alphas=[list of values by which `dz` should be multiplied, defaults [-2,2]] \\\n               --vis_img=[path to image to manipulate (note: size needs to match dataset); if blank uses example from test dataset]\n```\n\n#### Datasets\n\nTo download CelebA follow the instructions [here](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html). A nice script that simplifies downloading and extracting can be found here: https://github.com/nperraud/download-celebA-HQ/\n\n\n#### References\n* Official implementation in Tensorflow: https://github.com/openai/glow\n\n\n## Masked Autoregressive Flow\nhttps://arxiv.org/abs/1705.07057\n\nReimplementation of MADE, MAF, Mixture of Gaussians MADE, Mixture of\nGausssians MAF, and RealNVP modules on UCI datasets and MNIST.\n\n#### Results\nAverage test log likelihood for un/conditional density estimation (cf.\nTable 1 \u0026 2 in paper for results and parameters; models here were trained for 50 epochs):\n\n| Model | POWER | GAS | HEPMASS | MINIBOONE | BSDS300 | MNIST (uncond) | MNIST (cond) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| MADE | -3.10 +/- 0.02 | 2.53 +/- 0.02 | -21.13 +/- 0.01 | -15.36 +/- 15.06 | 146.42 +/- 0.14 | -1393.67 +/- 1.90 | -1340.98 +/- 1.71 |\n| MADE MOG | 0.37 +/- 0.01 | 8.08 +/- 0.02 | -15.70 +/- 0.02 | -11.64 +/- 0.44 | 153.56 +/- 0.28 | -1023.13 +/- 1.69 | -1013.75 +/- 1.61 |\n| RealNVP (5) | -0.49 +/- 0.01 | 7.01 +/- 0.06 | -19.96 +/- 0.02 | -16.88 +/- 0.21 | 148.34 +/- 0.26 | -1279.76 +/- 9.91 | -1276.33 +/- 12.21 |\n| MAF (5) | 0.03 +/- 0.01 | 6.23 +/- 0.01 | -17.97 +/- 0.01 | -11.57 +/- 0.21 | 153.53 +/- 0.27 | -1272.70 +/- 1.87 | -1268.24 +/- 2.73 |\n| MAF MOG (5) | 0.09 +/- 0.01 | 7.96 +/- 0.02 | -17.29 +/- 0.02 | -11.27 +/- 0.41 | 153.35 +/- 0.26 | -1080.46 +/- 1.53 | -1070.33 +/- 1.53 |\n\n\nToy density model (cf. Figure 1 in paper):\n\n| Target density | Learned density with MADE \u003cbr\u003e and random numbers driving MADE | Learned density with MAF 5 layers \u003cbr\u003e and random numbers driving MAF |\n| --- | --- | --- |\n| ![fig1a](images/maf/figure_1a.png) | ![fig1b](images/maf/figure_1b.png) | ![fig1c](images/maf/figure_1c.png) |\n\nClass-conditional generated images from MNIST using MAF (5) model; generated data arrange by decreasing log probability (cf. Figure 3 in paper):\n\n![mafmnist](images/maf/generated_samples_maf5.png)\n\n#### Usage\nTo train model:\n```\npython maf.py -- train \\\n              -- model=['made' | 'mademog' | 'maf' | 'mafmog' | 'realnvp'] \\\n              -- dataset=['POWER' | 'GAS' | 'HEPMASS' | 'MINIBOONE' | 'BSDS300' | MNIST'] \\\n              -- n_blocks=[for maf/mafmog and realnvp specify # of MADE-blocks / coupling layers] \\\n              -- n_components=[if mixture of Gaussians, specify # of components] \\\n              -- conditional [if MNIST, can train class-conditional log likelihood] \\\n              -- [add'l options see py file]\n```\n\nTo evaluate model:\n```\npython maf.py -- evaluate \\\n              -- restore_file=[path to .pt checkpoint]\n              -- [options of the saved model: n_blocks, n_hidden, hidden_size, n_components, conditional]\n```\n\nTo generate data from a trained model (for MNIST dataset):\n```\npython maf.py -- generate \\\n              -- restore_file=[path to .pt checkpoint]\n              -- dataset='MNIST'\n              -- [options of the saved model: n_blocks, n_hidden, hidden_size, n_components, conditional]\n```\n\n#### Datasets\n\nDatasets and preprocessing code are forked from the MAF authors' implementation [here](https://github.com/gpapamak/maf#how-to-get-the-datasets). The unzipped datasets should be symlinked into the `./data` folder or the data_dir argument should be specified to point to the actual data.\n\n#### References\n* The original Theano implementation by the authors https://github.com/gpapamak/maf/\n* https://github.com/ikostrikov/pytorch-flows\n\n\n## Variational inference with normalizing flows\nImplementation of [Variational Inference with Normalizing Flows](https://arxiv.org/abs/1505.05770)\n\n#### Results\nDensity estimation of 2-d test energy potentials (cf. Table 1 \u0026 Figure 3 in paper).\n\n| Target density | Flow K = 2 | Flow K = 32 | Training parameters |\n| --- | --- | --- | --- |\n| ![uz1](images/nf/nf_uz1_target_potential_density.png) | ![uz1k2](images/nf/nf_uz1_flow_k2_density.png) | ![uz1k32](images/nf/nf_uz1_flow_k32_density.png) | weight init Normal(0,1), base dist. scale 2 |\n| ![uz2](images/nf/nf_uz2_target_potential_density.png) | ![uz2k2](images/nf/nf_uz2_flow_k2_density.png) | ![uz2k32](images/nf/nf_uz2_flow_k32_density.png) | weight init Normal(0,1), base dist. scale 1 |\n| ![uz3](images/nf/nf_uz3_target_potential_density.png) | ![uz3k2](images/nf/nf_uz3_flow_k2_density.png) | ![uz3k32](images/nf/nf_uz3_flow_k32_density.png) | weight init Normal(0,1), base dist. scale 1, weight decay 1e-3 |\n| ![uz4](images/nf/nf_uz4_target_potential_density.png) | ![uz4k2](images/nf/nf_uz4_flow_k2_density.png) | ![uz4k32](images/nf/nf_uz4_flow_k32_density.png) | weight init Normal(0,1), base dist. scale 4, weight decay 1e-3 |\n\n\n#### Usage\nTo train model:\n```\npython planar_flow.py -- train \\\n                      -- target_potential=[choice from u_z1 | u_z2 | u_z3 | u_z4] \\\n                      -- flow_length=[# of layers in flow] \\\n                      -- [add'l options]\n```\nAdditional options are: base distribution (q0) scale, weight initialization\nscale, weight decay, learnable first affine layer (I did not find adding an affine layer beneficial).\n\nTo evaluate model:\n```\npython planar_flow.py -- evaluate \\\n                      -- restore_file=[path to .pt checkpoint]\n```\n\n#### Useful resources\n* https://github.com/casperkaae/parmesan/issues/22\n\n\n## Dependencies\n* python 3.6\n* pytorch 1.0\n* numpy\n* matplotlib\n* tensorboardX\n\n###### Some of the datasets further require:\n* pandas\n* sklearn\n* h5py\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkamenbliznashki%2Fnormalizing_flows","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkamenbliznashki%2Fnormalizing_flows","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkamenbliznashki%2Fnormalizing_flows/lists"}