{"id":27814381,"url":"https://github.com/rakutentech/iterative_training","last_synced_at":"2025-07-04T10:34:07.849Z","repository":{"id":144904782,"uuid":"203558667","full_name":"rakutentech/iterative_training","owner":"rakutentech","description":"Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization","archived":false,"fork":false,"pushed_at":"2021-11-19T04:13:40.000Z","size":183,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-05-01T12:43:02.978Z","etag":null,"topics":["deep-learning","machine-learning","model-compression","neural-network"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rakutentech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-08-21T10:11:19.000Z","updated_at":"2021-11-19T04:13:43.000Z","dependencies_parsed_at":"2023-07-08T05:15:31.948Z","dependency_job_id":null,"html_url":"https://github.com/rakutentech/iterative_training","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rakutentech/iterative_training","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakutentech%2Fiterative_training","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakutentech%2Fiterative_training/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakutentech%2Fiterative_training/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakutentech%2Fiterative_training/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rakutentech","download_url":"https://codeload.github.com/rakutentech/iterative_training/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakutentech%2Fiterative_training/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263493378,"owners_count":23475184,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","machine-learning","model-compression","neural-network"],"created_at":"2025-05-01T12:39:45.596Z","updated_at":"2025-07-04T10:34:07.833Z","avatar_url":"https://github.com/rakutentech.png","language":"Python","readme":"# Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization\n\nThis repository contains the source code for the paper: [https://arxiv.org/abs/2111.07046](https://arxiv.org/abs/2111.07046).\n\n## Requirements\n\n* GPU\n* Python 3\n* PyTorch 1.9\n  * Earlier version may work, but untested.\n* `pip install -r requirements.txt`\n* If running ResNet-21 or ImageNet experiments, first download and prepare the ImageNet 2012 dataset with [bin/imagenet_prep.sh](bin/imagenet_prep.sh) script.\n\n\n## Running\n\nFor non-ImageNet experiments, the main python file is [main.py](main.py). To see its arguments:\n\n```sh\npython main.py --help\n```\n\nRunning for the first time can take a little longer due to automatic downloading of the MNIST and Cifar-10 dataset from the Internet.\n\n\nFor ImageNet experiments, the main python files are [main_imagenet_float.py](main_imagenet_float.py) and [main_imagenet_binary.py](main_imagenet_binary.py).\nToo see their arguments:\n\n```sh\npython main_imagenet_float.py --help\n```\n\nand\n\n```sh\npython main_imagenet_binary.py --help\n```\n\nThe ImageNet dataset must be already downloaded and prepared. Please see the requirements section for details.\n\n\n## Scripts\n\nThe main python file has many options. The following scripts runs training with hyper-parameters given in the paper.\nOutput includes a run-log text file and tensorboard files.\nThese files are saved to `./logs` and reused for subsequent runs.\n\n\n### 300-100-10\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.1.\n./scripts/mnist/300/sensitivity/layer.sh sensitivity forward 0.1 0\n# Layer 2. Learning rate 0.1.\n./scripts/mnist/300/sensitivity/layer.sh sensitivity 231 0.1 0\n# Layer 3. Learning rate 0.1.\n./scripts/mnist/300/sensitivity/layer.sh sensitivity reverse 0.1 0\n```\n\nOutput files and run-log are written to `./logs/mnist/val/sensitivity/`.\n\n\n#### Hyperparam search\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1.\n./scripts/mnist/300/val/float.sh hyperparam 0.1 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1.\n./scripts/mnist/300/val/binary.sh hyperparam 0.1 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam forward 0.1 0\n# Reverse order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam reverse 0.1 0\n# 1, 3, 2 order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam 132 0.1 0\n# 2, 1, 3 order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam 213 0.1 0\n# 2, 3, 1 order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam 231 0.1 0\n# 3, 1, 2 order. Learning rate 0.1.\n./scripts/mnist/300/val/layer.sh hyperparam 312 0.1 0\n```\n\nOutput files and run-log are written to `./logs/mnist/val/hyperparam/`.\n\n\n#### Full Training\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/float.sh full 0.1 316 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/binary.sh full 0.1 316 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full forward 0.1 316 0\n# Reverse order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full reverse 0.1 316 0\n# 1, 3, 2 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full 132 0.1 316 0\n# 2, 1, 3 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full 213 0.1 316 0\n# 2, 3, 1 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full 231 0.1 316 0\n# 3, 1, 2 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/300/run/layer.sh full 312 0.1 316 0\n```\n\nOutput files and run-log are written to `./logs/mnist/run/full/`.\n\n\n### 784-100-10\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.1.\n./scripts/mnist/784/sensitivity/layer.sh sensitivity forward 0.1 0\n# Layer 2. Learning rate 0.1.\n./scripts/mnist/784/sensitivity/layer.sh sensitivity 231 0.1 0\n# Layer 3. Learning rate 0.1.\n./scripts/mnist/784/sensitivity/layer.sh sensitivity reverse 0.1 0\n```\n\nOutput files and run-log are written to `./logs/mnist/val/sensitivity/`.\n\n\n#### Hyperparam search\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1.\n./scripts/mnist/784/val/float.sh hyperparam 0.1 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1.\n./scripts/mnist/784/val/binary.sh hyperparam 0.1 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam forward 0.1 0\n# Reverse order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam reverse 0.1 0\n# 1, 3, 2 order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam 132 0.1 0\n# 2, 1, 3 order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam 213 0.1 0\n# 2, 3, 1 order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam 231 0.1 0\n# 3, 1, 2 order. Learning rate 0.1.\n./scripts/mnist/784/val/layer.sh hyperparam 312 0.1 0\n```\n\nOutput files and run-log are written to `./logs/mnist/val/hyperparam/`.\n\n\n#### Full Training\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/float.sh full 0.1 316 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/binary.sh full 0.1 316 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full forward 0.1 316 0\n# Reverse order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full reverse 0.1 316 0\n# 1, 3, 2 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full 132 0.1 316 0\n# 2, 1, 3 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full 213 0.1 316 0\n# 2, 3, 1 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full 231 0.1 316 0\n# 3, 1, 2 order. Learning rate 0.1. Seed 316.\n./scripts/mnist/784/run/layer.sh full 312 0.1 316 0\n```\n\nOutput files and run-log are written to `./logs/mnist/run/full/`.\n\n\n### Vgg-5\n\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.1.\n./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 1 0.1 0\n# Layer 2. Learning rate 0.1.\n./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 2 0.1 0\n# Layer 5. Learning rate 0.1.\n./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 5 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/sensitivity/`.\n\n#### Hyperparam Search\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1.\n./scripts/cifar10/vgg5/val/float.sh hyperparam 0.1 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1.\n./scripts/cifar10/vgg5/val/binary.sh hyperparam 0.1 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1.\n./scripts/cifar10/vgg5/val/layer.sh hyperparam forward 0.1 0\n# Ascend order. Learning rate 0.1.\n./scripts/cifar10/vgg5/val/layer.sh hyperparam ascend 0.1 0\n# Reverse order. Learning rate 0.1.\n./scripts/cifar10/vgg5/val/layer.sh hyperparam reverse 0.1 0\n# Descend order. Learning rate 0.1.\n./scripts/cifar10/vgg5/val/layer.sh hyperparam descend 0.1 0\n# Random order. Learning rate 0.1.\n./scripts/cifar10/vgg5/val/layer.sh hyperparam random 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/hyperparam/`.\n\n#### Full Training\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/float.sh full 0.1 316 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/binary.sh full 0.1 316 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/layer.sh full forward 0.1 316 0\n# Ascend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/layer.sh full ascend 0.1 316 0\n# Reverse order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/layer.sh full reverse 0.1 316 0\n# Descend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/layer.sh full descend 0.1 316 0\n# Random order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg5/run/layer.sh full random 0.1 316 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/run/full/`.\n\n\n### Vgg-9\n\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.1.\n./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 1 0.1 0\n# Layer 2. Learning rate 0.1.\n./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 2 0.1 0\n# Layer 5. Learning rate 0.1.\n./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 5 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/sensitivity/`.\n\n#### Hyperparam Search\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1.\n./scripts/cifar10/vgg9/val/float.sh hyperparam 0.1 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1.\n./scripts/cifar10/vgg9/val/binary.sh hyperparam 0.1 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1.\n./scripts/cifar10/vgg9/val/layer.sh hyperparam forward 0.1 0\n# Ascend order. Learning rate 0.1.\n./scripts/cifar10/vgg9/val/layer.sh hyperparam ascend 0.1 0\n# Reverse order. Learning rate 0.1.\n./scripts/cifar10/vgg9/val/layer.sh hyperparam reverse 0.1 0\n# Descend order. Learning rate 0.1.\n./scripts/cifar10/vgg9/val/layer.sh hyperparam descend 0.1 0\n# Random order. Learning rate 0.1.\n./scripts/cifar10/vgg9/val/layer.sh hyperparam random 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/hyperparam/`.\n\n#### Full Training\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/float.sh full 0.1 316 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/binary.sh full 0.1 316 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/layer.sh full forward 0.1 316 0\n# Ascend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/layer.sh full ascend 0.1 316 0\n# Reverse order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/layer.sh full reverse 0.1 316 0\n# Descend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/layer.sh full descend 0.1 316 0\n# Random order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/vgg9/run/layer.sh full random 0.1 316 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/run/full/`.\n\n\n### ResNet-20\n\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.1.\n./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 1 0.1 0\n# Layer 2. Learning rate 0.1.\n./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 2 0.1 0\n# ...\n# Layer 20. Learning rate 0.1.\n./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 20 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/sensitivity/`.\n\n#### Hyperparam Search\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1\n./scripts/cifar10/resnet20/val/float.sh hyperparam 0.1 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1\n./scripts/cifar10/resnet20/val/binary.sh hyperparam 0.1 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1\n./scripts/cifar10/resnet20/val/layer.sh hyperparam forward 0.1 0\n# Ascend order. Learning rate 0.1\n./scripts/cifar10/resnet20/val/layer.sh hyperparam ascend 0.1 0\n# Reverse order. Learning rate 0.1\n./scripts/cifar10/resnet20/val/layer.sh hyperparam reverse 0.1 0\n# Descend order. Learning rate 0.1\n./scripts/cifar10/resnet20/val/layer.sh hyperparam descend 0.1 0\n# Random order. Learning rate 0.1\n./scripts/cifar10/resnet20/val/layer.sh hyperparam random 0.1 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/val/hyperparam/`.\n\n\n#### Full Training\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/float.sh full 0.1 316 0\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/binary.sh full 0.1 316 0\n```\n\nFor iterative training:\n\n```sh\n# Forward order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/layer.sh full forward 0.1 316 0\n# Ascend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/layer.sh full ascend 0.1 316 0\n# Reverse order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/layer.sh full reverse 0.1 316 0\n# Descend order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/layer.sh full descend 0.1 316 0\n# Random order. Learning rate 0.1. Seed 316.\n./scripts/cifar10/resnet20/run/layer.sh full random 0.1 316 0\n```\n\nOutput files and run-log are written to `./logs/cifar10/run/full/`.\n\n\n### ResNet-21\n\nTo run experiments for ResNet-21, first download and prepare the ImageNet dataset.\nSee the requirements section at the beginning of this readme.\nWe assume the dataset is prepared and is at `./imagenet`.\n\n\n#### Sensitivity Pre-training\n\n```sh\n# Layer 1. Learning rate 0.01.\n./scripts/imagenet/layer.sh sensitivity ./imagenet 20 \"[20]\" 20 1 0.01\n# Layer 2. Learning rate 0.01.\n./scripts/imagenet/layer.sh sensitivity ./imagenet 20 \"[20]\" 20 2 0.01\n# Layer 21. Learning rate 0.01.\n./scripts/imagenet/layer.sh sensitivity ./imagenet 20 \"[20]\" 20 21 0.01\n```\n\nOutput files and run-log are written to `./logs/imagenet/sensitivity/`.\n\n#### Full Training\n\n\nFor floating-point training:\n\n```sh\n# Learning rate 0.01.\n./scripts/imagenet/float.sh full ./imagenet 67 \"[42,57]\" 0.01\n```\n\nFor full binary training:\n\n```sh\n# Learning rate 0.01.\n./scripts/imagenet/binary.sh full ./imagenet 67 \"[42,57]\" 0.01\n```\n\nFor layer-by-layer training:\n\n```sh\n# Forward order\n./scripts/imagenet/layer.sh full ./imagenet 67 \"[42,57]\" 2 forward 0.01\n# Ascending order\n./scripts/imagenet/layer.sh full ./imagenet 67 \"[42,57]\" 2 ascend 0.01\n```\n\n\nFor all scripts, output files and run-log are written to `./logs/imagenet/full/`.\n\n\n## License\n\nSee [LICENSE](LICENSE)\n\n## Contributing\n\nSee the [contributing guide](CONTRIBUTING.md) for details of how to participate in development of the module.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakutentech%2Fiterative_training","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frakutentech%2Fiterative_training","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakutentech%2Fiterative_training/lists"}