{"id":13578015,"url":"https://github.com/Spijkervet/SimCLR","last_synced_at":"2025-04-05T15:32:08.478Z","repository":{"id":40202313,"uuid":"246276098","full_name":"Spijkervet/SimCLR","owner":"Spijkervet","description":"PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.","archived":false,"fork":false,"pushed_at":"2024-05-21T15:14:38.000Z","size":330,"stargazers_count":782,"open_issues_count":14,"forks_count":168,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-31T03:18:22.512Z","etag":null,"topics":["contrastive-learning","pytorch","representation-learning","simclr","unsupervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Spijkervet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-10T10:52:29.000Z","updated_at":"2025-03-27T09:32:33.000Z","dependencies_parsed_at":"2022-06-26T21:24:09.220Z","dependency_job_id":"fee41bb8-4f5f-403f-90d5-01a1024a236d","html_url":"https://github.com/Spijkervet/SimCLR","commit_stats":{"total_commits":96,"total_committers":6,"mean_commits":16.0,"dds":0.0625,"last_synced_commit":"cd85c4366d2e6ac1b0a16798b76ac0a2c8a94e58"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spijkervet%2FSimCLR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spijkervet%2FSimCLR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spijkervet%2FSimCLR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spijkervet%2FSimCLR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Spijkervet","download_url":"https://codeload.github.com/Spijkervet/SimCLR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247359179,"owners_count":20926378,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["contrastive-learning","pytorch","representation-learning","simclr","unsupervised-learning"],"created_at":"2024-08-01T15:01:26.330Z","updated_at":"2025-04-05T15:32:03.469Z","avatar_url":"https://github.com/Spijkervet.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# SimCLR\nPyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.\nIncluding support for:\n- Distributed data parallel training\n- Global batch normalization\n- LARS (Layer-wise Adaptive Rate Scaling) optimizer.\n\n[Link to paper](https://arxiv.org/pdf/2002.05709.pdf)\n\nOpen SimCLR in Google Colab Notebook (with TPU support)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ObAYvVKQjMG5nd2wIno7j2y_X91E9IrX)\n\nOpen SimCLR results comparison on tensorboard.dev:\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://tensorboard.dev/experiment/A3laNdafRBes0oR45Y6LiA/#scalars\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://github.com/Spijkervet/SimCLR/blob/master/media/tensorboard.png?raw=true\" height=\"40\"/\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n### Quickstart (fine-tune linear classifier)\nThis downloads a pre-trained model and trains the linear classifier, which should receive an accuracy of ±`82.9%` on the STL-10 test set.\n```\ngit clone https://github.com/spijkervet/SimCLR.git \u0026\u0026 cd SimCLR\nwget https://github.com/Spijkervet/SimCLR/releases/download/1.2/checkpoint_100.tar\nsh setup.sh || python3 -m pip install -r requirements.txt || exit 1\nconda activate simclr\npython linear_evaluation.py --dataset=STL10 --model_path=. --epoch_num=100 --resnet resnet50\n```\n\n#### CPU\n```\nwget https://github.com/Spijkervet/SimCLR/releases/download/1.1/checkpoint_100.tar -O checkpoint_100.tar\npython linear_evaluation.py --model_path=. --epoch_num=100 --resnet=resnet18 --logistic_batch_size=32\n```\n\n### `simclr` package\nSimCLR for PyTorch is now available as a Python package! Simply run and use it in your project:\n```\npip install simclr\n```\n\nYou can then simply import SimCLR:\n```\nfrom simclr import SimCLR\n\nencoder = ResNet(...)\nprojection_dim = 64\nn_features = encoder.fc.in_features  # get dimensions of last fully-connected layer\nmodel = SimCLR(encoder, projection_dim, n_features)\n```\n\n### Training ResNet encoder:\nSimply run the following to pre-train a ResNet encoder using SimCLR on the CIFAR-10 dataset:\n```\npython main.py --dataset CIFAR10\n```\n\n### Distributed Training\nWith distributed data parallel (DDP) training:\n```\nCUDA_VISIBLE_DEVICES=0 python main.py --nodes 2 --nr 0\nCUDA_VISIBLE_DEVICES=1 python main.py --nodes 2 --nr 1\nCUDA_VISIBLE_DEVICES=2 python main.py --nodes 2 --nr 2\nCUDA_VISIBLE_DEVICES=N python main.py --nodes 2 --nr 3\n```\n\n\n### Results\nThese are the top-1 accuracy of linear classifiers trained on the (frozen) representations learned by SimCLR:\n\n| Method  | Batch Size | ResNet | Projection output dimensionality | Epochs | Optimizer | STL-10 | CIFAR-10\n| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |\n| SimCLR + Linear eval. | 256 | ResNet50 | 64 | 100 | Adam | **0.829** | **0.833** | \n| SimCLR + Linear eval. | 256 | ResNet50 | 64 | 100 | LARS | 0.783 | - | \n| SimCLR + Linear eval. | 256 | ResNet18 | 64 | 100 |  Adam | 0.765  | - |\n| SimCLR + Linear eval. | 256 | ResNet18 | 64 | 40 | Adam | 0.719  | - |\n| SimCLR + Linear eval. | 512 | ResNet18 | 64 | 40 | Adam | 0.71 | - |\n| Logistic Regression | - | - | - | 40 | Adam | 0.358 | 0.389 |\n\n\n\n### Pre-trained models\n| ResNet (batch_size, epochs) | Optimizer | STL-10 Top-1 |\n| ------------- | ------------- | ------------- |\n| [ResNet50 (256, 100)](https://github.com/Spijkervet/SimCLR/releases/download/1.2/checkpoint_100.tar) | Adam | **0.829** |\n| [ResNet18 (256, 100)](https://github.com/Spijkervet/SimCLR/releases/download/1.1/checkpoint_100.tar) | Adam | 0.765 |\n| [ResNet18 (256, 40)](https://github.com/Spijkervet/SimCLR/releases/download/1.0/checkpoint_40.tar) | Adam | 0.719 |\n\n`python linear_evaluation.py --model_path=. --epoch_num=100`\n\n#### LARS optimizer\nThe LARS optimizer is implemented in `modules/lars.py`. It can be activated by adjusting the `config/config.yaml` optimizer setting to: `optimizer: \"LARS\"`. It is still experimental and has not been thoroughly tested.\n\n## What is SimCLR?\nSimCLR is a \"simple framework for contrastive learning of visual representations\". The contrastive prediction task is defined on pairs of augmented examples, resulting in 2N examples per minibatch. Two augmented versions of an image are considered as a correlated, \"positive\" pair (x_i and x_j). The remaining 2(N - 1) augmented examples are considered negative examples. The contrastive prediction task aims to identify x_j in the set of negative examples for a given x_i.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/Spijkervet/SimCLR/blob/master/media/architecture.png?raw=true\" width=\"500\"/\u003e\n\u003c/p\u003e\n\n## Usage\nRun the following command to setup a conda environment:\n```\nsh setup.sh\nconda activate simclr\n```\n\nOr alternatively with pip:\n```\npip install -r requirements.txt\n```\n\nThen, simply run for single GPU or CPU training:\n```\npython main.py\n```\n\nFor distributed training (DDP), use for every process in nodes, in which N is the GPU number you would like to dedicate the process to:\n```\nCUDA_VISIBLE_DEVICES=0 python main.py --nodes 2 --nr 0\nCUDA_VISIBLE_DEVICES=1 python main.py --nodes 2 --nr 1\nCUDA_VISIBLE_DEVICES=2 python main.py --nodes 2 --nr 2\nCUDA_VISIBLE_DEVICES=N python main.py --nodes 2 --nr 3\n```\n\n`--nr` corresponds to the process number of the N nodes we make available for training.\n\n### Testing\nTo test a trained model, make sure to set the `model_path` variable in the `config/config.yaml` to the log ID of the training (e.g. `logs/0`).\nSet the `epoch_num` to the epoch number you want to load the checkpoints from (e.g. `40`).\n\n```\npython linear_evaluation.py\n```\n\nor in place:\n```\npython linear_evaluation.py --model_path=./save --epoch_num=40\n```\n\n\n## Configuration\nThe configuration of training can be found in: `config/config.yaml`. I personally prefer to use files instead of long strings of arguments when configuring a run. An example `config.yaml` file:\n```\n# train options\nbatch_size: 256\nworkers: 16\nstart_epoch: 0\nepochs: 40\ndataset_dir: \"./datasets\"\n\n# model options\nresnet: \"resnet18\"\nnormalize: True\nprojection_dim: 64\n\n# loss options\ntemperature: 0.5\n\n# reload options\nmodel_path: \"logs/0\" # set to the directory containing `checkpoint_##.tar` \nepoch_num: 40 # set to checkpoint number\n\n# logistic regression options\nlogistic_batch_size: 256\nlogistic_epochs: 100\n```\n\n## Logging and TensorBoard\nTo view results in TensorBoard, run:\n```\ntensorboard --logdir runs\n```\n\n## Optimizers and learning rate schedule\nThis implementation features the Adam optimizer and the LARS optimizer, with the option to decay the learning rate using a cosine decay schedule. The optimizer and weight decay can be configured in the `config/config.yaml` file.\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/Spijkervet/SimCLR/blob/master/media/lr_cosine_decay_schedule.png?raw=true\" width=\"400\"/\u003e\n\u003c/p\u003e\n\n#### Dependencies\n```\ntorch\ntorchvision\ntensorboard\npyyaml\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSpijkervet%2FSimCLR","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSpijkervet%2FSimCLR","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSpijkervet%2FSimCLR/lists"}