{"id":13564279,"url":"https://github.com/lucidrains/byol-pytorch","last_synced_at":"2025-05-14T18:05:27.695Z","repository":{"id":40593704,"uuid":"272785290","full_name":"lucidrains/byol-pytorch","owner":"lucidrains","description":"Usable Implementation of \"Bootstrap Your Own Latent\" self-supervised learning, from Deepmind, in Pytorch","archived":false,"fork":false,"pushed_at":"2024-07-15T18:28:22.000Z","size":74,"stargazers_count":1820,"open_issues_count":41,"forks_count":249,"subscribers_count":26,"default_branch":"master","last_synced_at":"2025-05-10T13:55:40.179Z","etag":null,"topics":["artificial-intelligence","deep-learning","self-supervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-16T18:38:32.000Z","updated_at":"2025-05-08T21:14:25.000Z","dependencies_parsed_at":"2024-11-26T12:48:52.662Z","dependency_job_id":null,"html_url":"https://github.com/lucidrains/byol-pytorch","commit_stats":{"total_commits":65,"total_committers":6,"mean_commits":"10.833333333333334","dds":"0.10769230769230764","last_synced_commit":"0c3ab5409181852f8495ef924dce9186f94d9126"},"previous_names":[],"tags_count":30,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fbyol-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fbyol-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fbyol-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fbyol-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/byol-pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254198514,"owners_count":22030965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","self-supervised-learning"],"created_at":"2024-08-01T13:01:29.096Z","updated_at":"2025-05-14T18:05:22.687Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":["Python","Computer Vision"],"sub_categories":["Image Representation Learning"],"readme":"\u003cimg src=\"./diagram.png\" width=\"700px\"\u003e\u003c/img\u003e\n\n## Bootstrap Your Own Latent (BYOL), in Pytorch\n\n[![PyPI version](https://badge.fury.io/py/byol-pytorch.svg)](https://badge.fury.io/py/byol-pytorch)\n\nPractical implementation of an \u003ca href=\"https://arxiv.org/abs/2006.07733\"\u003eastoundingly simple method\u003c/a\u003e for self-supervised learning that achieves a new state of the art (surpassing SimCLR) without contrastive learning and having to designate negative pairs.\n\nThis repository offers a module that one can easily wrap any image-based neural network (residual network, discriminator, policy network) to immediately start benefitting from unlabelled image data.\n\nUpdate 1: There is now \u003ca href=\"https://untitled-ai.github.io/understanding-self-supervised-contrastive-learning.html\"\u003enew evidence\u003c/a\u003e that batch normalization is key to making this technique work well\n\nUpdate 2: A \u003ca href=\"https://arxiv.org/abs/2010.10241\"\u003enew paper\u003c/a\u003e has successfully replaced batch norm with group norm + weight standardization, refuting that batch statistics are needed for BYOL to work\n\nUpdate 3: Finally, we have \u003ca href=\"https://arxiv.org/abs/2102.06810\"\u003esome analysis\u003c/a\u003e for why this works\n\n\u003ca href=\"https://www.youtube.com/watch?v=YPfUiOMYOEE\"\u003eYannic Kilcher's excellent explanation\u003c/a\u003e\n\nNow go save your organization from having to pay for labels :)\n\n## Install\n\n```bash\n$ pip install byol-pytorch\n```\n\n## Usage\n\nSimply plugin your neural network, specifying (1) the image dimensions as well as (2) the name (or index) of the hidden layer, whose output is used as the latent representation used for self-supervised training.\n\n```python\nimport torch\nfrom byol_pytorch import BYOL\nfrom torchvision import models\n\nresnet = models.resnet50(pretrained=True)\n\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = 'avgpool'\n)\n\nopt = torch.optim.Adam(learner.parameters(), lr=3e-4)\n\ndef sample_unlabelled_images():\n    return torch.randn(20, 3, 256, 256)\n\nfor _ in range(100):\n    images = sample_unlabelled_images()\n    loss = learner(images)\n    opt.zero_grad()\n    loss.backward()\n    opt.step()\n    learner.update_moving_average() # update moving average of target encoder\n\n# save your improved network\ntorch.save(resnet.state_dict(), './improved-net.pt')\n```\n\nThat's pretty much it. After much training, the residual network should now perform better on its downstream supervised tasks.\n\n## BYOL → SimSiam\n\nA \u003ca href=\"https://arxiv.org/abs/2011.10566\"\u003enew paper\u003c/a\u003e from Kaiming He suggests that BYOL does not even need the target encoder to be an exponential moving average of the online encoder. I've decided to build in this option so that you can easily use that variant for training, simply by setting the `use_momentum` flag to `False`. You will no longer need to invoke `update_moving_average` if you go this route as shown in the example below.\n\n```python\nimport torch\nfrom byol_pytorch import BYOL\nfrom torchvision import models\n\nresnet = models.resnet50(pretrained=True)\n\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = 'avgpool',\n    use_momentum = False       # turn off momentum in the target encoder\n)\n\nopt = torch.optim.Adam(learner.parameters(), lr=3e-4)\n\ndef sample_unlabelled_images():\n    return torch.randn(20, 3, 256, 256)\n\nfor _ in range(100):\n    images = sample_unlabelled_images()\n    loss = learner(images)\n    opt.zero_grad()\n    loss.backward()\n    opt.step()\n\n# save your improved network\ntorch.save(resnet.state_dict(), './improved-net.pt')\n```\n\n## Advanced\n\nWhile the hyperparameters have already been set to what the paper has found optimal, you can change them with extra keyword arguments to the base wrapper class.\n\n```python\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = 'avgpool',\n    projection_size = 256,           # the projection size\n    projection_hidden_size = 4096,   # the hidden dimension of the MLP for both the projection and prediction\n    moving_average_decay = 0.99      # the moving average decay factor for the target encoder, already set at what paper recommends\n)\n```\n\nBy default, this library will use the augmentations from the SimCLR paper (which is also used in the BYOL paper). However, if you would like to specify your own augmentation pipeline, you can simply pass in your own custom augmentation function with the `augment_fn` keyword.\n\n```python\naugment_fn = nn.Sequential(\n    kornia.augmentation.RandomHorizontalFlip()\n)\n\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = -2,\n    augment_fn = augment_fn\n)\n```\n\nIn the paper, they seem to assure that one of the augmentations have a higher gaussian blur probability than the other. You can also adjust this to your heart's delight.\n\n```python\naugment_fn = nn.Sequential(\n    kornia.augmentation.RandomHorizontalFlip()\n)\n\naugment_fn2 = nn.Sequential(\n    kornia.augmentation.RandomHorizontalFlip(),\n    kornia.filters.GaussianBlur2d((3, 3), (1.5, 1.5))\n)\n\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = -2,\n    augment_fn = augment_fn,\n    augment_fn2 = augment_fn2,\n)\n```\n\nTo fetch the embeddings or the projections, you simply have to pass in a `return_embeddings = True` flag to the `BYOL` learner instance\n\n```python\nimport torch\nfrom byol_pytorch import BYOL\nfrom torchvision import models\n\nresnet = models.resnet50(pretrained=True)\n\nlearner = BYOL(\n    resnet,\n    image_size = 256,\n    hidden_layer = 'avgpool'\n)\n\nimgs = torch.randn(2, 3, 256, 256)\nprojection, embedding = learner(imgs, return_embedding = True)\n```\n\n## Distributed Training\n\nThe repository now offers distributed training with \u003ca href=\"https://huggingface.co/docs/accelerate/index\"\u003e🤗 Huggingface Accelerate\u003c/a\u003e. You just have to pass in your own `Dataset` into the imported `BYOLTrainer`\n\nFirst setup the configuration for distributed training by invoking the accelerate CLI\n\n```bash\n$ accelerate config\n```\n\nThen craft your training script as shown below, say in `./train.py`\n\n```python\nfrom torchvision import models\n\nfrom byol_pytorch import (\n    BYOL,\n    BYOLTrainer,\n    MockDataset\n)\n\nresnet = models.resnet50(pretrained = True)\n\ndataset = MockDataset(256, 10000)\n\ntrainer = BYOLTrainer(\n    resnet,\n    dataset = dataset,\n    image_size = 256,\n    hidden_layer = 'avgpool',\n    learning_rate = 3e-4,\n    num_train_steps = 100_000,\n    batch_size = 16,\n    checkpoint_every = 1000     # improved model will be saved periodically to ./checkpoints folder \n)\n\ntrainer()\n```\n\nThen use the accelerate CLI again to launch the script\n\n```bash\n$ accelerate launch ./train.py\n```\n\n## Alternatives\n\nIf your downstream task involves segmentation, please look at the following repository, which extends BYOL to 'pixel'-level learning.\n\nhttps://github.com/lucidrains/pixel-level-contrastive-learning\n\n## Citation\n\n```bibtex\n@misc{grill2020bootstrap,\n    title = {Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning},\n    author = {Jean-Bastien Grill and Florian Strub and Florent Altché and Corentin Tallec and Pierre H. Richemond and Elena Buchatskaya and Carl Doersch and Bernardo Avila Pires and Zhaohan Daniel Guo and Mohammad Gheshlaghi Azar and Bilal Piot and Koray Kavukcuoglu and Rémi Munos and Michal Valko},\n    year = {2020},\n    eprint = {2006.07733},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.LG}\n}\n```\n\n```bibtex\n@misc{chen2020exploring,\n    title={Exploring Simple Siamese Representation Learning}, \n    author={Xinlei Chen and Kaiming He},\n    year={2020},\n    eprint={2011.10566},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fbyol-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Fbyol-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fbyol-pytorch/lists"}