{"id":23399662,"url":"https://github.com/filipbasara0/simple-ijepa","last_synced_at":"2025-04-08T19:53:04.337Z","repository":{"id":252741193,"uuid":"841152110","full_name":"filipbasara0/simple-ijepa","owner":"filipbasara0","description":"A simple and efficient implementation of Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture (I-JEPA)","archived":false,"fork":false,"pushed_at":"2024-08-16T08:50:46.000Z","size":20,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-14T15:46:49.197Z","etag":null,"topics":["computer-vision","i-jepa","jepa","joint-embedding","representation-learning","self-distillation","self-supervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/filipbasara0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-11T19:41:55.000Z","updated_at":"2024-08-16T08:50:49.000Z","dependencies_parsed_at":"2024-08-16T10:11:00.091Z","dependency_job_id":null,"html_url":"https://github.com/filipbasara0/simple-ijepa","commit_stats":null,"previous_names":["filipbasara0/simple-ijepa"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipbasara0%2Fsimple-ijepa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipbasara0%2Fsimple-ijepa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipbasara0%2Fsimple-ijepa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipbasara0%2Fsimple-ijepa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/filipbasara0","download_url":"https://codeload.github.com/filipbasara0/simple-ijepa/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247918939,"owners_count":21018043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","i-jepa","jepa","joint-embedding","representation-learning","self-distillation","self-supervised-learning"],"created_at":"2024-12-22T10:15:33.482Z","updated_at":"2025-04-08T19:53:04.318Z","avatar_url":"https://github.com/filipbasara0.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Simple I-JEPA\n\nA simple and efficient PyTorch implementation of Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture (I-JEPA).\n\n## Results\n\nThe model was pre-trained on 100,000 unlabeled images from the `STL-10` dataset. For evaluation, I trained and tested logistic regression on frozen features obtained from 5k train images and evaluated on 8k test images, also from the `STL-10` dataset.\n\nLinear probing was used for evaluating on features extracted from encoders using the scikit LogisticRegression model. Image resolution was `96x96`.\n\nMore detailed evaluation steps and results for [STL10](https://github.com/filipbasara0/simple-ijepa/blob/main/notebooks/linear-probing-stl.ipynb) can be found in the notebooks directory. \n\n| Dataset       | Approach    | Encoder           | Emb. dim | Patch size| Num. targets | Batch size | Epochs   | Top 1% |\n|---------------|-------------|-------------------|----------|-----------|--------------|------------|----------|--------|\n| STL10         | I-JEPA      | VisionTransformer | 512      | 8         | 4            | 256        | 100      | 77.07  |\n\nAll experiments were done using a very small and shallow VisionTransformer (only 11M params) with following parameters:\n* embbeding dimension - `512`\n* depth (number of transformers layers) - `6`\n* number of heads - `6`\n* mlp dim - `2 * embedding dimension`\n* patch size - `8`\n* number of targets - `4`\n\nThe mask generator is inspired by the original paper, but sligthly simplified.\n\n## Usage\n\n### Instalation\n\nTo setup the code, clone the repository, optionally create a venv and install requirements:\n\n1. `git clone git@github.com:filipbasara0/simple-ijepa.git`\n2. create virtual environment: `virtualenv -p python3.10 env`\n3. activate virtual environment: `source env/bin/activate`\n4. install requirements: `pip install .`\n\n\n### Examples\n\n`STL-10` model was trained with this command:\n\n`python run_training.py --fp16_precision --log_every_n_steps 200 --num_epochs 100 --batch_size 256`\n\n### Detailed options\nOnce the code is setup, run the following command with optinos listed below:\n`python run_training.py [args...]⬇️`\n\n```\nI-JEPA\n\noptions:\n  -h, --help            show this help message and exit\n  --dataset_path DATASET_PATH\n                        Path where datasets will be saved\n  --dataset_name {stl10}\n                        Dataset name\n  -save_model_dir SAVE_MODEL_DIR\n                        Path where models\n  --num_epochs NUM_EPOCHS\n                        Number of epochs for training\n  -b BATCH_SIZE, --batch_size BATCH_SIZE\n                        Batch size\n  -lr LEARNING_RATE, --learning_rate LEARNING_RATE\n  -wd WEIGHT_DECAY, --weight_decay WEIGHT_DECAY\n  --fp16_precision      Whether to use 16-bit precision for GPU training\n  --emb_dim EMB_DIM     Transofmer embedding dimm\n  --log_every_n_steps LOG_EVERY_N_STEPS\n                        Log every n steps\n  --gamma GAMMA         Initial EMA coefficient\n  --update_gamma_after_step UPDATE_GAMMA_AFTER_STEP\n                        Update EMA gamma after this step\n  --update_gamma_every_n_steps UPDATE_GAMMA_EVERY_N_STEPS\n                        Update EMA gamma after this many steps\n  --ckpt_path CKPT_PATH\n                        Specify path to ijepa_model.pth to resume training\n```\n\n## Citation\n\n```\n@misc{assran2023selfsupervisedlearningimagesjointembedding,\n      title={Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture}, \n      author={Mahmoud Assran and Quentin Duval and Ishan Misra and Piotr Bojanowski and Pascal Vincent and Michael Rabbat and Yann LeCun and Nicolas Ballas},\n      year={2023},\n      eprint={2301.08243},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https://arxiv.org/abs/2301.08243}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilipbasara0%2Fsimple-ijepa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffilipbasara0%2Fsimple-ijepa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilipbasara0%2Fsimple-ijepa/lists"}