{"id":17155246,"url":"https://github.com/baijiong-lin/lora-torch","last_synced_at":"2025-10-13T06:39:16.879Z","repository":{"id":184257088,"uuid":"671548962","full_name":"Baijiong-Lin/LoRA-Torch","owner":"Baijiong-Lin","description":"PyTorch Reimplementation of LoRA (featuring with supporting nn.MultiheadAttention in OpenCLIP)","archived":false,"fork":false,"pushed_at":"2025-06-10T04:51:05.000Z","size":62,"stargazers_count":67,"open_issues_count":3,"forks_count":7,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-08-22T05:43:41.311Z","etag":null,"topics":["fine-tuning","finetuning","lora","peft"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Baijiong-Lin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-07-27T15:12:58.000Z","updated_at":"2025-08-05T07:04:20.000Z","dependencies_parsed_at":"2024-12-05T07:21:24.001Z","dependency_job_id":"ad1d6583-8210-4b36-88ec-471b92ac2e52","html_url":"https://github.com/Baijiong-Lin/LoRA-Torch","commit_stats":null,"previous_names":["baijiong-lin/lora-torch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Baijiong-Lin/LoRA-Torch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baijiong-Lin%2FLoRA-Torch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baijiong-Lin%2FLoRA-Torch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baijiong-Lin%2FLoRA-Torch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baijiong-Lin%2FLoRA-Torch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Baijiong-Lin","download_url":"https://codeload.github.com/Baijiong-Lin/LoRA-Torch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baijiong-Lin%2FLoRA-Torch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279013975,"owners_count":26085429,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fine-tuning","finetuning","lora","peft"],"created_at":"2024-10-14T21:51:02.068Z","updated_at":"2025-10-13T06:39:16.828Z","avatar_url":"https://github.com/Baijiong-Lin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LoRA-Torch\n\n[![Made With Love](https://img.shields.io/badge/Made%20With-Love-orange.svg)](https://github.com/Baijiong-Lin/LoRA-Torch)\n\nThis codebase reimplementes [LoRA: Low-Rank Adaptation of Large Language Models (ICLR 2022)](https://openreview.net/forum?id=nZeVKeeFYf9) and is reconstructed based on [loralib](https://github.com/microsoft/LoRA). \n\n\n\n## Features\n\n**The implementations of ``loratorch`` and ``loralib`` are very different.** We take the ``nn.Linear`` as an example as follows.\n\n1. For ``loralib``,\n   $h = x W_0^\\top + \\frac{\\alpha}{r} x(BA)^\\top,$\n\nwhere $x\\in\\mathbb{R}^{k\\times n}$ is the input matrix, $W_0\\in\\mathbb{R}^{m\\times n}$ is the pre-trained weight matrix, $r$ is the predefined LoRA rank, $B\\in\\mathbb{R}^{m\\times r}$ and $A\\in \\mathbb{R}^{r\\times n}$ are the LoRA matrixes, and $\\alpha$ is a hyper-parameter.\n\n2. For ``loratorch``,\n   $h = x (W_0 + \\frac{\\alpha}{r} BA)^\\top.$\n   \n   \n\n``loralib`` computes $xW_0^\\top$ and $x(BA)^\\top$ respectively and then merges the results. While ``loratorch`` merges pre-trained weight $W_0$ and its LoRA weight $BA$ and then computes the results by simply using ``nn.Linear.forward()``. There is no difference between ``loralib`` and ``loratorch`` in the linear layers. But in some no-linear or complex layers, we are no sure whether this layer satisfies $L(x, W_0)+L(x, BA) = L(x, W_0+BA)$. Hence, it is difficult to extend LoRA to some complex layers by using ``loralib``. On the contrary, the idea of merging weights first in ``loratorch`` is more general and extensible. You just call ``merge_lora_param()`` in ``loratorch`` to merge weights and then call ``forward()`` in the original layer to compute the results. With the help of ``loratorch``, you can easily implement LoRA to any type of layer of ``torch.nn``.\n\n\n\n## Supported Layers\n\n|                           | ``loralib``    | ``loratorch``  |                                                    |\n| ------------------------- |:--------------:|:--------------:| -------------------------------------------------- |\n| ``nn.Linear``             | ✓              | ✓              | [linear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/linear.ipynb)            |\n| ``nn.Embedding``          | ✓              | ✓              | [embedding.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/embedding.ipynb)      |\n| ``nn.Conv1d``             | ✓              | ✓              |                                                    |\n| ``nn.Conv2d``             | ✓              | ✓              |                                                    |\n| ``nn.Conv3d``             | ✓              | ✓              |                                                    |\n| ``nn.MultiheadAttention`` | ✘              | ✓              |  [Finetune_open_clip_with_LoRA_Torch_on_CIFAR10.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/Finetune_open_clip_with_LoRA_Torch_on_CIFAR10.ipynb)   |\n| ``MergedLinear``          | ✓ (Error)      | ✓              | [mergedlinear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/mergedlinear.ipynb) |\n| $\\cdots$                  | hard to extend | easy to extend |                                                    |\n\n*We compare the results of ``loralib`` and ``loratorch``  in [examples](./examples) to demonstrate the correctness of the implementation in ``loratorch``.*\n\n\n\n## Quick Start\n\n:bangbang: We have provided an [example](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/Finetune_open_clip_with_LoRA_Torch_on_CIFAR10.ipynb) to demonstrate that how to apply LoRA-Torch to ``nn.MultiheadAttention`` in OpenCLIP. We greatly appreciate [@vietvo89](https://github.com/vietvo89)'s valuable contribution.\n\n**The usage of ``loratorch`` is the same as ``loralib``.**\n\n1. Install ``loratorch``.\n   \n   ```bash\n   pip install git+https://github.com/Baijiong-Lin/LoRA-Torch\n   # Alternatively for developers\n   # git clone https://github.com/Baijiong-Lin/LoRA-Torch\n   # cd LoRA-Torch\n   # pip install -e .\n   ```\n\n2. Replace the layers where you would like to use LoRA by using ``loratorch``.\n   \n   ```python\n   # ===== Before =====\n   # layer = nn.Linear(in_features, out_features)\n   \n   # ===== After ======\n   import loratorch as lora\n   # Add a pair of low-rank adaptation matrices with rank r=16 and alpha=32\n   layer = lora.Linear(in_features, out_features, r=16, lora_alpha=32)\n   ```\n\n3. Mark only LoRA parameters as trainable before the training loop.\n   \n   ```python\n   model = Model()\n   # (!!!) This sets requires_grad to False for all parameters without the string \"lora_\" in their names\n   lora.mark_only_lora_as_trainable(model)\n   \n   optimizer = torch.optim.SGD(model.parameters(), lr=0.1)\n   # Training loop\n   for batch in dataloader:\n       model.train()\n       # forward process\n       loss = forward_fun(model, batch)\n       # backward process\n       optimizer.zero_grad()\n       loss.backward()\n       optimizer.step()\n       # (!!!) reregister model param to ensure they are in model.state_dict() and model.parameters()\n       # (!!!) Without this line, the performance does not be affected but you will find that some weights are missing in model.state_dict() and model.parameters()\n       lora.register_model_param_after_backward(model)\n   ```\n\n4. Save LoRA model (only the LoRA matrixes will be saved).\n   \n   ```python\n   # ===== Before =====\n   # torch.save(model.state_dict(), checkpoint_path)\n   # ===== After =====\n   torch.save(lora.lora_state_dict(model), checkpoint_path)\n   ```\n\n5. Load LoRA model (need to load the pre-trained model first).\n   \n   ```python\n   # Load the pre-trained checkpoint first\n   model.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)\n   # Then load the LoRA checkpoint\n   model.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)\n   ```\n\n## Contributor\n\n``loratorch`` is developed and maintained by [Baijiong Lin](https://baijiong-lin.github.io).\n\n## Contact Us\n\nIf you have any question or suggestion, please feel free to contact us by [raising an issue](https://github.com/Baijiong-Lin/LoRA-Torch/issues) or sending an email to ``bj.lin.email@gmail.com``.\n\n## Acknowledgements\n\n``loratorch`` is heavily based on ``loralib``. We thank its authors for their wonderful and open-source codebase.\n\n## Citation\n\nIf you find ``loratorch`` useful for your research or development, please cite the following:\n\n```BibTeX\n@inproceedings{hu2022lora,\ntitle={Lo{RA}: Low-Rank Adaptation of Large Language Models},\nauthor={Edward J Hu and Yelong Shen and Phillip Wallis and Zeyuan Allen-Zhu and Yuanzhi Li and Shean Wang and Lu Wang and Weizhu Chen},\nbooktitle={International Conference on Learning Representations},\nyear={2022},\n}\n\n@software{lin2023loratorch,\n  author = {Baijiong Lin},\n  title = {{LoRA-Torch}: {PyTorch} Reimplementation of {LoRA}},\n  url = {https://github.com/Baijiong-Lin/LoRA-Torch},\n  year = {2023}\n}\n```\n\n## License\n\n``loratorch`` is released under the [MIT](./LICENSE) license.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaijiong-lin%2Flora-torch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaijiong-lin%2Flora-torch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaijiong-lin%2Flora-torch/lists"}