{"id":18293450,"url":"https://github.com/archinetai/smart-pytorch","last_synced_at":"2025-04-05T11:31:02.663Z","repository":{"id":38827138,"uuid":"483023123","full_name":"archinetai/smart-pytorch","owner":"archinetai","description":"PyTorch – SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models.","archived":false,"fork":false,"pushed_at":"2022-06-28T12:43:48.000Z","size":113,"stargazers_count":61,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-21T03:34:28.496Z","etag":null,"topics":["artificial-intelligence","deep-learning","fine-tuning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/archinetai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-04-18T23:02:30.000Z","updated_at":"2025-03-08T21:58:53.000Z","dependencies_parsed_at":"2022-07-12T17:38:53.377Z","dependency_job_id":null,"html_url":"https://github.com/archinetai/smart-pytorch","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsmart-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsmart-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsmart-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/archinetai%2Fsmart-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/archinetai","download_url":"https://codeload.github.com/archinetai/smart-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247330551,"owners_count":20921652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","fine-tuning"],"created_at":"2024-11-05T14:24:37.189Z","updated_at":"2025-04-05T11:31:02.283Z","avatar_url":"https://github.com/archinetai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cimg src=\"./SMART.png\"\u003e\u003c/img\u003e\n\n\n# SMART - PyTorch\n\nA PyTorch implementation of \u003ca href=\"https://aclanthology.org/2020.acl-main.197.pdf\"\u003eSMART\u003c/a\u003e, a regularization technique to fine-tune pretrained (language) models. You might also be interested in \u003ca href=\"https://github.com/archinetai/vat-pytorch\"\u003evat-pytorch\u003c/a\u003e, a more generic collection of virtual adversarial training (VAT) methods, in PyTorch. \n\n## Install\n\n```bash\n$ pip install smart-pytorch\n```\n\n[![PyPI - Python Version](https://img.shields.io/pypi/v/smart-pytorch?style=flat\u0026colorA=0f0f0f\u0026colorB=0f0f0f)](https://pypi.org/project/smart-pytorch/) \n\n## Usage\n\n### Minimal Example\n\n```py\nimport torch\nimport torch.nn as nn\nfrom smart_pytorch import SMARTLoss\n\n# Define function that will be perturbed (usually our network)\neval_fn = torch.nn.Linear(in_features=10, out_features=20)\n\n# Define loss function between states \nloss_fn = nn.MSELoss()\n\n# Initialize regularization loss\nregularizer = SMARTLoss(eval_fn = eval_fn, loss_fn = loss_fn)\n\n# Compute initial input embed and output state \nembed = torch.rand(1, 10) # [batch_size, in_features]\nstate = eval_fn(embed) # [batch_size, out_featueres]\n\n# Compute regularation loss \nloss = regularizer(embed, state)\nloss # tensor(0.0922578126, grad_fn=\u003cMseLossBackward0\u003e)\n```\n\nWhere `eval_fn` is a function (usually a neural network) that takes as input an embedding `embed` and produces as output one or multiple states `state`. Internally, this function is used to perturb the input `embed` with noise to get a perturbed `state` which is compared with the initially provided `state`. \n\n### Full API Example \n```python\nimport torch\nimport torch.nn as nn\nfrom smart_pytorch import SMARTLoss\n\n# Define function that will be perturbed (usually our network)\neval_fn = torch.nn.Linear(in_features=10, out_features=20)\n\n# Define loss function between states \nloss_fn = nn.MSELoss()\n\n# Norm used to normalize the gradient \ninf_norm = lambda x: torch.norm(x, p=float('inf'), dim=-1, keepdim=True)\n\n# Initialize regularization loss\nregularizer = SMARTLoss(\n    eval_fn = eval_fn,      \n    loss_fn = loss_fn,      # Loss to apply between perturbed and true state \n    loss_last_fn = loss_fn, # Loss to apply between perturbed and true state on the last iteration (default = loss_fn)\n    norm_fn = inf_norm,     # Norm used to normalize the gradient (default = inf_norm)\n    num_steps = 1,          # Number of optimization steps to find noise (default = 1)\n    step_size = 1e-3,       # Step size to improve noise (default = 1e-3)\n    epsilon = 1e-6,         # Noise norm constraint (default = 1e-6)\n    noise_var = 1e-5        # Initial noise variance (default = 1e-5)\n)\n\n# Compute initial input embed and output state \nembed = torch.rand(1, 10) # [batch_size, in_features]\nstate = eval_fn(embed) # [batch_size, out_featueres]\n\n# Compute regularation loss \nloss = regularizer(embed, state)\nloss # tensor(0.0432184562, grad_fn=\u003cMseLossBackward0\u003e)\n```\n\n### RoBERTa Classification Example\n\nThis example demostrates how to wrap a RoBERTa classifier from Huggingface to use with SMART.\n\n```py\nfrom smart_pytorch import SMARTLoss, kl_loss, sym_kl_loss\nfrom transformers import AutoTokenizer, AutoModelForSequenceClassification\n\nclass SMARTRobertaClassificationModel(nn.Module):\n    \n    def __init__(self, model, weight = 0.02):\n        super().__init__()\n        self.model = model \n        self.weight = weight\n\n    def forward(self, input_ids, attention_mask, labels):\n\n        # Get initial embeddings \n        embed = self.model.roberta.embeddings(input_ids) \n\n        # Define eval function \n        def eval(embed):\n            outputs = self.model.roberta(inputs_embeds=embed, attention_mask=attention_mask)\n            pooled = outputs[0] \n            logits = self.model.classifier(pooled) \n            return logits \n        \n        # Define SMART loss\n        smart_loss_fn = SMARTLoss(eval_fn = eval, loss_fn = kl_loss, loss_last_fn = sym_kl_loss)\n        # Compute initial (unperturbed) state \n        state = eval(embed)\n        # Apply classification loss \n        loss = F.cross_entropy(state.view(-1, 2), labels.view(-1))\n        # Apply smart loss \n        loss += self.weight * smart_loss_fn(embed, state)\n        \n        return state, loss\n    \n\ntokenizer = AutoTokenizer.from_pretrained('roberta-base')\nmodel = AutoModelForSequenceClassification.from_pretrained('roberta-base')  \n\nmodel_smart = SMARTRobertaClassificationModel(model)\n# Compute inputs \ntext = [\"This text belongs to class 1...\", \"This text belongs to class 0...\"]\ninputs = tokenizer(text, return_tensors='pt')\nlabels = torch.tensor([1, 0]) \n\n# Compute output and loss \nstate, loss = model_smart(input_ids = inputs['input_ids'], attention_mask = inputs['attention_mask'], labels = labels)\nprint(state.shape, loss) # torch.Size([2, 2]) tensor(0.6980957389, grad_fn=\u003cAddBackward0\u003e)\n```\n\n\n\n\n## Citations\n\n```bibtex\n@inproceedings{Jiang2020SMARTRA,\n  title={SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization},\n  author={Haoming Jiang and Pengcheng He and Weizhu Chen and Xiaodong Liu and Jianfeng Gao and Tuo Zhao},\n  booktitle={ACL},\n  year={2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchinetai%2Fsmart-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farchinetai%2Fsmart-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchinetai%2Fsmart-pytorch/lists"}