{"id":13994206,"url":"https://github.com/minyoungg/overparam","last_synced_at":"2025-07-22T19:31:13.114Z","repository":{"id":75141582,"uuid":"343608653","full_name":"minyoungg/overparam","owner":"minyoungg","description":null,"archived":false,"fork":false,"pushed_at":"2023-03-23T13:53:00.000Z","size":60801,"stargazers_count":40,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-29T16:38:30.914Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/minyoungg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-03-02T01:29:14.000Z","updated_at":"2024-07-23T19:29:13.000Z","dependencies_parsed_at":"2023-06-05T12:06:40.336Z","dependency_job_id":null,"html_url":"https://github.com/minyoungg/overparam","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/minyoungg/overparam","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minyoungg%2Foverparam","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minyoungg%2Foverparam/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minyoungg%2Foverparam/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minyoungg%2Foverparam/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/minyoungg","download_url":"https://codeload.github.com/minyoungg/overparam/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minyoungg%2Foverparam/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266561167,"owners_count":23948610,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-09T14:02:45.899Z","updated_at":"2025-07-22T19:31:12.832Z","avatar_url":"https://github.com/minyoungg.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Overparam layers\nPyTorch linear over-parameterization layers with automatic graph reduction.   \n\nOfficial codebase used in:\n\n**The Low-Rank Simplicity Bias in Deep Networks**  \n[Minyoung Huh](http://minyounghuh.com/) \u0026nbsp; [Hossein Mobahi]() \u0026nbsp; [Richard Zhang](https://richzhang.github.io/) \u0026nbsp; [Brian Cheung]() \u0026nbsp; [Pulkit Agrawal]() \u0026nbsp; [Phillip Isola]()     \nMIT CSAIL \u0026nbsp; Google Research \u0026nbsp; Adobe Research \u0026nbsp; MIT BCS   \nTMLR 2023 (arXiv 2021).    \n**[[project page]](https://minyoungg.github.io/overparam/) | [[paper]](https://openreview.net/pdf?id=bCiNWDmlY2) | [[arXiv]](https://arxiv.org/abs/2103.10427)**     \n\n\n## 1. Installation\n\u003cb\u003e Developed on \u003c/b\u003e \n- \u003cb\u003ePython 3.7 \u003c/b\u003e :snake:\n- \u003cb\u003ePyTorch 1.7\u003c/b\u003e :fire:\n\n```bash\n\u003e git clone https://github.com/minyoungg/overparam\n\u003e cd overparam\n\u003e pip install .\n```\n\n## 2. Usage\nThe layers work exactly the same as any `torch.nn` layers.\n\n###  Getting started\n\n#### (1a) OverparamLinear layer (equivalence: `nn.Linear`) \n\n```python\nfrom overparam import OverparamLinear\n \nlayer = OverparamLinear(16, 32, width=1, depth=2)\nx = torch.randn(1, 16)\n```\n\n#### (1b) OverparamConv2d layer (equivalence: `nn.Conv2d`)\n\n```python\nfrom overparam import OverparamConv2d\nimport numpy as np\n```\n \nWe can construct 3 Conv2d layers with kernel dimensions of `5x5`, `3x3`, `1x1`\n```python\n# Same padding\npadding = max((np.sum(kernel_sizes) - len(kernel_sizes) + 1) // 2, 0)\n\nlayer = OverparamConv2d(2, 4, kernel_sizes=[5, 3, 1], padding, depth=len(kernel_sizes))\n\n# Get the effective kernel size\nprint(layer.kernel_size)\n```\nWhen `kernel_sizes` is an integer, all proceeding layers are assumed to have kernel size of `1x1`. \n\n#### (2) Forward computation\n\n```python\n# Forward pass (expanded form)\nlayer.train()\ny = layer(x)\n```\n\nWhen calling `eval()` the model will automatically reduce the computation graph to its effective single-layer counterpart. \nForward pass in `eval` mode will use the effective weights instead.\n\n```python\n# Forward pass (collapsed form) [automatic]\nlayer.eval()\ny = layer(x)\n```\n\nYou can access the effective weights as follows:\n\n```python\nprint(layer.weight)\nprint(layer.bias)\n```\n\n#### (3) Automatic conversion\n\n```python\nimport torchvision.models as models\nfrom overparam.utils import overparameterize\n\nmodel = models.alexnet() # Replace this with YOUR_PYTORCH_MODEL()\nmodel = overparameterize(model, depth=2)\n```\n\n#### (4) Batch-norm and Residual connections\nWe also provide support for batch-norm and linear residual connections.\n\n- batch-normalization (pseudo-linera layer: linear during `eval` mode)\n```python\nlayer = OverparamConv2d(32, 32, kernel_sizes=3, padding=1, depth=2, \n                        batch_norm=True)\n```\n\n- residual-connection \n```python\n# every 2 layers, a residual connection is added\nlayer = OverparamConv2d(32, 32, kernel_sizes=3, padding=1, depth=2,\n                        residual=True, residual_intervals=2)\n```\n\n- multiple residual connection\n```python\n# every modulo [1, 2, 3] layers, a residual connection is added\nlayer = OverparamConv2d(32, 32, kernel_sizes=3, padding=1, depth=2, \n                        residual=True, residual_intervals=[1, 2, 3])\n```\n\n- batch-norm and residual connection \n```python\n# mimics `BasicBlock` in ResNets\nlayer = OverparamConv2d(32, 32, kernel_sizes=3, padding=1, depth=2, \n                        batch_norm=True, residual=True, residual_intervals=2)\n```\n\n\n### 3. Cite\n```\n@article{huh2023simplicitybias,\n    title={The Low-Rank Simplicity Bias in Deep Networks},\n    author={Minyoung Huh and Hossein Mobahi and Richard Zhang and Brian Cheung and Pulkit Agrawal and Phillip Isola},\n    journal={Transactions on Machine Learning Research},\n    issn={2835-8856},\n    year={2023},\n    url={https://openreview.net/forum?id=bCiNWDmlY2},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminyoungg%2Foverparam","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fminyoungg%2Foverparam","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminyoungg%2Foverparam/lists"}