{"id":13564389,"url":"https://github.com/deep-spin/entmax","last_synced_at":"2025-04-03T21:30:46.523Z","repository":{"id":35076491,"uuid":"189602244","full_name":"deep-spin/entmax","owner":"deep-spin","description":"The entmax mapping and its loss, a family of sparse softmax alternatives.","archived":false,"fork":false,"pushed_at":"2024-06-22T14:34:00.000Z","size":214,"stargazers_count":426,"open_issues_count":14,"forks_count":45,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-08T13:47:39.256Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deep-spin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-31T13:54:08.000Z","updated_at":"2025-02-21T01:31:29.000Z","dependencies_parsed_at":"2024-05-20T15:56:27.219Z","dependency_job_id":"c625bf59-28b1-46fd-95bb-a6f0ca9a2862","html_url":"https://github.com/deep-spin/entmax","commit_stats":{"total_commits":52,"total_committers":7,"mean_commits":7.428571428571429,"dds":0.4423076923076923,"last_synced_commit":"f624390251392accad064e8e36a39491bce97d13"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deep-spin%2Fentmax","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deep-spin%2Fentmax/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deep-spin%2Fentmax/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deep-spin%2Fentmax/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deep-spin","download_url":"https://codeload.github.com/deep-spin/entmax/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247082945,"owners_count":20880749,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T13:01:30.569Z","updated_at":"2025-04-03T21:30:42.869Z","avatar_url":"https://github.com/deep-spin.png","language":"Python","funding_links":[],"categories":["Python","Classification"],"sub_categories":[],"readme":"[![Build Status](https://dev.azure.com/zephyr14/entmax/_apis/build/status/deep-spin.entmax?branchName=master)](https://dev.azure.com/zephyr14/entmax/_build/latest?definitionId=1\u0026branchName=master)\n\n[![PyPI version](https://badge.fury.io/py/entmax.svg)](https://badge.fury.io/py/entmax)\n\n# entmax\n\n\u003cimg src=\"entmax.png\" /\u003e\n\n--------------------------------------------------------------------------------\n\nThis package provides a pytorch implementation of entmax and entmax losses:\na sparse family of probability mappings and corresponding loss functions,\ngeneralizing softmax / cross-entropy.\n\n*Features:*\n  - Exact partial-sort algorithms for 1.5-entmax and 2-entmax (sparsemax).\n  - A bisection-based algorithm for generic alpha-entmax.\n  - Gradients w.r.t. alpha for adaptive, learned sparsity!\n  - Other sparse transformations: alpha-normmax and k-subsets budget (handled through bisection-based algorithms).\n\n*Requirements:* python 3, pytorch \u003e= 1.9 (and pytest for unit tests)\n\n## Example\n\n```python\nIn [1]: import torch\n\nIn [2]: from torch.nn.functional import softmax\n\nIn [2]: from entmax import sparsemax, entmax15, entmax_bisect, normmax_bisect, budget_bisect\n\nIn [4]: x = torch.tensor([-2, 0, 0.5])\n\nIn [5]: softmax(x, dim=0)\nOut[5]: tensor([0.0486, 0.3592, 0.5922])\n\nIn [6]: sparsemax(x, dim=0)\nOut[6]: tensor([0.0000, 0.2500, 0.7500])\n\nIn [7]: entmax15(x, dim=0)\nOut[7]: tensor([0.0000, 0.3260, 0.6740])\n\nIn [8]: normmax_bisect(x, alpha=2, dim=0)\nOut[8]: tensor([0.0000, 0.3110, 0.6890])\n\nIn [9]: normmax_bisect(x, alpha=1000, dim=0)\nOut[9]: tensor([0.0000, 0.4997, 0.5003])\n\nIn [10]: budget_bisect(x, budget=2, dim=0)\nOut[10]: tensor([0.0000, 1.0000, 1.0000])\n```\n\nGradients w.r.t. alpha (continued):\n\n```python\nIn [1]: from torch.autograd import grad\n\nIn [2]: x = torch.tensor([[-1, 0, 0.5], [1, 2, 3.5]])\n\nIn [3]: alpha = torch.tensor(1.33, requires_grad=True)\n\nIn [4]: p = entmax_bisect(x, alpha)\n\nIn [5]: p\nOut[5]:\ntensor([[0.0460, 0.3276, 0.6264],\n        [0.0026, 0.1012, 0.8963]], grad_fn=\u003cEntmaxBisectFunctionBackward\u003e)\n\nIn [6]: grad(p[0, 0], alpha)\nOut[6]: (tensor(-0.2562),)\n```\n\n## Installation\n\n```\npip install entmax\n```\n\n## Citations\n\n[Sparse Sequence-to-Sequence Models](https://www.aclweb.org/anthology/P19-1146)\n\n```\n@inproceedings{entmax,\n  author    = {Peters, Ben and Niculae, Vlad and Martins, Andr{\\'e} FT},\n  title     = {Sparse Sequence-to-Sequence Models},\n  booktitle = {Proc. ACL},\n  year      = {2019},\n  url       = {https://www.aclweb.org/anthology/P19-1146}\n}\n```\n\n[Adaptively Sparse Transformers](https://arxiv.org/pdf/1909.00015.pdf)\n\n```\n@inproceedings{correia19adaptively,\n  author    = {Correia, Gon\\c{c}alo M and Niculae, Vlad and Martins, Andr{\\'e} FT},\n  title     = {Adaptively Sparse Transformers},\n  booktitle = {Proc. EMNLP-IJCNLP (to appear)},\n  year      = {2019},\n}\n```\n\nFurther reading:\n\n  - Blondel, Martins, and Niculae, 2019. [Learning with Fenchel-Young Losses](https://arxiv.org/abs/1901.02324).\n  - Martins and Astudillo, 2016. [From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification](https://arxiv.org/abs/1602.02068).\n  - Peters and Martins, 2019 [IT-IST at the SIGMORPHON 2019 Shared Task: Sparse Two-headed Models for Inflection](https://www.aclweb.org/anthology/W19-4207).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeep-spin%2Fentmax","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeep-spin%2Fentmax","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeep-spin%2Fentmax/lists"}