{"id":15662198,"url":"https://github.com/rishit-dagli/glu","last_synced_at":"2025-05-05T23:44:38.901Z","repository":{"id":62591605,"uuid":"488210050","full_name":"Rishit-dagli/GLU","owner":"Rishit-dagli","description":"An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow.","archived":false,"fork":false,"pushed_at":"2023-02-22T08:34:48.000Z","size":225,"stargazers_count":20,"open_issues_count":2,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-19T12:13:57.437Z","etag":null,"topics":["activation-functions","artificial-intelligence","deep-learning","glu","keras","machine-learning","neural-network","python","tensorflow","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Rishit-dagli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-03T12:55:51.000Z","updated_at":"2024-11-17T07:02:47.000Z","dependencies_parsed_at":"2024-10-03T13:30:47.633Z","dependency_job_id":"f16b87ab-f36d-4d7c-bdf5-4de6b7676444","html_url":"https://github.com/Rishit-dagli/GLU","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FGLU","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FGLU/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FGLU/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FGLU/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Rishit-dagli","download_url":"https://codeload.github.com/Rishit-dagli/GLU/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252596324,"owners_count":21773842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["activation-functions","artificial-intelligence","deep-learning","glu","keras","machine-learning","neural-network","python","tensorflow","transformers"],"created_at":"2024-10-03T13:30:39.136Z","updated_at":"2025-05-05T23:44:38.878Z","avatar_url":"https://github.com/Rishit-dagli.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GLU\n\n![PyPI](https://img.shields.io/pypi/v/GLU-tf)\n[![Lint Code Base](https://github.com/Rishit-dagli/GLU/actions/workflows/linter.yml/badge.svg)](https://github.com/Rishit-dagli/GLU/actions/workflows/linter.yml)\n[![Upload Python Package](https://github.com/Rishit-dagli/GLU/actions/workflows/python-publish.yml/badge.svg)](https://github.com/Rishit-dagli/GLU/actions/workflows/python-publish.yml)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[![GitHub stars](https://img.shields.io/github/stars/Rishit-dagli/GLU?style=social)](https://github.com/Rishit-dagli/GLU/stargazers)\n[![GitHub followers](https://img.shields.io/github/followers/Rishit-dagli?label=Follow\u0026style=social)](https://github.com/Rishit-dagli)\n[![Twitter Follow](https://img.shields.io/twitter/follow/rishit_dagli?style=social)](https://twitter.com/intent/follow?screen_name=rishit_dagli)\n\nAn easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow. This repository allows you to easily make use of the following activation functions:\n\n- **GLU** introduced in the paper Language Modeling with Gated Convolutional Networks [1]\n- **Bilinear** introduced in the paper Language Modeling with Gated Convolutional Networks [1] atrributed to Mnih et al. [2]\n- **ReGLU** introduced in the paper GLU Variants Improve Transformer [3]\n- **GEGLU** introduced in the paper GLU Variants Improve Transformer [3]\n- **SwiGLU** introduced in the paper GLU Variants Improve Transformer [3]\n- **SeGLU**\n\n![](media/glue_benchmark.PNG)\n\nGated Linear Units consist of the component-wise product of two linear projections, one of which is first passed through a sigmoid function. Variations on GLU are possible, using different nonlinear (or even linear) functions in place of sigmoid. In the GLU Variants Improve Transformer [3] paper,  in a fine-tuning scenario the new variants seem to produce better perplexities for the de-noising objective used in pre-training, as well as better results on many downstream language-understanding tasks. Furthermore these do not have any apparent computational drawbacks.\n\n## Installation\n\nRun the following to install:\n\n```sh\npip install glu-tf\n```\n\n## Developing glu-tf\n\nTo install `glu-tf`, along with tools you need to develop and test, run the following in your virtualenv:\n\n```sh\ngit clone https://github.com/Rishit-dagli/GLU.git\n# or clone your own fork\n\ncd GLU\npip install -e .[dev]\n```\n\n## Usage\n\nIn this section, I show a minimal example of using the SwiGLU activation function but you can use the other activations in  similar manner:\n\n```python\nimport tensorflow as tf\nfrom glu_tf import SwiGLU\n\nmodel = tf.keras.Sequential()\nmodel.add(tf.keras.layers.Dense(units=10)\nmodel.add(SwiGLU(bias = False, dim=-1, name='swiglu'))\n```\n\n## Want to Contribute 🙋‍♂️?\n\nAwesome! If you want to contribute to this project, you're always welcome! See [Contributing Guidelines](CONTRIBUTING.md). You can also take a look at [open issues](https://github.com/Rishit-dagli/GLU/issues) for getting more information about current or upcoming tasks.\n\n## Want to discuss? 💬\n\nHave any questions, doubts or want to present your opinions, views? You're always welcome. You can [start discussions](https://github.com/Rishit-dagli/GLU/discussions).\n\n## References\n\n[1] Dauphin, Yann N., et al. ‘Language Modeling with Gated Convolutional Networks’. ArXiv:1612.08083 [Cs], Sept. 2017. arXiv.org, http://arxiv.org/abs/1612.08083.\n\n[2] Mnih, A., and Hinton, G. 2007. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning (pp. 641–648).\n\n[3] Shazeer, Noam. ‘GLU Variants Improve Transformer’. ArXiv:2002.05202 [Cs, Stat], Feb. 2020. arXiv.org, http://arxiv.org/abs/2002.05202.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frishit-dagli%2Fglu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frishit-dagli%2Fglu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frishit-dagli%2Fglu/lists"}