https://github.com/hendrycks/gelus
A smoother activation function (undergrad code)
https://github.com/hendrycks/gelus
Last synced: about 1 year ago
JSON representation
A smoother activation function (undergrad code)
- Host: GitHub
- URL: https://github.com/hendrycks/gelus
- Owner: hendrycks
- License: mit
- Created: 2016-07-06T15:26:02.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2020-06-23T21:15:30.000Z (almost 6 years ago)
- Last Synced: 2025-04-09T08:51:17.003Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 8.58 MB
- Stars: 109
- Watchers: 6
- Forks: 20
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Gaussian Error Linear Units (GELUs)
This software allows users to reproduce the results in Gaussian Error Linear Units (GELUs), Dan Hendrycks and Kevin Gimpel 2016.
# GELU Approximations
The `sigmoid(1.702 * x) * x` approximation is fast but is somewhat inaccurate. Meanwhile `0.5 * x * (1 + tanh(x * 0.7978845608 * (1 + 0.044715 * x * x)))` is slower but more accurate.
However, exact versions are now available in pytorch, so approximations are no longer necessary for suitable speed.
# Execution
Please install Tensorflow, Lasagne, and Python 3+.
## Citation
If you find this useful in your research, please consider citing:
@article{hendrycks2016gelu,
title={Gaussian Error Linear Units (GELUs)},
author={Hendrycks, Dan and Gimpel, Kevin},
journal={arXiv preprint arXiv:1606.08415},
year={2016}
}