https://github.com/naetherm/gptllm
Just a small learning project for implementing GPT2 LLM
https://github.com/naetherm/gptllm
Last synced: about 2 months ago
JSON representation
Just a small learning project for implementing GPT2 LLM
- Host: GitHub
- URL: https://github.com/naetherm/gptllm
- Owner: naetherm
- License: mit
- Created: 2024-06-20T14:10:25.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-06-20T14:35:51.000Z (11 months ago)
- Last Synced: 2025-02-01T21:16:08.411Z (4 months ago)
- Language: Python
- Size: 2.93 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gptLLM
Reimplementation of the papers "Language models are unsupervised multitask learners" and by that a reimplementation of GPT2.
## References
```
@article{radford2019language,
title={Language models are unsupervised multitask learners},
author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},
journal={OpenAI blog},
volume={1},
number={8},
pages={9},
year={2019}
}@article{radford2018improving,
title={Improving language understanding by generative pre-training},
author={Radford, Alec and Narasimhan, Karthik and Salimans, Tim and Sutskever, Ilya and others},
year={2018},
publisher={OpenAI}
}@article{hendrycks2016gaussian,
title={Gaussian error linear units (gelus)},
author={Hendrycks, Dan and Gimpel, Kevin},
journal={arXiv preprint arXiv:1606.08415},
year={2016}
}
```