https://github.com/s4m-mo/tf-gpt
A TensorFlow implementation of GPT.
https://github.com/s4m-mo/tf-gpt
deep-learning foundation-models gpt large-language-models machine-learning neural-networks python tensorflow
Last synced: 2 months ago
JSON representation
A TensorFlow implementation of GPT.
- Host: GitHub
- URL: https://github.com/s4m-mo/tf-gpt
- Owner: s4m-mo
- License: gpl-3.0
- Created: 2023-07-18T14:42:43.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-07-18T14:58:05.000Z (almost 3 years ago)
- Last Synced: 2025-05-18T11:11:24.253Z (about 1 year ago)
- Topics: deep-learning, foundation-models, gpt, large-language-models, machine-learning, neural-networks, python, tensorflow
- Language: Python
- Homepage:
- Size: 20.5 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TF-GPT
A TensorFlow implementation of GPT. It implements a stack of decoder blocks for autoregressive text generation, allowing you to train your own foundation models and (smaller) LLMs.
## Usage
To run, simply use the command line:
```powershell
python main.py
```
If you want to train on a custom text file (that fits in RAM) then run the following command, substituting `myDataset.txt` for your text file. If you don't specify your file, it'll train on the [HuggingFace Wikipedia Dataset](https://huggingface.co/datasets/wikitext).
```powershell
python main.py --data="myDataset.txt"
```