Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/muhammad-fiaz/gpt
A simple implementation based on the "Attention is All You Need" paper, using GPT-2 for text generation.
https://github.com/muhammad-fiaz/gpt
attention-is-all-you-need gpt gpt-2 gpt-3 gpt-implementation gpt-using-pytorch gpt2 numpy open-source paper-implementations python pytorch pytorch-implementation
Last synced: 4 days ago
JSON representation
A simple implementation based on the "Attention is All You Need" paper, using GPT-2 for text generation.
- Host: GitHub
- URL: https://github.com/muhammad-fiaz/gpt
- Owner: muhammad-fiaz
- License: mit
- Created: 2023-12-21T15:59:20.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-06T10:53:20.000Z (10 days ago)
- Last Synced: 2025-01-06T11:24:23.685Z (10 days ago)
- Topics: attention-is-all-you-need, gpt, gpt-2, gpt-3, gpt-implementation, gpt-using-pytorch, gpt2, numpy, open-source, paper-implementations, python, pytorch, pytorch-implementation
- Language: Python
- Homepage:
- Size: 101 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
GPT Implementation using Pytorch
A simple implementation repository based on the ["Attention is All You Need"](https://arxiv.org/pdf/1706.03762) paper, focused on learning and developing implementations of small-size models, using GPT-2 for text generation.
> **This project is currently in active development. Make sure to ⭐ the repository! If you want to contribute, make sure to fork the repository.**
## Table of Contents
1. [Requirements](#requirements)
2. [Usage](#usage)
3. [Arguments](#arguments)
- [--text](#text-str)
- [--nsamples](#nsamples-int)
- [--batch_size](#batch_size-int)
- [--length](#length-int)
- [--temperature](#temperature-float)
- [--top_k](#top_k-int)
- [--quiet](#quiet-bool)
- [--unconditional](#unconditional-bool)
- [--seed](#seed-int-optional)
- [--stop_token](#stop_token-str-optional)
- [--skip_special_tokens](#skip_special_tokens-bool)
- [--clean_up_tokenization_spaces](#clean_up_tokenization_spaces-bool)
- [--do_sample](#do_sample-bool)
- [--num_return_sequences](#num_return_sequences-int)
- [--repetition_penalty](#repetition_penalty-float)
- [--length_penalty](#length_penalty-float)
- [--param](#param-str)
4. [Model Download](#model-download)
5. [Example](#example)
6. [License](#license)
7. [Weights License](#weights-license)
8. [Credits](#credits)## Requirements
Before running the script, make sure you have the required dependencies installed:
```bash
pip install -r requirements.txt
```You will also need to download the pre-trained GPT-2 model. The script will handle this for you if the model is not found locally.
## Usage
To generate text using GPT-2, use the following command:
```bash
python run.py --text "Your input text here" --nsamples 3 --batch_size 1 --length 50 --temperature 0.7 --top_k 40 --param 124M
```## Arguments
- `--param` (str)
Specifies the pre-trained model size to use, such as `124M`, `355M`, `774M`, or `1558M`. Default is `124M`.
```bash
--param 355M
```- `--text` (str):
The input text that will be used as the starting point for text generation. For example:
```bash
--text "Once upon a time"
```- `--nsamples` (int):
Number of samples (generated text sequences) to produce. Default is `1`.
```bash
--nsamples 3
```- `--batch_size` (int):
Number of samples to generate per batch. This can be useful for generating multiple texts in parallel. Default is `1`.
```bash
--batch_size 1
```- `--length` (int):
The length of each generated text sequence in tokens. The default is half of the model’s maximum context size.
```bash
--length 50
```- `--temperature` (float):
Controls the randomness of predictions by scaling the logits before applying softmax. A higher temperature (e.g., 1.0) results in more random completions, and lower temperature values (e.g., 0.7) make the model more confident in its predictions.
```bash
--temperature 0.7
```- `--top_k` (int):
Limits the sampling to the top `k` most likely next words. The higher the value, the more diverse the output. Lower values restrict the model to fewer possible next words.
```bash
--top_k 40
```- `--quiet` (bool):
If set to `True`, suppresses output (e.g., the generated text). Default is `False`.- `--unconditional` (bool):
If set to `True`, generates text without a prompt. Otherwise, it will generate text based on the provided input `--text`.- `--seed` (int, optional):
A random seed for generating reproducible results. If not provided, a random seed will be used each time.
```bash
--seed 42
```- `--stop_token` (str, optional):
A token at which text generation should stop (e.g., a special end-of-text token). Default is `None`.
```bash
--stop_token "<|endoftext|>"
```- `--skip_special_tokens` (bool):
If set to `True`, skips special tokens (e.g., end-of-text markers) in the generated text. Default is `False`.
```bash
--skip_special_tokens True
```- `--clean_up_tokenization_spaces` (bool):
If set to `True`, it removes unwanted spaces generated by the tokenizer. Default is `False`.
```bash
--clean_up_tokenization_spaces True
```- `--do_sample` (bool):
If set to `True`, generates text using sampling. Otherwise, it will generate the most likely output using greedy decoding. Default is `True`.
```bash
--do_sample True
```- `--num_return_sequences` (int):
Number of distinct sequences to generate for each input prompt. Default is `1`.
```bash
--num_return_sequences 3
```- `--repetition_penalty` (float):
A penalty for repetition in the generated text. Higher values reduce repetition. Default is `1.0`.
```bash
--repetition_penalty 1.2
```- `--length_penalty` (float):
A penalty for the length of the generated sequence. Higher values favor shorter sequences. Default is `1.0`.
```bash
--length_penalty 1.0
```## Model Download
This script will attempt to download the pre-trained GPT-2 model automatically if it is not already available locally. The model is downloaded from Hugging Face's repository.
## Example
To generate three text samples with the following parameters:
- Input text: "Once upon a time"
- Number of samples: 3
- Batch size: 1
- Length of generated text: 50 tokens
- Temperature: 0.7
- Top-k sampling: 40Run the command:
```bash
python run.py --text "Once upon a time" --nsamples 3 --batch_size 1 --length 50 --temperature 0.7 --top_k 40 --param 124M
```## License
This project is licensed under the [MIT License](./LICENSE).
## Weights License
The pre-trained GPT-2 model Weight is released under the [GPT-2 License](https://huggingface.co/openai-community/gpt2/tree/main) on Huggingface, and also you can refer on [GitHub](https://github.com/openai/gpt-2/blob/master/LICENSE), which allows for research and commercial use but with certain restrictions regarding model usage and the generation of harmful content.
## Credits
This implementation is inspired by and based on the work from the [GPT-2 repository](https://github.com/openai/gpt-2), which provides the foundational model and techniques for text generation. We have built upon these concepts to create this PyTorch-based version.
Additionally, this implementation is also based on the "Attention is All You Need" paper, which introduced the Transformer architecture that serves as the foundation for GPT models.
You can access the paper here: [Attention is All You Need](https://arxiv.org/abs/1706.03762).