{"id":18398904,"url":"https://github.com/lyeoni/gpt-pytorch","last_synced_at":"2025-04-07T05:33:59.321Z","repository":{"id":62390140,"uuid":"241573646","full_name":"lyeoni/gpt-pytorch","owner":"lyeoni","description":"PyTorch Implementation of OpenAI GPT","archived":false,"fork":false,"pushed_at":"2023-06-28T19:26:13.000Z","size":721,"stargazers_count":124,"open_issues_count":3,"forks_count":30,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-22T14:03:37.944Z","etag":null,"topics":["gpt-pytorch","openai-gpt"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lyeoni.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-19T08:48:17.000Z","updated_at":"2025-03-14T07:48:44.000Z","dependencies_parsed_at":"2022-11-01T02:45:31.754Z","dependency_job_id":null,"html_url":"https://github.com/lyeoni/gpt-pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyeoni%2Fgpt-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyeoni%2Fgpt-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyeoni%2Fgpt-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyeoni%2Fgpt-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lyeoni","download_url":"https://codeload.github.com/lyeoni/gpt-pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247601378,"owners_count":20964861,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt-pytorch","openai-gpt"],"created_at":"2024-11-06T02:24:55.839Z","updated_at":"2025-04-07T05:33:54.294Z","avatar_url":"https://github.com/lyeoni.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenAI GPT\n[![LICENSE](https://img.shields.io/github/license/lyeoni/gpt-pytorch?style=flat-square)](https://github.com/lyeoni/gpt-pytorch/blob/master/LICENSE)\n[![GitHub issues](https://img.shields.io/github/issues/lyeoni/gpt-pytorch?style=flat-square)](https://github.com/lyeoni/gpt-pytorch/issues)\n[![GitHub stars](https://img.shields.io/github/stars/lyeoni/gpt-pytorch?style=flat-square\u0026color=important)](https://github.com/lyeoni/gpt-pytorch/stargazers)\n[![GitHub forks](https://img.shields.io/github/forks/lyeoni/gpt-pytorch?style=flat-square\u0026color=blueviolet)](https://github.com/lyeoni/gpt-pytorch/network/members)\n\nPyTorch Implementation of OpenAI GPT\n\n\u003cp align=\"center\"\u003e\u003cimg width= 70 src=\"https://pytorch.org/assets/images/logo-icon.svg\"\u003e\u003c/p\u003e\n\n## Quick Start\n### 0. Install dependencies\nPreNLP is Preprocessing Library for Natural Language Processing. It provides sentencepiece tokenizer.\n```\n$ pip install prenlp\n$ git clone https://github.com/LiyuanLucasLiu/RAdam\n$ python RAdam/setup.py install\n```\n\u003cbr\u003e\n\n### 1. Setup input pipeline\n\n#### Building vocab based on your corpus\n```\n$ python vocab.py --corpus \u003cYOUR_CORPUS\u003e --prefix \u003cVOCAB_NAME\u003e --vocab_size \u003cYOUR_VOCAB_SIZE\u003e\n```\n\nor you can download WikiText-103 corpus using below command, and build vocab based on this.\n```\n$ python -c \"import prenlp; prenlp.data.WikiText103()\"\n$ ls .data/wikitext-103\nwiki.test  wiki.train  wiki.valid\n$ python vocab.py --corpus .data/wikitext-103/wiki.train --prefix wiki103\n```\n\u003cbr\u003e\n\n### 2. Unsupervised pre-training\n```\n$ python main.py --train_corpus \u003cTRAIN_CORPUS\u003e --vocab_file \u003cVOCAB_FILE\u003e --pretrained_sp_model \u003cPRETRAINED_SP_MODEL\u003e --pretrain\n```\n\n#### Distributed training with torch.distributed (Recommended)\nYou can apply to both single-node(multi-GPU) and multi-node distributed training.\n```\n$ python -m torch.distributed.launch --nproc_per_node=\u003cNPROC_PER_NODE\u003e --nnodes=\u003cNNODES\u003e --node_rank=\u003cNODE_RANK\u003e --master_addr=\u003cMASTER_ADDR\u003e --master_port=\u003cMASTER_PORT\u003e main.py --train_corpus \u003cTRAIN_CORPUS\u003e \\\n                                    --vocab_file \u003cVOCAB_FILE\u003e \\\n                                    --pretrained_sp_model \u003cPRETRAINED_SP_MODEL\u003e \\\n                                    --pretrain --distributed\n```\n\u003cbr\u003e\n\n### 3. Supervised fine-tuning\n```\n$ python main.py --train_corpus \u003cTRAIN_CORPUS\u003e --test_corpus \u003cTEST_CORPUS\u003e  --vocab_file \u003cVOCAB_FILE\u003e --pretrained_sp_model \u003cPRETRAINED_SP_MODEL\u003e --pretrained_model \u003cPRETRAINED_MODEL\u003e --finetune --do_eval\n```\n\n#### Distributed training with torch.distributed (Recommended)\nYou can apply to both single-node(multi-GPU) and multi-node distributed training.\n```\n$ python -m torch.distributed.launch --nproc_per_node=\u003cNPROC_PER_NODE\u003e --nnodes=\u003cNNODES\u003e --node_rank=\u003cNODE_RANK\u003e --master_addr=\u003cMASTER_ADDR\u003e --master_port=\u003cMASTER_PORT\u003e main.py --train_corpus \u003cTRAIN_CORPUS\u003e --test_corpus \u003cTEST_CORPUS\u003e \\\n                                    --vocab_file \u003cVOCAB_FILE\u003e \\\n                                    --pretrained_sp_model \u003cPRETRAINED_SP_MODEL\u003e \\\n                                    --pretrained_model \u003cPRETRAINED_MODEL\u003e \\\n                                    --finetune --do_eval --distributed\n```\n\u003cbr\u003e\n\n## Questions and Discussions\n### Does auxiliary objective function have a bigger impact?\nGPT authors mentioned that \"We additionally found that including language modeling as an auxiliary objective to the fine-tuninghelped learning by (a) improving generalization of the supervised model, and (b) accelerating convergence\".\n\nAnd, in our experiments on IMDb dataset, it shows that the auxiliary objective function improves test-accuracy as shown below.\nThe orange line is for _auxiliary weight = 0_, blue line is for _auxiliary weight = 0.25_, red line is for _auxiliary weight = 0.5_. And you can also see training logs for this in [here](https://github.com/lyeoni/gpt-pytorch/tree/master/logs).\n\u003cp align=\"center\"\u003e\u003cimg width= 700 src=\"logs/tensorboard-visualization.png\"\u003e\u003c/p\u003e\n\u003cbr\u003e\n\n## List of options\nYou may need to change below argument parameters.\n```\n$ python main.py -h\nusage: main.py [-h] --train_corpus TRAIN_CORPUS --vocab_file VOCAB_FILE\n               --pretrained_sp_model PRETRAINED_SP_MODEL [--pretrain]\n               [--finetune] [--do_eval] [--test_corpus TEST_CORPUS]\n               [--pretrained_model PRETRAINED_MODEL]\n               [--output_model_prefix OUTPUT_MODEL_PREFIX]\n               [--batch_size BATCH_SIZE] [--max_seq_len MAX_SEQ_LEN]\n               [--n_workers N_WORKERS] [--epochs EPOCHS] [--lr LR]\n               [--auxiliary_ratio AUXILIARY_RATIO] [--local_rank LOCAL_RANK]\n               [--no_cuda] [--distributed] [--hidden HIDDEN]\n               [--n_layers N_LAYERS] [--n_attn_heads N_ATTN_HEADS]\n               [--embd_dropout EMBD_DROPOUT] [--resid_dropout RESID_DROPOUT]\n               [--attn_dropout ATTN_DROPOUT] [--ffn_hidden FFN_HIDDEN]\n               [--cached_label_dict CACHED_LABEL_DICT]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --train_corpus TRAIN_CORPUS\n                        corpus for either pre-train or fine-tune\n  --vocab_file VOCAB_FILE\n                        pretrained vocabulary\n  --pretrained_sp_model PRETRAINED_SP_MODEL\n                        pretrained sentencepiece model\n  --pretrain\n  --finetune\n  --do_eval\n  --test_corpus TEST_CORPUS\n                        corpus for either pre-train or fine-tune evaluation\n  --pretrained_model PRETRAINED_MODEL\n                        pretrained GPT model path\n  --output_model_prefix OUTPUT_MODEL_PREFIX\n                        output model name prefix\n  --batch_size BATCH_SIZE\n                        batch size\n  --max_seq_len MAX_SEQ_LEN\n                        the maximum size of the input sequence\n  --n_workers N_WORKERS\n                        the number of workers\n  --epochs EPOCHS       the number of epochs\n  --lr LR               initial learning rate\n  --auxiliary_ratio AUXILIARY_RATIO\n                        weight of auxiliary objective\n  --local_rank LOCAL_RANK\n                        node rank for distributed training\n  --no_cuda\n  --distributed\n  --hidden HIDDEN       the number of expected features in the transformer\n                        decoder\n  --n_layers N_LAYERS   the number of decoder layers\n  --n_attn_heads N_ATTN_HEADS\n                        the number of multi-head attention heads\n  --embd_dropout EMBD_DROPOUT\n                        embedding dropout value\n  --resid_dropout RESID_DROPOUT\n                        residual dropout value\n  --attn_dropout ATTN_DROPOUT\n                        attention dropout value\n  --ffn_hidden FFN_HIDDEN\n                        dimension of the feedforward network\n  --cached_label_dict CACHED_LABEL_DICT\n```\n\n### References\n- [Improving Language Understandingby Generative Pre-Training](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf)\n- [openai / finetune-transformer-lm](https://github.com/openai/finetune-transformer-lm)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flyeoni%2Fgpt-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flyeoni%2Fgpt-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flyeoni%2Fgpt-pytorch/lists"}