{"id":13641910,"url":"https://github.com/liutiedong/goat","last_synced_at":"2025-04-20T12:30:58.114Z","repository":{"id":169546125,"uuid":"642539664","full_name":"liutiedong/goat","owner":"liutiedong","description":"a Fine-tuned LLaMA that is Good at Arithmetic Tasks","archived":false,"fork":false,"pushed_at":"2023-09-15T14:44:32.000Z","size":884,"stargazers_count":174,"open_issues_count":5,"forks_count":16,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-11-09T12:39:44.271Z","etag":null,"topics":["ai","llms","nlp-datasets"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/liutiedong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-18T19:57:36.000Z","updated_at":"2024-09-20T18:55:27.000Z","dependencies_parsed_at":"2024-01-14T09:19:03.340Z","dependency_job_id":"adf7b805-36e9-4800-aaae-ac36351bfb25","html_url":"https://github.com/liutiedong/goat","commit_stats":null,"previous_names":["liutiedong/goat"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liutiedong%2Fgoat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liutiedong%2Fgoat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liutiedong%2Fgoat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liutiedong%2Fgoat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/liutiedong","download_url":"https://codeload.github.com/liutiedong/goat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249893359,"owners_count":21341435,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","llms","nlp-datasets"],"created_at":"2024-08-02T01:01:25.604Z","updated_at":"2025-04-20T12:30:58.106Z","avatar_url":"https://github.com/liutiedong.png","language":"Jupyter Notebook","funding_links":[],"categories":["A01_文本生成_文本对话","Jupyter Notebook","Applications"],"sub_categories":["大语言对话模型及数据","提示语（魔法）"],"readme":"#  🐐 Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks\n\n\u003cp align=\"center\"\u003e\u003ca href=\"https://arxiv.org/abs/2305.14201\"\u003e[Paper]\u003c/a\u003e | \u003ca href=\"https://huggingface.co/tiedong/goat-lora-7b\"\u003e[Adapter Weights]\u003c/a\u003e | \u003ca href=\"https://huggingface.co/datasets/tiedong/goat\"\u003e[Dataset]\u003c/a\u003e | \u003ca href=\"https://colab.research.google.com/drive/15tiSi_XvSpFC-M0c45lJXOwDPgjDSrK9?usp=sharing\"\u003e[Colab]\u003c/a\u003e \u003c/p\u003e\n\n### Demo\n1. Addition\n\u003cdiv style=\"display: flex;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/gpt-4-add.png?raw=true\" alt=\"Alt text\" style=\"width: 45%;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/add.png?raw=true\" alt=\"Alt text\" style=\"width: 54%;\"\u003e\n\u003c/div\u003e\n2. Subtraction\n\u003cdiv style=\"display: flex;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/gpt-4-sub.png?raw=true\" alt=\"Alt text\" style=\"width: 45%;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/sub.png?raw=true\" alt=\"Alt text\" style=\"width: 54%;\"\u003e\n\u003c/div\u003e\n3. Multiplication\n\u003cdiv style=\"display: flex;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/gpt-4-mul.png?raw=true\" alt=\"Alt text\" style=\"width: 45%;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/mul.png?raw=true\" alt=\"Alt text\" style=\"width: 54%;\"\u003e\n\u003c/div\u003e\n4. Division\n\u003cdiv style=\"display: flex;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/gpt-4-div.png?raw=true\" alt=\"Alt text\" style=\"width: 45%;\"\u003e\n    \u003cimg align=\"top\" src=\"imgs/div.png?raw=true\" alt=\"Alt text\" style=\"width: 54%;\"\u003e\n\u003c/div\u003e\n\n\n\n### Local Setup\n\n   ```bash\n   git clone https://github.com/liutiedong/goat.git \n   cd goat\n   pip install -r requirements.txt\n   ```\n\n### Dataset (`dataset.ipynb`)\nRun `dataset.ipynb` to generate `dataset.json` file, or download from HuggingFace dataset `tiedong/goat` (https://huggingface.co/datasets/tiedong/goat). Each instance in the dataset contains\n\n- __instruction__: human instruction created by inserting an arithmetic expression to a randomly chosen template and adding some natural language noises. It serves as prompts to be fed to the model for instruction-finetuning.\n- __input__: a randomly generated arithmetic expression. It can be used to replace 'instruction' for training when we want to focus on arithmetic and avoid the influence of natural language.\n- __output__: the target output for the model to learn. It contains CoTs for multi-digit multiplication and division.\n- __answer__: direct numerical answer to the arithmetic task. It can be used to test learnability of various sub-tasks.\n\nExample:\n```bash\n{\n    \"instruction\": \"What is 94140209+73?\",\n    \"input\": \"94140209 + 73\",\n    \"output\": \"94140209 + 73 = 94140282\",\n    \"answer\": \"94140282\"\n},\n{\n    \"instruction\": \"Compute 8432862 - 659016175?\",\n    \"input\": \"8432862 - 659016175\",\n    \"output\": \"8432862 - 659016175 = -650583313\",\n    \"answer\": \"-650583313\"\n},\n{\n    \"instruction\": \"Calculate 37 times 3066\",\n    \"input\": \"37 * 3066\",\n    \"output\": \"37 * 3066 = 3066 * (30 + 7) = 3066 * 30 + 3066 * 7 = 91980 + 21462 = 113442\",\n    \"answer\": \"113442\"\n},\n{\n    \"instruction\": \"Determine the numerical value of 5697/47.\",\n    \"input\": \"5697 / 47\",\n    \"output\": \"5697 - 47 * 100 = 5697 - 4700 = 997\\n997 - 47 * 20 = 997 - 940 = 57\\n57 - 47 * 1 = 57 - 47 = 10\\nTherefore, 5697 / 47 = 121 R 10\",\n    \"answer\": \"121 R 10\"\n},\n\n```\nFeel free to modify `dataset.ipynb` to create your own data.\n\nIt is good to start with a simple sub-task, say 8-digit by 8-digit addition, \n```\npairs = [(random.randint(10**7, 10**8), random.randint(10**7, 10**8)) for k in range(100000)]\n```\nIt only takes less than 2 hours of finetuning to achieve near-perfect accuracy (100000 training samples on A10 GPU). \n\n\n\n### Template (`goat.json`)\n`template.txt` contains several hundred natural language instructions. Instructions that are more commonly used are duplicated more times to increase their chances of being sampled. Instructions that are generated using ChatGPT are listed behind without duplication. Note that some instructions may not be coherent or grammatical correct after inserting arithmetic expressions, but it should not be a problem if we do not train on input. \n\nTo add more instructions for training, put new instructions in `template.txt` under `templates` folder. Then run `python convert_txt_to_json.py` to convert to `goat.json` file, which is used by `dataset.ipynb` to generate dataset for fine-tuning.\n\n\n\n\n### Training (`finetune.py`)\n\nExample usage:\n\n```bash\npython finetune.py \\\n    --base_model 'decapoda-research/llama-7b-hf' \\\n    --data_path 'dataset.json' \\\n    --output_dir './weights'\n```\n\nWe train our model using the following command:\n\n```bash\npython finetune.py \\\n    --base_model 'decapoda-research/llama-7b-hf' \\\n    --data_path 'dataset.json' \\\n    --output_dir './weights' \\\n    --batch_size 128 \\\n    --micro_batch_size 16 \\\n    --num_epochs 1 \\\n    --learning_rate 1e-4 \\\n    --cutoff_len 512 \\\n    --val_set_size 0 \\\n    --lora_r 64 \\\n    --lora_alpha 64 \\\n    --lora_dropout 0.05 \\\n    --lora_target_modules '[q_proj,v_proj,k_proj,o_proj]' \\\n```\n\n### Inference (`app.py`)\n\nThis file downloads LoRA weights from HuggingFace `tiedong/goat-lora-7b`, and runs a Gradio interface for inference.\n\nExample usage:\n\n```bash\npython app.py \\\n    --base_model 'decapoda-research/llama-7b-hf' \\\n    --lora_weights 'tiedong/goat-lora-7b'\n```\n\nAlternatively, host your own Goat gradio demo directly in Colab with [this notebook](https://colab.research.google.com/drive/15tiSi_XvSpFC-M0c45lJXOwDPgjDSrK9?usp=sharing).\n\n### Citation\n```\n@article{liu2023goat,\n  title={Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks},\n  author={Liu, Tiedong and Low, Bryan Kian Hsiang},\n  journal={arXiv preprint arXiv:2305.14201},\n  year={2023}\n}\n```\n\n### Acknowledgements\nOur implementation is mainly based on [Alpaca-LoRA](https://github.com/tloen/alpaca-lora).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliutiedong%2Fgoat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fliutiedong%2Fgoat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliutiedong%2Fgoat/lists"}