{"id":18028497,"url":"https://github.com/interactivetech/mpttune-test","last_synced_at":"2025-07-29T10:07:41.673Z","repository":{"id":175405392,"uuid":"653808150","full_name":"interactivetech/mpttune-test","owner":"interactivetech","description":"Testing MPT 7B finetuning using LORA","archived":false,"fork":false,"pushed_at":"2023-06-29T23:19:27.000Z","size":40813,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-31T23:51:11.386Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/interactivetech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-14T19:29:28.000Z","updated_at":"2023-10-17T07:32:08.000Z","dependencies_parsed_at":"2023-07-03T02:44:33.918Z","dependency_job_id":null,"html_url":"https://github.com/interactivetech/mpttune-test","commit_stats":null,"previous_names":["interactivetech/mpttune-test"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/interactivetech%2Fmpttune-test","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/interactivetech%2Fmpttune-test/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/interactivetech%2Fmpttune-test/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/interactivetech%2Fmpttune-test/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/interactivetech","download_url":"https://codeload.github.com/interactivetech/mpttune-test/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253767571,"owners_count":21961137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-30T08:42:21.807Z","updated_at":"2025-05-12T15:45:15.765Z","avatar_url":"https://github.com/interactivetech.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mpttune: 4-Bit Finetuning of MPTs on a Consumer GPU\n\n**mpttune** allows finetuning MPTs (e.g., mpt-7b-storywriter-4bit) on as little as one consumer-grade A100 40GB. \n\nIts features tiny and easy-to-use codebase.\n\nOne benefit of being able to finetune larger LLMs on one GPU is the ability to easily leverage data parallelism for large models.\n\nUnderneath the hood, **mpttune** implements the LoRA algorithm over an LLM compressed using the GPTQ algorithm, which requires implementing a backward pass for the quantized LLM.\n\n**mpttune** can generate a 600-token epilogue when fed 9000 tokens from a book on A100 40GB for ~ 30 seconds using triton backend\n\n```\n$ tail ... $book\n\n“She still retained her beauty. She was more than common tall, of\nmajestic presence, she had an exquisitely-modelled neck and bust, and\nher hand was the delight of the sculptor. Her smile was distinguished\nby its sweetness and her voice was rich and low. Her lofty brow, and\nclear, thoughtful gaze  \n\n----------------------------------------------------------------------\n\n\netained her beauty. She was more than common tall, of\nmajestic presence, she had an exquisitely-modelled neck and bust, and\nher hand was the delight of the sculptor. Her smile was distinguished\nby its sweetness and her voice was rich and low. Her lofty brow, and\nclear, thoughtful gaze . \n\n\nEPILOGUE\n\n\n$ mpttune generate --interactive --model mpt-7b-storywriter-4bit --weights mpt-7b-storywriter-4bit-128g.safetensors --max_new_tokens=600 --use_cache --do_sample --prompt \"$book\"\n\nEPILOGUE\nThe Project Gutenberg eBook of A forgotten Prince of Wales, by Henry Curties  This eBook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org. If you are not located in the United States, you will have to check the laws of the country where you are located before using this eBook.  Title: A forgotten Prince of Wales  Author: Henry Curties  Release Date: May 19, 2023 [eBook #70795]  Language: English  Produced by: MWS and the Online Distributed Proofreading Team at              https://www.pgdp.net (This file was produced from images              generously made available by The Internet Archive/Canadian              Libraries)  *** START OF THE PROJECT GUTENBERG EBOOK A FORGOTTEN PRINCE OF WALES ***                                    A FORGOTTEN                             PRINCE OF WALES.                              [Illustration]                              [Illustration:             _National Portrait Gallery._      _Emery Walker._           FREDERICK, PRINCE OF WALES, AND HIS SISTERS AT KEW.]                                   A FORGOTTEN                             PRINCE OF WALES                                      BY                          CAPTAIN HENRY CURTIES                      Author of “When England Slept,”                                etc., etc.                                    LONDON                           EVERETT \u0026 CO., LTD.                      42 ESSEX STREET, STRAND, W.C.                             Dedicated by permission                                    to                    His Grace the Duke of Argyll, K.G.                                    CONTENTS                                                                       PAGE  CHAPTER I.   Which Seizes upon the Prince as he comes into the World               1   CHAPTER II.   The Falling in of a Great Legacy                                     12   CHAPTER III.   The Prince at the Age of Nine                                        18   CHAPTER IV.   In which England gets a new King and Queen                           25   CHAPTER V.   A Double Event which did not come off                                41   CHAPTER VI.   The Prince and the London of 1728                                    50   CHAPTER VII.   Peter Wentworth’s Letters on the Prince’s Life                       60   CHAPTER VIII.   The Prince’s Embarrassments                                          73   CHAPTER IX.   The Duchess of Marlborough Throws for a Big Stake                    83   CHAPTER X.   The Beautiful Vanilla                                                92   CHAPTER XI.   The Prince Asserts Himself                                          104   CHAPTER XII.   A Child Bride                                                       121   CHAPTER XIII.   The Nuptials                                                        141   CHAPTER XIV.   Lady Archibald                                                      147   CHAPTER XV.   A Rope Ladder and Some Storms                                       153   CHAPTER XVI.   Parliament and the Prince’s Income                                  178   CHAPTER XVII.   A New Favourite and a Settlement                                    198   CHAPTER XVIII.   A Most Extraordinary Event                                          203   CHAPTER XIX.   Which Contains a Great Deal of Fussing and Fuming and a little  Poetry                                                              221   CHAPTER XX.   The Prince is Cast Forth with His Family                            247   CHAPTER XXI.   The Death of the Queen                                              261   CHAPTER XXII.   The Year of Mourning                                                282   CHAPTER XXIII.   A Husband and a Lover                                               294   CHAPTER XXIV.   The Reconciliation               \"In London,\" says the Duke of Somerset, \"I do not know why the Duke of Kent, who has a large share of our fortunes, has not had the honour of being elected King of England, but there is a precedent for it, which I think we will be better served with a new edition of the history of the British Empire, and the Prince of Wales, who is the eldest brother of the Duke of Kent, has been admitted, and my eldest brother, George, Duke of Kent, has also been elected, and his brother the Duke of Cornwall, Prince of Wales, has been appointed,\" said the Duke of York, and he was right.\n\nThe question is very simple, but we have seen how many centuries ago I came into this realm, and so it is a little bit, and I am sure that I don't like to say this, but I am going to say it anyway.\n\nAs a boy I am, and I think that I have a right to be, but I think I am also not going to be, because I'm a man and that is the best thing about me.\n\nI don't know how many years ago I was born, but it was one of those moments, and I have no idea why I was born. But, sir, the truth is that I am the last of the princes who have been born in England, and so I can be a prince and I'm not going to say it, because it is a very serious matter, because I don't think that it is a big thing for someone to be a prince, but I have the same rights as the other princes before me, and I am a prince that, if you think, can be very popular and a prince that has not been born a prince, I'm not sure that it is a very good thing.\n\nWhen I was born, I was very important, and I have been told that I am the last of the princes that I am, but, as I said, I am also the last of the Prince of Waleses who has been born, and I'm not going to say what else it is, but I am proud to be able, if I have a son, to go forward, and I'm going to be a king.\n\nI am not a prince of the British Empire, but I do not think it is something that is important for my family, and I am not going to say that it is, I just do not think it is a good thing for a boy to be a prince.\n\nI'm not going to say that I think that I am still the only person that I am not going to say this, and I have been in the same position as a prince, and I'm going to be a prince, and that is an issue.\n\nThere is no doubt that, if you think, my mother and my brothers, that I am going to\n\n\n\nTook 30.842 s\n```\n\nThis example is based on the model: OccamRazor/mpt-7b-storywriter-4bit-128g.\n\nHere is a [Google Colab](https://colab.research.google.com/drive/1JoSObRbuehRHWh7Q12Qy-7kFPRVj25yz?usp=sharing). \nYou will need a A100 40GB to read a context length of 9000 tokens.\n\n## Installation\n\n### Setup\n\n```\npip install -r requirements.txt \npython setup.py install         \n```\n\nThe default backend is triton which is the fastest. For cuda support install also the CUDA kernels:\n\n```\npython setup_cuda.py install         \n```\n\n\n## Running mpttune\n\nThe above process installs a `mpttune` command in your environment.\n\n### Download Models\n\nFirst, start by downloading the weights of a MPT model:\n```\n$ wget https://huggingface.co/OccamRazor/mpt-7b-storywriter-4bit-128g/resolve/main/model.safetensors\n```\n\n### Generate Text\n\nYou can generate text directly from the command line. This generates text from the base model:\n```\n$ mpttune generate \\\n    --interactive \\\n    --model mpt-7b-storywriter-4bit \\\n    --weights model.safetensors \\\n    --max_new_tokens=600 \\\n    --use_cache \\\n    --do_sample \\\n    --prompt \"The first person on the moon is \"\n```\n\n### Finetune A Base Model\n\nYou may also finetune a base model yourself. First, you need to download a dataset:\n```\n$ wget https://github.com/gururise/AlpacaDataCleaned/raw/main/alpaca_data_cleaned.json\n```\n\nYou can finetune any model of the MPT family:\n\n\u003cdetails\u003e\n\u003csummary\u003eMPT-7B\u003c/summary\u003e\n\u003cbr\u003e\n\n    $ mpttune finetune \\\n        --model=mpt-7b \\\n        --weights=mosaicml/mpt-7b \\\n        --dataset=./alpaca_data_cleaned.json \\\n        --data_type=alpaca \\\n        --lora_out_dir=./mpt-7b-alpaca/ \\\n        --mbatch_size=1 \\\n        --batch_size=2 \\\n        --epochs=3 \\\n        --lr=3e-4 \\\n        --cutoff_len=256 \\\n        --lora_r=8 \\\n        --lora_alpha=16 \\\n        --lora_dropout=0.05 \\\n        --warmup_steps=5 \\\n        --save_steps=50 \\\n        --save_total_limit=3 \\\n        --logging_steps=5 \\\n        --target_modules='[\"Wqkv\"]'\n\n    The above commands will download the model and use LoRA to finetune the quantized model. The final adapters and the checkpoints will be saved in `mpt-7b-alpaca` and available for generation as follows:\n\n    $ mpttune generate \\\n        --interactive \\\n        --model mpt-7b \\\n        --weights mosaicml/mpt-7b \\\n        --lora_apply_dir mpt-7b-alpaca \\\n        --max_new_tokens 50 \\\n        --use_cache \\\n        --do_sample \\\n        --instruction \"How to prepare pasta?\"\n\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003eMPT-7B-INSTRUCT\u003c/summary\u003e\n\u003cbr\u003e\n\n    $ mpttune finetune \\\n        --model=mpt-7b-instruct \\\n        --weights=mosaicml/mpt-7b-instruct \\\n        --dataset=./alpaca_data_cleaned.json \\\n        --data_type=alpaca \\\n        --lora_out_dir=./mpt-7b-instruct-alpaca/ \\\n        --mbatch_size=1 \\\n        --batch_size=2 \\\n        --epochs=3 \\\n        --lr=3e-4 \\\n        --cutoff_len=256 \\\n        --lora_r=8 \\\n        --lora_alpha=16 \\\n        --lora_dropout=0.05 \\\n        --warmup_steps=5 \\\n        --save_steps=50 \\\n        --save_total_limit=3 \\\n        --logging_steps=5 \\\n        --target_modules='[\"Wqkv\"]'\n\n    The above commands will download the model and use LoRA to finetune the quantized model. The final adapters and the checkpoints will be saved in `mpt-7b-instruct-alpaca` and available for generation as follows:\n\n    $ mpttune generate \\\n        --interactive \\\n        --model mpt-7b-instruct \\\n        --weights mosaicml/mpt-7b-instruct \\\n        --lora_apply_dir mpt-7b-instruct-alpaca \\\n        --max_new_tokens 50 \\\n        --use_cache \\\n        --do_sample \\\n        --instruction \"How to prepare pasta?\"\n\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003eMPT-7B-CHAT\u003c/summary\u003e\n\u003cbr\u003e\n\n    $ mpttune finetune \\\n        --model=mpt-7b-chat \\\n        --weights=mosaicml/mpt-7b-chat \\\n        --dataset=./alpaca_data_cleaned.json \\\n        --data_type=alpaca \\\n        --lora_out_dir=./mpt-7b-chat-alpaca/ \\\n        --mbatch_size=1 \\\n        --batch_size=2 \\\n        --epochs=3 \\\n        --lr=3e-4 \\\n        --cutoff_len=256 \\\n        --lora_r=8 \\\n        --lora_alpha=16 \\\n        --lora_dropout=0.05 \\\n        --warmup_steps=5 \\\n        --save_steps=50 \\\n        --save_total_limit=3 \\\n        --logging_steps=5 \\\n        --target_modules='[\"Wqkv\"]'\n\n    The above commands will download the model and use LoRA to finetune the quantized model. The final adapters and the checkpoints will be saved in `mpt-7b-chat-alpaca` and available for generation as follows:\n\n    $ mpttune generate \\\n        --interactive \\\n        --model mpt-7b-chat \\\n        --weights mosaicml/mpt-7b-chat\\\n        --lora_apply_dir mpt-7b-chat-alpaca \\\n        --max_new_tokens 50 \\\n        --use_cache \\\n        --do_sample \\\n        --instruction \"How to prepare pasta?\"\n\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003eMPT-7B-STORYWRITER\u003c/summary\u003e\n\u003cbr\u003e\n\n    $ mpttune finetune \\\n        --model=mpt-7b-storywriter \\\n        --weights=mosaicml/mpt-7b-storywriter \\\n        --dataset=./alpaca_data_cleaned.json \\\n        --data_type=alpaca \\\n        --lora_out_dir=./mpt-7b-storywriter-alpaca/ \\\n        --mbatch_size=1 \\\n        --batch_size=2 \\\n        --epochs=3 \\\n        --lr=3e-4 \\\n        --cutoff_len=256 \\\n        --lora_r=8 \\\n        --lora_alpha=16 \\\n        --lora_dropout=0.05 \\\n        --warmup_steps=5 \\\n        --save_steps=50 \\\n        --save_total_limit=3 \\\n        --logging_steps=5 \\\n        --target_modules='[\"Wqkv\"]'\n\n    The above commands will download the model and use LoRA to finetune the quantized model. The final adapters and the checkpoints will be saved in `mpt-7b-storywriter-alpaca` and available for generation as follows:\n\n    $ mpttune generate \\\n        --interactive \\\n        --model mpt-7b-storywriter \\\n        --weights mosaicml/mpt-7b-storywriter \\\n        --lora_apply_dir mpt-7b-storywriter-alpaca \\\n        --max_new_tokens 50 \\\n        --use_cache \\\n        --do_sample \\\n        --instruction \"How to prepare pasta?\"\n\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003eMPT-7B-STORYWRITER-4BIT-128G\u003c/summary\u003e\n\u003cbr\u003e\n\n    $ wget https://huggingface.co/OccamRazor/mpt-7b-storywriter-4bit-128g/resolve/main/model.safetensors\n    \n    $ mpttune finetune \\\n        --model=mpt-7b-storywriter-4bit \\\n        --weights=./model.safetensors \\\n        --dataset=./alpaca_data_cleaned.json \\\n        --data_type=alpaca \\\n        --lora_out_dir=./mpt-7b-storywriter-4bit-alpaca/ \\\n        --mbatch_size=1 \\\n        --batch_size=2 \\\n        --epochs=3 \\\n        --lr=3e-4 \\\n        --cutoff_len=256 \\\n        --lora_r=8 \\\n        --lora_alpha=16 \\\n        --lora_dropout=0.05 \\\n        --warmup_steps=5 \\\n        --save_steps=50 \\\n        --save_total_limit=3 \\\n        --logging_steps=5 \\\n        --target_modules='[\"Wqkv\"]'\n\n    The above commands will download the model and use LoRA to finetune the quantized model. The final adapters and the checkpoints will be saved in `mpt-7b-storywriter-4bit-alpaca` and available for generation as follows:\n\n    $ mpttune generate \\\n        --interactive \\\n        --model mpt-7b-storywriter-4bit \\\n        --weights model.safetensors \\\n        --lora_apply_dir mpt-7b-storywriter-4bit-alpaca \\\n        --max_new_tokens=50 \\\n        --use_cache \\\n        --do_sample \\\n        --instruction \"How to prepare pasta?\"\n\n\u003c/details\u003e\n\n\n\n\n\n\n\n\n\n## Todos\n\nWork that stills needs to be done:\n* Add triton flash attention as the only one that supports attention bias (alibi)\n\n\n## Acknowledgements\n\n**mpttune** is based on the following projects:\n* The GPTQ algorithm and codebase by the [IST-DASLAB](https://github.com/IST-DASLab/gptq) with modifications by [@qwopqwop200](https://github.com/qwopqwop200/)\n* The `alpaca_lora_4bit` repo by [johnsmith0031](https://github.com/johnsmith0031)\n* The PEFT repo and its implementation of LoRA\n* The LLAMA, OPT, and BLOOM models by META FAIR and the BigScience consortium\n* The `llmtune` repo by [kuleshov-group](https://github.com/kuleshov-group/llmtune)\n\n\n## Consultations\nNeed a custom solution? Let me know: `r.m.mihaylov@gmail.com`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finteractivetech%2Fmpttune-test","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finteractivetech%2Fmpttune-test","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finteractivetech%2Fmpttune-test/lists"}