{"id":22156041,"url":"https://github.com/daniel-furman/sft-demos","last_synced_at":"2025-04-09T21:21:56.071Z","repository":{"id":177752264,"uuid":"660852781","full_name":"daniel-furman/sft-demos","owner":"daniel-furman","description":"Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.","archived":false,"fork":false,"pushed_at":"2024-10-20T19:40:51.000Z","size":10112,"stargazers_count":73,"open_issues_count":0,"forks_count":8,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-02T19:08:14.147Z","etag":null,"topics":["chatbot","deep-learning","instruction-tuning","llama","nlp","text-generation","transformers"],"latest_commit_sha":null,"homepage":"https://huggingface.co/dfurman","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/daniel-furman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-01T03:07:29.000Z","updated_at":"2025-03-02T04:39:01.000Z","dependencies_parsed_at":"2023-12-25T07:28:58.348Z","dependency_job_id":"ff9accb9-ffcf-490a-8fc1-b47b37d92237","html_url":"https://github.com/daniel-furman/sft-demos","commit_stats":{"total_commits":247,"total_committers":1,"mean_commits":247.0,"dds":0.0,"last_synced_commit":"96fd87fcc381e5713e0ad01e195614bc05050b30"},"previous_names":["daniel-furman/sft_demos"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daniel-furman%2Fsft-demos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daniel-furman%2Fsft-demos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daniel-furman%2Fsft-demos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daniel-furman%2Fsft-demos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/daniel-furman","download_url":"https://codeload.github.com/daniel-furman/sft-demos/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247805333,"owners_count":20999155,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","deep-learning","instruction-tuning","llama","nlp","text-generation","transformers"],"created_at":"2024-12-02T02:34:44.086Z","updated_at":"2025-04-09T21:21:56.036Z","avatar_url":"https://github.com/daniel-furman.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"# Finetuning demos for LLMs\n\n[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg)](https://github.com/daniel-furman/Polyglot-or-Not/blob/main/LICENSE) \n[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/release/python-390/) \n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) \n\n## 📚 Intro\n\nThis repo contains demos for finetuning of Large Language Models (LLMs), like Meta's [llama-3](https://huggingface.co/meta-llama/Meta-Llama-3-8B). In particular, we focus on training for short-form instruction following.\n\n---\n\n## 🔎 Finetunes\n\n*Note*: See `_peft` for training runs, as organized by base model. \n\nSome examples:\n\n* [dfurman/CalmeRys-78B-Orpo-v0.1](https://huggingface.co/dfurman/CalmeRys-78B-Orpo-v0.1)\n    * [mlx-community/CalmeRys-78B-Orpo-v0.1-4bit](https://huggingface.co/mlx-community/CalmeRys-78B-Orpo-v0.1-4bit)\n* [dfurman/Qwen2-72B-Orpo-v0.1](https://huggingface.co/dfurman/Qwen2-72B-Orpo-v0.1)\n* [dfurman/Llama-3-70B-Orpo-v0.1](https://huggingface.co/dfurman/Llama-3-70B-Orpo-v0.1)\n* [dfurman/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/dfurman/Mixtral-8x7B-Instruct-v0.1)\n\n## 🏆 Evaluation\n\n*Note*: See `_eval` for evaluation runs. \n\nAn example:\n\nAs of Oct 2024, [dfurman/CalmeRys-78B-Orpo-v0.1](https://huggingface.co/dfurman/CalmeRys-78B-Orpo-v0.1) is the top ranking model on the [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) 🏆\n\n|      Metric       |Value|\n|-------------------|----:|\n|Avg.               |50.78|\n|IFEval (0-Shot)    |81.63|\n|BBH (3-Shot)       |61.92|\n|MATH Lvl 5 (4-Shot)|37.92|\n|GPQA (0-shot)      |20.02|\n|MuSR (0-shot)      |36.37|\n|MMLU-PRO (5-shot)  |66.80|\n\n\n## 💻 Usage\n\n*Note*: Use the code below to get started on text generation (inference). Be sure to have a GPU-enabled cluster.\n\n\u003cdetails\u003e\n\n\u003csummary\u003eSetup\u003c/summary\u003e\n\n```python\n!pip install -qU transformers accelerate bitsandbytes\n!huggingface-cli download dfurman/CalmeRys-78B-Orpo-v0.1\n```\n\n```python\nfrom transformers import AutoTokenizer, BitsAndBytesConfig\nimport transformers\nimport torch\n\n\nif torch.cuda.get_device_capability()[0] \u003e= 8:\n    !pip install -qqq flash-attn\n    attn_implementation = \"flash_attention_2\"\n    torch_dtype = torch.bfloat16\nelse:\n    attn_implementation = \"eager\"\n    torch_dtype = torch.float16\n\n# # quantize if necessary\n# bnb_config = BitsAndBytesConfig(\n#    load_in_4bit=True,\n#    bnb_4bit_quant_type=\"nf4\",\n#    bnb_4bit_compute_dtype=torch_dtype,\n#    bnb_4bit_use_double_quant=True,\n# )\n\nmodel = \"dfurman/CalmeRys-78B-Orpo-v0.1\"\n\ntokenizer = AutoTokenizer.from_pretrained(model)\npipeline = transformers.pipeline(\n    \"text-generation\",\n    model=model,\n    model_kwargs={\n        \"torch_dtype\": torch_dtype,\n        # \"quantization_config\": bnb_config,\n        \"device_map\": \"auto\",\n        \"attn_implementation\": attn_implementation,\n    }\n)\n```\n\n\u003c/details\u003e\n\n### Example 1\n\n```python\nquestion = \"Is the number 9.11 larger than 9.9?\"\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant that thinks step by step.\"},\n    {\"role\": \"user\", \"content\": question},\n]\nprompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n# print(\"***Prompt:\\n\", prompt)\n\noutputs = pipeline(\n    prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95\n)\nprint(\"***Generation:\")\nprint(outputs[0][\"generated_text\"][len(prompt) :])\n```\n\n```\n***Generation:\nTo compare these two numbers, it's important to look at their decimal places after the whole number part, which is 9 in both cases. Comparing the tenths place, 9.11 has a '1' and 9.9 has a '9'. Since '9' is greater than '1', 9.9 is larger than 9.11.\n```\n\n### Example 2\n\n```python\nquestion = \"\"\"The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning. \nThey sold 93 loaves in the morning and 39 loaves in the afternoon. \nA grocery store then returned 6 unsold loaves back to the bakery. \nHow many loaves of bread did the bakery have left?\nRespond as succinctly as possible. Format the response as a completion of this table:\n|step|subquestion|procedure|result|\n|:---|:----------|:--------|:-----:|\"\"\"\n\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": question},\n]\nprompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n# print(\"***Prompt:\\n\", prompt)\n\noutputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)\nprint(\"***Generation:\")\nprint(outputs[0][\"generated_text\"][len(prompt):])\n\n```\n\n```\n***Generation:\n|1|Calculate total sold|Add morning and afternoon sales|132|\n|2|Subtract sold from total|200 - 132|68|\n|3|Adjust for returns|Add returned loaves to remaining|74|\n```\n\n### Example 3\n\n```python\nquestion = \"What's a good recipe for a spicy margarita?\"\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": question},\n]\nprompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n# print(\"***Prompt:\\n\", prompt)\n\noutputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)\nprint(\"***Generation:\")\nprint(outputs[0][\"generated_text\"][len(prompt):])\n```\n\n```\n***Generation:\nTo make a Spicy Margarita, you'll need to incorporate a chili or pepper element into your classic margarita recipe. Here’s a simple way to do it:\n\n### Ingredients:\n- 2 oz tequila (blanco or reposado)\n- 1 oz fresh lime juice\n- 1/2 oz triple sec (Cointreau or Grand Marnier)\n- 1/2 oz agave syrup or simple syrup\n- 1-2 slices of jalapeño (or more depending on how spicy you like it)\n- Salt and/or chili powder for rimming the glass\n- Ice\n- Lime wheel for garnish\n\n### Instructions:\n1. **Muddle Jalapeño**: In a shaker, muddle the jalapeño slices slightly. This will release the oils and heat from the peppers.\n2. **Add Remaining Ingredients**: Add the tequila, lime juice, triple sec, and agave syrup or simple syrup. \n3. **Shake and Strain**: Fill the shaker with ice and shake vigorously until cold. Strain into a salt and/or chili powder rimmed glass filled with ice.\n4. **Garnish and Serve**: Garnish with a lime wheel and enjoy.\n\nIf you prefer a smoother spiciness that doesn't overpower the drink, you could also consider making a jalapeño-infused tequila by leaving the jalapeño slices in the bottle of tequila for several hours to a couple of days, adjusting the time based on desired level of spiciness. Then use this infused tequila instead of regular tequila in the recipe above. \n\nAnother variation is to use a spicy syrup. To make this, combine equal parts water and sugar with a few sliced jalapeños in a saucepan. Bring to a boil, stirring occasionally to dissolve the sugar. Reduce heat and simmer for about 5 minutes. Let cool, strain out the jalapeños, then store in a sealed container in the refrigerator until ready to use. Use this spicy syrup instead of regular syrup in the recipe. \n\nAs always, adjust the quantity of jalapeño or the type of chili used to suit your taste. Enjoy responsibly!\n```\n\n## 🤝 References\n\nBase models:\n\n* [qwen2](https://huggingface.co/Qwen/Qwen2-72B-Instruct)\n* [llama-3](https://huggingface.co/meta-llama/Meta-Llama-3-8B)\n* [phi-2](https://huggingface.co/microsoft/phi-2)\n* [mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)\n* [mistral](https://huggingface.co/mistralai/Mistral-7B-v0.1)\n* [llama-2](https://huggingface.co/meta-llama/Llama-2-70b-hf)\n* [falcon](https://huggingface.co/tiiuae/falcon-180B)\n\nDatasets:\n\n* [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k)\n* [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin)\n* [jondurbin/airoboros-2.2.1](https://huggingface.co/datasets/jondurbin/airoboros-2.2.1)\n* [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus)\n* [timdettmers/openassistant-guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco)\n\nCompute providers:\n\n* [RunPod](https://www.runpod.io/)\n* [Lambda Labs](https://lambdalabs.com/)\n* [Google Colab](https://colab.google/)\n\n## Reccomended venv setup\n\n```\npython3 -m venv .venv\nsource .venv/bin/activate\npip3 install -r requirements.txt\n```\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaniel-furman%2Fsft-demos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaniel-furman%2Fsft-demos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaniel-furman%2Fsft-demos/lists"}