{"id":13337910,"url":"https://github.com/d-f/llm-summarization","last_synced_at":"2025-04-07T11:35:24.082Z","repository":{"id":243176076,"uuid":"774456799","full_name":"d-f/llm-summarization","owner":"d-f","description":"LoRA supervised fine-tuning, RLHF (PPO) and RAG with llama-3-8B on the TLDR summarization dataset","archived":false,"fork":false,"pushed_at":"2025-02-02T09:10:29.000Z","size":99,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-28T15:46:13.102Z","etag":null,"topics":["document-summarization","fine-tuning","llama3","llm","reinforcement-learning","retrieval-augmented-generation","supervised-learning","tldr"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/d-f.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-19T15:27:02.000Z","updated_at":"2025-03-11T20:28:41.000Z","dependencies_parsed_at":"2024-06-07T06:35:44.404Z","dependency_job_id":"798e2765-44aa-48be-9be9-e878f281a85d","html_url":"https://github.com/d-f/llm-summarization","commit_stats":{"total_commits":37,"total_committers":1,"mean_commits":37.0,"dds":0.0,"last_synced_commit":"f9aa5f2b60344678f45464ce579be95f5dc3f4f0"},"previous_names":["d-f/llm-summarization"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-f%2Fllm-summarization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-f%2Fllm-summarization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-f%2Fllm-summarization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-f%2Fllm-summarization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/d-f","download_url":"https://codeload.github.com/d-f/llm-summarization/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247647298,"owners_count":20972839,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["document-summarization","fine-tuning","llama3","llm","reinforcement-learning","retrieval-augmented-generation","supervised-learning","tldr"],"created_at":"2024-07-29T19:15:10.669Z","updated_at":"2025-04-07T11:35:24.012Z","avatar_url":"https://github.com/d-f.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# llm-summarization\n# Prompt Engineering\n5 prompts were compared with 10 random examples from the TLDR dataset that were summarized by humans\n\nprompt_eng_templates.py was used in order to create Jinja2 templates, render the text and save .txt file input and output examples\n\nfolder_inference.py was adapted from llama-recipes inference.py but was changed to predict all text files within a folder more efficiently\n\ncompare_prompt_outputs.py was used to measure the following metrics of the llama-3-8B summarization results with 5 different prompts\n\nROUGE-1\n| Prompt   | Precision | Recall | F1    | \n| -------- | --------- | ------ | ----- |\n| Prompt 1 | 0.092     | 0.307  | 0.136 |\n| Prompt 2 | 0.125     | 0.370  | 0.178 |\n| Prompt 3 | 0.107     | 0.307  | 0.145 |\n| Prompt 4 | 0.093     | 0.216  | 0.112 |\n| Prompt 5 | 0.113     | 0.301  | 0.153 |\n\nROUGE-2\n| Prompt   | Precision | Recall | F1    | \n| -------- | --------- | ------ | ----- |\n| Prompt 1 | 0.005     | 0.028  | 0.009 |\n| Prompt 2 | 0.016     | 0.065  | 0.025 |\n| Prompt 3 | 0.004     | 0.012  | 0.006 |\n| Prompt 4 | 0.008     | 0.022  | 0.012 |\n| Prompt 5 | 0.010     | 0.030  | 0.015 |\n\nROUGE-L\n| Prompt   | Precision | Recall | F1    | \n| -------- | --------- | ------ | ----- |\n| Prompt 1 | 0.059     | 0.201  | 0.088 |\n| Prompt 2 | 0.078     | 0.245  | 0.113 |\n| Prompt 3 | 0.068     | 0.202  | 0.093 |\n| Prompt 4 | 0.056     | 0.129  | 0.071 |\n| Prompt 5 | 0.075     | 0.204  | 0.100 |\n\nDespite the very low scores due to the smallest llama model being used with 8-bit quantization, the second prompt shows higher results.\n\nTLDR dataset: \nhttps://huggingface.co/datasets/webis/tldr-17\n\nRLHF dataset: https://github.com/openai/summarize-from-feedback\n```\nazcopy copy \"https://openaipublic.blob.core.windows.net/summarize-from-feedback/dataset/*\" . --recursive\n```\nLlama3 access can be gained by applying at the following link:\n\nhttps://llama.meta.com/llama-downloads/\n\n```\ngit clone https://github.com/meta-llama/llama-recipes\n```\n\n```\ncd ./llama-recipes\npip install -e .\n```\n\nDownload.sh downloads a folder /llama-3-8b/ containing consolidated.00.pth.tar, params.JSON, tokenizer.JSON, tokenizer.model, tokenizer_config.JSON.\n\n```\ngit clone https://github.com/huggingface/transformers\npip install -e ./transformers\n```\n\n```\npip install blobfile\npip install tiktoken\n```\n\nUse transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py to convert llama to huggingface format in order to use llama-recipes.\n```\npython ./transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /llm_summarization/llama3/Meta-Llama-3-8B/ --model_size 8B --output_dir /llm_summarization/llama3_hf_format/ --llama_version 3\n```\n\nIf the convert_llama_weights_to_hf.py ends with a [WinError 5] Access is denied for the file .\\convert_output\\tmp\\pytorch_model-1-of-33.bin, the \\tmp\\ folder can be deleted manually.\n\nMove the contents of the huggingface conversion folder (--output_dir from convert_llama_weights_to_hf.py) into the folder used for inference (/llama-recipes/recipes/inference/local_inference/llama-3-8b/).\n```\nbash move_hf_conversion.sh\n```\n\n# Inference\n\nIn order to summarize text within example_prompt1.txt (quantization for 8-bit precision).\n```\npython inference.py --model_name llama-3-8b --prompt_file /llm_summarization/example_prompt1.txt --quantization\n```\n\nRandom samples were taken from the TL;DR dataset and generated with the llama 3-8b prior to any finetuning:\n\n```\npython prefinetune_examples.py -save_path /llm_summarization/example_prompts/ -num_ex 2\n```\n\nPrompt: \n````\n### Instruction: Write a concise summary of the following text delimited by triple backquotes.\n```\nI think it should be fixed on either UTC standard or UTC+1 year around, with the current zone\noffsets. Moving timescales add a lot of complexity to the implementation of timekeeping\nsystems and have [dubious value](I think seasonal shifting time made sense in the pre-electric\npast, when timekeeping was more flexible and artificial light was inefficient and often\ndangerous. Now we have machines that work easily with simple timekeeping rules, and it's more\nbeneficial to spend a small amount on energy for lighting, and save the larger cost of\nengineering things to work with the complex timekeeping rules, as well as saving the\nirritation to humans. Lighting has gotten much more efficient over time; we can squeeze out\na lot more photons per unit of energy from a 2012 CFL or LED than a candle could in 1780,\nor a lightbulb could in 1950. There's a lot of room for improvement in how we use lights\nas well; as lighting control gets more intelligent, there will be a lot of savings from not\nilluminating inactive spaces constantly.\n```\nSUMMARY:\n````\nModel Output:\n```\nThis paragraph talks about the disadvantage of timezones - for engineers and\n```\n\nPrompt: \n````\n### Instruction: Write a concise summary of the following text delimited by triple backquotes.\n\n```\nArt is about the hardest thing to categorize in terms of good and bad. To consider one work\nor artist as dominate over another comes down to personal opinion. Sure some things maybe\nblatantly better than other works, but it ultimately lies with the individual. I personally\nenjoy the work of \"street artists\" (using quotations not to be sarcastic, but mainly because\nthis is in a different category than graffiti and since my background is not in art I don't\nknow what the \"proper\" term is , if there is one), but I do see where you are coming from.\nCLET tends to use the same images continuously (to a point where one could say \"Is this\nit?\") as do most street artists (I do think this term is thrown around a lot more than it\nshould be, I agree with you there) and it can be annoying.\n```\n\nSUMMARY: \n````\nModel Output:\n```\nArt is hard to categorize, and artists use similar images to get their point across.\nThere are three different types of people in this world; those who make things happen, those who watch things happen and those who wonder what happened\nThe people who make things happen are proactive and take action to change the world.\nI am always curious to know how the world will change and I am not one to take action as I do more to learn but not to act.\nIf\n```\n\n# Supervised fine tuning\nIn order to format the TLDR dataset for using with TRL training pipelines:\n```\npython format_dataset.py -save_path /llm_summarization/tldr_dataset.jsonl\n```\n\nTo fine-tune llama3 on the custom dataset:\n```\npython finetune.py -load_4bit -quant_type nf4 -dtype float16 -dbl_quant -model_dir /llm_summarization/llama3_hf_format/ -lora_a 32 -lora_drop 0.1 -r 8 -bias none -task_type CAUSAL_LM -target_mods q_proj v_proj -ds_json /llm_summarization/tldr_dataset.json -ds_txt_field prompt -output_dir /llm_summarization/sft_output/ -batch_size 64 -bf16 -max_len 1024 -eval_strat epoch -do_eval\n```\n\n# RLHF\nReformat openAI data for proximal policy optimization:\n```\npython partition_openai.py -feedback_folder /llm_summarization/openai_RLHF_data/comparisons/ -val_prop 0.1 -save_folder /llm_summarization/openai_RLHF_data/ -train_filename train_feedback.json -val_filename val_feedback.json\n```\nTrain the reward model:\n```\npython train_reward_model.py -model_folder /llm_summarization/llama3_hf_format/ -train_json /llm_summarization/openai_RLHF_dataset/train_feedback.json -val_json /llm_summarization/openai_RLHF_dataset/val_feedback.json -model_save_name /llm_summarization/model_1/ -r 8 -lora_a 32 -lora_dropout 0.1 -load_4bit -quant_type nf4 -dtype float16 -dbl_quant -lr 1e-3 -bf16 -max_len 128 -batch_size 2 -output_dir /llm_summarization/reward_output/ -target_mods q_proj v_proj\n```\nAlign policy model with proximal policy optimization:\n```\npython rlhf.py -ds_json /llm_summarization/openai_RLHF_data/train_feedback.json -tok_dir /llm_summarization/llama3_hf_format/ -model_save_path /llm_summarization/rlhf_model_1/ -lr 1e-5 -batch_size 1 -mini_batch_size 1 -load_4bit -quant-type nf4 -dtype float16 -dbl_quant -policy_dir /llm_summarization/sft_output/ -reward_dir /llm_summarization/reward_output/\n```\n\n# RAG\nTo retrieve from a database provided by langchain such as WikipediaLoader\n```\npython rag.py -wiki_query \"dogs\" -max_docs 100 -temp 0.2 -rep_pen 1.1 -max_new_tok 400 -num_ex 3 -chunk_size 512 -chunk_overlap 30 -llm_path /llm_summarization/llama3_hf_format/ -em_model_name BAAI/bge-base-en-v1.5 -tok_path /llm_summarization/llama3_hf_format/ -load_4bit -quant_type nf4 -dtpye float16 -dbl_quant\n```\n\nTo retrieve from a custom database of .MD files located within a certain folder (e.g. /llm_summarization/md_files/)\n```\npython rag.py -custom_ds -md_dir /llm_summarization/md_files/ -max_docs 100 -temp 0.2 -rep_pen 1.1 -max_new_tok 400 -num_ex 3 -chunk_size 512 -chunk_overlap 30 -llm_path /llm_summarization/llama3_hf_format/ -em_model_name BAAI/bge-base-en-v1.5 -tok_path /llm_summarization/llama3_hf_format/ -load_4bit -quant_type nf4 -dtpye float16 -dbl_quant\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-f%2Fllm-summarization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fd-f%2Fllm-summarization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-f%2Fllm-summarization/lists"}