{"id":13562877,"url":"https://github.com/llm-merging/LLM-Merging","last_synced_at":"2025-04-03T19:31:42.397Z","repository":{"id":228500216,"uuid":"774150007","full_name":"llm-merging/LLM-Merging","owner":"llm-merging","description":"LLM-Merging: Building LLMs Efficiently through Merging","archived":false,"fork":false,"pushed_at":"2024-09-24T19:09:50.000Z","size":3153,"stargazers_count":173,"open_issues_count":8,"forks_count":39,"subscribers_count":18,"default_branch":"main","last_synced_at":"2024-11-04T15:51:38.630Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/llm-merging.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-19T03:00:01.000Z","updated_at":"2024-11-03T12:21:18.000Z","dependencies_parsed_at":"2024-03-27T18:47:37.924Z","dependency_job_id":"e7f66297-317d-4507-bb03-a2d3c13e69db","html_url":"https://github.com/llm-merging/LLM-Merging","commit_stats":null,"previous_names":["llm-merging/llm-alchemy","llm-merging/llm-merging"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/llm-merging%2FLLM-Merging","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/llm-merging%2FLLM-Merging/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/llm-merging%2FLLM-Merging/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/llm-merging%2FLLM-Merging/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/llm-merging","download_url":"https://codeload.github.com/llm-merging/LLM-Merging/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247065279,"owners_count":20877748,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T13:01:13.078Z","updated_at":"2025-04-03T19:31:41.919Z","avatar_url":"https://github.com/llm-merging.png","language":"Jupyter Notebook","readme":"\u003cdiv align=\"center\"\u003e\n\n\n\u003ch1\u003eLLM-Merging: Building LLMs Efficiently through Merging \u003c/h1\u003e\n\n[![](https://img.shields.io/badge/Documentation-online-green)](https://llm-merging.readthedocs.io)\n[![](https://img.shields.io/badge/Website-online-green)](https://llm-merging.github.io)\n[![](https://img.shields.io/badge/License-MIT-blue)](#License)\n\u003c/div\u003e\n\nThis repository contains the starter code for the LLM-Merging competition.\n\n## Important Tips\n1.  Please do not specify any device_id in the code because the device_id might not hold in our setup. If you need to specify a device_id in your setup, one solution is to use environment variables like\n```bash\nexport CUDA_VISIBLE_DEVICES=0  \n```\n2. Please do not specify any filepaths because they may not be the same in our setup. If you need to specify the HuggingFace cache, one solution is to use environment variables like\n```bash\nexport HUGGINGFACE_HUB_CACHE=/tmp/\n```\nand then access this path in Python via \n```python\npath=os.environ[\"HUGGINGFACE_HUB_CACHE\"]\n```\n3. When running `tar` on this repo `LLM-Merging` to submit it, please ensure this directory is called `LLM-Merging` and not renamed to any directories. This can cause issues when evaluating your submissions.   \n\n## Setup Environment\n\nThe library was tested on CUDA 10.1 on an A6000.\n\n```bash\nconda env create -f environment.yml --name llm-merging\nconda activate llm-merging\nexport PYTHONPATH=`pwd`\n```\n\nAuthentication tokens are required for certain models like Llama2, which require users to agree to specific terms. You can find the authentication token [here](https://huggingface.co/settings/tokens).\n\n```bash\nexport HF_AUTH_TOKEN=\"\"\n```\n\n## Developing New Merging Methods\n\nYou may make some modifications to the starter kit code, constrained by the terms listed under the Submissions section.\n\n1. To add a new merging method, create a new file in `llm_merging/merging`.\n\n    This file should extend `llm_merging/merging/Merges` and implement `__init__()` and `merge()` functions.\n    See `llm_merging/merging/FlanT5Avg.py`, `llm_merging/merging/LlamaAvg.py`, and `llm_merging/merging/TinyLlamaAvg.py` for examples.\n\n2. Add the new merging method to the dictionary returned by `all_merge_handlers()` in `llm_merging/main.py`\n\n3. Add the new module to `llm_merging/merging/__init__.py`\n\n4. Add any additional required libraries to `setup.py`.\n\n## Test Method\n\n```bash\npython llm_merging/setup.py install\npython llm_merging/main.py -m {merging_method}\n```\n\nThe validation dataset (consisting of CosmosQA and XSum) is mainly included to ensure the merging method (with evaluation on those datasets) runs in under the 1-hour time limit. Our results on `llama_avg` are `{\"cosmos_qa\": {\"accuracy\": 0.234}, \"xsum\": {\"rouge1\": 0.123, \"rouge2\": 0.023, \"rougeL\": 0.093, \"rougeLsum\": 0.102}}`, which run in about 25 minutes on our A6000.\n\n## Submissions\n\nMost modifications to the starter kit are allowed. In general, any change that honors the spirit of the competition, to understand how best to merge models, will be allowed. For example, modifying the generation code to allow for dynamic selection of a prompt is allowed, whether that's your own code or imported. Changes which are not allowed might include optimizations focused only on making the forward pass of a model faster. This is because squeezing out such speedups does not contribute to drawing any conclusions about the merging method.\n\nYou may use any publicly available library/module, as long as we will also be able to install it easily from a requirements or dependency list. You may finetune the model, but keep in mind that the compute limits apply to the finetuning stage as well, and you will eventually need to run your code, exactly as it is, on a held out test set that you will not have access to until after you finalize and submit your code.\n\n### How to submit\n\nYou must submit the output file on Kaggle, and the model files via the instructions below.\n\nFirst, generate the output file, using the input dataset file found in `data/test.csv`. Name your output file `submission.csv`.\nTo submit to Kaggle, go to our [Kaggle competition site](https://www.kaggle.com/competitions/llm-merging-competition/overview) and click `Submit Prediction`, uploading your `submission.csv`.\n\nNext, tar this repo for submission:\n\n```bash\ntar -cvf {merging_method}.tar LLM-Merging\n```\n\nSubmit the tar file using this [form](https://docs.google.com/forms/d/17TPg7N02o8qvw1czx55Zbh_5Kp7-YStUIOhQDJYc23g/)\n\n## Leaderboard\n\nThe leaderboard being used is on our [Kaggle competition site](https://www.kaggle.com/competitions/llm-merging-competition/overview).\nThe leaderboard's standings are *not* final.The final results of the competition will be calculated after the conclusion of the competition. At that point, we will release the inputs for our final held out evaluation, and you will have a week to run your model code on this input. The input will be in the same format as the `test.csv` file in this competition. You will then be responsible for submitting this final output file to us. For all top placers, we will be verifying that the code you submitted via the form before the closing of the competition does indeed yield your final submission csv. \n\nThe old leaderboard of the submitted solutions can be found [here](https://huggingface.co/spaces/margsli/merging_competition). \n","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fllm-merging%2FLLM-Merging","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fllm-merging%2FLLM-Merging","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fllm-merging%2FLLM-Merging/lists"}