{"id":14964629,"url":"https://github.com/tencentarc/llama-pro","last_synced_at":"2025-04-05T02:10:14.964Z","repository":{"id":215459761,"uuid":"738033067","full_name":"TencentARC/LLaMA-Pro","owner":"TencentARC","description":"[ACL 2024] Progressive LLaMA with Block Expansion.","archived":false,"fork":false,"pushed_at":"2024-05-20T04:51:44.000Z","size":84640,"stargazers_count":476,"open_issues_count":22,"forks_count":35,"subscribers_count":20,"default_branch":"main","last_synced_at":"2024-10-29T17:12:22.717Z","etag":null,"topics":["llama","llama2","llm"],"latest_commit_sha":null,"homepage":"https://tencentarc.github.io/LLaMA-Pro/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TencentARC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-02T08:57:19.000Z","updated_at":"2024-10-29T13:14:04.000Z","dependencies_parsed_at":"2024-01-09T17:25:52.554Z","dependency_job_id":"2e09bbc8-5e6b-4e50-8bfc-376acefa91c1","html_url":"https://github.com/TencentARC/LLaMA-Pro","commit_stats":{"total_commits":21,"total_committers":4,"mean_commits":5.25,"dds":0.6190476190476191,"last_synced_commit":"bead65718f9d2c7877a170f953ead24d2cc53b91"},"previous_names":["tencentarc/llama-pro"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FLLaMA-Pro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FLLaMA-Pro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FLLaMA-Pro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FLLaMA-Pro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TencentARC","download_url":"https://codeload.github.com/TencentARC/LLaMA-Pro/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247276189,"owners_count":20912288,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llama","llama2","llm"],"created_at":"2024-09-24T13:33:32.488Z","updated_at":"2025-04-05T02:10:14.941Z","avatar_url":"https://github.com/TencentARC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"#   \u003cimg src=\"assets/icon.png\" width = \"20\" height = \"40\" alt=\"图片名称\" align=center /\u003e LLaMA Pro: Progressive LLaMA with Block Expansion\n\u003cp align=\"center\"\u003e\n📃 \u003ca href=\"https://arxiv.org/abs/2401.02415\" target=\"_blank\"\u003ePaper\u003c/a\u003e • 🤗 \u003ca href=\"https://huggingface.co/TencentARC/LLaMA-Pro-8B\" target=\"_blank\"\u003eDemo \u0026 Model\u003c/a\u003e \n\u003c/p\u003e\n\n## News\n* [2024/01/06] We open source the [LLaMA-Pro repository](https://github.com/TencentARC/LLaMA-Pro) and [Demo \u0026 Model](https://huggingface.co/TencentARC/LLaMA-Pro-8B). \n* [2024/01/07] Add how to run gradio demo locally in [demo](./demo/app.py)\n* [2024/01/18] Add the training code in [open-instruct](https://github.com/hills-code/open-instruct/tree/llama-pro).\n* [2024/02/23] We release the [Mistral-Pro-8B-v0.1](https://huggingface.co/TencentARC/Mistral_Pro_8B_v0.1) with superior performance on a range of benchmarks. It enhances the code and math performance of Mistral and matches the performance of the recently dominant model, [Gemma](https://huggingface.co/google/gemma-7b).\n![assets/mistral_pro_performance.png](assets/mistral_pro_performance.png)\n* [2024/02/23] We release the evaluation code of [Mistral-Pro-8B-v0.1](https://huggingface.co/TencentARC/Mistral_Pro_8B_v0.1) in [lm-evaluation-harness](https://github.com/hills-code/lm-evaluation-harness).\n* [2024/02/23] We release [MetaMath-Mistral-Pro](https://huggingface.co/TencentARC/MetaMath-Mistral-Pro) that surpasses previous MetaMath series 7B models at both GSM8k and MATH. The evaluation is following [the official MetaMath repo](https://github.com/meta-math/MetaMath).\n* [2024/05/08] Add the pre-train example script for cosmopedia in [open-instruct](https://github.com/hills-code/open-instruct/tree/llama-pro).\n* [2024/05/16] [LLaMA Pro](https://arxiv.org/abs/2401.02415) has been accepted to the main conference of ACL 2024!\n\n\n🔥 Comprehensive Results\n\n| Model               | GSM8k Pass@1 | MATH Pass@1 |\n|---------------------|--------------|-------------|\n| WizardMath-7B       | 54.9         | 10.7        |\n| LLaMA-2-70B         | 56.8         | 13.5        |\n| WizardMath-13B      | 63.9         | 14.0        |\n| MetaMath-7B         | 66.5     | 19.8    |\n| MetaMath-13B        | 72.3     | 22.4    |\n| MetaMath-Mistral-7B | 77.7     | 28.2    |\n| MetaMath-Llemma-7B  | 69.2     | 30.0    |\n| 🔥 **MetaMath-Mistral-Pro** | **78.4**     | **30.3**        |\n\n## Acknowledgement\nThe code of instruction tuning is based on the official implementation of [open-instruct](https://github.com/allenai/open-instruct).\n\nThanks [huggingface](https://huggingface.co/TencentARC/LLaMA-Pro-8B) \u0026 [wisemodel](https://wisemodel.cn/models/TencentARC/LLaMA-Pro-8B) for hosting our checkpoint.\n\n## Citation\nThe code and model in this repository is mostly developed for or derived from the paper below. Please cite it if you find the repository helpful.\n```\n@article{wu2024llama,\n  title={Llama pro: Progressive llama with block expansion},\n  author={Wu, Chengyue and Gan, Yukang and Ge, Yixiao and Lu, Zeyu and Wang, Jiahao and Feng, Ye and Luo, Ping and Shan, Ying},\n  journal={arXiv preprint arXiv:2401.02415},\n  year={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fllama-pro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftencentarc%2Fllama-pro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fllama-pro/lists"}