{"id":13625592,"url":"https://github.com/xfactlab/orpo","last_synced_at":"2025-04-16T06:33:02.878Z","repository":{"id":227102484,"uuid":"770365509","full_name":"xfactlab/orpo","owner":"xfactlab","description":"Official repository for ORPO","archived":false,"fork":false,"pushed_at":"2024-05-31T06:39:39.000Z","size":1956,"stargazers_count":390,"open_issues_count":6,"forks_count":35,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-08-01T22:05:40.536Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xfactlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-11T12:31:38.000Z","updated_at":"2024-08-01T18:29:10.000Z","dependencies_parsed_at":"2024-07-29T09:44:14.686Z","dependency_job_id":null,"html_url":"https://github.com/xfactlab/orpo","commit_stats":null,"previous_names":["xfactlab/orpo"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xfactlab%2Forpo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xfactlab%2Forpo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xfactlab%2Forpo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xfactlab%2Forpo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xfactlab","download_url":"https://codeload.github.com/xfactlab/orpo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223700617,"owners_count":17188368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T21:01:58.102Z","updated_at":"2024-11-08T14:32:03.514Z","avatar_url":"https://github.com/xfactlab.png","language":"Python","funding_links":[],"categories":["others","A01_文本生成_文本对话","Python"],"sub_categories":["大语言对话模型及数据"],"readme":"# **ORPO**\n\n### **`Updates (24.03.25)`**\n- [X] Sample script for ORPOTrainer in 🤗\u003ca class=\"link\" href=\"https://github.com/huggingface/trl\"\u003eTRL\u003c/a\u003e is added to `trl/test_orpo_trainer_demo.py`\n- [X] New model, 🤗\u003ca class=\"link\" href=\"https://huggingface.co/kaist-ai/mistral-orpo-capybara-7k\"\u003ekaist-ai/mistral-orpo-capybara-7k\u003c/a\u003e, is added to 🤗\u003ca class=\"link\" href=\"https://huggingface.co/collections/kaist-ai/orpo-65efef87544ba100aef30013\"\u003eORPO Collection\u003c/a\u003e \n- [X] Now you can try ORPO in 🤗\u003ca class=\"link\" href=\"https://github.com/huggingface/trl\"\u003eTRL\u003c/a\u003e, \u003ca class=\"link\" href=\"https://github.com/OpenAccess-AI-Collective/axolotl\"\u003eAxolotl\u003c/a\u003e and \u003ca class=\"link\" href=\"https://github.com/hiyouga/LLaMA-Factory\"\u003eLLaMA-Factory\u003c/a\u003e🔥\n- [X] We are making general guideline for training LLMs with ORPO, stay tuned🔥\n- [X] **Mistral-ORPO-β** achieved a 14.7% in the length-controlled (LC) win rate on \u003ca class=\"link\" href=\"https://tatsu-lab.github.io/alpaca_eval/\"\u003eofficial AlpacaEval Leaderboard\u003c/a\u003e🔥\n\n\u0026nbsp;\n\nThis is the official repository for \u003ca class=\"link\" href=\"https://arxiv.org/abs/2403.07691\"\u003e**ORPO: Monolithic Preference Optimization without Reference Model**\u003c/a\u003e. The detailed results in the paper can be found in:\n- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kaist-ai%2Fmistral-orpo-beta)\n- [AlpacaEval](#alpacaeval)\n- [MT-Bench](#mt-bench)\n- [IFEval](#ifeval)\n\n\n### **`Model Checkpoints`**\n\nOur models trained with ORPO can be found in:\n\n- [X] **Mistral-ORPO-Capybara-7k**: 🤗 \u003ca class=\"link\" href=\"https://huggingface.co/kaist-ai/mistral-orpo-capybara-7k\"\u003ekaist-ai/mistral-orpo-capybara-7k\u003c/a\u003e\n- [X] **Mistral-ORPO-⍺**: 🤗 \u003ca class=\"link\" href=\"https://huggingface.co/kaist-ai/mistral-orpo-alpha\"\u003ekaist-ai/mistral-orpo-alpha\u003c/a\u003e\n- [X] **Mistral-ORPO-β**: 🤗 \u003ca class=\"link\" href=\"https://huggingface.co/kaist-ai/mistral-orpo-beta\"\u003ekaist-ai/mistral-orpo-beta\u003c/a\u003e\n\nAnd the corresponding logs for the average log probabilities of chosen/rejected responses during training are reported in:\n\n- [X] **Mistral-ORPO-Capybara-7k**: TBU\n- [X] **Mistral-ORPO-⍺**: \u003ca class=\"link\" href=\"https://wandb.ai/jiwooya1000/PREF/reports/Mistral-ORPO-7B-Training-Log--Vmlldzo3MTE1NzE0?accessToken=rms6o4mg5vo3feu1bvbpk632m4cspe19l0u1p4he3othx5bgean82chn9neiile6\"\u003eWandb Report for Mistral-ORPO-⍺\u003c/a\u003e\n- [X] **Mistral-ORPO-β**: \u003ca class=\"link\" href=\"https://wandb.ai/jiwooya1000/PREF/reports/Mistral-ORPO-7B-Training-Log--Vmlldzo3MTE3MzMy?accessToken=dij4qbp6dcrofsanzbgobjsne9el8a2zkly2u5z82rxisd4wiwv1rhp0s2dub11e\"\u003eWandb Report for Mistral-ORPO-β\u003c/a\u003e\n\n\u0026nbsp;\n\n### **`AlpacaEval`**\n\n\u003cfigure\u003e\n  \u003cimg class=\"png\" src=\"/assets/img/alpaca_blog.png\" alt=\"Description of the image\"\u003e\n  \u003cfigcaption\u003e\u003cb\u003eFigure 1.\u003c/b\u003e AlpacaEval 2.0 score for the models trained with different alignment methods.\u003c/figcaption\u003e\n\u003c/figure\u003e\n\n\u0026nbsp;\n\n### **`MT-Bench`**\n\n\u003cfigure\u003e\n  \u003cimg class=\"png\" src=\"/assets/img/mtbench_hf.png\" alt=\"Description of the image\"\u003e\n  \u003cfigcaption\u003e\u003cb\u003eFigure 2.\u003c/b\u003e MT-Bench result by category.\u003c/figcaption\u003e\n\u003c/figure\u003e\n\n\u0026nbsp;\n\n### **`IFEval`**\n\nIFEval scores are measured with \u003ca class=\"link\" href=\"https://github.com/EleutherAI/lm-evaluation-harness\"\u003eEleutherAI/lm-evaluation-harness\u003c/a\u003e by applying the chat template. The scores for Llama-2-Chat (70B), Zephyr-β (7B), and Mixtral-8X7B-Instruct-v0.1 are originally reported in \u003ca class=\"link\" href=\"https://twitter.com/wiskojo/status/1739767758462877823\"\u003ethis tweet\u003c/a\u003e.\n\n| **Model Type**     | **Prompt-Strict** | **Prompt-Loose** | **Inst-Strict** | **Inst-Loose** |\n|--------------------|:-----------------:|:----------------:|:---------------:|----------------|\n| **Llama-2-Chat (70B)** |       0.4436      |      0.5342      |      0.5468     |     0.6319     |\n| **Zephyr-β (7B)** |       0.4233      |      0.4547      |      0.5492     |     0.5767     |\n| **Mixtral-8X7B-Instruct-v0.1** |       0.5213      |      **0.5712**      |      0.6343     |     **0.6823**     |\n| **Mistral-ORPO-⍺ (7B)** |       0.5009      |      0.5083      |      0.5995     |     0.6163     |\n| **Mistral-ORPO-β (7B)** |       **0.5287**      |      0.5564      |      **0.6355**     |     0.6619     |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxfactlab%2Forpo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxfactlab%2Forpo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxfactlab%2Forpo/lists"}