{"id":18183153,"url":"https://github.com/PKU-Alignment/align-anything","last_synced_at":"2025-04-01T21:31:05.982Z","repository":{"id":248375264,"uuid":"828505099","full_name":"PKU-Alignment/align-anything","owner":"PKU-Alignment","description":"Align Anything: Training All-modality Model with Feedback","archived":false,"fork":false,"pushed_at":"2025-03-30T09:06:45.000Z","size":63841,"stargazers_count":3131,"open_issues_count":21,"forks_count":395,"subscribers_count":260,"default_branch":"main","last_synced_at":"2025-03-30T10:19:41.453Z","etag":null,"topics":["chameleon","dpo","large-language-models","multimodal","rlhf","vision-language-model"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2412.15838","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PKU-Alignment.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-14T11:05:19.000Z","updated_at":"2025-03-30T09:40:14.000Z","dependencies_parsed_at":"2024-09-12T11:30:27.310Z","dependency_job_id":"1c73b7d1-5db3-466d-a530-fc34a7a997b1","html_url":"https://github.com/PKU-Alignment/align-anything","commit_stats":null,"previous_names":["pku-alignment/align-anything"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Falign-anything","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Falign-anything/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Falign-anything/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Falign-anything/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PKU-Alignment","download_url":"https://codeload.github.com/PKU-Alignment/align-anything/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246712989,"owners_count":20821827,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chameleon","dpo","large-language-models","multimodal","rlhf","vision-language-model"],"created_at":"2024-11-02T20:00:38.980Z","updated_at":"2025-04-01T21:31:00.972Z","avatar_url":"https://github.com/PKU-Alignment.png","language":"Python","funding_links":[],"categories":["Jupyter Notebook","A01_文本生成_文本对话","7. Training \u0026 Fine-tuning Ecosystem","Industry Strength Natural Language Processing","Papers","Python"],"sub_categories":["大语言对话模型及数据","2024"],"readme":"\u003c!-- markdownlint-disable first-line-h1 --\u003e\r\n\u003c!-- markdownlint-disable html --\u003e\r\n\r\n\u003cdiv align=\"center\"\u003e\r\n  \u003cimg src=\"assets/logo.jpg\" width=\"390\"/\u003e\r\n  \u003cdiv\u003e\u0026nbsp;\u003c/div\u003e\r\n  \u003cdiv align=\"center\"\u003e\r\n    \u003cb\u003e\u003cfont size=\"5\"\u003eproject website\u003c/font\u003e\u003c/b\u003e\r\n    \u003csup\u003e\r\n      \u003ca href=\"https://space.bilibili.com/3493095748405551?spm_id_from=333.337.search-card.all.click\"\u003e\r\n        \u003ci\u003e\u003cfont size=\"4\"\u003eHOT\u003c/font\u003e\u003c/i\u003e\r\n      \u003c/a\u003e\r\n    \u003c/sup\u003e\r\n    \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\r\n    \u003cb\u003e\u003cfont size=\"5\"\u003ePKU-Alignment Team\u003c/font\u003e\u003c/b\u003e\r\n    \u003csup\u003e\r\n      \u003ca href=\"https://space.bilibili.com/3493095748405551?spm_id_from=333.337.search-card.all.click\"\u003e\r\n        \u003ci\u003e\u003cfont size=\"4\"\u003ewelcome\u003c/font\u003e\u003c/i\u003e\r\n      \u003c/a\u003e\r\n    \u003c/sup\u003e\r\n  \u003c/div\u003e\r\n  \u003cdiv\u003e\u0026nbsp;\u003c/div\u003e\r\n\r\n[![PyPI](https://img.shields.io/pypi/v/align-anything?logo=pypi)](https://pypi.org/project/align-anything)\r\n[![License](https://img.shields.io/github/license/PKU-Alignment/align-anything?label=license)](#license)\r\n\r\n[📘Documentation](https://pku-alignment.notion.site/Align-Anything-37a300fb5f774bb08e5b21fdeb476c64) |\r\n[🆕Update News](#news) |\r\n[🛠️Quick Start](#quick-start) |\r\n[🚀Algorithms](#algorithms) |\r\n[👀Evaluation](#evaluation) |\r\n[🤔Reporting Issues](#report-issues)\r\n\u003c/div\u003e\r\n\r\n\u003cdiv align=\"center\"\u003e\r\n\r\n[Our 100K Instruction-Following Datasets](https://huggingface.co/datasets/PKU-Alignment/Align-Anything-Instruction-100K)\r\n\r\n\u003c/div\u003e\r\n\r\nAlign-Anything aims to align any modality large models (any-to-any models), including LLMs, VLMs, and others, with human intentions and values. More details about the definition and milestones of alignment for Large Models can be found in [AI Alignment](https://alignmentsurvey.com). Overall, this framework has the following characteristics:\r\n\r\n- **Highly Modular Framework.** Its versatility stems from the abstraction of different algorithm types and well-designed APIs, allowing users to easily modify and customize the code for different tasks.\r\n- **Support for Various Model Fine-Tuning.** This framework includes fine-tuning capabilities for models such as LLaMA3.1, LLaVA, Gemma, Qwen, Baichuan, and others (see [Model Zoo](https://github.com/PKU-Alignment/align-anything/blob/main/Model-Zoo.md)).\r\n- **Support Fine-Tuning across Any Modality.** It supports fine-tuning alignments for different modality model, including LLMs, VLMs, and other modalities (see [Development Roadmap](#development-roadmap)).\r\n- **Support Different Alignment Methods.** The framework supports different alignment algorithms, including SFT, DPO, PPO, and others.\r\n\r\n\r\n|| \u003cdetails\u003e\u003csummary\u003eprompt\u003c/summary\u003eSmall white toilet sitting in a small corner next to a wall.\u003c/details\u003e | \u003cdetails\u003e\u003csummary\u003eprompt\u003c/summary\u003eA close up of a neatly made bed with two night stands\u003c/details\u003e  | \u003cdetails\u003e\u003csummary\u003eprompt\u003c/summary\u003eA pizza is sitting on a plate at a restaurant.\u003c/details\u003e |\u003cdetails\u003e\u003csummary\u003eprompt\u003c/summary\u003eA girl in a dress next to a piece of luggage and flowers.\u003c/details\u003e|\r\n|---| ---------------------------------- | --- | --- | --- |\r\n|Before Alignment ([Chameleon-7B](https://huggingface.co/facebook/chameleon-7b))| \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/before/1.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/before/2.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/before/3.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e  | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/before/4.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e|\r\n|**After Alignment ([Chameleon 7B Plus](https://huggingface.co/PKU-Alignment/AA-chameleon-7b-plus))**| \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/after/1.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/after/2.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/after/3.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e  | \u003cimg src=\"https://github.com/Gaiejj/align-anything-images/blob/main/chameleon/after/4.png?raw=true\" alt=\"Image 8\" style=\"max-width: 100%; height: auto;\"\u003e|\r\n\r\n\u003e Alignment fine-tuning can significantly enhance the instruction-following capabilities of large multimodal models. After fine-tuning, Chameleon 7B Plus generates images that are more relevant to the prompt.\r\n\r\n## Algorithms\r\nWe support basic alignment algorithms for different modalities, each of which may involve additional algorithms. For instance, in the text modality, we have also implemented SimPO, KTO, and others.\r\n\r\n| Modality                           | SFT | RM  | DPO | PPO |\r\n| ---------------------------------- | --- | --- | --- | --- |\r\n| `Text -\u003e Text (t2t)`               | ✔️   | ✔️   | ✔️   | ✔️   |\r\n| `Text+Image -\u003e Text (ti2t)`        | ✔️   | ✔️   | ✔️   | ✔️   |\r\n| `Text+Image -\u003e Text+Image (ti2ti)` | ✔️   | ✔️   | ✔️   | ✔️   |\r\n| `Text+Audio -\u003e Text (ta2t)`        | ✔️   | ✔️   | ✔️   | ✔️   |\r\n| `Text+Video -\u003e Text (tv2t)`        | ✔️   | ✔️   | ✔️   | ✔️   |\r\n| `Text -\u003e Image (t2i)`              | ✔️   | ⚒️   | ✔️   | ⚒️   |\r\n| `Text -\u003e Video (t2v)`              | ✔️   | ⚒️   | ✔️   | ⚒️   |\r\n| `Text -\u003e Audio (t2a)`              | ✔️   | ⚒️   | ✔️   | ⚒️   |\r\n\r\n## Evaluation\r\nWe support evaluation datasets for `Text -\u003e Text`, `Text+Image -\u003e Text` and `Text -\u003e Image`.\r\n\r\n| Modality              | Supported Benchmarks                                                  |\r\n| :-------------------- | :----------------------------------------------------------- |\r\n| `t2t`       | [ARC](https://huggingface.co/datasets/allenai/ai2_arc), [BBH](https://huggingface.co/datasets/lukaemon/bbh), [Belebele](https://huggingface.co/datasets/facebook/belebele), [CMMLU](https://huggingface.co/datasets/haonan-li/cmmlu), [GSM8K](https://huggingface.co/datasets/openai/gsm8k), [HumanEval](https://huggingface.co/datasets/openai/openai_humaneval), [MMLU](https://huggingface.co/datasets/cais/mmlu), [MMLU-Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro), [MT-Bench](https://huggingface.co/datasets/HuggingFaceH4/mt_bench_prompts), [PAWS-X](https://huggingface.co/datasets/google-research-datasets/paws-x), [RACE](https://huggingface.co/datasets/ehovy/race), [TruthfulQA ](https://huggingface.co/datasets/truthfulqa/truthful_qa) |\r\n| `ti2t` | [A-OKVQA](https://huggingface.co/datasets/HuggingFaceM4/A-OKVQA), [LLaVA-Bench(COCO)](https://huggingface.co/datasets/lmms-lab/llava-bench-coco), [LLaVA-Bench(wild)](https://huggingface.co/datasets/lmms-lab/llava-bench-in-the-wild), [MathVista](https://huggingface.co/datasets/AI4Math/MathVista), [MM-SafetyBench](https://huggingface.co/datasets/PKU-Alignment/MM-SafetyBench), [MMBench](https://huggingface.co/datasets/lmms-lab/MMBench), [MME](https://huggingface.co/datasets/lmms-lab/MME), [MMMU](https://huggingface.co/datasets/MMMU/MMMU), [MMStar](https://huggingface.co/datasets/Lin-Chen/MMStar), [MMVet](https://huggingface.co/datasets/lmms-lab/MMVet), [POPE](https://huggingface.co/datasets/lmms-lab/POPE), [ScienceQA](https://huggingface.co/datasets/derek-thomas/ScienceQA), [SPA-VL](https://huggingface.co/datasets/sqrti/SPA-VL), [TextVQA](https://huggingface.co/datasets/lmms-lab/textvqa), [VizWizVQA](https://huggingface.co/datasets/lmms-lab/VizWiz-VQA) |\r\n|`tv2t` |[MVBench](https://huggingface.co/datasets/OpenGVLab/MVBench), [Video-MME](https://huggingface.co/datasets/lmms-lab/Video-MME) |\r\n|`ta2t` |[AIR-Bench](https://huggingface.co/datasets/qyang1021/AIR-Bench-Dataset) |\r\n| `t2i`      | [ImageReward](https://huggingface.co/datasets/THUDM/ImageRewardDB), [HPSv2](https://huggingface.co/datasets/zhwang/HPDv2), [COCO-30k(FID)](https://huggingface.co/datasets/sayakpaul/coco-30-val-2014) |\r\n| `t2v`      | [ChronoMagic-Bench](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench) |\r\n| `t2a`      | [AudioCaps(FAD)](https://huggingface.co/datasets/AudioLLMs/audiocaps_test) |\r\n\r\n- ⚒️ : coming soon.\r\n\r\n# News\r\n\r\n- 2024-10-10: We support SFT for `Any -\u003e Any` modality models Emu3.\r\n- 2024-09-24: We support SFT, DPO, RM and PPO for `Text + Video -\u003e Text` modality models.\r\n- 2024-09-13: We support SFT, DPO, RM and PPO for `Text + Audio -\u003e Text` modality models.\r\n- 2024-08-17: We support DPO and PPO for `Text+Image -\u003e Text+Image` modality models.\r\n- 2024-08-15: We support a new function in the evaluation module: the `models_pk` script in [here](./scripts/models_pk.sh), which enables comparing the performance of two models across different benchmarks.\r\n- 2024-08-06: We restructure the framework to support any modality evaluation and the supported benchmark list is [here](https://github.com/PKU-Alignment/align-anything/tree/main/align_anything/evaluation/benchmarks).\r\n- 2024-08-06: We support `Text+Image -\u003e Text+Image` modality for the SFT trainer and Chameleon models.\r\n\u003cdetails\u003e\u003csummary\u003eMore News\u003c/summary\u003e\r\n\r\n- 2024-07-23: We support `Text -\u003e Image`, `Text -\u003e Audio`, and `Text -\u003e Video` modalities for the SFT trainer and DPO trainer.\r\n- 2024-07-22: We support the **Chameleon** model for the SFT trainer and DPO trainer!\r\n- 2024-07-17: We open-source the Align-Anything-Instruction-100K dataset for text modality. This dataset is available in both [English](https://huggingface.co/datasets/PKU-Alignment/Align-Anything-Instruction-100K) and [Chinese](https://huggingface.co/datasets/PKU-Alignment/Align-Anything-Instruction-100K-zh) versions, each sourced from different data sets and meticulously refined for quality by GPT-4.\r\n- 2024-07-14: We open-source the align-anything framework.\r\n\r\n\u003c/details\u003e\r\n\r\n# Installation\r\n\r\n\r\n```bash\r\n# clone the repository\r\ngit clone git@github.com:PKU-Alignment/align-anything.git\r\ncd align-anything\r\n\r\n# create virtual env\r\nconda create -n align-anything python==3.11\r\nconda activate align-anything\r\n```\r\n\r\n- **`[Optional]`** We recommend installing [CUDA](https://anaconda.org/nvidia/cuda) in the conda environment and set the environment variable.\r\n\r\n```bash\r\n# We tested on the H800 computing cluster, and this version of CUDA works well. \r\n# You can adjust this version according to the actual situation of the computing cluster.\r\n\r\nconda install nvidia/label/cuda-12.2.0::cuda\r\nexport CUDA_HOME=$CONDA_PREFIX\r\n```\r\n\r\n\u003e If your CUDA installed in a different location, such as `/usr/local/cuda/bin/nvcc`, you can set the environment variables as follows:\r\n\r\n```bash\r\nexport CUDA_HOME=\"/usr/local/cuda\"\r\n```\r\n\r\nFianlly, install `align-anything` by:\r\n\r\n```bash\r\npip install -e .\r\n```\r\n\r\n## Wandb Logger\r\n\r\nWe support `wandb` logging. By default, it is set to offline. If you need to view wandb logs online, you can specify the environment variables of `WANDB_API_KEY` before starting the training:\r\n\r\n```bash\r\nexport WANDB_API_KEY=\"...\"  # your W\u0026B API key here\r\n```\r\n\r\n\u003c!-- ## Install from Dockerfile\r\n\r\n1. build docker image\r\n\r\n\r\n```bash\r\nFROM nvcr.io/nvidia/pytorch:24.02-py3\r\n\r\nRUN echo \"export PS1='[\\[\\e[1;33m\\]\\u\\[\\e[0m\\]:\\[\\e[1;35m\\]\\w\\[\\e[0m\\]]\\$ '\" \u003e\u003e ~/.bashrc\r\n\r\nWORKDIR /root/align-anything\r\nCOPY . .\r\n\r\nRUN python -m pip install --upgrade pip \\\r\n    \u0026\u0026 pip install -e .\r\n```\r\n\r\nthen,\r\n\r\n```bash\r\ndocker build --tag align-anything .\r\n```\r\n\r\n2. run the container\r\n\r\n```bash\r\ndocker run -it --rm \\\r\n    --gpus all \\\r\n    --ipc=host \\\r\n    --ulimit memlock=-1 \\\r\n    --ulimit stack=67108864 \\\r\n    --mount type=bind,source=\u003chost's mode path\u003e,target=\u003cdocker's mode path\u003e \\\r\n    align-anything\r\n``` --\u003e\r\n\r\n\r\n# Quick Start\r\n\r\n## Training Scripts\r\n\r\nTo prepare for training, all scripts are located in the `./scripts` and parameters that require user input have been left empty. For example, the DPO scripts for `Text + Image -\u003e Text` modality is as follow:\r\n\r\n```bash\r\nMODEL_NAME_OR_PATH=\"\" # model path\r\nTRAIN_DATASETS=\"\" # dataset path\r\nTRAIN_TEMPLATE=\"\" # dataset template\r\nTRAIN_SPLIT=\"\" # split the dataset\r\nOUTPUT_DIR=\"\"  # output dir\r\n\r\nsource ./setup.sh # source the setup script\r\n\r\nexport CUDA_HOME=$CONDA_PREFIX # replace it with your CUDA path\r\n\r\ndeepspeed \\\r\n\t--master_port ${MASTER_PORT} \\\r\n\t--module align_anything.trainers.text_image_to_text.dpo \\\r\n\t--model_name_or_path ${MODEL_NAME_OR_PATH} \\\r\n\t--train_datasets ${TRAIN_DATASETS} \\\r\n\t--train_template SPA_VL \\\r\n\t--train_split train \\\r\n\t--output_dir ${OUTPUT_DIR}\r\n```\r\n\r\nWe can run DPO with [LLaVA-v1.5-7B](https://huggingface.co/llava-hf/llava-1.5-7b-hf) (HF format) and [SPA-VL](https://huggingface.co/datasets/sqrti/SPA-VL) dataset using the follow script:\r\n\r\n```bash\r\nMODEL_NAME_OR_PATH=\"llava-hf/llava-1.5-7b-hf\" # model path\r\nTRAIN_DATASETS=\"sqrti/SPA-VL\" # dataset path\r\nTRAIN_TEMPLATE=\"SPA_VL\" # dataset template\r\nTRAIN_SPLIT=\"train\" # split the dataset\r\nOUTPUT_DIR=\"../output/dpo\" # output dir\r\nexport WANDB_API_KEY=\"YOUR_WANDB_KEY\" # wandb logging\r\n\r\nsource ./setup.sh # source the setup script\r\n\r\nexport CUDA_HOME=$CONDA_PREFIX # replace it with your CUDA path\r\n\r\ndeepspeed \\\r\n\t--master_port ${MASTER_PORT} \\\r\n\t--module align_anything.trainers.text_image_to_text.dpo \\\r\n\t--model_name_or_path ${MODEL_NAME_OR_PATH} \\\r\n\t--train_datasets ${TRAIN_DATASETS} \\\r\n\t--train_template ${TRAIN_TEMPLATE} \\\r\n\t--train_split ${TRAIN_SPLIT} \\\r\n\t--output_dir ${OUTPUT_DIR}\r\n```\r\n\r\n## Evaluation\r\n\r\nAll evaluation scripts can be found in the `./scripts`. The `./scripts/evaluate.sh` script runs model evaluation on the benchmarks, and parameters that require user input have been left empty. The corresponding script is as follow:\r\n\r\n```bash\r\nSCRIPT_DIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )\" \u0026\u0026 pwd )\"\r\ncd \"${SCRIPT_DIR}/../align_anything/evaluation\" || exit 1\r\n\r\nBENCHMARKS=(\"\") # evaluation benchmarks\r\nOUTPUT_DIR=\"\" # output dir\r\nGENERATION_BACKEND=\"\" # generation backend\r\nMODEL_ID=\"\" # model's unique id\r\nMODEL_NAME_OR_PATH=\"\" # model path\r\nCHAT_TEMPLATE=\"\" # model template\r\n\r\nfor BENCHMARK in \"${BENCHMARKS[@]}\"; do\r\n    python __main__.py \\\r\n        --benchmark ${BENCHMARK} \\\r\n        --output_dir ${OUTPUT_DIR} \\\r\n        --generation_backend ${GENERATION_BACKEND} \\\r\n        --model_id ${MODEL_ID} \\\r\n        --model_name_or_path ${MODEL_NAME_OR_PATH} \\\r\n        --chat_template ${CHAT_TEMPLATE}\r\ndone\r\n```\r\n\r\nFor example, you can evaluate [LLaVA-v1.5-7B](https://huggingface.co/llava-hf/llava-1.5-7b-hf) (HF format) on [POPE](https://huggingface.co/datasets/lmms-lab/POPE) and [MM-SafetyBench](https://huggingface.co/datasets/PKU-Alignment/MM-SafetyBench) benchmarks using the follow script:\r\n\r\n```bash\r\nSCRIPT_DIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )\" \u0026\u0026 pwd )\"\r\ncd \"${SCRIPT_DIR}/../align_anything/evaluation\" || exit 1\r\n\r\nBENCHMARKS=(\"POPE\" \"MM-SafetyBench\") # evaluation benchmarks\r\nOUTPUT_DIR=\"../output/evaluation\" # output dir\r\nGENERATION_BACKEND=\"vLLM\" # generation backend\r\nMODEL_ID=\"llava-1.5-7b-hf\" # model's unique id\r\nMODEL_NAME_OR_PATH=\"llava-hf/llava-1.5-7b-hf\" # model path\r\nCHAT_TEMPLATE=\"Llava\" # model template\r\n\r\nfor BENCHMARK in \"${BENCHMARKS[@]}\"; do\r\n    python __main__.py \\\r\n        --benchmark ${BENCHMARK} \\\r\n        --output_dir ${OUTPUT_DIR} \\\r\n        --generation_backend ${GENERATION_BACKEND} \\\r\n        --model_id ${MODEL_ID} \\\r\n        --model_name_or_path ${MODEL_NAME_OR_PATH} \\\r\n        --chat_template ${CHAT_TEMPLATE}\r\ndone\r\n```\r\n\r\nYou can modify the configuration files for the benchmarks in [this directory](https://github.com/PKU-Alignment/align-anything/tree/main/align_anything/configs/evaluation/benchmarks) to suit specific evaluation tasks and models, and adjust inference parameters for [vLLM](https://github.com/PKU-Alignment/align-anything/tree/main/align_anything/configs/evaluation/vllm) or [DeepSpeed](https://github.com/PKU-Alignment/align-anything/tree/main/align_anything/configs/evaluation/deepspeed) based on your generation backend. For more details about the evaluation pipeline, refer to the [here](https://github.com/PKU-Alignment/align-anything/blob/main/align_anything/evaluation/README.md).\r\n\r\n# Inference\r\n\r\n## Interactive Client\r\n\r\n```bash\r\npython3 -m align_anything.serve.cli --model_name_or_path your_model_name_or_path\r\n```\r\n\r\n\u003cimg src=\"assets/cli_demo.gif\" alt=\"cli_demo\" style=\"width:600px;\"\u003e\r\n\r\n\r\n## Interactive Arena\r\n\r\n```bash\r\npython3 -m align_anything.serve.arena \\\r\n    --red_corner_model_name_or_path your_red_model_name_or_path \\\r\n    --blue_corner_model_name_or_path your_blue_model_name_or_path\r\n```\r\n\r\n\u003cimg src=\"assets/arena_demo.gif\" alt=\"arena_demo\" style=\"width:600px;\"\u003e\r\n\r\n## Report Issues\r\n\r\nIf you have any questions in the process of using align-anything, don't hesitate to ask your questions on [the GitHub issue page](https://github.com/PKU-Alignment/align-anything/issues/new/choose), we will reply to you in 2-3 working days.\r\n\r\n\r\n# Citation\r\n\r\nPlease cite the repo if you use the data or code in this repo.\r\n\r\n```bibtex\r\n@misc{align_anything,\r\n  author = {PKU-Alignment Team},\r\n  title = {Align Anything: training all modality models to follow instructions with unified language feedback},\r\n  year = {2024},\r\n  publisher = {GitHub},\r\n  journal = {GitHub repository},\r\n  howpublished = {\\url{https://github.com/PKU-Alignment/align-anything}},\r\n}\r\n```\r\n\r\n# License\r\n\r\nalign-anything is released under Apache License 2.0.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPKU-Alignment%2Falign-anything","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPKU-Alignment%2Falign-anything","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPKU-Alignment%2Falign-anything/lists"}