{"id":14043904,"url":"https://github.com/OpenGVLab/LAMM","last_synced_at":"2025-07-27T15:31:57.783Z","repository":{"id":174540328,"uuid":"650980216","full_name":"OpenGVLab/LAMM","owner":"OpenGVLab","description":"[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents","archived":false,"fork":false,"pushed_at":"2024-04-16T11:30:23.000Z","size":17429,"stargazers_count":296,"open_issues_count":11,"forks_count":16,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-10-18T01:57:17.993Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://openlamm.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenGVLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-08T08:21:38.000Z","updated_at":"2024-10-17T06:27:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"d36d1015-2b7f-4987-a53d-508f049a71ff","html_url":"https://github.com/OpenGVLab/LAMM","commit_stats":null,"previous_names":["openlamm/lamm","opengvlab/lamm"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FLAMM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FLAMM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FLAMM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FLAMM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenGVLab","download_url":"https://codeload.github.com/OpenGVLab/LAMM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227814501,"owners_count":17823912,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-12T08:06:37.280Z","updated_at":"2024-12-02T22:31:50.773Z","avatar_url":"https://github.com/OpenGVLab.png","language":"Python","funding_links":[],"categories":["Evaluation"],"sub_categories":[],"readme":"# LAMM\n\nLAMM (pronounced as /læm/, means cute lamb to show appreciation to LLaMA), is a growing open-source community aimed at helping researchers and developers quickly train and evaluate Multi-modal Large Language Models (MLLM), and further build multi-modal AI agents capable of bridging the gap between ideas and execution, enabling seamless interaction between humans and AI machines.\n\n\u003cp align=\"center\"\u003e\n    \u003cfont size='4'\u003e\n    \u003ca href=\"https://openlamm.github.io/\" target=\"_blank\"\u003e🌏 Project Page\u003c/a\u003e\n    \u003c/font\u003e\n\u003c/p\u003e\n\n## Updates \n📆 [**2024-03**] \n1. [Ch3Ef](https://openlamm.github.io/ch3ef/) is available!\n2. [Ch3Ef](https://arxiv.org/abs/2403.17830) released on Arxiv!\n3. [Dataset](https://huggingface.co/datasets/openlamm/Ch3Ef) and [leaderboard](https://openlamm.github.io/ch3ef/leaderboard.html) are available!\n\n📆 [**2023-12**] \n1. [DepictQA](https://arxiv.org/abs/2312.08962): Depicted Image Quality Assessment based on Multi-modal Language Models released on Arxiv!\n2. [MP5](https://arxiv.org/abs/2312.07472): A Multi-modal LLM based Open-ended Embodied System in Minecraft released on Arxiv!\n\n📆 [**2023-11**] \n\n1. [ChEF](https://openlamm.github.io/paper_list/ChEF): A comprehensive evaluation framework for MLLM released on Arxiv!\n2. [Octavius](https://openlamm.github.io/paper_list/Octavius): Mitigating Task Interference in MLLMs by combining Mixture-of-Experts (MoEs) with LoRAs released on Arxiv!\n3. Camera ready version of LAMM is available on [Arxiv](https://arxiv.org/abs/2306.06687).\n\n📆 [**2023-10**]\n1. LAMM is accepted by NeurIPS2023 Datasets \u0026 Benchmark Track! See you in December!\n\n📆 [**2023-09**]\n1. Light training framework for V100 or RTX3090 is available! LLaMA2-based finetuning is also online.\n2. Our demo moved to \u003ca href=\"https://openxlab.org.cn/apps/detail/LAMM/LAMM\" target=\"_blank\"\u003eOpenXLab\u003c/a\u003e.\n\n📆 [**2023-07**]\n1.  Checkpoints \u0026 Leaderboard of LAMM on huggingface updated on new code base.\n2.  Evaluation code for both 2D and 3D tasks are ready.\n3.  Command line demo tools updated.\n\n📆 [**2023-06**]\n1. LAMM: 2D \u0026 3D dataset \u0026 benchmark for MLLM\n2. Watch demo video for LAMM at \u003ca href=\"https://www.youtube.com/watch?v=M7XlIe8hhPk\" target=\"_blank\"\u003eYouTube\u003c/a\u003e or \u003ca href=\"https://www.bilibili.com/video/BV1kN411D7kt/\" target=\"_blank\"\u003eBilibili\u003c/a\u003e!\n3. Full paper with Appendix is available on \u003ca href=\"https://arxiv.org/abs/2306.06687\" target=\"_blank\"\u003eArxiv\u003c/a\u003e.\n4. LAMM dataset released on \u003ca href=\"https://huggingface.co/datasets/openlamm/LAMM_Dataset\" target=\"_blank\"\u003eHuggingface\u003c/a\u003e \u0026 \u003ca href=\"https://opendatalab.com/LAMM/LAMM\" target=\"_blank\"\u003eOpenDataLab\u003c/a\u003e for Research community!',\n5. LAMM code is available for Research community!\n\n\n## Paper List\n**Publications**\n\n- [x] [LAMM](https://openlamm.github.io/paper_list/LAMM)\n- [x] [Octavius](https://openlamm.github.io/paper_list/Octavius)\n\n**Preprints**\n- [x] [Assessment of Multimodal Large Language Models in Alignment with Human Values](https://openlamm.github.io/ch3ef/)\n- [x] [ChEF](https://openlamm.github.io/paper_list/ChEF)\n\n## Citation\n**LAMM**\n\n```\n@article{yin2023lamm,\n    title={LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark},\n    author={Yin, Zhenfei and Wang, Jiong and Cao, Jianjian and Shi, Zhelun and Liu, Dingning and Li, Mukai and Sheng, Lu and Bai, Lei and Huang, Xiaoshui and Wang, Zhiyong and others},\n    journal={arXiv preprint arXiv:2306.06687},\n    year={2023}\n}\n```\n\n**Assessment of Multimodal Large Language Models in Alignment with Human Values**\n\n```\n@misc{shi2024assessment,\n      title={Assessment of Multimodal Large Language Models in Alignment with Human Values}, \n      author={Zhelun Shi and Zhipin Wang and Hongxing Fan and Zaibin Zhang and Lijun Li and Yongting Zhang and Zhenfei Yin and Lu Sheng and Yu Qiao and Jing Shao},\n      year={2024},\n      eprint={2403.17830},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n\n**ChEF**\n\n```\n@misc{shi2023chef,\n      title={ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models}, \n      author={Zhelun Shi and Zhipin Wang and Hongxing Fan and Zhenfei Yin and Lu Sheng and Yu Qiao and Jing Shao},\n      year={2023},\n      eprint={2311.02692},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n\n**Octavius**\n\n```\n@misc{chen2023octavius,\n      title={Octavius: Mitigating Task Interference in MLLMs via MoE}, \n      author={Zeren Chen and Ziqin Wang and Zhen Wang and Huayang Liu and Zhenfei Yin and Si Liu and Lu Sheng and Wanli Ouyang and Yu Qiao and Jing Shao},\n      year={2023},\n      eprint={2311.02684},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n\n**DepictQA**\n\n```\n@article{depictqa,\n        title={Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models},\n        author={You, Zhiyuan and Li, Zheyuan, and Gu, Jinjin, and Yin, Zhenfei and Xue, Tianfan and Dong, Chao},\n        journal={arXiv preprint arXiv:2312.08962},\n        year={2023}\n    }\n```\n\n**MP5**\n\n```\n@misc{qin2023mp5,\n  title         = {MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception}, \n  author        = {Yiran Qin and Enshen Zhou and Qichang Liu and Zhenfei Yin and Lu Sheng and Ruimao Zhang and Yu Qiao and Jing Shao},\n  year          = {2023},\n  eprint        = {2312.07472},\n  archivePrefix = {arXiv},\n  primaryClass  = {cs.CV}\n}\n```\n\n\n## Get Started\nPlease see [tutorial](https://openlamm.github.io/tutorial) for the basic usage of this repo.\n\n## License \n\nThe project is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FLAMM","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOpenGVLab%2FLAMM","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FLAMM/lists"}