https://github.com/OpenGVLab/LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
https://github.com/OpenGVLab/LAMM

Last synced: 6 months ago
JSON representation

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

Host: GitHub
URL: https://github.com/OpenGVLab/LAMM
Owner: OpenGVLab
Created: 2023-06-08T08:21:38.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-04-16T11:30:23.000Z (over 1 year ago)
Last Synced: 2024-10-18T01:57:17.993Z (about 1 year ago)
Language: Python
Homepage: https://openlamm.github.io/
Size: 16.6 MB
Stars: 296
Watchers: 8
Forks: 16
Open Issues: 11
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

Awesome-MLLM-Safety - Github

README

          # LAMM

LAMM (pronounced as /læm/, means cute lamb to show appreciation to LLaMA), is a growing open-source community aimed at helping researchers and developers quickly train and evaluate Multi-modal Large Language Models (MLLM), and further build multi-modal AI agents capable of bridging the gap between ideas and execution, enabling seamless interaction between humans and AI machines.



    

    🌏 Project Page

    



## Updates 

📆 [**2024-03**] 

1. [Ch3Ef](https://openlamm.github.io/ch3ef/) is available!

2. [Ch3Ef](https://arxiv.org/abs/2403.17830) released on Arxiv!

3. [Dataset](https://huggingface.co/datasets/openlamm/Ch3Ef) and [leaderboard](https://openlamm.github.io/ch3ef/leaderboard.html) are available!

📆 [**2023-12**] 

1. [DepictQA](https://arxiv.org/abs/2312.08962): Depicted Image Quality Assessment based on Multi-modal Language Models released on Arxiv!

2. [MP5](https://arxiv.org/abs/2312.07472): A Multi-modal LLM based Open-ended Embodied System in Minecraft released on Arxiv!

📆 [**2023-11**] 

1. [ChEF](https://openlamm.github.io/paper_list/ChEF): A comprehensive evaluation framework for MLLM released on Arxiv!

2. [Octavius](https://openlamm.github.io/paper_list/Octavius): Mitigating Task Interference in MLLMs by combining Mixture-of-Experts (MoEs) with LoRAs released on Arxiv!

3. Camera ready version of LAMM is available on [Arxiv](https://arxiv.org/abs/2306.06687).

📆 [**2023-10**]

1. LAMM is accepted by NeurIPS2023 Datasets & Benchmark Track! See you in December!

📆 [**2023-09**]

1. Light training framework for V100 or RTX3090 is available! LLaMA2-based finetuning is also online.

2. Our demo moved to OpenXLab.

📆 [**2023-07**]

1.  Checkpoints & Leaderboard of LAMM on huggingface updated on new code base.

2.  Evaluation code for both 2D and 3D tasks are ready.

3.  Command line demo tools updated.

📆 [**2023-06**]

1. LAMM: 2D & 3D dataset & benchmark for MLLM

2. Watch demo video for LAMM at YouTube or Bilibili!

3. Full paper with Appendix is available on Arxiv.

4. LAMM dataset released on Huggingface & OpenDataLab for Research community!',

5. LAMM code is available for Research community!

## Paper List

**Publications**

- [x] [LAMM](https://openlamm.github.io/paper_list/LAMM)

- [x] [Octavius](https://openlamm.github.io/paper_list/Octavius)

**Preprints**

- [x] [Assessment of Multimodal Large Language Models in Alignment with Human Values](https://openlamm.github.io/ch3ef/)

- [x] [ChEF](https://openlamm.github.io/paper_list/ChEF)

## Citation

**LAMM**

```

@article{yin2023lamm,

    title={LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark},

    author={Yin, Zhenfei and Wang, Jiong and Cao, Jianjian and Shi, Zhelun and Liu, Dingning and Li, Mukai and Sheng, Lu and Bai, Lei and Huang, Xiaoshui and Wang, Zhiyong and others},

    journal={arXiv preprint arXiv:2306.06687},

    year={2023}

}

```

**Assessment of Multimodal Large Language Models in Alignment with Human Values**

```

@misc{shi2024assessment,

      title={Assessment of Multimodal Large Language Models in Alignment with Human Values}, 

      author={Zhelun Shi and Zhipin Wang and Hongxing Fan and Zaibin Zhang and Lijun Li and Yongting Zhang and Zhenfei Yin and Lu Sheng and Yu Qiao and Jing Shao},

      year={2024},

      eprint={2403.17830},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

}

```

**ChEF**

```

@misc{shi2023chef,

      title={ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models}, 

      author={Zhelun Shi and Zhipin Wang and Hongxing Fan and Zhenfei Yin and Lu Sheng and Yu Qiao and Jing Shao},

      year={2023},

      eprint={2311.02692},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

}

```

**Octavius**

```

@misc{chen2023octavius,

      title={Octavius: Mitigating Task Interference in MLLMs via MoE}, 

      author={Zeren Chen and Ziqin Wang and Zhen Wang and Huayang Liu and Zhenfei Yin and Si Liu and Lu Sheng and Wanli Ouyang and Yu Qiao and Jing Shao},

      year={2023},

      eprint={2311.02684},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

}

```

**DepictQA**

```

@article{depictqa,

        title={Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models},

        author={You, Zhiyuan and Li, Zheyuan, and Gu, Jinjin, and Yin, Zhenfei and Xue, Tianfan and Dong, Chao},

        journal={arXiv preprint arXiv:2312.08962},

        year={2023}

    }

```

**MP5**

```

@misc{qin2023mp5,

  title         = {MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception}, 

  author        = {Yiran Qin and Enshen Zhou and Qichang Liu and Zhenfei Yin and Lu Sheng and Ruimao Zhang and Yu Qiao and Jing Shao},

  year          = {2023},

  eprint        = {2312.07472},

  archivePrefix = {arXiv},

  primaryClass  = {cs.CV}

}

```

## Get Started

Please see [tutorial](https://openlamm.github.io/tutorial) for the basic usage of this repo.

## License 

The project is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/OpenGVLab/LAMM

Awesome Lists containing this project

README