https://github.com/showlab/visincontext

Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
https://github.com/showlab/visincontext

efficient in-context-learning llm mllm

Last synced: about 1 year ago
JSON representation

Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning

Host: GitHub
URL: https://github.com/showlab/visincontext
Owner: showlab
Created: 2024-06-03T15:41:44.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-10-30T03:57:48.000Z (over 1 year ago)
Last Synced: 2025-04-01T17:24:48.337Z (about 1 year ago)
Topics: efficient, in-context-learning, llm, mllm
Language: Python
Homepage: https://fingerrec.github.io/visincontext/
Size: 1010 KB
Stars: 14
Watchers: 2
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # VisInContext

[Arxiv](https://arxiv.org/abs/2406.02547)

![](figures/gpu_memory.png)

- VisInContext is a easy way to increase the in-context text length in Multi-modality Learning.

- This work is also complement with existing works to increase in-context text length like FlashAttn, Memory Transformer.

## Install

```

pip install -r requirement.txt

```

For H100 GPUS, run the following dependencies:

```

pip install -r requirements_h100.txt

```

## Dataset Preparation

See [DATASET.md](DATASET.md).

## Pre-training

See [PRETRAIN.md](PRETRAIN.md).

## Few-shot Evaluation

See [Evaluation.md](EVALUATION.md)

## Citation

If you find our work helps, please consider cite the following work

```

@article{wang2024visincontext,

        title={Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning},

        author={Wang, Alex Jinpeng and Li, Linjie and Lin, Yiqi and Li, Min  and Wang, Lijuan and Shou, Mike Zheng},

        journal={NeurIPS},

        year={2024}

    }

```

## Contact

Email: awinyimgprocess at gmail dot com

## Acknowledgement

Thanks for these good works.

 [Open-flamingo](https://github.com/mlfoundations/open_flamingo), [Open-CLIP](https://github.com/mlfoundations/open_clip) and [WebDataset](https://github.com/webdataset/webdataset).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/showlab/visincontext

Awesome Lists containing this project

README