Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/showlab/movieseq

[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences
https://github.com/showlab/movieseq

Last synced: about 2 months ago
JSON representation

[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences

Host: GitHub
URL: https://github.com/showlab/movieseq
Owner: showlab
Created: 2024-07-03T14:39:25.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-08-26T00:12:34.000Z (4 months ago)
Last Synced: 2024-08-26T01:42:14.881Z (4 months ago)
Homepage:
Size: 204 KB
Stars: 12
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# MovieSeq (ECCV'24)

![overview](./assets/teaser.png)

MovieSeq is a method designed to enhance Large Multimodal Models for improved video in-context learning using interleaved multimodal sequences (e.g., character photo, human dialogues, etc).

We have developed a lightweight practical code that can be easily integrated into existing LMMs (e.g., GPT-4o) for easy usage.

## Environments
```
conda create --name movieseq python=3.10
conda activate movieseq
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia

pip install git+https://github.com/m-bain/whisperx.git
pip install tqdm moviepy openai opencv-python
```

## Guideline
Please refer to `example.ipynb` to learn how MovieSeq works.
Have fun!