Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/showlab/movieseq
[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences
https://github.com/showlab/movieseq
Last synced: about 2 months ago
JSON representation
[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences
- Host: GitHub
- URL: https://github.com/showlab/movieseq
- Owner: showlab
- Created: 2024-07-03T14:39:25.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-08-26T00:12:34.000Z (4 months ago)
- Last Synced: 2024-08-26T01:42:14.881Z (4 months ago)
- Homepage:
- Size: 204 KB
- Stars: 12
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MovieSeq (ECCV'24)
![overview](./assets/teaser.png)
MovieSeq is a method designed to enhance Large Multimodal Models for improved video in-context learning using interleaved multimodal sequences (e.g., character photo, human dialogues, etc).
We have developed a lightweight practical code that can be easily integrated into existing LMMs (e.g., GPT-4o) for easy usage.
## Environments
```
conda create --name movieseq python=3.10
conda activate movieseq
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidiapip install git+https://github.com/m-bain/whisperx.git
pip install tqdm moviepy openai opencv-python
```## Guideline
Please refer to `example.ipynb` to learn how MovieSeq works.
Have fun!