https://github.com/shi-labs/vim
https://github.com/shi-labs/vim
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/shi-labs/vim
- Owner: SHI-Labs
- License: mit
- Created: 2023-11-03T04:08:58.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-11-08T01:48:33.000Z (almost 2 years ago)
- Last Synced: 2025-04-13T23:55:36.860Z (6 months ago)
- Language: Python
- Size: 26.4 KB
- Stars: 61
- Watchers: 6
- Forks: 4
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Video Instance Matting
[Jiachen Li](https://chrisjuniorli.github.io/), Roberto Henschel, [Vidit Goel](https://vidit98.github.io/), Marianna Ohanyan, Shant Navasardyan, [Humphrey Shi](https://www.humphreyshi.com/)
[[`arXiv`](https://arxiv.org/pdf/2311.04212.pdf)] [[`Code`](https://github.com/SHI-Labs/VIM)]
## Updates
11/02/2023: [Codes](https://github.com/SHI-Labs/VIM) and [arxiv](https://arxiv.org/pdf/2311.04212.pdf) are released.
## Installation
Step 1: Clone this repo
```bash
git clone https://github.com/SHI-Labs/VIM.git
```Step 2: Create conda environment
```bash
conda create --name vim python=3.9
conda activate vim
```Step 3: Install pytorch and torchvision
```bash
conda install pytorch==1.13.1 torchvision==0.14.1 pytorch-cuda=11.7 -c pytorch -c nvidia
```Step 4: Install dependencies
```bash
pip install -r requirements.txt
```## Data Preparation
* [VIM50](https://drive.google.com/drive/folders/1gYtZd66qeCA4JWdbguRaWecG90aqfvs5?usp=sharing)
* [MTRCNN masks](https://drive.google.com/drive/folders/1gYtZd66qeCA4JWdbguRaWecG90aqfvs5?usp=sharing)
* [SeqFormer masks](https://drive.google.com/drive/folders/1gYtZd66qeCA4JWdbguRaWecG90aqfvs5?usp=sharing)
* [Checkpoints](https://drive.google.com/drive/folders/1gYtZd66qeCA4JWdbguRaWecG90aqfvs5?usp=sharing)## Inference & Evaluation
Inference on the VIM50 with MTRCNN mask guidance:```
CUDA_VISIBLE_DEVICES=0 python infer_vim_clip.py --config config/VIM.toml --checkpoint /path/to/msgvim.pth --image-dir /path/to/VIM50 --tg-mask-dir /path/to/MTRCNN/tg_masks/ --re-mask-dir /path/to/MTRCNN/re_masks/ --output outputs/MTRCNN_msgvim
```Evaluation the results
```
CUDA_VISIBLE_DEVICES=0 python metrics_vim.py --gt-dir /path/to/VIM50 --output-dir /path/to/outputs/MTRCNN_msgvim
```Inference on the VIM50 with MTRCNN mask guidance:
```
CUDA_VISIBLE_DEVICES=0 python infer_vim_clip.py --config config/VIM.toml --checkpoint /path/to/msgvim.pth --image-dir /path/to/VIM50 --tg-mask-dir /path/to/SeqFormer/tg_masks/ --re-mask-dir /path/to/SeqFormer/re_masks/ --output outputs/SeqFormer_msgvim
```Evaluation the results
```
CUDA_VISIBLE_DEVICES=0 python metrics_vim.py --gt-dir /path/to/VIM50 --output-dir /path/to/outputs/SeqFormer_msgvim
```## Citation
```
@article{li2023vim,
title={Video Instance Matting},
author={Jiachen Li and Roberto Henschel and Vidit Goel and Marianna Ohanyan and Shant Navasardyan and Humphrey Shi},
journal={arXiv preprint},
year={2023},
}
```## Acknowledgement
This repo is based on [MGMatting](https://github.com/yucornetto/MGMatting). Thanks for their open-sourced works.