https://github.com/happywu/Sequence-Level-Semantics-Aggregation

Sequence Level Semantics Aggregation for Video Object Detection
https://github.com/happywu/Sequence-Level-Semantics-Aggregation

mxnet object-detection video-object-detection

Last synced: 8 months ago
JSON representation

Sequence Level Semantics Aggregation for Video Object Detection

Host: GitHub
URL: https://github.com/happywu/Sequence-Level-Semantics-Aggregation
Owner: happywu
License: apache-2.0
Created: 2019-10-15T05:23:36.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2024-08-30T23:42:49.000Z (over 1 year ago)
Last Synced: 2024-11-11T18:42:55.224Z (about 1 year ago)
Topics: mxnet, object-detection, video-object-detection
Language: Python
Size: 3.51 MB
Stars: 85
Watchers: 4
Forks: 19
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-video-object-detection - SELSA - Level-Semantics-Aggregation?style=social"/> : "Sequence level semantics aggregation for video object detection". (**[ICCV 2019](https://openaccess.thecvf.com/content_ICCV_2019/html/Wu_Sequence_Level_Semantics_Aggregation_for_Video_Object_Detection_ICCV_2019_paper.html)**) (Frameworks)

README

          # Sequence Level Semantics Aggregation for Video Object Detection

## Introduction

This is an official MXNet implementation of 

[*Sequence Level Semantics Aggregation for Video Object Detection*](https://arxiv.org/abs/1907.06390). (ICCV 2019, oral).

SELSA aggregates full-sequence level information of videos while keeping a simple and clean pipeline. It achieves **82.69**

mAP with ResNet-101 on ImageNet VID validation set.

## Citation

If you use the code or models in your research, please cite with:

```

@article{wu2019selsa,

  title={Sequence Level Semantics Aggregation for Video Object Detection},

  author={Wu, Haiping and Chen, Yuntao and Wang, Naiyan and Zhang, Zhaoxiang},

  journal={ICCV 2019},

  year={2019}

}

```

## Main Results

|                                 | _{training data}  | _{testing data} | _mAP(%) | _mAP(%)(slow)  | _{mAP(%)(medium)} | _mAP(%)(fast) |

|---------------------------------|-------------------|--------------|---------|---------|--------|--------|

| _{Single-frame baseline(Faster R-CNN, ResNet-101)}   | _{ImageNet DET train + VID train} | _{ImageNet VID validation} | 73.6 | 82.1 | 71.0 | 52.5 |

| _{SELSA(Faster R-CNN, ResNet-101)}  | _{ImageNet DET train + VID train} | _{ImageNet VID validation} | 80.3| 86.9 | 78.9 | 61.4 |

| _{SELSA(Faster R-CNN, ResNet-101, Data Aug)}  | _{ImageNet DET train + VID train} | _{ImageNet VID validation} | 82.7 | 88.0 | 81.4 | 67.1 |

## Installation

Please note that this repo is based on Python 2.

1. Clone the repository.

~~~

git clone https://github.com/happywu/Sequence-Level-Semantics-Aggregation

~~~

2. Install MXNet following https://mxnet.incubator.apache.org/get_started. We tested our code on MXNet v1.3.0.

3. Install packages via 

~~~

pip install -r requirements.txt

sh init.sh

~~~

## Preparation for Training & Testing

1. Please download ILSVRC2015 DET and ILSVRC2015 VID dataset, and make sure it looks like this:

	```

	./data/ILSVRC2015/

	./data/ILSVRC2015/Annotations/DET

	./data/ILSVRC2015/Annotations/VID

	./data/ILSVRC2015/Data/DET

	./data/ILSVRC2015/Data/VID

	./data/ILSVRC2015/ImageSets

	```

2. Please download ImageNet pre-trained [ResNet-v1-101](https://1dv.aflat.top/resnet_v1_101-0000.params) model and 

our pretrained [SELSA ResNet-101](https://1dv.aflat.top/selsa_rcnn_vid-0000.params) model manually, and put it under folder `./model`. Make sure it looks like this:

	```

	./model/pretrained_model/resnet_v1_101-0000.params

	./model/pretrained_model/selsa_rcnn_vid-0000.params

	```

## Testing

1. To test the provided pretrained model, run the following command.

    ```

    python experiments/selsa/test.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml --test-pretrained ./model/pretrained_model/selsa_rcnn_vid

    ```

   

You should get the results as reported before.

## Training

3. To train, use the following command

    ```

    python experiments/selsa/train_end2end.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml

    ```

	A cache folder would be created automatically to save the model and the log under `output/selsa_rcnn/imagenet_vid/`.

	

2. To test your trained model

    ```

    python experiments/selsa/test.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml

    ```

## Other implementations

Pytorch: [MMTracking](https://github.com/open-mmlab/mmtracking/tree/master/configs/vid/selsa)

## Acknowledge

This repo is modified from [*Flow-Guided-Feature-Aggregation*](https://github.com/msracver/Flow-Guided-Feature-Aggregation).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/happywu/Sequence-Level-Semantics-Aggregation

Awesome Lists containing this project

README