Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pengboxiangshang/siamrpn_plus_plus_pytorch

SiamRPN, SiamRPN++, unofficial implementation of "SiamRPN++" (CVPR2019), multi-GPUs, LMDB.
https://github.com/pengboxiangshang/siamrpn_plus_plus_pytorch

Last synced: 4 days ago
JSON representation

SiamRPN, SiamRPN++, unofficial implementation of "SiamRPN++" (CVPR2019), multi-GPUs, LMDB.

Awesome Lists containing this project

README

        

# SiamRPN++_PyTorch

This is an unofficial PyTorch implementation of [SiamRPN++ (CVPR2019)](https://arxiv.org/pdf/1812.11703.pdf), implemented by **[Peng Xu](http://www.pengxu.net)** and **[Jin Feng](https://github.com/JinDouer)**. Our **training** can be conducted on **multi-GPUs**, and use **LMDB** data format to speed up the data loading.

This project is designed with these goals:
- [x] Training on ILSVRC2015_VID dataset.
- [ ] Training on GOT-10k dataset.
- [ ] Training on YouTube-BoundingBoxes dataset.
- [ ] Evaluate the performance on tracking benchmarks.

## Details of SiamRPN++ Network
As stated in the original paper, SiamRPN++ network has three parts, including Backbone Networks, SiamRPN Blocks, and Weighted Fusion Layers.

**1. Backbone Network (modified ResNet-50)**

As stated in the original paper, SiamRPN++ uses ResNet-50 as backbone by modifying the strides and adding dilated convolutions for *conv4* and *conv5* blocks. Here, we present the detailed comparison between original ResNet-50 and SiamRPN++ ResNet-50 backbone in following table.



bottleneck in conv4
bottleneck in conv5


conv1x1
conv3x3
conv1x1
conv1x1
conv3x3
conv1x1


original ResNet-50
stride
1
2
1
1
2
1


padding
0
1
0
0
1
0


dilation
1
1
1
1
1
1


ResNet-50 in SiamRPN++
stride
1
1
1
1
1
1


padding
0
2
0
0
4
0


dilation
1
2
1
1
4
1

**2. SiamRPN Block**

Based on our understanding to the original paper, we plot a architecture illustration to describe the *Siamese RPN* block as shown in following.

We also present the detailed configurations of each layer of RPN block in following table. Please see more details in [./network/RPN.py](https://github.com/PengBoXiangShang/SiamRPN_plus_plus_Pytorch/blob/master/network/RPN.py).

|component|configuration|
|:---|:---|
|adj_1 / adj_2 / adj_3 / adj_4|conv2d(256, 256, ksize=3, pad=1, stride=1), BN2d(256)|
|fusion_module_1 / fusion_module_2|conv2d(256, 256, ksize=1, pad=0, stride=1), BN2d(256), ReLU|
|box head|conv2d(256, 4*5, ksize=1, pad=0, stride=1)|
|cls head|conv2d(256, 2*5, ksize=1, pad=0, stride=1)|

**3. Weighted Fusion Layer**

We implemente the *weighted fusion layer* via **group convolution operations**. Please see details in [./network/SiamRPN.py](https://github.com/PengBoXiangShang/SiamRPN_plus_plus_Pytorch/blob/master/network/SiamRPN.py).

## Requirements
Ubuntu 14.04

Python 2.7

PyTorch 0.4.0

Other main requirements can be installed by:

```
# 1. Install cv2 package.
conda install opencv

# 2. Install LMDB package.
conda install lmdb

# 3. Install fire package.
pip install fire -c conda-forge
```

## Training Instructions

```
# 1. Clone this repository to your disk.
git clone https://github.com/PengBoXiangShang/SiamRPN_plus_plus_PyTorch.git

# 2. Change working directory.
cd SiamRPN_plus_plus_PyTorch

# 3. Download training data. In this project, we provide the downloading and preprocessing scripts for ILSVRC2015_VID dataset. Please download ILSVRC2015_VID dataset (86GB). The cripts for other tracking datasets are coming soon.
cd data
wget -c http://bvisionweb1.cs.unc.edu/ilsvrc2015/ILSVRC2015_VID.tar.gz
tar -xvf ILSVRC2015_VID.tar.gz
rm ILSVRC2015_VID.tar.gz
cd ..

# 4. Preprocess data.
chmod u+x ./preprocessing/create_dataset.sh
./preprocessing/create_dataset.sh

# 5. Pack the preprocessed data into LMDB format to accelerate data loading.
chmod u+x ./preprocessing/create_lmdb.sh
./preprocessing/create_lmdb.sh

# 6. Start the training.
chmod u+x ./train.sh
./train.sh
```

## Acknowledgement
Many thanks to [Sisi](https://github.com/noCodegirl) who helps us to download the huge ILSVRC2015_VID dataset.