https://github.com/qijiezhao/pseudo-3d-pytorch

pytorch version of pseudo-3d-residual-networks(P-3D), pretrained model is supported
https://github.com/qijiezhao/pseudo-3d-pytorch

Last synced: 3 months ago
JSON representation

pytorch version of pseudo-3d-residual-networks(P-3D), pretrained model is supported

Host: GitHub
URL: https://github.com/qijiezhao/pseudo-3d-pytorch
Owner: qijiezhao
License: mit
Created: 2017-11-08T07:15:01.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-06-17T10:15:39.000Z (about 6 years ago)
Last Synced: 2025-03-30T01:06:14.635Z (3 months ago)
Language: Python
Size: 27.3 KB
Stars: 453
Watchers: 11
Forks: 114
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Pseudo-3D Residual Networks

This repo implements the network structure of P3D[1] with PyTorch, pre-trained model weights are converted from caffemodel, which is supported from the [author's repo](https://github.com/ZhaofanQiu/pseudo-3d-residual-networks)

### Requirements:

- pytorch

- numpy

### Structure details

In the author's official repo, only P3D-199 is released. Besides this deepest P3D-199, I also implement P3D-63 and P3D-131, which are respectively modified from ResNet50-3D and ResNet101-3D, the two nets may bring more convenience to users who have only memory-limited GPUs.

### Pretrained weights

(Pretrained weights of P3D63 and P3D131 are not yet supported) 

(tips: I feel sorry to canceal the download urls of pretrained weights because of some private reasons. For more information you could send emails to me.)

(New tips: Model weights now are available.)

1, P3D-199 trained on Kinetics dataset:

 [BaiduYun url](https://pan.baidu.com/s/1o8VFtMy)

 [Google Drive](https://drive.google.com/drive/folders/1u_l-yvhS0shpW6e0tCiqPE7Bd1qQZKdD)

 

2, P3D-199 trianed on Kinetics Optical Flow (TVL1):

 [BaiduYun url](https://pan.baidu.com/s/1o8VFtMy)

 [Google Drive](https://drive.google.com/drive/folders/1u_l-yvhS0shpW6e0tCiqPE7Bd1qQZKdD)

3, P3D-199 trained on Kinetics600, RGB, 224&299:

 [BaiduYun url](https://pan.baidu.com/s/1xAfTcqVX1qgoArGzRbI4SQ)

 [Google Drive](https://drive.google.com/drive/folders/1u_l-yvhS0shpW6e0tCiqPE7Bd1qQZKdD)

 (Change the value of GAP kernel from 5 to 7 if 224, to 9 if 299)

### Example Code

    from __future__ import print_function

    from p3d_model import *

    import torch

    

    model = P3D199(pretrained=True,num_classes=400)

    model = model.cuda()

    data=torch.autograd.Variable(torch.rand(10,3,16,160,160)).cuda()   # if modality=='Flow', please change the 2nd dimension 3==>2

    out=model(data)

    print(out.size(),out)

    

### Ablation settings

1. **ST-Structures**:

    All P3D models in this repo support various forms of ST-Structures like ('A','B','C') ,('A','B') and ('A'), code is as follows.

    ```

    model = P3D63(ST_struc=('A','B'))

    model = P3D131(ST_struc=('C'))

    ```

    

2. **Flow and RGB models**:

    

    Set parameter *modality='RGB'* as 'RGB' model, 'Flow' as flow model. Flow model i trained on TVL1 optical flow images.

    

    ```

    model= P3D199(pretrained=True,modality='Flow')

    ```

3. **Finetune the model**

    when finetuning the models on your custom dataset, use get_optim_policies() to set different learning speed for different layers. e.g. When dataset is small, Only need to train several deepest layers, set *slow_rate=0.8* in code, and change the following *lr_mult*,*decay_mult*. 

-----------------------------------

please **cite this repo** if you take use of it.

### Experiment Result (Out of the paper)

#### (All the following results are generated by End-to-End manners).

Some of them have outperforms **state of the arts**.

- Action recognition(mean accuracy on UCF101):

  

modality/model | RGB | Flow | Fusion

---|---|---|---

P3D199 (Sports-1M) | 88.5%| -|-

P3D199 (Kinetics) | 91.2% | 92.4%| 98.3%

- Action localization(mAP on Thumos14):

  

#### steps: perframe+watershed

Step | perframe | localization

---|---|---

P3D199(Sports-1M | 0.451 | 0.25

P3D199(Kinetics) | 0.569(fused) | 0.307

Reference:

 [1][Learning Spatio-Temporal Representation with Pseudo-3D Residual,ICCV2017](http://openaccess.thecvf.com/content_iccv_2017/html/Qiu_Learning_Spatio-Temporal_Representation_ICCV_2017_paper.html)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/qijiezhao/pseudo-3d-pytorch

Awesome Lists containing this project

README