Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bgu-cs-vil/ja-pols

Moving-camera background model (a CVPR '20 paper)
https://github.com/bgu-cs-vil/ja-pols

Last synced: 10 days ago
JSON representation

Moving-camera background model (a CVPR '20 paper)

Host: GitHub
URL: https://github.com/bgu-cs-vil/ja-pols
Owner: BGU-CS-VIL
License: mit
Created: 2020-03-16T08:10:34.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2020-10-22T10:34:17.000Z (about 4 years ago)
Last Synced: 2023-05-02T11:41:55.087Z (over 1 year ago)
Language: C++
Homepage:
Size: 60.1 MB
Stars: 17
Watchers: 6
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# JA-POLS

**Authors:** Irit Chelly, Vlad Winter, Dor Litvak, David Rosen, and Oren Freifeld.

This code repository corresponds to our CVPR '20 paper: **JA-POLS: a Moving-camera Background Model via Joint Alignment and Partially-overlapping Local Subspaces**.
JA-POLS is a novel 2D-based method for unsupervised learning of a moving-camera background model, which is highly scalable and allows for relatively-free camera motion.

JA-POLS typical results

A detailed description of our method and more example results can be found here:

[Paper](https://3d3cd9ad-88b3-4e3e-b64f-fb392ff4b6ea.filesusr.com/ugd/88bbac_80ebb886f20d43cdb2517074e7d22a81.pdf)

[Supplemental Material](https://3d3cd9ad-88b3-4e3e-b64f-fb392ff4b6ea.filesusr.com/ugd/88bbac_6c675f8d71cb4676bf45da8e26d751fc.pdf)

[Example Results](https://drive.google.com/drive/u/0/folders/1fnME3gYM-WvwGps08tWT00ZT6VlWBxfz)

**Acknowledgements:**

This work was partially funded by the Shulamit Aloni Scholarship from Israel's Ministry of Technology and Science, and by BGU's Hi-Tech Scholarship.

## Requirements
- Python: most of the code runs in python using the following packages: numpy, matlab.engine, scipy, tensorflow, torch, openCV, imageio, scikit-image, and other common python packages.
- MATLAB (for the SE-Sync part)
- C++: in case you are choosing the TGA mathod for learning the local subspaces (see module 2 below), please follow the [TGA requirements](https://github.com/MPI-IS/Grassmann-Averages-PCA). All steps should be performed in the TGA folder: *2_learning\BG\TGA-PCA*.

**For a minimal working example, use the Tennis sequence (the input images are already located in the input folder in this repository)**.

## Installation

## Instructions and Description
JA-POLS method includes 3 phases that run in separate modules:
- Joint alignment: align all input images to a common coordinate system
- Learning of two tasks:
- Partially-overlapping Local Subspaces (the background)
- Alignment prediction
- BG/FG separation for a (previously-unseen) input frame

**Configuration parameters:** the file config.py includes all required parameters for the 3 modules.

Before start running the code, insert the following config parameter:

Your local path to the JA-POLS folder:
```
paths = dict(
my_path = '/PATH_TO_JAPOLS_CODE/JA-POLS/',
)
```

The size of a single input frame (height, width, depth):
```
images = dict(
img_sz = (250, 420, 3),
)
```

**All 3 modules should run from the source folder JA-POLS/**.

### Module 1: Joint Alignment
Code:

Main function: *1_joint_alignment/main_joint_alignment.py*

Input:

A video or a sequence of images, that the BG model will be learned from.

The video or the images should be located in *input/learning/video* or *input/learning/images* respectively.

Output:

- *data/final_AFFINE_trans.npy*: affine transformations for all input images.

(In this file, record *i* contains the affine transformation (6-parameters vector) that is associated with input image *i*).

Required params in config.py:

Data type (video or a sequence of images), and relevant info about the input data:
```
se = dict(
data_type = 'images', # choose from: ['images', 'video']
video_name = 'jitter.mp4', # relevant when data_type = 'video'
img_type = '*.png', # relevant when data_type = 'images'
)
```

Parameters for the spatial transformer net (when estimating the affine transformations):
```
stn = dict(
device = '/gpu:0', # choose from: ['/gpu:0', '/gpu:1', '/cpu:0']
load_model = False, # 'False' when learning a model from scratch, 'True' when using a trained network's model
iter_per_epoch = 2000, # number of iterations
batch_size = 10,
)
```

The rest of the parameters can (optionally) remain with the current configuration.

Description:

Here we solve a joint-alignment problem:

High-level steps:
1. Compute relative transformations for pairs of input images (according to the graph)
2. Run SE-Sync framework and get absolute SE transformations for each frame
3. Transform images according to the absolute SE transformations
4. Estimate residual affine transformations by optimizing the above loss function using Spatial Transformer Network (STN).
5. End-up with absolute affine transformations for each of the input images

### Module 2: Learning
Code location (main function):

Main function: *2_learning/main_learning.py*

Input:

Files that were prepared in module 1:
- *data/final_AFFINE_trans.npy*
- *data/imgs.npy*
- *data/imgs_big_embd.npy*

Output:

- *data/subspaces/*: local subspaces for the background learning.

- *2_learning/Alignment/models/best_model.pt*: model of a trained net for the alignment prediction.

Required params in config.py:

**Local-subspaces learning:**

Method type of the background learning algorithm, that will run on each local domain:
```
pols = dict(
method_type = 'PRPCA', # choose from: [PCA / RPCA-CANDES / TGA / PRPCA]
)
```
The rest of the parameters can (optionally) remain with the current configuration.

**Alignment-prediction learning:**

Parameters for the regressor net (when learning a map between images and transformations):
```
regress_trans = dict(
load_model = False, # 'False' when learning a model from scratch, 'True' when using a trained network's model
gpu_num = 0, # number of gpu to use (in case there is more than one)
)
```
The rest of the parameters can (optionally) remain with the current configuration.

Description:

Here we learn two tasks, based on the affine transformations that were learned in module 1:

### Module 3: Background/Foreground Separation
Code:

Main function: *3_bg_separation/main_bg_separation.py*

Input:

A video or a sequence of test images for BG/FG separation.

The video or the images should be located in *input/test/video* or *input/test/images* respectively.

Output:

- *output/bg/*: background for each test image.

- *output/fg/*: foreground for each test image.

- *output/img/*: original test images.

Required params in config.py:

Data type (video or a sequence of test images), and relevant info about the input data:
```
bg_tool = dict(
data_type = 'images', # choose from: ['images', 'video']
video_name = 'jitter.mp4', # relevant when data_type = 'video'
img_type = '*.png', # relevant when data_type = 'images'
)
```

Indicate which test images to process: 'all' (all test data), 'subsequence' (subsequence of the image list), or 'idx_list' (a list of specific frame indices (0-based))..

If choosing 'subsequence', insert relevant info in "start_frame" and "num_of_frames".

If choosing 'idx_list', insert a list of indices in "idx_list".
```
bg_tool = dict(
which_test_frames='idx_list', # choose from: ['all', 'subsequence', 'idx_list']
start_frame=0,
num_of_frames=20,
idx_list=(2,15,39),
)
```

Indicate whether or not to use the ground-truth transformations, in case your process images from the original video.

When processing learning images, insert True.

When processing unseen images, insert False.
```
bg_tool = dict(
use_gt_theta = True,
)
```
The rest of the parameters can (optionally) remain with the current configuration.

## Copyright and License

This software is released under the MIT License (included with the software). Note, however, that if you are using this code (and/or the results of running it) to support any form of publication (e.g., a book, a journal paper, a conference paper, a patent application, etc.) then we request you will cite our paper:
```
@inproceedings{chelly2020ja,
title={JA-POLS: a Moving-camera Background Model via Joint Alignment and Partially-overlapping Local Subspaces},
author={Chelly, Irit and Winter, Vlad and Litvak, Dor and Rosen, David and Freifeld, Oren},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={12585--12594},
year={2020}
}
```