Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/waikato-datamining/video-frame-selector

Meta-tool that presents frames from a video to image analysis frameworks and uses the predictions to determine whether to use a frame or not.
https://github.com/waikato-datamining/video-frame-selector
object-detection python3 video-processing
Last synced: about 2 months ago
JSON representation
Meta-tool that presents frames from a video to image analysis frameworks and uses the predictions to determine whether to use a frame or not.
Host: GitHub
URL: https://github.com/waikato-datamining/video-frame-selector
Owner: waikato-datamining
License: mit
Created: 2021-07-07T22:34:11.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-04-22T04:56:04.000Z (8 months ago)
Last Synced: 2024-04-22T05:53:25.010Z (8 months ago)
Topics: object-detection, python3, video-processing
Language: Python
Homepage:
Size: 144 KB
Stars: 0
Watchers: 4
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.rst
- License: LICENSE
Awesome Lists containing this project

README

        # video-frame-selector

Meta-tool that presents frames from a video to image analysis frameworks and uses the predictions to determine 

whether to use a frame or not.

Frames can be presented as files (*file-polling-based*) with the predictions

then being read from files that the image analysis framework generated.

Alternatively, a [Redis](https://redis.io/) backend can be used (*redis-based*),

to broadcast the images as JPG bytes and then listening on another channel

for the predictions to come through. This approach avoids wearing out disks. 

Rather than just processing every n-th frame, a simple pruning mechanism can

be employed which discards frames that are too similar.

## Installation

* create virtual environment

  ```bash

  virtualenv -p /usr/bin/python3.7 venv

  ```

  

* install library

  ```bash

  ./venv/bin/pip install video-frame-selector

  ```

## Supported formats

* Input

  * directory with images (.jpg, .png)

  * webcam

  * videos

  

* Image analysis output

  * [ROI CSV](https://github.com/waikato-ufdl/wai-annotations-roi)

  * [OPEX JSON](https://github.com/WaikatoLink2020/objdet-predictions-exchange-format)

  

* Output

  * JPG files (using frame number as file name) - optional metadata file in YAML format 

  * MJPG video (use either .avi or .mkv as file extension for the container) 

## Command-line options

### File-polling based

```

usage: vfs-process [-h] --input DIR_OR_FILE_OR_ID --input_type

                   {image_dir,video,webcam} [--nth_frame INT]

                   [--max_frames INT] [--from_frame INT] [--to_frame INT]

                   [--prune] [--bw_threshold INT] [--change_threshold FLOAT]

                   [--analysis_input DIR] [--analysis_tmp DIR]

                   [--analysis_output DIR] [--analysis_timeout SECONDS]

                   [--analysis_type {rois_csv,opex_json}]

                   [--analysis_keep_files] [--min_score FLOAT]

                   [--required_labels LIST] [--excluded_labels LIST]

                   [--poll_interval POLL_INTERVAL] --output DIR_OR_FILE

                   --output_type {jpg,mjpg} [--output_format FORMAT]

                   [--output_tmp DIR] [--output_fps FORMAT]

                   [--crop_to_content] [--crop_margin INT]

                   [--crop_min_width INT] [--crop_min_height INT]

                   [--output_metadata] [--progress INT] [--keep_original]

                   [--verbose]

Tool for replaying videos or grabbing frames from webcam, presenting it to an

image analysis framework to determine whether to include the frame in the

output.

optional arguments:

  -h, --help            show this help message and exit

  --input DIR_OR_FILE_OR_ID

                        the dir with images, video file to read or the webcam

                        ID (default: None)

  --input_type {image_dir,video,webcam}

                        the input type (default: None)

  --nth_frame INT       every nth frame gets presented to the analysis process

                        (default: 10)

  --max_frames INT      the maximum number of processed frames before exiting

                        (<=0 for unlimited) (default: 0)

  --from_frame INT      the starting frame (incl.); ignored if <= 0 (default:

                        -1)

  --to_frame INT        the last frame to process (incl.); ignored if <= 0

                        (default: -1)

  --prune               whether to prune the images if not enough change

                        (default: False)

  --bw_threshold INT    The threshold (0-255) for the black/white conversion

                        (requires --prune) (default: 128)

  --change_threshold FLOAT

                        The threshold (0.0-1.0) for the change detection

                        (requires --prune) (default: 0.0)

  --analysis_input DIR  the input directory used by the image analysis

                        process; if not provided, all frames get accepted

                        (default: None)

  --analysis_tmp DIR    the temporary directory to place the images in before

                        moving them into the actual input directory (to avoid

                        race conditions) (default: None)

  --analysis_output DIR

                        the output directory used by the image analysis

                        process (default: None)

  --analysis_timeout SECONDS

                        the maximum number of seconds to wait for the image

                        analysis to finish processing (default: 10)

  --analysis_type {rois_csv,opex_json}

                        the type of output the analysis process generates

                        (default: rois_csv)

  --analysis_keep_files

                        whether to keep the analysis files rather than

                        deleting them (default: False)

  --min_score FLOAT     the minimum score that a prediction must have

                        (default: 0.0)

  --required_labels LIST

                        the comma-separated list of labels that the analysis

                        output must contain (with high enough scores)

                        (default: None)

  --excluded_labels LIST

                        the comma-separated list of labels that the analysis

                        output must not contain (with high enough scores)

                        (default: None)

  --poll_interval POLL_INTERVAL

                        interval in seconds for polling for result files

                        (default: 0.1)

  --output DIR_OR_FILE  the output directory or file for storing the selected

                        frames (use .avi or .mkv for videos) (default: None)

  --output_type {jpg,mjpg}

                        the type of output to generate (default: None)

  --output_format FORMAT

                        the format string for the images, see

                        https://docs.python.org/3/library/stdtypes.html#old-

                        string-formatting (default: %06d.jpg)

  --output_tmp DIR      the temporary directory to write the output images to

                        before moving them to the output directory (to avoid

                        race conditions with processes that pick up the

                        images) (default: None)

  --output_fps FORMAT   the frames per second to use when generating a video

                        (default: 25)

  --crop_to_content     whether to crop the frame to the detected content

                        (default: False)

  --crop_margin INT     the margin in pixels to use around the determined crop

                        region (default: 0)

  --crop_min_width INT  the minimum width for the cropped content (default: 2)

  --crop_min_height INT

                        the minimum height for the cropped content (default:

                        2)

  --output_metadata     whether to output a YAML file alongside the image with

                        some metadata when outputting frame images (default:

                        False)

  --progress INT        every nth frame a progress message is output on stdout

                        (default: 100)

  --keep_original       keeps the original file name when processing an image

                        dir (default: False)

  --verbose             for more verbose output (default: False)

```

### Redis-based

```

usage: vfs-process-redis [-h] --input DIR_OR_FILE_OR_ID --input_type

                         {image_dir,video,webcam} [--nth_frame INT]

                         [--max_frames INT] [--from_frame INT]

                         [--to_frame INT] [--prune] [--bw_threshold INT]

                         [--change_threshold FLOAT] [--redis_host HOST]

                         [--redis_port PORT] [--redis_db DB] --redis_out

                         CHANNEL --redis_in CHANNEL [--redis_timeout SECONDS]

                         [--analysis_type {rois_csv,opex_json}]

                         [--min_score FLOAT] [--required_labels LIST]

                         [--excluded_labels LIST] --output DIR_OR_FILE

                         --output_type {jpg,mjpg} [--output_format FORMAT]

                         [--output_tmp DIR] [--output_fps FORMAT]

                         [--crop_to_content] [--crop_margin INT]

                         [--crop_min_width INT] [--crop_min_height INT]

                         [--output_metadata] [--progress INT]

                         [--keep_original] [--verbose]

Tool for replaying videos or grabbing frames from webcam, presenting it to an

image analysis framework to determine whether to include the frame in the

output. Uses Redis to exchange data.

optional arguments:

  -h, --help            show this help message and exit

  --input DIR_OR_FILE_OR_ID

                        the dir with images, video file to read or the webcam

                        ID (default: None)

  --input_type {image_dir,video,webcam}

                        the input type (default: None)

  --nth_frame INT       every nth frame gets presented to the analysis process

                        (default: 10)

  --max_frames INT      the maximum number of processed frames before exiting

                        (<=0 for unlimited) (default: 0)

  --from_frame INT      the starting frame (incl.); ignored if <= 0 (default:

                        -1)

  --to_frame INT        the last frame to process (incl.); ignored if <= 0

                        (default: -1)

  --prune               whether to prune the images if not enough change

                        (default: False)

  --bw_threshold INT    The threshold (0-255) for the black/white conversion

                        (requires --prune) (default: 128)

  --change_threshold FLOAT

                        The threshold (0.0-1.0) for the change detection

                        (requires --prune) (default: 0.0)

  --redis_host HOST     The redis server to connect to (default: localhost)

  --redis_port PORT     The port the redis server is listening on (default:

                        6379)

  --redis_db DB         The redis database to use (default: 0)

  --redis_out CHANNEL   The redis channel to send the frames to (default:

                        None)

  --redis_in CHANNEL    The redis channel to receive the predictions on

                        (default: None)

  --redis_timeout SECONDS

                        the maximum number of seconds to wait for the image

                        analysis to finish processing (default: 10)

  --analysis_type {rois_csv,opex_json}

                        the type of output the analysis process generates

                        (default: rois_csv)

  --min_score FLOAT     the minimum score that a prediction must have

                        (default: 0.0)

  --required_labels LIST

                        the comma-separated list of labels that the analysis

                        output must contain (with high enough scores)

                        (default: None)

  --excluded_labels LIST

                        the comma-separated list of labels that the analysis

                        output must not contain (with high enough scores)

                        (default: None)

  --output DIR_OR_FILE  the output directory or file for storing the selected

                        frames (use .avi or .mkv for videos) (default: None)

  --output_type {jpg,mjpg}

                        the type of output to generate (default: None)

  --output_format FORMAT

                        the format string for the images, see

                        https://docs.python.org/3/library/stdtypes.html#old-

                        string-formatting (default: %06d.jpg)

  --output_tmp DIR      the temporary directory to write the output images to

                        before moving them to the output directory (to avoid

                        race conditions with processes that pick up the

                        images) (default: None)

  --output_fps FORMAT   the frames per second to use when generating a video

                        (default: 25)

  --crop_to_content     whether to crop the frame to the detected content

                        (default: False)

  --crop_margin INT     the margin in pixels to use around the determined crop

                        region (default: 0)

  --crop_min_width INT  the minimum width for the cropped content (default: 2)

  --crop_min_height INT

                        the minimum height for the cropped content (default:

                        2)

  --output_metadata     whether to output a YAML file alongside the image with

                        some metadata when outputting frame images (default:

                        False)

  --progress INT        every nth frame a progress message is output on stdout

                        (default: 100)

  --keep_original       keeps the original file name when processing an image

                        dir (default: False)

  --verbose             for more verbose output (default: False)

```

## Examples

### File-polling based

In the following, an example of how to use the *video-frame-selector* to feed images to a 

file-based [detectron2](https://github.com/waikato-datamining/pytorch/tree/master/detectron2) 

model that runs in a docker container.

Directory structure:

```

/some/where

|

|- cache (pytorch cache)

|

|- data (contains videos and detectron training data)

|

|- output (detectron2 models etc)

|

|- d2 (detectron work area)

|  |

|  +- in

|  |

|  +- tmp

|  |

|  +- out

|

|- vfs (vfs work area)

|  |

|  +- in

|  |

|  +- tmp

|  |

|  +- out (contains frames to keep)

```

Running [detectron2](https://github.com/waikato-datamining/pytorch/tree/master/detectron2) 

to detect farm animals (Goat, Cow, Chicken): 

```bash

docker run --gpus=all --shm-size 8G -u $(id -u):$(id -g) -e USER=$USER \

    -v /some/where:/opt/projects \

    -v /some/where/cache:/.torch \

    -it public.aml-repo.cms.waikato.ac.nz:443/pytorch/detectron2:0.5

DATASET=/opt/projects/data/animal_farm

OUTPUT=/opt/projects/output/animal_farm

CONFIG=mask_rcnn_R_50_FPN_1x.yaml 

d2_predict \

  --model $OUTPUT/model_final.pth \

  --config $OUTPUT/$CONFIG \

  --labels $DATASET/train/labels.txt \

  --prediction_in /opt/projects/d2/in/ \

  --prediction_tmp /opt/projects/d2/tmp/ \

  --prediction_out /opt/projects/d2/out/ \

  --delete_input \

  --max_files 10 \

  --use_watchdog \

  --continuous

```

Feeding in images from a video, but only keeping frames with *Goat* detections with a score of at least 0.8:

```bash

vfs-process \ 

  --input "/some/where/data/my_farm.avi" \

  --input_type video \

  --nth_frame 10 \

  --analysis_input /some/where/d2/in \

  --analysis_tmp /some/where/vfs/tmp \

  --analysis_output /some/where/d2/out \

  --min_score 0.8 \

  --required_labels Goat \

  --output /some/where/vfs/out \

  --output_type jpg \

  --verbose \

  --progress 100 \

  --poll_interval 0.01 \

  --crop_to_content \

  --crop_margin 50 \

  --crop_min_width 600 \

  --crop_min_height 600 \

  --output_metadata

```

### Redis-based

In the following, an example of how to use the *video-frame-selector* to feed images to a 

redis-based [detectron2](https://github.com/waikato-datamining/pytorch/tree/master/detectron2) 

model that runs in a docker container. 

Of course, a Redis instance must be running as well. This example assumes the instance to

be running on `localhost` and default port `6379`.

Directory structure:

```

/some/where

|

|- cache (pytorch cache)

|

|- data (contains videos and detectron training data)

|

|- output (detectron2 models etc)

```

Running [detectron2](https://github.com/waikato-datamining/pytorch/tree/master/detectron2) 

to detect farm animals (Goat, Cow, Chicken): 

```bash

docker run --gpus=all --shm-size 8G -u $(id -u):$(id -g) -e USER=$USER \

    -v /some/where:/opt/projects \

    -v /some/where/cache:/.torch \

    -it public.aml-repo.cms.waikato.ac.nz:443/pytorch/detectron2:0.5

DATASET=/opt/projects/data/animal_farm

OUTPUT=/opt/projects/output/animal_farm

CONFIG=mask_rcnn_R_50_FPN_1x.yaml 

d2_predict_redis \

  --model $OUTPUT/model_final.pth \

  --config $OUTPUT/$CONFIG \

  --labels $DATASET/train/labels.txt \

  --redis_in images \

  --redis_out predictions

```

Feeding in images from a video, but only keeping frames with *Goat* detections with a score of at least 0.8:

```bash

vfs-process-redis \ 

  --input "/some/where/data/my_farm.avi" \

  --input_type video \

  --nth_frame 10 \

  --redis_out images \

  --redis_in predictions \

  --analysis_type opex_json \

  --min_score 0.8 \

  --required_labels Goat \

  --output /some/where/vfs/out \

  --output_type jpg \

  --verbose \

  --progress 100 \

  --crop_to_content \

  --crop_margin 50 \

  --crop_min_width 600 \

  --crop_min_height 600 \

  --output_metadata

```