Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cvalenzuela/scenescoop

A tool to describe the content of videos and suggest similar scenes in other videos/films.
https://github.com/cvalenzuela/scenescoop

learning machine movie python scene tensorflow video

Last synced: about 1 month ago
JSON representation

A tool to describe the content of videos and suggest similar scenes in other videos/films.

Host: GitHub
URL: https://github.com/cvalenzuela/scenescoop
Owner: cvalenzuela
Created: 2017-12-05T03:43:15.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2021-02-25T15:03:40.000Z (almost 4 years ago)
Last Synced: 2024-11-27T04:30:58.502Z (3 months ago)
Topics: learning, machine, movie, python, scene, tensorflow, video
Language: Python
Homepage:
Size: 20.5 MB
Stars: 134
Watchers: 7
Forks: 21
Open Issues: 5
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Scenescoop

Scenescoop is a tool to get similar semantic scenes from a pair of videos. Basically, you input a video and get a scene that has a similar meaning in another video. You can run it as a python script or as a web app.

![description](static/imgs/description2.png)

## How it works

Scenescoop uses the [im2text](https://github.com/tensorflow/models/tree/master/research/im2txt) tensorflow model to analyze videos on a frame to frames basis and get a description of the content of those images. Frames with the same description are grouped together to create a sequence or scene.

Scenes are then analyzed with [spaCy](https://spacy.io/), for sentence parsing and built-in word vectors, using the average of the word vectors in the sentence.

[Annoy](https://github.com/spotify/annoy) is finally used to create an index for fast nearest-neighbor lookup (based on [@aparrish](https://github.com/aparrish) [Plot to poem](https://github.com/aparrish/plot-to-poem/blob/master/plot-to-poem.ipynb))

This project is inspired by [Thingscoop](https://github.com/agermanidis/thingscoop).

## Video Demos

### [A man sitting at a table with a plate of food](https://youtu.be/ZF5W_tcnF4s)
[![A man sitting at a table with a plate of food](static/imgs/food.png)](https://youtu.be/ZF5W_tcnF4s)

### [A group of people walking down the street](https://youtu.be/aaYVMsMMEjc)
[![A group of people walking down the street](static/imgs/street.png)](https://youtu.be/aaYVMsMMEjc)

## Usage

To run this you'll need to install a few dependencies. You can follow the [original repository](https://github.com/tensorflow/models/tree/master/research/im2txt) or the instructions [Edouard Fouché](https://edouardfouche.com/Fun-with-Tensorflow-im2txt/) wrote.
(I plan to write a step-by-step guide on how to install everything)

You can also get the pretrained model I'm using [here](https://drive.google.com/open?id=1tSTzD21qXXOiXlfgJllgXNZ9lREy6yij).

Once everything is installed, clone the repo and install the project dependencies:

```
git clone https://github.com/cvalenzuela/scenescoop.git
cd scenescoop
pip install -r requirements.txt
```

You can then run Scenescoop in two modes:

### 1) Frame Analysis Mode

Given a video file `--video` (.mp4, .avi, .mkv or .mov), this will analyse the file frame by frame and output a `.json` file containing the descriptions of the those frames. The `--name` argument should be the output name of the transcript.

Example:
```
python scenescoop.py --video videos/moonrisekingdom.mp4 --name moonrisekingdom
```

The `.json` file should look something like this:

```
{
...
"a person is taking a picture of themselves in a mirror ": [4834],
"a man sitting in the back of a pickup truck ": [2265, 2266],
"a man sitting on a bench in front of a building ": [1935, 1937,
1938, 3950, 3951, 3952, 3953, 3960, 4072, 4073, 4074, 4075,
4077, 4079, 4080, 4082, 4115, 4467],
"a man standing next to a tree holding a surfboard ": [2470]
...
}
```

### 2) Transfer Mode

Two videos are required for this mode and both should have their corresponding `transcript.json` file created in the Frame Analysis Mode.

The `--input_data` argument should be the `.json` file containing the data for the input video and `--transform_data` is the `.json` file for the transfer video. `--input_seconds` is the input time frame to transfer and `--transform_src` is the video source of the transfer video.

Example:
```
python scenescoop.py --input_data transcripts/street.json --input_seconds 0,5 --transform_src videos/her.avi --transform_data transcripts/her.json
```

You can print all options with `python scenescoop.py -h`:

```
usage: scenescoop.py [-h] [--video VIDEO] [--name NAME]
[--input_data INPUT_DATA] [--input_seconds INPUT_SECONDS]
[--transform_src TRANSFORM_SRC]
[--transform_data TRANSFORM_DATA] [--api API]

Storiescoop

optional arguments:
-h, --help show this help message and exit
--video VIDEO Video Source to transform
--name NAME Name of the video
--input_data INPUT_DATA
Input Video. Must be a json file.
--input_seconds INPUT_SECONDS
Input Video Seconds to create transformation. Example:
1,30
--transform_src TRANSFORM_SRC
Transform Video Source.
--transform_data TRANSFORM_DATA
Transform Video Data. Must be a json file.
--api API API Request
```

## Web App

You can also launch an interactive web app, using a flask server, to run the Frame Analysis Mode and Transfer Mode in a webpage. You'll still need all the dependencies installed.

![description](static/imgs/demo.gif)

To run the app in a local server:

```
python server.py
```

The visit `localhost:8080`.

To modify the source code:
```
cd static
yarn watch
```

## MMS

Local development of the MMS application:

Start ngrok
```
./ngrok http 7676
```

Configure the url in Twilio and in the server in `NGROK_URL`

Start the Redis server
```
redis-server
```

Start the Celery worker:
```
celery -A server.celery worker
```

Finally start the server
```
python server.py
```

## License

MIT