https://github.com/x-tabdeveloping/visual-analytics-assignment1

First assignment for visual analytics course.
https://github.com/x-tabdeveloping/visual-analytics-assignment1

Last synced: about 1 year ago
JSON representation

First assignment for visual analytics course.

Host: GitHub
URL: https://github.com/x-tabdeveloping/visual-analytics-assignment1
Owner: x-tabdeveloping
License: mit
Created: 2024-02-23T13:23:25.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-05-13T16:31:23.000Z (almost 2 years ago)
Last Synced: 2025-02-08T17:14:40.015Z (about 1 year ago)
Language: Python
Size: 459 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # visual-analytics-assignment1

First assignment for visual analytics course.

This assignment is oriented at image retrieval in the [17 Category Flower Dataset](https://www.robots.ox.ac.uk/~vgg/data/flowers/17/)

## Setup

After downloading the data all jpg files should be arranged in one folder.

In the examples I will use the following folder structure:

```

 - data/

    - jpg/

        - files.txt

        - image_0001.jpg

        ...

```

 > Beware!! The data directory in the repo is only added for demonstration puposes and for including the example images in the README.

The code ignore the `files.txt` file and scans the directory for all images.

This choice was made so that the code is easily reusable in non-indexed image datasets.

Each image is assumed to have a particular ID, this is the stem of the image path.

E.g. for the file `image_0001.jpg` its ID would be `"image_0001"`.

These IDs will be later used for specifying which image to base the search on.

To run the code first install dependencies:

```bash

pip install -r requirements.txt

```

## Usage

The repository contains code for a command line interface that can search in the dataset based on one user-specified image.

The CLI can use multiple distance metrics and types of latent representations to achieve this.

### Color Histograms

To do image retrieval based on minmax normalized color histograms, run the following:

```bash

python3 src/hist_search.py data/jpg/image_0001.jpg -o "out/hist"

```

This will put a CSV file with the images closest to the target by Chi Square histogram distance in the `out/hist` folder.

```

 - out/

    - hist/

        - image_0001.csv

```

These are the results I got:

||Image|Distance|

|-|-|-|

|0||0.0|

|1||190.13992491162105|

|2||190.2249241130487|

|3||190.62783760197846|

|4||191.69055452774253|

|5||191.8753821638015|

As we can see many of the images barely resemble the original, and I would deem the performance of this method not satisfactory.

The problem likely lies in the histograms being too sparse (`256*256*256` bins) and the exact same colors rarely occur in two different images.

Increasing the size of the bins would likely yield much better results.

### VGG16 

To use VGG16 image embeddings and cosine distance to search in the images run this command:

```bash

python3 src/embedding_search.py data/jpg/image_0001.jpg -o "out/vgg16"

```

This will put a CSV file with the images closest to the target cosine distance in the `out/vgg16` folder.

```

 - out/

    - vgg16/

        - image_0001.csv

```

| |Image|Distance|

|-|-|-|

|0||1.369352364832821e-08|

|1||0.00986799891034007|

|2||0.010539749814202692|

|3||0.010623027922563977|

|4||0.011068601062943717|

|5||0.011395322683994236|

### Parameters

| Argument                | Description                                                                                  | Type    | Default           |

|-------------------------|----------------------------------------------------------------------------------------------|---------|-------------------|

| `query_image_path`              | Path to query image.                                                                    |         | -                 |

| `-h`, `--help`          | Show this help message and exit.                                                             |         |                   |

| `-i IMAGES_PATH`,
`--images_path IMAGES_PATH` | Path to the source directory containing images.                                           | str     | `"data/jpg"`                 |

| `-o OUT_PATH`,
`--out_path OUT_PATH` | Path to the output directory to save results.                                                | str     | `"out"`                 |

| `-k TOP_K`,
`--top_k TOP_K` | Top K similar images to retrieve.                                                             | int     | 5                 |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/x-tabdeveloping/visual-analytics-assignment1

Awesome Lists containing this project

README