Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/woctezuma/steam-filtered-image-data

Build the Steam-OneFace dataset.
https://github.com/woctezuma/steam-filtered-image-data

data image-dataset image-datasets steam steam-api steam-game steam-games steam-oneface

Last synced: about 2 months ago
JSON representation

Build the Steam-OneFace dataset.

Awesome Lists containing this project

README

        

# Steam Filtered Image Data

This repository provides details about image data:
- downloaded from Steam,
- **filtered** following parts of the procedure in [`download-steam-banners-data`][download-steam-banners-data].

A dataset, called Steam-OneFace, is shared [in this section][steam-oneface-section].

## Data

Data downloaded from Steam consists of:
- PNG logos with transparency,
- JPG vertical banners.

Filtered image data is shared [on Google Drive][filtered-data-on-gdrive].

The *up-to-date* list of appIDs for which I have *tried* to download image data is available in:
> `data/download-queries/app_ids.txt`. Most of these ~48k appIDs do not feature any image data.

Here is an example of a vertical banner:

![Example of a vertical Banner][vertical-banner-example]

Here is an example of a logo:

![Example of a logo][logo-example]

## Filtering

The filtering consists in the removal of:
- blank images, i.e. images for which grayscale image intensity extrema are equal,
- images of uncommon resolution:
- anything but 640x360 for logos,
- anything but 300x450 for vertical banners,
- images with uncommon bands:
- anything but RGBA for logos,
- anything but RGB for banners,
- images for which the bounding box of the non-zero regions covers:
- 100% of the image space for transparent logos,
- strictly less than 100% image space for banners, which happens with [vignetting][vignetting-wiki].

## Suggestions of filtering

Suggestions of filtering include:
- removal of duplicate images with [`imagededup`][imagededup],
- filtering based on the number of detected faces, as in [`steam-face-detection`][steam-face-detection].

The enforcement of such filtering is left to the reader.
Otherwise, it would be difficult to keep **filtered** data up-to-date.

## Steam-OneFace dataset

The notebook [`build_steam_oneface_dataset.ipynb`][steam-oneface-notebook] shows an application of the filters suggested above.
[![Open In Colab][colab-badge]][steam-oneface-notebook]

There are three options for the face detector:
- [`face-alignment`][python-face-alignment]
- [`retinaface_pytorch`][retinaface]
- [`dlib`][dlib-github]

### With the `face_alignment` module

This allows to build a dataset, called `Steam-OneFace`, of 1688 images which should all feature exactly one face.

This dataset is shared [on Google Drive][steam-oneface-gdrive] in:
- the original resolution (300x450): [`steam-oneface-hr.tar.gz`][steam-oneface-hr] (94 MB)
- a lower resolution (256x256): [`steam-oneface-lr.tar.gz`][steam-oneface-lr] (52 MB)

[![Thumbnails of Steam OneFace][steam-oneface-cover-small]][steam-oneface-cover-big]

To use this dataset on Google Colab, run the following:
```bash
!gdown --id 1QptHrW9vloTtP--YJsxMY8PZWI2D8NJt
!tar xf steam-oneface-lr.tar.gz
```
```python
import glob
from pathlib import Path

file_names = glob.glob('steam-oneface-lr/*.jpg')
app_ids = [int(Path(fname).stem) for fname in file_names]
```

### With the `retinaface` module

The dataset consists of 2472 images, shared in:
- the original resolution (300x450): [`steam-oneface-hr_with_retinaface.tar.gz`][steam-oneface-hr-retinaface] (133 MB)
- a lower resolution (256x256): [`steam-oneface-lr_with_retinaface.tar.gz`][steam-oneface-lr-retinaface] (74 MB)

To use this dataset on Google Colab, run the following:
```bash
!gdown --id 1-0Nk7H6Cn3Nt60EdHG_NWSA8ohi2oBqr
!tar xf steam-oneface-lr_with_retinaface.tar.gz
```

### With the `dlib` module

The dataset consists of 305 images, shared in:
- the original resolution (300x450): [`steam-oneface-hr_with_dlib.tar.gz`][steam-oneface-hr-dlib] (16 MB)
- a lower resolution (256x256): [`steam-oneface-lr_with_dlib.tar.gz`][steam-oneface-lr-dlib] (9 MB)

To use this dataset on Google Colab, run the following:
```bash
!gdown --id 1-4RIn9G9Bee2JZ1bK1gkkgkLocHuWJJ4
!tar xf steam-oneface-lr_with_dlib.tar.gz
```

### With several detection modules

The notebook [`trim_steam_oneface_dataset.ipynb`][steam-oneface-notebook-trim] trims the dataset by intersecting the results of different detectors.
[![Open In Colab][colab-badge]][steam-oneface-notebook-trim]

The trimmed datasets are:

#### Steam-OneFace-small

- [`Steam-OneFace-small`][steam-oneface-small-gdrive]:
- 993 images,
- obtained with modules `face_alignment` and `retinaface`,

To use this dataset on Google Colab, run the following:
```bash
!gdown --id 1-1V5fDhPo75iDtAbrD18rppV-lf51bPW
!tar xf steam-oneface-small-lr.tar.gz
```

#### Steam-OneFace-tiny

- [`Steam-OneFace-tiny`][steam-oneface-tiny-gdrive]:
- 168 images,
- obtained with modules `dlib`, `face_alignment` and `retinaface`.

To use this dataset on Google Colab, run the following:
```bash
!gdown --id 1-2sCVgBUmu6LFug1pzBfmL8zNFFBq27F
!tar xf steam-oneface-tiny-lr.tar.gz
```

![Thumbnails of the WHOLE dataset Steam OneFace Tiny][steam-oneface-tiny-as-grid]

## References

- To download images: [`download_steam_banners.ipynb`][download_steam_banners] in [`woctezuma/google-colab`][code]
- To filter out duplicates, etc.:
- for PNG logos: [`remove_duplicates.ipynb`][filter_steam_logos] in [`woctezuma/google-colab`][code]
- for JPG banners: [`remove_duplicates.ipynb`][filter_steam_banners] in [`woctezuma/steam-stylegan2-ada`][code-ada]
- To detect faces: [`detect_faces_on_steam_banners.ipynb`][colab-notebook-face-detection] in [`woctezuma/steam-face-detection`][steam-face-detection]

[download-steam-banners-data]:
[steam-oneface-section]:

[logo-example]:
[vertical-banner-example]:

[filtered-data-on-gdrive]:

[vignetting-wiki]:

[imagededup]:
[steam-face-detection]:

[steam-oneface-notebook]:
[dlib-github]:
[python-face-alignment]:
[retinaface]:
[steam-oneface-gdrive]:
[steam-oneface-hr]:
[steam-oneface-lr]:
[steam-oneface-cover-small]:
[steam-oneface-cover-big]:
[steam-oneface-hr-retinaface]:
[steam-oneface-lr-retinaface]:
[steam-oneface-hr-dlib]:
[steam-oneface-lr-dlib]:

[steam-oneface-notebook-trim]:
[steam-oneface-small-gdrive]:
[steam-oneface-tiny-gdrive]:
[steam-oneface-tiny-as-grid]:

[colab-badge]:

[code]:
[code-ada]:
[download_steam_banners]:
[filter_steam_logos]:
[filter_steam_banners]:
[colab-notebook-face-detection]: