Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kirill-sidorchuk/lego_detector

Lego parts classifier using deep neural networks
https://github.com/kirill-sidorchuk/lego_detector

classification computer convolutional deep learning lego network vision

Last synced: 3 months ago
JSON representation

Lego parts classifier using deep neural networks

Host: GitHub
URL: https://github.com/kirill-sidorchuk/lego_detector
Owner: kirill-sidorchuk
Created: 2017-10-15T13:35:18.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-03-05T09:06:41.000Z (almost 6 years ago)
Last Synced: 2024-08-03T23:12:21.203Z (6 months ago)
Topics: classification, computer, convolutional, deep, learning, lego, network, vision
Language: Python
Homepage:
Size: 1.5 MB
Stars: 24
Watchers: 3
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-lego-machine-learning - Lego Detector [2019.03 - Code for training a classifer. (Parts Classification / Code)

README

# Lego parts classifier

## Workflow
* Collect data: make photos of multiple parts in each image
* Prepare dataset: segment out individual parts and sort the results
* Finalize dataset: create train/validation split
* Train a model: finetune existing model trained on ImageNet
* Test model

## Data Repository
[Google Drive Folder](https://drive.google.com/drive/folders/1qJFMb3k_JA_EefiI5HkEDkGxsIrhqmCC?usp=sharing)

## Python
We are using Python 3 to be able to run Tensorflow on Windows

## Dataset directory structure
dataset root/
* raw - raw photos from camera (user input)
* downsampled - raw photos downsampled to 1024 px on longer side (generated from raw photos)
* masks - masks generated from downsamled raw images (these are 4-value masks for GrabCut algorithm). You can edit masks in graphical editor to improve segmentation.
* segmentation - segmentation results of GrabCut algorithm - all background pixels are zeroed.
* parts - contains extracted individual parts for each image
* sorted - user sorted parts. Each subdirectory contains parts belonging to the same class. Name of subdirectory is regarded as class label name
* test - raw photos to use as a test set. File names are regarded as true labels.

## Workflow commandlines

python3 PrepareDataset.py data_root_dir\
Runs segmentation step. The procedure is as follows:\
For each raw image in 'raw' directory
* create downsampled image if not exists already
* create foreground segmentation 4-value mask for GrabCut if not exists already in 'masks' dir
* run GrabCut to segment out background (if segmentation result does not exist already)
* extract individual parts from segmentation results (always overwrites results)

\
python3 FinalizeDataset.py data_root_dir\
Creates train/validation split using 'sorted' directory.
Results - train.txt and val.txt - are created in data root dir.

\
python3 Finetune.py --model A --snapshot weights-784-0.969.hdf5\
Runs training.\
--model specifies model name to use.\
--snapshot file [optional] specifies a snapshot file to restart training from.\
--debug_epochs N [optional] specifies number of training epochs to save images fed to network. These images will be written in data_root_dir/debug directory.

\
python3 Predict.py new_test\sorted measure --tta 2 --rtta 1 --model A --snapshot weights-784-0.969.hdf5\
Runs prediction on test set (from 'new_test\sorted' subdirectory).\
There are two modes: 'measure' and 'sort'
* measure will calculate top1 and top5 accuracies given images with known labels, stored to directories (one directory - one class). Classification results and final accuracies are written to console.

* sort will sort images with unknown labels into directories - dir name will correspond to predicted class label.

--model and --snapshot arguments are the same as for 'Finetune.py' script.\
--tta 0 means no test time data augmentation, 1 - do vertical flip, 2 - do vertical and horizontal flips.\
--rtta <1 means no robot test time data augmentation, 2 and more means take that many images from sorted directory and average results.
--tta_mode 'mean' or 'majority' aggregation method for TTA. Default is 'mean'.

\
python3 data_root_dir camera_index --tta 3 --rtta 1 --model E --snapshot weights-745-0.875.hdf5\
Runs predictions from web camera.\
data_root_dir - data root dir. Needed to find models snapshots (which are in subdirectory 'snapshots')\
camera_index - index of web camera in the system. Default is 0\
--tta - test time augmentation level (see Predict.py arguments description)\
--rtta - robot test time augmentation level (see Predict.py arguments description)\
--model - model name to use\
--snapshot - snapshot file to load. Snapshots should be stored in data_root\snapshots\model_name\\

## Results

These results were obtained using command line params as follows:\
--tta 3 --rtta 3 --tta_mode mean\
Top 1 and top 5 accuracies were measured on a test set of 386 files

#### ResNet50 (24M params)
Best snapshot: model C, weights-860-0.917.hdf5\
top1: 91.4%\
top5: 95.7\
predict time per image: 900 ms

#### MobileNet (5M params)
Best snapshot: model E, weights-745-0.875.hdf5\
top1: 80.7%\
top5: 94.6%\
predict time per image: 347 ms

#### InceptionV3 (24M params)
Best snapshot: model F, weights-557-0.910.hdf5\
top1: 76.3%\
top5: 94.6%\
predict time per image: 514 ms