Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/alberto-abarzua/computer_vision_experiments

Messing around with object detection from scratch
https://github.com/alberto-abarzua/computer_vision_experiments

computervision deep-learning densenet121 image-classification object-detection opencv python pytorch

Last synced: 1 day ago
JSON representation

Messing around with object detection from scratch

Host: GitHub
URL: https://github.com/alberto-abarzua/computer_vision_experiments
Owner: alberto-abarzua
Created: 2022-06-19T21:15:38.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2022-08-09T13:58:03.000Z (over 2 years ago)
Last Synced: 2024-03-26T13:52:29.581Z (8 months ago)
Topics: computervision, deep-learning, densenet121, image-classification, object-detection, opencv, python, pytorch
Language: Jupyter Notebook
Homepage:
Size: 100 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Computer vision Experiments

### First experiments using OpenCV and PyTorch.

### DISCLAIMER: This is at a very early stage so there are many things to improve. (More on the section Things to improve.)

The idea of this repo is to test out some computer vision techniques to later use them as a tool/controller for the robot arm I am developing.

[Robot Arm Repository ](https://github.com/alberto-abarzua/3d_printed_robot_arm)

The main objectives to solve are:

1. Given a static camera with a static background, detect new objects that enter it's FOV (field of view).
2. Quick way to define types and new objects to detect (Be able to train the model only with data gathered within a few minutes of a training prodecure).
3. Classify objects mentioned in (1) using the gathered data mentioned in (2).

## For the first aproach the following workflow was used. This is divided in to main sections.

### Training sequence:

1. Define the background of the inital scene. (Startup the camera without any objects in it's FOV)
2. Define the objects the model should be able to detect and classify. For each object place it individually in the camere's FOV and take pictures of it in various positions and orientations (using the same cropping and ROI described in "Running the object detection".
3. Use data augmentation techniques to increase the size of the training data and improve generalization.
4. Train the model used with this gathered data.

### Running the object detection:

1. Define the background of the inital scene. (Startup the camera without any objects in it's FOV)
2. For each new frame detect the difference between it and the background. The countours created with this difference will be the detected objects.
3. Determine bounding boxes for each object detectec and extract a region of interest from it (ROI).
4. Use a CNN (in this case a modified version of DenseNet-121 implemented using pytorch), to classifie each ROI
5. Output the bounding boxes and the prediccion to the output frame (screen live camera)

## About the CNN model used.

A modified version of DenseNet-121 was implemented using pytorch. This model takes as an input images (RGB) of size 32 (3,32,32) used to classify the several classes defined in "Training sequence"
## Sample dataset examples

# Some examples.

## Object segmentation process.

## Various objects in the same frame.

## Things to improve

- The object detection algorithm (the way countours are found).
- The data augmentation for the creation of new datasets.
- How the ROI's are processed and sent to the classifier model. (CNN)
- Many others ... (The ones just mentioned are the most important at the moment.)