Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/damiencorpataux/foodopendata-digitization

Automated extraction of data from food labels pictures
https://github.com/damiencorpataux/foodopendata-digitization

Last synced: about 1 month ago
JSON representation

Automated extraction of data from food labels pictures

Awesome Lists containing this project

README

        

# foodopendata-digitization
Automated extraction of data from food labels pictures
https://hack.opendata.ch/project/67

## Scope and process
- INPUT: food label picture
- PROCESS: identify the textual data areas from the picture
- AI semantic segmentation problem here, ingredients and nutrition facts datasets exists for machine learning:
- https://www.crowdai.org/challenges/openfood-nutrition-table-challenge#resources
- https://www.crowdai.org/challenges/openfood-ingredients-list-challenge#resources
- A blog post on machine learning to identify letters in an image (slightly off topic but the process is inspiring)
- http://francescopochetti.com/text-recognition-natural-scenes/
- PROCESS: clean the cropped data area of the picture to make them OCRable (rotation, brightness/contrast, ...)
- PROCESS: OCR and parse the text stream
- OUTPUT: structured data, that should ideally be storable

## Legacy resources
- This guy made a prototype for OCRing text on an image with noise
http://francescopochetti.com/text-recognition-natural-scenes/#second
- A ridiculously basic OCR prototype (might not be always accessible, this is a dev server)
http://mien.ch:5555/fod/