Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/damiencorpataux/foodopendata-digitization

Automated extraction of data from food labels pictures
https://github.com/damiencorpataux/foodopendata-digitization

Last synced: about 1 month ago
JSON representation

Automated extraction of data from food labels pictures

Host: GitHub
URL: https://github.com/damiencorpataux/foodopendata-digitization
Owner: damiencorpataux
Created: 2017-02-10T14:31:46.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2017-02-10T15:48:53.000Z (almost 8 years ago)
Last Synced: 2024-11-05T16:36:46.512Z (3 months ago)
Size: 1000 Bytes
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# foodopendata-digitization
Automated extraction of data from food labels pictures
https://hack.opendata.ch/project/67

## Scope and process
- INPUT: food label picture
- PROCESS: identify the textual data areas from the picture
- AI semantic segmentation problem here, ingredients and nutrition facts datasets exists for machine learning:
- https://www.crowdai.org/challenges/openfood-nutrition-table-challenge#resources
- https://www.crowdai.org/challenges/openfood-ingredients-list-challenge#resources
- A blog post on machine learning to identify letters in an image (slightly off topic but the process is inspiring)
- http://francescopochetti.com/text-recognition-natural-scenes/
- PROCESS: clean the cropped data area of the picture to make them OCRable (rotation, brightness/contrast, ...)
- PROCESS: OCR and parse the text stream
- OUTPUT: structured data, that should ideally be storable

## Legacy resources
- This guy made a prototype for OCRing text on an image with noise
http://francescopochetti.com/text-recognition-natural-scenes/#second
- A ridiculously basic OCR prototype (might not be always accessible, this is a dev server)
http://mien.ch:5555/fod/