Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/damiencorpataux/foodopendata-digitization
Automated extraction of data from food labels pictures
https://github.com/damiencorpataux/foodopendata-digitization
Last synced: about 1 month ago
JSON representation
Automated extraction of data from food labels pictures
- Host: GitHub
- URL: https://github.com/damiencorpataux/foodopendata-digitization
- Owner: damiencorpataux
- Created: 2017-02-10T14:31:46.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-02-10T15:48:53.000Z (almost 8 years ago)
- Last Synced: 2024-11-05T16:36:46.512Z (3 months ago)
- Size: 1000 Bytes
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# foodopendata-digitization
Automated extraction of data from food labels pictures
https://hack.opendata.ch/project/67## Scope and process
- INPUT: food label picture
- PROCESS: identify the textual data areas from the picture
- AI semantic segmentation problem here, ingredients and nutrition facts datasets exists for machine learning:
- https://www.crowdai.org/challenges/openfood-nutrition-table-challenge#resources
- https://www.crowdai.org/challenges/openfood-ingredients-list-challenge#resources
- A blog post on machine learning to identify letters in an image (slightly off topic but the process is inspiring)
- http://francescopochetti.com/text-recognition-natural-scenes/
- PROCESS: clean the cropped data area of the picture to make them OCRable (rotation, brightness/contrast, ...)
- PROCESS: OCR and parse the text stream
- OUTPUT: structured data, that should ideally be storable## Legacy resources
- This guy made a prototype for OCRing text on an image with noise
http://francescopochetti.com/text-recognition-natural-scenes/#second
- A ridiculously basic OCR prototype (might not be always accessible, this is a dev server)
http://mien.ch:5555/fod/