https://github.com/chrismattmann/imagecat
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
https://github.com/chrismattmann/imagecat
apache memex oodt oodt-radix solr tika
Last synced: 6 months ago
JSON representation
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
- Host: GitHub
- URL: https://github.com/chrismattmann/imagecat
- Owner: chrismattmann
- Created: 2015-02-18T08:25:18.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2018-08-26T19:14:06.000Z (about 7 years ago)
- Last Synced: 2025-03-21T16:07:11.964Z (7 months ago)
- Topics: apache, memex, oodt, oodt-radix, solr, tika
- Language: Java
- Homepage:
- Size: 175 MB
- Stars: 95
- Watchers: 17
- Forks: 40
- Open Issues: 0
-
Metadata Files:
- Readme: README.md