Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/diging/giles-eco-giles-web
Distributed system based on Apache Kafka to run OCR on images and extract images and texts from PDF files.
https://github.com/diging/giles-eco-giles-web
giles-ecosystem java spring
Last synced: 4 days ago
JSON representation
Distributed system based on Apache Kafka to run OCR on images and extract images and texts from PDF files.
- Host: GitHub
- URL: https://github.com/diging/giles-eco-giles-web
- Owner: diging
- License: mpl-2.0
- Created: 2016-11-03T23:13:41.000Z (about 8 years ago)
- Default Branch: develop
- Last Pushed: 2024-05-03T19:59:23.000Z (7 months ago)
- Last Synced: 2024-05-04T07:16:18.588Z (7 months ago)
- Topics: giles-ecosystem, java, spring
- Language: Java
- Homepage: http://gilesecosystem.io
- Size: 4.65 MB
- Stars: 9
- Watchers: 14
- Forks: 2
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Giles Ecosystem
The Giles Ecosystem is a distributed system to run OCR on images and extract images and texts from PDF files. This repository contains the user-facing component of this system called "Giles". The system requires the following software:
* Apache Tomcat 8
* Apache Kafka
* Apache Zookeeper
* MySQL (or PostgreSQL)
* Tesseract OCR (https://github.com/tesseract-ocr/)The core components of the Giles Ecosystem are located in the following repositories:
* Giles: https://github.com/diging/giles-eco-giles-web (this repository)
* Nepomuk: https://github.com/diging/giles-eco-nepomuk (file storage)
* Cepheus: https://github.com/diging/giles-eco-cepheus (image extraction from PDF files)
* Andromemda: https://github.com/diging/giles-eco-andromeda (text extraction from PDF files)
* Cassiopeia: https://github.com/diging/giles-eco-cassiopeia (OCR using Tesseract)The above applications have dependencies to libraries located in the following repositories:
* https://github.com/diging/giles-eco-requests
* https://github.com/diging/giles-eco-util
* https://github.com/diging/giles-eco-september-utilAdditionally, Giles depends on:
* https://github.com/jdamerow/spring-social-github
* https://github.com/jdamerow/spring-social-mitreid-connectThere are some additional components of the Giles Ecosystem that can be added if required:
* September (monitoring app for the Giles Ecosystem): https://github.com/diging/giles-eco-september
* Freddie (Solr connector): https://github.com/diging/giles-eco-freddieThere is a Docker Compose file for testing and evaluation purposes that sets up the Giles Ecosystem in Docker. You can find that file here: https://github.com/diging/giles-eco-docker
You can detailed installation information and the documentation of Giles' API [here](https://diging.atlassian.net/wiki/display/GECO/Giles+Ecosystem+Home).