{"id":19811367,"url":"https://github.com/diging/giles-eco-cepheus","last_synced_at":"2026-05-06T16:35:20.131Z","repository":{"id":13274392,"uuid":"73753690","full_name":"diging/giles-eco-cepheus","owner":"diging","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-17T21:23:25.000Z","size":1037,"stargazers_count":0,"open_issues_count":11,"forks_count":0,"subscribers_count":10,"default_branch":"develop","last_synced_at":"2025-11-24T03:07:52.812Z","etag":null,"topics":["extract-images","giles-ecosystem","java","spring"],"latest_commit_sha":null,"homepage":null,"language":"SCSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/diging.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-14T22:45:18.000Z","updated_at":"2024-11-17T21:20:07.000Z","dependencies_parsed_at":"2025-02-28T13:30:36.661Z","dependency_job_id":"8b794dfe-6644-47be-bc9c-2a80a58793ec","html_url":"https://github.com/diging/giles-eco-cepheus","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/diging/giles-eco-cepheus","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diging%2Fgiles-eco-cepheus","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diging%2Fgiles-eco-cepheus/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diging%2Fgiles-eco-cepheus/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diging%2Fgiles-eco-cepheus/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/diging","download_url":"https://codeload.github.com/diging/giles-eco-cepheus/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diging%2Fgiles-eco-cepheus/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32702198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-06T08:33:17.875Z","status":"ssl_error","status_checked_at":"2026-05-06T08:33:17.221Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extract-images","giles-ecosystem","java","spring"],"created_at":"2024-11-12T09:25:59.817Z","updated_at":"2026-05-06T16:35:20.116Z","avatar_url":"https://github.com/diging.png","language":"SCSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cepheus \n## giles-eco-cepheus\n\n\u003ca href='http://diging-dev.asu.edu/jenkins/job/GECO_test_cepheus_on_push'\u003e\u003cimg src='http://diging-dev.asu.edu/jenkins/buildStatus/icon?job=GECO_test_cepheus_on_push'\u003e\u003c/a\u003e\n\nThis repository contains Cepheus which is part of the Giles Ecosystem. Cepheus is an app to extract images and embedded text from PDFs.\n\nThe Giles Ecosystem is a distributed system to run OCR on images and extract images and texts from PDF files. This repository contains the text and image extraction component of this system called \"Cepheus\". The system requires the following software:\n\n* Apache Tomcat 8\n* Apache Kafka\n* Apache Zookeeper (required by Apache Kafka)\n* Tesseract (https://github.com/tesseract-ocr/)\n\nThe components of the Giles Ecosystem are located in the following repositories:\n\n* Giles: https://github.com/diging/giles-eco-giles-web (user-facing component for uploading files)\n* Nepomuk: https://github.com/diging/giles-eco-nepomuk (file storage)\n* Cepheus: https://github.com/diging/giles-eco-cepheus (this repository)\n* Cassiopeia: https://github.com/diging/giles-eco-cassiopeia (OCR using Tesseract)\n\nThe above applications have dependencies to libraries located in the following repositories:\n\n* https://github.com/diging/giles-eco-requests\n* https://github.com/diging/giles-eco-util\n\nThere is a docker compose file for testing and evaluation purposes that sets up the Giles Ecosystem in Docker. You can find that file here: https://github.com/diging/giles-eco-docker\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiging%2Fgiles-eco-cepheus","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdiging%2Fgiles-eco-cepheus","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiging%2Fgiles-eco-cepheus/lists"}