{"id":15563255,"url":"https://github.com/jveverka/data-lab","last_synced_at":"2026-04-13T16:36:23.327Z","repository":{"id":101222469,"uuid":"210439876","full_name":"jveverka/data-lab","owner":"jveverka","description":"Data Lab Project","archived":false,"fork":false,"pushed_at":"2021-10-15T20:40:20.000Z","size":22789,"stargazers_count":0,"open_issues_count":2,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-29T05:13:06.960Z","etag":null,"topics":["elasticsearch","image-processing","microservices","tensorflow","tensorflow2","yolov3"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jveverka.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-23T19:51:29.000Z","updated_at":"2021-10-15T18:29:55.000Z","dependencies_parsed_at":"2023-03-22T15:03:17.507Z","dependency_job_id":null,"html_url":"https://github.com/jveverka/data-lab","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jveverka/data-lab","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jveverka%2Fdata-lab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jveverka%2Fdata-lab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jveverka%2Fdata-lab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jveverka%2Fdata-lab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jveverka","download_url":"https://codeload.github.com/jveverka/data-lab/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jveverka%2Fdata-lab/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31761987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T15:25:13.801Z","status":"ssl_error","status_checked_at":"2026-04-13T15:25:09.162Z","response_time":93,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elasticsearch","image-processing","microservices","tensorflow","tensorflow2","yolov3"],"created_at":"2024-10-02T16:20:52.911Z","updated_at":"2026-04-13T16:36:23.309Z","avatar_url":"https://github.com/jveverka.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Java11](https://img.shields.io/badge/java-11-blue)](https://img.shields.io/badge/java-11-blue)\n[![Gradle](https://img.shields.io/badge/gradle-v6.5-blue)](https://img.shields.io/badge/gradle-v6.5-blue)\n[![Build Status](https://travis-ci.org/jveverka/data-lab.svg?branch=master)](https://travis-ci.org/jveverka/data-lab?branch=master)\n\n# Data Lab Project\n__Data Lab Project__ provides advanced analytics and query services on various document sources like \nimages, video streams, text documents, file system. This project is work in progress.\n![datalab](docs/data-lab-image.svg)\n\n## Features\n* __File system indexing__ - queries on file system meta-data\n* __Image meta-data indexing__ - queries on exif and geo-location meta data.\n* __Video meta-data indexing__ - queries on exif and geo-location meta data.\n* __Image content object recognition__ - queries on objects contained in images. \n\n### Microservices\n* [__data-scanner-service__](data-scanner-service) - [__microservice__] simple service for scanning file system.\n* [__ml-services__](ml-services) - [__microservices__] simple services utilizing using machine learning.\n* [__message-broker__](message-broker) - [__microservice__]\n\n### Components \n* [__file-system-service__](file-system-service) - [__library__] simple library for scanning file system.\n* [__elasticsearch-service__](elasticsearch) - [__library__] service for easy ElasticSearch read/write access.\n* [__data-scanner-service__](data-scanner-service) - [__library__] service for scanning data directory and annotating data files.\n\n### Architecture\n![architecture](docs/architecture-01.svg)\n\n### Technology stack\n* __Microservices__ - REST, Message Broker integrations, K8s, WIP\n* __ElasticSearch 7.15.x__ - main meta-data database\n* __Kibana 7.15.x__ - basic data visualizations\n* __RabbitMQ 3.8__ - message broker\n* __Java 11__ - microservice implementations \n* __Python 3.8.x__ - microservice implementations\n* __TensorFlow 2.0 / Keras__ - ML related tasks\n* __Gradle 7.2.x__ - build system or later \n* __Ubuntu 20.04 LTS__ - default target environment\n\n### Build, Test and Run\n```\ngradle clean installDist distZip test\n```\nFollow [this user](docs/user-guide.md) guide to run microservices locally.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjveverka%2Fdata-lab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjveverka%2Fdata-lab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjveverka%2Fdata-lab/lists"}