{"id":13569319,"url":"https://github.com/Netflix/genie","last_synced_at":"2025-04-04T05:32:08.085Z","repository":{"id":40420673,"uuid":"10828921","full_name":"Netflix/genie","owner":"Netflix","description":"Distributed Big Data Orchestration Service","archived":false,"fork":false,"pushed_at":"2024-09-23T18:13:34.000Z","size":215422,"stargazers_count":1715,"open_issues_count":12,"forks_count":369,"subscribers_count":527,"default_branch":"master","last_synced_at":"2024-10-29T10:52:33.453Z","etag":null,"topics":["big-data","bigdata","cloud","configuration","configuration-management","distributed-systems","java","microservice","microservices","netflix-oss","netflixoss","orchestration","spring-boot"],"latest_commit_sha":null,"homepage":"https://netflix.github.io/genie","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Netflix.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2013-06-20T20:35:56.000Z","updated_at":"2024-10-26T15:49:06.000Z","dependencies_parsed_at":"2022-08-09T20:00:26.256Z","dependency_job_id":"5f103c44-d5b3-45ba-b3a0-b09187116bb0","html_url":"https://github.com/Netflix/genie","commit_stats":{"total_commits":2736,"total_committers":44,"mean_commits":62.18181818181818,"dds":0.5409356725146199,"last_synced_commit":"a117a939874792e4ccbba527f3504a30fea59230"},"previous_names":[],"tags_count":246,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fgenie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fgenie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fgenie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fgenie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Netflix","download_url":"https://codeload.github.com/Netflix/genie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222981780,"owners_count":17068007,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["big-data","bigdata","cloud","configuration","configuration-management","distributed-systems","java","microservice","microservices","netflix-oss","netflixoss","orchestration","spring-boot"],"created_at":"2024-08-01T14:00:38.499Z","updated_at":"2024-11-05T01:32:08.845Z","avatar_url":"https://github.com/Netflix.png","language":"Java","readme":"# Genie\n\n[![License](https://img.shields.io/github/license/Netflix/genie.svg)](http://www.apache.org/licenses/LICENSE-2.0)\n[![Issues](https://img.shields.io/github/issues/Netflix/genie.svg)](https://github.com/Netflix/genie/issues)\n[![NetflixOSS Lifecycle](https://img.shields.io/osslifecycle/Netflix/genie.svg)]()\n\n## Introduction\n\nGenie is a federated Big Data orchestration and execution engine developed by Netflix.\n\nGenie’s value is best described in terms of the problem it solves.\n\nBig Data infrastructure is complex and ever-evolving.\n\nData consumers (Data Scientists or other applications) need to jump over a lot of hurdles in order to run a simple query:\n - Find, download, install and configure a number of binaries, libraries and tools\n - Point to the correct cluster, using valid configuration and reasonable parameters, some of which are very obscure\n - Manually monitor the query, retrieve its output\n\nWhat works today, may not work tomorrow.\nThe cluster may have moved, the binaries may no longer be compatible, etc.\n\nMultiply this overhead times the number of data consumers, and it adds up to a lot of wasted time (and grief!).\n\nData infrastructure providers face a different set of problems:\n - Users require a lot of help configuring their working setup, which is not easy to debug remotely\n - Infrastructure upgrades and expansion require careful coordination with all users\n\n\nGenie is designed to sit at the boundary of these two worlds, and simplify the lives of people on either side.\n\nA data scientist can “rub the magic lamp” and just say “Genie, run query ‘Q’ using engine SparkSQL against production data”.\nGenie takes care of all the nitty-gritty details. It dynamically assembles the necessary binaries and configurations, execute the job, monitors it, notifies the user of its completion, and makes the output data available for immediate and future use.\n\nProviders of Big data infrastructure work with Genie by making resources available for use (clusters, binaries, etc) and plugging in the magic logic that the user doesn’t need to worry about: which cluster should a given query be routed to? Which version of spark should a given query be executed with? Is this user allowed to access this data? etc.\nMoreover, every job’s details are recorded for later audit or debugging.\n\nGenie is designed from the ground up to be very flexible and customizable.\nFor more details visit the [official documentation](https://netflix.github.io/genie)\n\n## Builds\n\nGenie builds are run on Travis CI [here](https://travis-ci.com/Netflix/genie).\n\n|     Branch     |                                                     Build                                                     |                                                                Coverage (coveralls.io)                                                                 |\n|:--------------:|:-------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------:|\n| master (4.2.x) | [![Build Status](https://travis-ci.com/Netflix/genie.svg?branch=master)](https://travis-ci.com/Netflix/genie) | [![Coverage Status](https://coveralls.io/repos/github/Netflix/genie/badge.svg?branch=master)](https://coveralls.io/github/Netflix/genie?branch=master) |\n|     4.1.x      | [![Build Status](https://travis-ci.com/Netflix/genie.svg?branch=4.1.x)](https://travis-ci.com/Netflix/genie)  |  [![Coverage Status](https://coveralls.io/repos/github/Netflix/genie/badge.svg?branch=4.1.x)](https://coveralls.io/github/Netflix/genie?branch=4.1.x)  |\n|     4.0.x      | [![Build Status](https://travis-ci.com/Netflix/genie.svg?branch=4.0.x)](https://travis-ci.com/Netflix/genie)  |  [![Coverage Status](https://coveralls.io/repos/github/Netflix/genie/badge.svg?branch=4.0.x)](https://coveralls.io/github/Netflix/genie?branch=4.0.x)  |\n\n## Project structure\n\n### `genie-app`\nSelf-contained Genie service server.\n\n### `genie-agent-app`\nSelf-contained Genie CLI job executor.\n\n### `genie-client`\nGenie client interact with the service via REST API.\n\n### `genie-web`\nThe main server library, can be re-wrapped to inject and override server components.\n\n### `genie-agent`\nThe main agent library, can be re-wrapped to inject and override components.\n\n### `genie-common`, `genie-common-internal`, `genie-common-external`\n\nInternal components libraries shared by the server, agent, and client modules.\n\n### `genie-proto`\n\nProtobuf messages and gRPC services definition shared by server and agent.\nThis is not a public API meant for use by other clients.\n\n### `genie-docs`, `genie-demo`\n\nDocumentation and demo application.\n\n### `genie-test`, `genie-test-web`\n\nTesting classes and utilities shared by other modules.\n\n### `genie-ui`\n\nJavaScript UI to search and visualize jobs, clusters, commands.\n\n### `genie-swagger`\n\nAuto-configuration of [Swagger](https://swagger.io/) via [Spring Fox](https://springfox.github.io/springfox/). Add\nto final deployment artifact of server to enable.\n\n## Artifacts\n\nGenie publishes to [Maven Central](https://search.maven.org/) and [Docker Hub](https://hub.docker.com/r/netflixoss/genie-app/)\n\nRefer to the [demo]() section of the documentations for examples.\nAnd to the [setup]() section for more detailed instructions to set up Genie.\n\n## Python Client\n\nThe [Genie Python client](https://github.com/Netflix/pygenie) is hosted in a different repository.\n\n## Further info\nFor a detailed explanation of Genie architecture, use cases, API documentation, demos, deployment and customization guides, and more, visit the\n[Genie documentation](https://netflix.github.io/genie).\n\n## Contact\n\nTo contact Genie developers with questions and suggestions, please use [GitHub Issues](https://github.com/Netflix/genie/issues)\n","funding_links":[],"categories":["Java","spring-boot","Data Pipeline ETL Frameworks","Hadoop","Data Pipeline","大数据"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetflix%2Fgenie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNetflix%2Fgenie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetflix%2Fgenie/lists"}