{"id":30294436,"url":"https://github.com/linkedin/hoptimator","last_synced_at":"2026-05-28T19:00:59.598Z","repository":{"id":158485306,"uuid":"629656845","full_name":"linkedin/Hoptimator","owner":"linkedin","description":"Multi-hop declarative data pipelines","archived":false,"fork":false,"pushed_at":"2026-05-15T20:59:15.000Z","size":3018,"stargazers_count":128,"open_issues_count":12,"forks_count":15,"subscribers_count":9,"default_branch":"main","last_synced_at":"2026-05-15T21:57:48.027Z","etag":null,"topics":["brooklin","cdc","data-pipelines","flink","kafka","kafka-connect"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linkedin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-04-18T18:59:18.000Z","updated_at":"2026-05-15T05:30:45.000Z","dependencies_parsed_at":"2023-11-18T00:53:59.427Z","dependency_job_id":"0abcc248-99d2-4d00-a401-d03143e4a98e","html_url":"https://github.com/linkedin/Hoptimator","commit_stats":null,"previous_names":[],"tags_count":97,"template":false,"template_full_name":null,"purl":"pkg:github/linkedin/Hoptimator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2FHoptimator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2FHoptimator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2FHoptimator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2FHoptimator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linkedin","download_url":"https://codeload.github.com/linkedin/Hoptimator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2FHoptimator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33622070,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-28T02:00:06.440Z","response_time":99,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["brooklin","cdc","data-pipelines","flink","kafka","kafka-connect"],"created_at":"2025-08-17T01:34:54.761Z","updated_at":"2026-05-28T19:00:59.588Z","avatar_url":"https://github.com/linkedin.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003eHoptimator\u003c/h1\u003e\n  \u003ch3\u003eA SQL control plane for multi-system data pipelines\u003c/h3\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/linkedin/Hoptimator/actions\"\u003e\u003cimg src=\"https://img.shields.io/github/actions/workflow/status/linkedin/Hoptimator/integration-tests.yml?branch=main\u0026label=CI\" alt=\"CI\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/linkedin/Hoptimator/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-BSD--2--Clause-blue\" alt=\"License\"\u003e\u003c/a\u003e\n    \u003cimg src=\"https://img.shields.io/badge/status-alpha-orange\" alt=\"Status: alpha\"\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\nHoptimator turns SQL into running, multi-hop data pipelines that span\nKafka, Flink, Venice, and anything else you plug in. You declare what you\nwant — a materialized view from one system into another — and Hoptimator\nplans the topology, generates the specs, deploys them, and reconciles them.\n\n```sql\nCREATE MATERIALIZED VIEW ADS.AUDIENCE AS\n  SELECT FIRST_NAME, LAST_NAME\n  FROM ADS.PAGE_VIEWS NATURAL JOIN PROFILE.MEMBERS;\n```\n\nWhat that statement *becomes* depends on the templates and databases\nregistered in your environment. With a typical Kafka + Flink setup, it\nexpands into:\n\n- a `View` and a `Pipeline` resource,\n- a connector configuration on each side,\n- a Flink SQL job that maintains the result,\n- and any intermediate hops (e.g. CDC topics) the planner determined were\n  needed to get from sources to sink.\n\nSwap in different templates and the same SQL can target a different stack.\nThe deployment target is pluggable — the bundled deployers target Kubernetes,\nbut `hoptimator-api` is the actual extension point.\n\n## Why Hoptimator?\n\n- **One SQL surface across many systems.** Kafka, Flink, Venice, MySQL — and\n  pluggable for the rest. The catalog is unified; joins span systems.\n- **Multi-hop, declarative.** You don't write Flink jobs and you don't request\n  topics. The planner figures out the topology from a query.\n- **Kubernetes out of the box, not as a hard requirement.** The bundled\n  deployers target Kubernetes, so pipelines show up as first-class CRDs and\n  `kubectl get pipelines` Just Works. The `Deployer` interface is the actual\n  extension point — anything that knows how to materialize a spec can take\n  the place of the defaults.\n- **Inspectable before it deploys.** `!specify` (CLI) and `plan` (MCP) emit the\n  exact specs Hoptimator would apply. No \"magic\" deploys.\n- **Pluggable.** New sources, sinks, engines, deployers, and validators are all\n  extension points on `hoptimator-api`.\n\n## Quickstart\n\nYou need Docker Desktop with Kubernetes enabled (or `kind`), `kubectl`, and\nJDK 17+. Then:\n\n```bash\nmake build install     # build the project and install the SQL CLI\nmake deploy-demo       # install CRDs and a couple of demo databases\n./hoptimator           # start the SQL CLI\n\u003e !intro\n```\n\nInside the CLI, declare a materialized view:\n\n```sql\nCREATE MATERIALIZED VIEW ADS.AUDIENCE AS\n  SELECT FIRST_NAME, LAST_NAME\n  FROM ADS.PAGE_VIEWS NATURAL JOIN PROFILE.MEMBERS;\n```\n\nThen in another terminal, watch what showed up:\n\n```bash\nkubectl get views\nkubectl get pipelines\n```\n\nFor a full walkthrough — including how to inspect the plan before deploying\nand how to clean up — see the [Quickstart](docs/getting-started/quickstart.md).\n\n## How it works\n\n```\n   SQL  ──▶  Planner  ──▶  Pipeline (sources, sink, job)\n                              │\n                              ▼\n                          Deployers\n                              │\n                              ▼\n                  Kubernetes resources\n                  (Pipeline, KafkaTopic,\n                   FlinkSessionJob, …)\n                              │\n                              ▼\n                          Operator\n                       (reconcile loop)\n```\n\nHoptimator plays three roles: **planner** (parse + optimize the SQL across the\nunified catalog), **adapter** (translate plan elements into target-system\nspecs), and **operator** (apply specs to Kubernetes and reconcile drift). The\nsame machinery powers the SQL CLI, the JDBC driver, the MCP server, and the\nstandalone operator.\n\nFor the long version, see the [Architecture overview](docs/getting-started/architecture.md).\n\n## Documentation\n\nThe full docs live in [`docs/`](docs/index.md):\n\n- **[Getting started](docs/getting-started/index.md)** — quickstart, concepts,\n  architecture.\n- **[User guide](docs/user-guide/index.md)** — SQL CLI, JDBC driver, MCP\n  server, DDL reference, hints.\n- **[Kubernetes guide](docs/kubernetes/index.md)** — operator, CRD\n  reference, templates, triggers, configuration.\n- **[Extending Hoptimator](docs/extending/index.md)** — adding data\n  sources, writing deployers, validators, config providers.\n- **[Learn more](docs/resources/learn-more.md)** — engineering blog posts and\n  case studies.\n\n## Project status\n\nHoptimator is **alpha**. APIs — including the SQL grammar, the\n`hoptimator-api` interfaces, and the `v1alpha1` CRDs — are subject to change\nwithout notice. The project is still early-stage and experimental from an open\nsource perspective; if you adopt it today, expect to follow `main` and pin to\nspecific versions deliberately.\n\nThat said, Hoptimator is not a research toy: LinkedIn runs production\npipelines on it internally. Pre-release artifacts for the modules in this\nrepo are published to LinkedIn's\n[JFrog Artifactory](https://linkedin.jfrog.io/artifactory/hoptimator).\n\n## Contributing\n\nBug reports, feature requests, and PRs are welcome. See\n[CONTRIBUTING.md](CONTRIBUTING.md) for how to file an issue, send a pull\nrequest, or report a security vulnerability.\n\n## License\n\n[BSD 2-Clause](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fhoptimator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinkedin%2Fhoptimator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fhoptimator/lists"}