{"id":17423279,"url":"https://github.com/OpenLineage/openlineage","last_synced_at":"2025-03-01T00:32:15.214Z","repository":{"id":36950042,"uuid":"306977038","full_name":"OpenLineage/OpenLineage","owner":"OpenLineage","description":"An Open Standard for lineage metadata collection","archived":false,"fork":false,"pushed_at":"2024-10-29T09:59:40.000Z","size":46399,"stargazers_count":1750,"open_issues_count":257,"forks_count":304,"subscribers_count":46,"default_branch":"main","last_synced_at":"2024-10-29T11:58:03.401Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://openlineage.io","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenLineage.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":"GOVERNANCE.md","roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-24T21:45:05.000Z","updated_at":"2024-10-29T09:59:44.000Z","dependencies_parsed_at":"2024-03-23T11:39:39.905Z","dependency_job_id":"91fa4b24-2d5b-4253-93ca-f09f41473aa4","html_url":"https://github.com/OpenLineage/OpenLineage","commit_stats":{"total_commits":2264,"total_committers":120,"mean_commits":"18.866666666666667","dds":0.8418727915194346,"last_synced_commit":"96c8ff66a2ec5663b6b3380e67e042890f0a79fa"},"previous_names":[],"tags_count":116,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenLineage%2FOpenLineage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenLineage%2FOpenLineage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenLineage%2FOpenLineage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenLineage%2FOpenLineage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenLineage","download_url":"https://codeload.github.com/OpenLineage/OpenLineage/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241289990,"owners_count":19939196,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-17T05:00:49.444Z","updated_at":"2025-03-01T00:32:15.207Z","avatar_url":"https://github.com/OpenLineage.png","language":"Java","readme":"[![CircleCI](https://circleci.com/gh/OpenLineage/OpenLineage/tree/main.svg?style=shield)](https://circleci.com/gh/OpenLineage/OpenLineage/tree/main)\n[![status](https://img.shields.io/badge/status-active-brightgreen.svg)](#status)\n[![Slack](https://img.shields.io/badge/slack-chat-blue.svg)](https://join.slack.com/t/openlineage/shared_invite/zt-2u4oiyz5h-TEmqpP4fVM5eCdOGeIbZvAk)\n[![license](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](https://github.com/OpenLineage/OpenLineage/blob/main/LICENSE)\n[![maven](https://img.shields.io/maven-central/v/io.openlineage/openlineage-java.svg)](https://search.maven.org/search?q=g:io.openlineage)\n[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/4888/badge)](https://bestpractices.coreinfrastructure.org/projects/4888)\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"./doc/openlineage-lfai-logo.png\" width=\"754px\"/\u003e\n\u003c/div\u003e\n\n## Overview\nOpenLineage is an Open standard for metadata and lineage collection designed to instrument jobs as they are running.\nIt defines a generic model of run, job, and dataset entities identified using consistent naming strategies.\nThe core lineage model is extensible by defining specific facets to enrich those entities.\n\nOpenLineage is an [LF AI \u0026 Data Foundation](https://lfaidata.foundation/projects/openlineage) Graduate project under active development, and we welcome contributions.\n\n## Problem\n\n### Before\n\n- Duplication of effort: each project has to instrument all jobs\n- Integrations are external and can break with new versions\n\n![Before OpenLineage](doc/before-ol.svg)\n\n### With OpenLineage\n\n- The effort of integration is shared\n- An integration can be pushed in each project: no need to play catch up\n\n![With OpenLineage](doc/with-ol.svg)\n\n## Scope\nOpenLineage defines the metadata for running jobs and the corresponding events.\nA configurable backend allows the user to choose what protocol to send the events to.\n ![Scope](doc/scope.svg)\n\n## Core model\n\n ![Model](doc/datamodel.svg)\n\n A facet is an atomic piece of metadata attached to one of the core entities.\n See the spec for more details.\n\n## Spec\nThe [specification](spec/OpenLineage.md) is defined using OpenAPI and allows extension through custom facets.\n\n## Integration matrix\n\nThe OpenLineage repository contains integrations with several systems.\n\n| Name| Table-level lineage| Column-level lineage |\n| ----| ------------------ | -------------------- |\n|[Apache Spark](https://github.com/OpenLineage/OpenLineage/tree/main/integration/spark)| :white_check_mark: | :white_check_mark:\u003csup\u003e1\u003c/sup\u003e |\n|[Apache Airflow](https://github.com/OpenLineage/OpenLineage/tree/main/integration/airflow)| :white_check_mark: | :white_check_mark:\u003csup\u003e2\u003c/sup\u003e |\n|[Dagster](https://github.com/OpenLineage/OpenLineage/tree/main/integration/dagster)| :white_check_mark: | :x: |\n|[dbt](https://github.com/OpenLineage/OpenLineage/tree/main/integration/dbt) |:white_check_mark: | :white_check_mark: |\n|[Flink](https://github.com/OpenLineage/OpenLineage/tree/main/integration/flink)|:white_check_mark: | :x: |\n\n1. Does not support `SELECT *` queries with JDBC.\n2. Supports SQL-based operators other than BigQuery.\n\n## Related projects\n- [Marquez](https://marquezproject.ai/): Marquez is an [LF AI \u0026 DATA](https://lfaidata.foundation/) project to collect, aggregate, and visualize a data ecosystem's metadata. It is the reference implementation of the OpenLineage API.\n  - [OpenLineage collection implementation](https://github.com/MarquezProject/marquez/blob/main/api/src/main/java/marquez/api/OpenLineageResource.java)\n- [Egeria](https://egeria.odpi.org/): Egeria offers open metadata and governance for enterprises - automatically capturing, managing and exchanging metadata between tools and platforms, no matter the vendor.\n\n## Community\n- Website: [openlineage.io](http://openlineage.io)\n- Slack: [OpenLineage.slack.com](https://join.slack.com/t/openlineage/shared_invite/zt-2u4oiyz5h-TEmqpP4fVM5eCdOGeIbZvA)\n- Twitter: [@OpenLineage](https://twitter.com/OpenLineage)\n- Mailing list: [openlineage-tsc](https://lists.lfaidata.foundation/g/openlineage-tsc)\n- Wiki: [OpenLineage+Home](https://wiki.lfaidata.foundation/display/OpenLineage/OpenLineage+Home)\n- LinkedIn: [13927795](https://www.linkedin.com/groups/13927795/)\n- YouTube: [channel](https://www.youtube.com/channel/UCRMLy4AaSw_ka-gNV9nl7VQ)\n- Mastodon: [@openlineage@fostodon.org](openlineage@fosstodon.org)\n\n## Talks\n- [Flink Forward, October 2024. Data Lineage for Apache Flink with OpenLineage](https://www.flink-forward.org/berlin-2024/agenda#data-lineage-for-apache-flink-with-openlineage)\n- [Airflow Summit, September 2024. Activating operational metadata with Airflow, Atlan and OpenLineage](https://airflowsummit.org/sessions/2024/activating-operational-metadata-with-airflow-atlan-and-openlineage/)\n- [Kafka Summit, March 2024. OpenLineage for Stream Processing](https://www.confluent.io/events/kafka-summit-london-2024/openlineage-for-stream-processing/)\n- [Data Council Austin, March 2024. Data Lineage: We've Come a Long Way](https://www.youtube.com/watch?v=OE1o4D_iWfw)\n- [Data+AI Summit June 2023. Cross-Platform Data Lineage with OpenLineage](https://www.databricks.com/dataaisummit/session/cross-platform-data-lineage-openlineage/)\n- [Berlin Buzzwords, June 2023. Column-Level Lineage is Coming to the Rescue](https://youtu.be/xFVSZCCbZlY)\n- [Berlin Buzzwords, June 2022. Cross-Platform Data Lineage with OpenLineage](https://www.youtube.com/watch?v=pLBVGIPuwEo)\n- [Berlin Buzzwords, June 2021. Observability for Data Pipelines with OpenLineage](https://2021.berlinbuzzwords.de/member/julien-le-dem)\n- [Data Driven NYC, February 2021. Data Observability and Pipelines: OpenLineage and Marquez](https://mattturck.com/datakin/)\n- [Big Data Technology Warsaw Summit, February 2021. Data lineage and Observability with Marquez and OpenLineage](https://bigdatatechwarsaw.eu/edition-2021/)\n- [Metadata Day 2020. OpenLineage Lightning Talk](https://www.youtube.com/watch?v=anlV5Er_BpM)\n- [Open Core Summit 2020. Observability for Data Pipelines: OpenLineage Project Launch](https://www.coss.community/coss/ocs-2020-breakout-julien-le-dem-3eh4)\n\n## Contributing\n\nSee [CONTRIBUTING.md](https://github.com/OpenLineage/OpenLineage/blob/main/CONTRIBUTING.md) for more details about how to contribute.\n\n## Report a Vulnerability\n\nIf you discover a vulnerability in the project, please [open an issue](https://github.com/OpenLineage/OpenLineage/issues/new/choose) and attach the \"security\" label.\n\n----\nSPDX-License-Identifier: Apache-2.0\\\nCopyright 2018-2025 contributors to the OpenLineage project\n","funding_links":[],"categories":["Data Catalog","大数据"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenLineage%2Fopenlineage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOpenLineage%2Fopenlineage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenLineage%2Fopenlineage/lists"}