{"id":14065660,"url":"https://github.com/tokern/data-lineage","last_synced_at":"2025-04-04T19:11:32.722Z","repository":{"id":37196044,"uuid":"247882621","full_name":"tokern/data-lineage","owner":"tokern","description":"Generate and Visualize Data Lineage from query history","archived":false,"fork":false,"pushed_at":"2023-08-04T07:24:15.000Z","size":2582,"stargazers_count":322,"open_issues_count":32,"forks_count":46,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-28T18:12:09.269Z","etag":null,"topics":["data-governance","data-lineage","jupyter","postgresql","python"],"latest_commit_sha":null,"homepage":"https://tokern.io/data-lineage/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tokern.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-17T04:55:39.000Z","updated_at":"2025-03-10T05:45:32.000Z","dependencies_parsed_at":"2024-05-28T01:37:07.245Z","dependency_job_id":"54f4dada-2ec9-4fc9-a9a9-ce5b31d07177","html_url":"https://github.com/tokern/data-lineage","commit_stats":{"total_commits":99,"total_committers":4,"mean_commits":24.75,"dds":"0.31313131313131315","last_synced_commit":"51aad23edea28b12bc28429dae9406c6f73cbff1"},"previous_names":[],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokern%2Fdata-lineage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokern%2Fdata-lineage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokern%2Fdata-lineage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokern%2Fdata-lineage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tokern","download_url":"https://codeload.github.com/tokern/data-lineage/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247234921,"owners_count":20905854,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-governance","data-lineage","jupyter","postgresql","python"],"created_at":"2024-08-13T07:04:37.282Z","updated_at":"2025-04-04T19:11:32.700Z","avatar_url":"https://github.com/tokern.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Tokern Lineage Engine\n\n[![CircleCI](https://circleci.com/gh/tokern/data-lineage.svg?style=svg)](https://circleci.com/gh/tokern/data-lineage)\n[![codecov](https://codecov.io/gh/tokern/data-lineage/branch/master/graph/badge.svg)](https://codecov.io/gh/tokern/data-lineage)\n[![PyPI](https://img.shields.io/pypi/v/data-lineage.svg)](https://pypi.python.org/pypi/data-lineage)\n[![image](https://img.shields.io/pypi/l/data-lineage.svg)](https://pypi.org/project/data-lineage/)\n[![image](https://img.shields.io/pypi/pyversions/data-lineage.svg)](https://pypi.org/project/data-lineage/)\n\n\nTokern Lineage Engine is _fast_ and _easy to use_ application to collect, visualize and analyze \ncolumn-level data lineage in databases, data warehouses and data lakes in AWS and RDS.\n\nTokern Lineage helps you browse column-level data lineage \n* visually using [kedro-viz](https://github.com/quantumblacklabs/kedro-viz)\n* analyze lineage graphs programmatically using the powerful [networkx graph library](https://networkx.org/)\n\n## Resources\n\n* Demo of Tokern Lineage App\n\n![data-lineage](https://user-images.githubusercontent.com/1638298/118261607-688a7100-b4d1-11eb-923a-5d2407d6bd8d.gif)\n\n* Checkout an [example data lineage notebook](http://tokern.io/docs/data-lineage/example/).\n\n* Check out [the post on using data lineage for cost control](https://tokern.io/blog/data-lineage-on-redshift/) for an \nexample of how data lineage can be used in production.\n\n## Quick Start\n\n### Install a demo of using Docker and Docker Compose\n\nDownload the docker-compose file from Github repository.\n\n\n    # in a new directory run\n    wget https://raw.githubusercontent.com/tokern/data-lineage/master/install-manifests/docker-compose/catalog-demo.yml\n    # or run\n    curl https://raw.githubusercontent.com/tokern/data-lineage/master/install-manifests/docker-compose/tokern-lineage-engine.yml -o docker-compose.yml\n\n\nRun docker-compose\n   \n\n    docker-compose up -d\n\n\nCheck that the containers are running.\n\n\n    docker ps\n    CONTAINER ID   IMAGE                                    CREATED        STATUS       PORTS                    NAMES\n    3f4e77845b81   tokern/data-lineage-viz:latest   ...   4 hours ago    Up 4 hours   0.0.0.0:8000-\u003e80/tcp     tokern-data-lineage-visualizer\n    1e1ce4efd792   tokern/data-lineage:latest       ...   5 days ago     Up 5 days                             tokern-data-lineage\n    38be15bedd39   tokern/demodb:latest             ...   2 weeks ago    Up 2 weeks                            tokern-demodb\n\nTry out Tokern Lineage App\n\nHead to `http://localhost:8000/` to open the Tokern Lineage app\n\n### Install Tokern Lineage Engine\n\n    # in a new directory run\n    wget https://raw.githubusercontent.com/tokern/data-lineage/master/install-manifests/docker-compose/tokern-lineage-engine.yml\n    # or run\n    curl https://raw.githubusercontent.com/tokern/data-lineage/master/install-manifests/docker-compose/catalog-demo.yml -o tokern-lineage-engine.yml\n\nRun docker-compose\n   \n\n    docker-compose up -d\n\n\nIf you want to use an external Postgres database, change the following parameters in `tokern-lineage-engine.yml`:\n\n* CATALOG_HOST\n* CATALOG_USER\n* CATALOG_PASSWORD\n* CATALOG_DB\n\nYou can also override default values using environement variables. \n\n    CATALOG_HOST=... CATALOG_USER=... CATALOG_PASSWORD=... CATALOG_DB=... docker-compose -f ... up -d\n\nFor more advanced usage of environment variables with docker-compose, [refer to docker-compose docs](https://docs.docker.com/compose/environment-variables/)\n\n**Pro-tip**\n\nIf you want to connect to a database in the host machine, set \n\n    CATALOG_HOST: host.docker.internal # For mac or windows\n    #OR\n    CATALOG_HOST: 172.17.0.1 # Linux\n\n## Supported Technologies\n\n* Postgres\n* AWS Redshift\n* Snowflake\n\n### Coming Soon\n\n* SparkSQL\n* Presto\n\n## Documentation\n\nFor advanced usage, please refer to [data-lineage documentation](https://tokern.io/docs/data-lineage/index.html)\n## Survey\n\nPlease take this [survey](https://forms.gle/p2oEQBJnpEguhrp3A) if you are a user or considering using data-lineage. Responses will help us prioritize features better. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftokern%2Fdata-lineage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftokern%2Fdata-lineage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftokern%2Fdata-lineage/lists"}