Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/grai-io/grai-core
https://github.com/grai-io/grai-core
data data-lineage data-science dataengineering datalineage dbt django fivetran hacktoberfest mssql mysql open-source parquet postgresql python redshift snowflake
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/grai-io/grai-core
- Owner: grai-io
- License: other
- Created: 2022-06-09T16:16:22.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-07-15T19:08:07.000Z (4 months ago)
- Last Synced: 2024-07-17T14:59:11.318Z (4 months ago)
- Topics: data, data-lineage, data-science, dataengineering, datalineage, dbt, django, fivetran, hacktoberfest, mssql, mysql, open-source, parquet, postgresql, python, redshift, snowflake
- Language: Python
- Homepage: https://www.grai.io
- Size: 120 MB
- Stars: 281
- Watchers: 2
- Forks: 21
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
- awesome-data-engineering - Grai - A data catalog tool that integrates into your CI system exposing downstream impact testing of data changes. These tests prevent data changes which might break data pipelines or BI dashboards from making it to production. (Testing / Data Profiler)
README
Documentation •
Website •
Slack •
Discussion •
Want to Chat?## Introduction
**Data lineage made simple.**
Grai makes it easy to understand and test how your data relates across databases, warehouses, APIs, and dashboards.- **Pre-built connectors.** Automatically synchronize lineage from across the stack so your metadata is never out of date.
- **Centralized data tests.** Write data validation tests that run whenever upstream data sources change.
- **Integrated with GitHub.** Run data validation tasks as part of your CI/CD process to test changes everywhere your data is used.
- **Your data, your cloud.** is fully open source and self-hosted. You maintain full control over your data and hosting environment.## How it works
- Automatically build column level lineage spanning your warehouse and production services with connectors for `dbt`, `Snowflake`, `Fivetran`, and more (see below).
- Get alerts in your CI/CD workflows whenever changes to a production system will impact your warehouse or dbt projects with [GitHub Actions](https://github.com/-io/grai-actions-server).
- Self-host the project or run it in the [Grai Cloud](https://app.grai.io) for free.### Connectors
| | integration | install |
|-------------------------------------------------------------------|---------------|-------------------------------------|
| | Snowflake | `pip install grai-source-snowflake` |
| | BigQuery | `pip install grai-source-bigquery` |
| | Redshift | `pip install grai-source-redshift` |
| | Postgres | `pip install grai-source-postgres` |
| | MySQL | `pip install grai-source-mysql` |
| | SQL Server | `pip install grai-source-mssql` |
| | dbt | `pip install grai-source-dbt` |
| | Fivetran | `pip install grai-source-fivetran` |
| | csv, parquet, feather | `pip install grai-source-flat-file` |
| | Metabase | `pip install grai-source-metabase` |
| | Looker (alpha) | `pip install grai-source-looker` |## Quickstart
You can find a full quickstart guide in the [documentation](https://docs.grai.io/quick-start) which covers deploying your own instance of Grai and getting set up with your first connector in Python.
The fastest way to get started is through the Grai CLI but you can also run the project locally with docker compose.Default login credentials:
```
username: [email protected]
password: super_secret
```### CLI
```bash
pip install grai-cli
grai demo start
```### Running Locally
You can always find pre-built images of the backend server at `ghcr.io/grai-io/grai-core/grai-server:latest` and the frontend at `ghcr.io/grai-io/grai-core/grai-frontend:latest`, however, if you prefer to build from source, you can do so with docker compose.
```bash
git clone https://github.com/grai-io/grai-core
cp examples/deployment/docker-compose/docker-compose.yml
docker compose up
```The backend server will be available at [http://localhost:8000/](http://localhost:8000/) and the frontend is now here [http://localhost:3000/](http://localhost:3000/).
After logging in and connecting a data source you'll be greeted with a lineage graph looking something like this
![Frontend](resources/frontend.png)
For more information about using the web application check out the [getting started guide](https://docs.grai.io/web-app/getting-started).
### Other Deployment Mechanisms
You can find example configurations for docker compose and Kubernetes in the [examples](/examples/deployment) folder.
#### Helm
We also publish a set of [Helm charts](https://charts.grai.io/) which are available if you prefer.
```
helm repo add grai https://charts.grai.io
helm install grai grai/grai
```## Component Services
* [grai-server](https://github.com/grai-io/grai-core/tree/master/grai-server): The backend metadata service built on Postgres and Django as the Metadata persistence layer.
* [grai-frontend](https://github.com/grai-io/grai-core/tree/master/grai-frontend): The frontend web application built on React.
* [grai-cli](https://github.com/grai-io/grai-core/tree/master/grai-client): Python client library for interacting with the Grai server.
* [grai-schemas](https://github.com/grai-io/grai-core/tree/master/grai-schemas): The python metadata schema implementation library of Grai. It provides a standardized view of all Grai objects used to ensure compatibility between the server, integrations, and the client.
* [grai-graph](https://github.com/grai-io/grai-core/tree/master/grai-graph): A python utility library for working with the Grai metadata graph.
* [grai-actions](https://github.com/grai-io/grai-actions): A library of GitHub Actions implementations to integrate Grai tests into your CI/CD pipelines.
* [integrations](https://github.com/grai-io/grai-core/tree/master/grai-integrations): A collection of integration libraries to extract metadata and persist their results to Grai.## Community Roadmap
Community Feedback drives our roadmap. Please let us know what you'd like to see next by asking questions and upvoting feature requests!
* [Feature Requests](https://github.com/orgs/grai-io/discussions/categories/feature-requests)
* [Documentation Requests](https://github.com/orgs/grai-io/discussions/categories/documentation-requests)
* [Bug Reports](https://github.com/grai-io/grai-core/issues)
* [FAQ](https://github.com/orgs/grai-io/discussions/categories/q-a)## Repo Activity
![Repo activity](https://repobeats.axiom.co/api/embed/31e89b7eda9ea0ebad3005fff55589496f79dc2d.svg "Repobeats analytics image")
## Community
Email us: [email protected]
Check us out at www.grai.io
Sign up for our Newsletter `Grai Matters` [email list](https://blog.grai.io/#/portal/signup).