https://github.com/contiamo/datahub-sap-hana
SAP Hana data source for LinkedIn's Datahub
https://github.com/contiamo/datahub-sap-hana
datahub hana sap
Last synced: 2 months ago
JSON representation
SAP Hana data source for LinkedIn's Datahub
- Host: GitHub
- URL: https://github.com/contiamo/datahub-sap-hana
- Owner: contiamo
- License: mit
- Created: 2022-01-17T13:40:53.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-10-09T22:32:13.000Z (over 1 year ago)
- Last Synced: 2024-12-27T20:37:47.918Z (4 months ago)
- Topics: datahub, hana, sap
- Language: Python
- Homepage:
- Size: 920 KB
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Datahub SAP Hana Metadata source
Add your SAP Hana databases to your Linkedin Datahub!
# Description
This python package extracts views metadata from SAP Hana db to push it to Datahub. These includes:
- Table schema with datatypes
- View definitions
- View lineages with support for cross-schema references and column-level lineagesThe metadata from SAP Hana are extracted and parsed using sqlalchemy (for table lineage) and sqlglot (for column lineage).
### Recipe File
The recipe file for the source file supports ingestion of both table and column lineage via the Datahub CLI.
The ingestion config file can also specify specific schemas to exclude and include, allowing for the creation of lineage across different schemas in a SAP Hana db.
Results can be seen in the Datahub UI or printed in the console, or file.## Installing
Pre-built Wheels can be downloaded from the [Releases page](https://github.com/contiamo/datahub-sap-hana/releases/latest)
Otherwise, you must install from source.
### Requirements
You need the following tools pre-installed
* [Task](https://taskfile.dev/#/installation)
* [Poetry](https://python-poetry.org/docs/#installation)
* [Pyenv](https://github.com/pyenv/pyenv#installation)
* Make sure to have the [required build packages installed for your OS](https://github.com/pyenv/pyenv/wiki#suggested-build-environment)
* And the [shims for your shell](https://github.com/pyenv/pyenv#set-up-your-shell-environment-for-pyenv)## Try it out
1. Clone the project
```sh
git clone [email protected]:contiamo/datahub-sap-hana.git
cd datahub-sap-hana
```2. You will need Python 3.10 or higher
Once you have `pyenv` and Poetry installed, you should run
```sh
pyenv install 3.10.10
pyenv local 3.10.10
poetry config virtualenvs.in-project true
```3. Install the project and dependencies
```sh
task setup
```4. Edit the `examples/hana_recipe.yaml` to set the connection details to your SAP Hana database.
If you just want to do a local test, SAP offers [SAP Hana Express](https://www.sap.com/products/hana/express-trial.html) as a
free trial version of Hana. There is also [a Docker image](https://developers.sap.com/tutorials/hxe-ua-install-using-docker.html)
that makes this very easy, this is our recommendation.5. Run the test sync
```sh
poetry run datahub ingest run -c examples/hana_recipe.yaml
```6. Inspect the contents of the `hana_mces.json` file that was created.
### Docker image
A Docker image with datahub and this package preinstalled is provided via the [Github Container Registry, see here](https://github.com/contiamo/datahub-sap-hana/pkgs/container/datahub-sap-hana)
```sh
docker run -it --rm -v `pwd`:/opt \
ghcr.io/contiamo/datahub-sap-hana:latest \
ingest run -c /opt/examples/hana_recipe.yaml
```
Note that you may need to set the `--network` flag if you are using the Hana Express Docker image.## Development
### Running the tests
To run the unit tests, use
```sh
task test
```To run all of the tests, just use
```sh
task test -- -v
```