https://github.com/flavienbwk/datahub-docker-compose
Getting started with LinkedIn's DataHub project on compose
https://github.com/flavienbwk/datahub-docker-compose
datahub docker docker-compose linkedin sample
Last synced: 2 months ago
JSON representation
Getting started with LinkedIn's DataHub project on compose
- Host: GitHub
- URL: https://github.com/flavienbwk/datahub-docker-compose
- Owner: flavienbwk
- Created: 2022-05-01T13:48:03.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-01T15:36:06.000Z (about 4 years ago)
- Last Synced: 2025-03-23T09:35:26.948Z (over 1 year ago)
- Topics: datahub, docker, docker-compose, linkedin, sample
- Language: Shell
- Homepage:
- Size: 66.4 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DataHub docker-compose
Getting started with LinkedIn's [DataHub project](https://github.com/datahub-project/datahub) on a all-in-one compose configuration

## Running DataHub
Raise your host's ulimits for ElasticSearch to handle high I/O :
```bash
# Persist this setting in `/etc/sysctl.conf` and execute `sysctl -p`
sudo sysctl -w vm.max_map_count=512000
```
Run the whole DataHub cluster :
```bash
# Edit env variables with desired credentials
cp .env.example .env
docker-compose up -d
```
Now, wait a bit for magic to happen !
Access DataHub on port 9002 with default username AND password `datahub`
## Ingesting a dataset with metadata
Start injecting [`bootstrap_mce.json`](./metadata-ingestion/bootstrap_mce.json) by running :
```bash
docker-compose -f ingest.docker-compose.yml run ingestion
```