Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ineerav/sparkini
base docker compose to setup the data engineering env in local
https://github.com/ineerav/sparkini
docker hadoop-hdfs hue spark
Last synced: about 1 month ago
JSON representation
base docker compose to setup the data engineering env in local
- Host: GitHub
- URL: https://github.com/ineerav/sparkini
- Owner: INeerav
- License: apache-2.0
- Created: 2024-07-21T13:55:28.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-07-21T23:08:41.000Z (4 months ago)
- Last Synced: 2024-09-29T07:01:38.845Z (about 2 months ago)
- Topics: docker, hadoop-hdfs, hue, spark
- Language: Jupyter Notebook
- Homepage:
- Size: 34.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sparkini
## Namenode and datanodes (HDFS)
The Namenode is the master node which persist metadata in HDFS and the datanode is the slave node which store the data. When you insert data or create objects into Hive tables, data will be stored in HDFS on Hadoop DataNodes and the NameNode will keep the tracking of which DataNode has the data.- namenode (fjardim/namenode_sqoop)
- datanode1 (fjardim/datanode)
- datanode2 (fjardim/datanode)## Hue
Hue is an open source SQL Assistant for Databases & Data Warehouses, It is not necessary for a big data ecosystem, but it can help you visualize data in HDFS faster, and other notable features.
- namenode (fjardim/hue)
- database(fjardim/mysql)```
shanks@pc cd sparkini/docker
docker-compose up -d
```## Contributing and Feedback
Took the inspirations from https://github.com/fabiogjardim