https://github.com/vigneshss-07/bigdata_technologies
This repo contains all technical knowledge and implementation of big data technologies.
https://github.com/vigneshss-07/bigdata_technologies
big-data hadoop hadoop-hdfs hbase hive hive-metastore kafka mapreduce-python pyspark spark sparksql
Last synced: 4 months ago
JSON representation
This repo contains all technical knowledge and implementation of big data technologies.
- Host: GitHub
- URL: https://github.com/vigneshss-07/bigdata_technologies
- Owner: vigneshSs-07
- Created: 2021-08-07T20:01:22.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-07-19T04:41:25.000Z (almost 3 years ago)
- Last Synced: 2025-01-16T07:57:12.241Z (5 months ago)
- Topics: big-data, hadoop, hadoop-hdfs, hbase, hive, hive-metastore, kafka, mapreduce-python, pyspark, spark, sparksql
- Language: Jupyter Notebook
- Homepage:
- Size: 1.49 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### Bigdata_Technologies
Started with Hadoop explaining HDFS and its evolution.
# Difference between Apache Hadoop Vs Apache Spark
* https://www.ibm.com/cloud/blog/hadoop-vs-spark
* https://towardsdatascience.com/big-data-analytics-apache-spark-vs-apache-hadoop-7cb77a7a9424***Big Data ecosystem***
1. https://github.com/dgadiraju/itversity-books/tree/master/Data%20Engineering%20Bootcamp/40%20Big%20Data%20ecosystem%20-%20Overview