https://github.com/mariam-iftikhar/bigdataprojects
The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
https://github.com/mariam-iftikhar/bigdataprojects
apache-spark awsec2 awsemr hadoop-cluster hadoop-mapreduce hbase hiveql
Last synced: about 2 months ago
JSON representation
The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
- Host: GitHub
- URL: https://github.com/mariam-iftikhar/bigdataprojects
- Owner: Mariam-iftikhar
- Created: 2024-05-14T12:11:56.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-05-14T16:01:58.000Z (11 months ago)
- Last Synced: 2025-01-17T04:45:56.085Z (3 months ago)
- Topics: apache-spark, awsec2, awsemr, hadoop-cluster, hadoop-mapreduce, hbase, hiveql
- Homepage:
- Size: 10.4 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# BigDataTechnologies
Key Highlights:
Hadoop: Implemented MapReduce jobs for large-scale data processing.
HBase: Developed and managed scalable, high-performance NoSQL databases.
Hive: Executed SQL-like queries for data warehousing and analytical tasks.
Spark: Built real-time and batch processing applications to extract valuable insights.
Explore the repository to see practical applications of these technologies and gain insights into big data solutions on AWS EMR.