An open API service indexing awesome lists of open source software.

https://github.com/mariam-iftikhar/bigdataprojects

The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
https://github.com/mariam-iftikhar/bigdataprojects

apache-spark awsec2 awsemr hadoop-cluster hadoop-mapreduce hbase hiveql

Last synced: about 2 months ago
JSON representation

The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.

Awesome Lists containing this project

README

        

# BigDataTechnologies

Key Highlights:

Hadoop: Implemented MapReduce jobs for large-scale data processing.

HBase: Developed and managed scalable, high-performance NoSQL databases.

Hive: Executed SQL-like queries for data warehousing and analytical tasks.

Spark: Built real-time and batch processing applications to extract valuable insights.

Explore the repository to see practical applications of these technologies and gain insights into big data solutions on AWS EMR.