An open API service indexing awesome lists of open source software.

https://github.com/pavithra19/apache_spark_people_data_processor

This project is a data processing application built with Apache Spark and Scala. This is designed to efficiently process, analyze and transform large datasets related to people data. It leverages Spark’s distributed computing capabilities to handle scalable data ingestion, cleaning and reporting. Shell scripts are included for hadoop deployment.
https://github.com/pavithra19/apache_spark_people_data_processor

apachespark dataengineering hadoop hdfs scala

Last synced: 4 months ago
JSON representation

This project is a data processing application built with Apache Spark and Scala. This is designed to efficiently process, analyze and transform large datasets related to people data. It leverages Spark’s distributed computing capabilities to handle scalable data ingestion, cleaning and reporting. Shell scripts are included for hadoop deployment.

Awesome Lists containing this project