Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ajaymahadeven/apache-spark-programs

This repository contains Apache Spark programs implemented in Python. These programs are part of my learning process for Apache Spark and are intended to serve as examples for anyone who is also learning or working with Apache Spark.
https://github.com/ajaymahadeven/apache-spark-programs

apache-spark apache-spark-sql apache-sparksql pyspark

Last synced: about 1 month ago
JSON representation

This repository contains Apache Spark programs implemented in Python. These programs are part of my learning process for Apache Spark and are intended to serve as examples for anyone who is also learning or working with Apache Spark.

Awesome Lists containing this project

README

        

# Apache Spark Programs

## DESCRIPTION
This repository contains Apache Spark programs implemented in Python. These programs are part of my learning process for Apache Spark and are intended to serve as examples for anyone who is also learning or working with Apache Spark.

---

## Installation

Before running these programs, you need to install Apache Spark and PySpark on your system. You can follow the instructions on the official Apache Spark website to download and install the latest version of Apache Spark: https://spark.apache.org/downloads.html

Once you have installed Apache Spark, you can install PySpark using pip:

pip install pyspark

---

## Usage

To run any of the programs in this repository, navigate to the program's directory and run the following command:

spark-submit program-name.py
Make sure to replace program-name with the name of the program you want to run.

---

## PROGRAMS :

Here is a list of all the programs in this repository:
1. Total Spent By customer (sorted and SparkSQL version)
2. Calculate Average Friends By Age
3. Filtering RDD's and finding Minimum Temperature
4. Movie Ratings Counter
5. Word Count using FlatMap
6. Calculating Min and Max Temperature using DataFrames
7. Social Graph Analysis using Marvel Superheroes
8. Calculating Average Friends By Age using SparkSQL
9. Calculating Total Spent By Customer using DataFrames
10. Word Count using SparkSQL
11. Calculating Average Friends By Age using DataFrames

---

## CONTRIBUTIONS

If you have any suggestions or ideas for new Apache Spark programs, feel free to open an issue or submit a pull request.

---

## LICENSE

This repository is licensed under the MIT License. See the LICENSE file for more information.