Projects in Awesome Lists tagged with spark-dataframes
A curated list of projects in awesome lists tagged with spark-dataframes .
https://github.com/mahmoudparsian/pyspark-tutorial
PySpark-Tutorial provides basic algorithms using PySpark
big-data big-data-analytics data-algorithms pyspark spark spark-dataframes spark-rdd
Last synced: 14 May 2025
https://github.com/26hzhang/stockprediction
Plain Stock Close-Price Prediction via Graves LSTM RNNs
deeplearning4j java lstm recurrent-neural-networks spark-dataframes stock-price-prediction
Last synced: 13 Apr 2025
https://github.com/mahmoudparsian/big-data-mapreduce-course
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
algorithms apache-hadoop apache-spark big-data data-algorithms data-analysis data-engineering data-partition data-transformation glossary mapreduce mapreduce-algorithm mapreduce-python monoid partitioning-algorithms pyspark pyspark-algorithms-book santa-clara-university spark-dataframes spark-rdd
Last synced: 12 Apr 2025
https://github.com/nashtech-labs/sparkathon
A library having Java and Scala examples for Spark 2.x
apache-spark java-8 knoldus rdd scala spark spark-dataframes spark-dataset spark-ml spark-mllib spark-sql spark-streaming spark-structured-streaming
Last synced: 31 Aug 2025
https://github.com/maxinexiong/item-based-collaborative-filtering
This project utilizes PySpark DataFrames and PySpark RDD to implement item-based collaborative filtering. By calculating cosine similarity scores or identifying movies with the highest number of shared viewers, the system recommends 10 similar movies for a given target movie that aligns users’ preferences.
apache-spark collaborative-filtering movie-recommendation pyspark python spark spark-dataframes spark-rdd
Last synced: 24 Aug 2025
https://github.com/smusab9152/pyspark_programs_and_projects
Collection of PySpark programs and projects demonstrating the use of Apache Spark's Python API for big data processing and analysis. It includes practical implementations such as logistic regression classification, data analysis on the Iris dataset, and basic PySpark operations like temperature conversion.
apache-spark big-data big-data-analytics data-engineering distributed-computing etl pyspark spark-dataframes spark-rdd spark-sql
Last synced: 07 Oct 2025
https://github.com/milesgranger/pontem
Treat Spark like pandas.
dataframe-api dataframes distributed-dataframe pandas pyspark spark-dataframes
Last synced: 28 Jul 2025