An open API service indexing awesome lists of open source software.

https://github.com/a3darekar/big-data-management-project-3

Data Analysis with Spark ML
https://github.com/a3darekar/big-data-management-project-3

Last synced: 2 months ago
JSON representation

Data Analysis with Spark ML

Awesome Lists containing this project

README

        

# Data analysis with Spark ML

The project aims to analyze the US domestic flight dataset using PySpark Dataframes and predict which flight/flight carrier is most likely to be canceled or delayed.

Dataset can be found [here](https://www.kaggle.com/yuanyuwendymu/airline-delay-and-cancellation-data-2009-2018/data)