Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/srosalino/predicting_nyc_taxi_limousine_profit
Predicting the profit of NYC Taxi Limousine services to provide actionable insights for maximizing revenue
https://github.com/srosalino/predicting_nyc_taxi_limousine_profit
cloud-computing feature-engineering feature-selection pyspark pyspark-mllib statistical-analysis
Last synced: 4 days ago
JSON representation
Predicting the profit of NYC Taxi Limousine services to provide actionable insights for maximizing revenue
- Host: GitHub
- URL: https://github.com/srosalino/predicting_nyc_taxi_limousine_profit
- Owner: srosalino
- Created: 2023-09-15T16:05:45.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-11T18:57:18.000Z (4 months ago)
- Last Synced: 2024-07-11T21:56:25.592Z (4 months ago)
- Topics: cloud-computing, feature-engineering, feature-selection, pyspark, pyspark-mllib, statistical-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 712 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
**Overview**
The present work aims to implement a computational solution for studying and analyzing a problem with large-scale data, involving the construction of a machine learning model. The data processed is large and was taken from AWS Open Data (https://registry.opendata.aws/). In the development of this work, in addition to the computational capacity that its authors had at their disposal, services in the Amazon cloud environment for academic context were used: AWS Academy. In terms of tools, the project was implemented using resources provided by Apache Spark and the Python programming language. The implementation of the solution was based on the ML Pipeline methodology, systematized in the figure below.
![Description of image](pipeline.png)
This report will seek to describe the phases of carrying out the work, how the algorithm is implemented and the main conclusions drawn from its use.
**Full Report**
The full report containing detailed explanations of the developed work, as well as the obtained results is present in the '*Relatorio_Grupo_AS.pdf*' file.