Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ravi72munde/scala-spark-cab-rides-predictions
A big data project for predicting prices of Uber/Lyft rides depending on the weather
https://github.com/ravi72munde/scala-spark-cab-rides-predictions
predict-prices scala spark spark-streaming streaming uber weather
Last synced: 24 days ago
JSON representation
A big data project for predicting prices of Uber/Lyft rides depending on the weather
- Host: GitHub
- URL: https://github.com/ravi72munde/scala-spark-cab-rides-predictions
- Owner: ravi72munde
- License: mit
- Created: 2018-11-19T06:17:22.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-03-19T17:57:15.000Z (10 months ago)
- Last Synced: 2024-03-19T19:01:11.185Z (10 months ago)
- Topics: predict-prices, scala, spark, spark-streaming, streaming, uber, weather
- Language: Scala
- Size: 1.58 MB
- Stars: 12
- Watchers: 3
- Forks: 8
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# scala-spark-cab-rides-predictions
A big data project for predicting prices of Uber/Lyft rides depending on the weather.Dataset was compiled and uploaded to Kaggle. Can be found here https://www.kaggle.com/ravi72munde/uber-lyft-cab-prices
## Contributors:
* Ravi Munde
* Karan Barai### Project Structure :
* cab-price-connector - Data Collection Scala Project
* Databricks_Prediction_code.html - Anlysis and Spark Model(From Databricks.com)
* Cab_Price_Prediction.ipynb - Random Forrest Model in Python### Data Model:
#### CabPrice
root
|- cab_type : String
|- destination : String
|- distance: Float
|- id: String
|- name: String
|- price: Float
|- product_id: String
|- source: String
|- surge_multiplier: String
|- time_stamp:Long#### Weather
root
|- clouds : Float
|- humidity : Float
|- location : Float
|- location : String
|- temp : String
|- pressure : Float
|- wind : Float![Actor System](Actors.png)
Sample log of Actor System Running on EC2
`INFO [CabRideSystem-akka.actor.default-dispatcher-2] a.DynamoActor - received 12 number of weather records`
`INFO [CabRideSystem-akka.actor.default-dispatcher-4] a.DynamoActor - Weather Batch processed on DynamoDB`
`INFO [CabRideSystem-akka.actor.default-dispatcher-9] a.DynamoActor - received 156 number of cab price records`
`INFO [CabRideSystem-akka.actor.default-dispatcher-8] a.DynamoActor - Cab Prices Batch processed on DynamoDB`
`INFO [CabRideSystem-akka.actor.default-dispatcher-7] a.Master - Cab ride data piped to Dynamo Actor`
`INFO [CabRideSystem-akka.actor.default-dispatcher-13] a.DynamoActor - received 156 number of cab price records`
`INFO [CabRideSystem-akka.actor.default-dispatcher-15] a.DynamoActor - Cab Prices Batch processed on DynamoDB`*NOTE: AWS Creditials need to be put in environment vairables*
### Model Evaluation Matrices
* Regression R_squared = 0.62
* Random Forrest Regression's Price Prediction Accuracy : 92.79 %
* Random Forrest Classification Surge Prediction Accuracy: 77.69 %
Confusion Matrix for the Classifier