https://github.com/rbhatia46/nyc-taxi-demand-prediction
The objective of this project was to predict the taxi demand for yellow cabs in a particular region in next 10 minutes for New York city. Based on the data, machine learning model predicts the pickup demand of cabs in 10 minutes time frame. The data was provided by the Taxi & Limousine Commission for yellow cabs. Correct prediction of the same can fairly improve the time utilization of a taxi driver.
https://github.com/rbhatia46/nyc-taxi-demand-prediction
Last synced: 7 months ago
JSON representation
The objective of this project was to predict the taxi demand for yellow cabs in a particular region in next 10 minutes for New York city. Based on the data, machine learning model predicts the pickup demand of cabs in 10 minutes time frame. The data was provided by the Taxi & Limousine Commission for yellow cabs. Correct prediction of the same can fairly improve the time utilization of a taxi driver.
- Host: GitHub
- URL: https://github.com/rbhatia46/nyc-taxi-demand-prediction
- Owner: rbhatia46
- Created: 2019-08-15T06:12:49.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-08-15T06:13:33.000Z (about 6 years ago)
- Last Synced: 2025-01-24T18:37:02.954Z (9 months ago)
- Language: Jupyter Notebook
- Size: 1.66 MB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# NYC-Taxi-Demand-Prediction
### Predict the taxi demand for yellow cabs with the location in next 10 minutes for new york city.This python notebook is to develop machine learning model to predict the taxi demand for yellow cabs in new york city with the data provided by the Taxi & Limousine Commission for yellow cabs. Based on the data, machine learning model predicts the pickup demand of cabs in 10 minutes time frame. In this python notebook different machine learning model have been trained and accuracy is tested.
Data Overview
- pick-up and drop-off dates/times,
- pick-up and drop-off locations,
- trip distances,
- itemized fares,
- rate types,
- payment types,
- driver-reported passenger countsWith the given data first, we will do the data cleaning and convert data into the required format.
To divide new york city into the region so that prediction can be done region vise, we will use K-means algorithm.
Feature importance is an important part for any of the machine learning problem. Here we will use below baseline model by generating feature with ratio and previous value at a time (t-1) and will calculate Mean Absolute Percentage Error.
- Moving Averages
- Weighted Moving Averages
- Exponential Moving AveragesAlong with that, we will use below regression model by selecting best hyper-parameter with the help of different technique depending on hype parameter to predict the taxi demand.
- Linear Regression with GridSearch
- Random Forest Regressor with Random search
- XgBoost Regressor with Random search__Objective: By comparing the different model we will select the best model to predict the Yellow Taxi demand which helps the taxi drivers.__