Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jcardonamde/datasets_ml
This project analyzes cab and limousine travel data in New York City. This with the goal of predicting the total duration of trips within the city. Machine learning models were used.
https://github.com/jcardonamde/datasets_ml
data-science machine-learning machine-learning-algorithms matplotlib numpy pandas pipelines python seaborn sklearn
Last synced: 11 days ago
JSON representation
This project analyzes cab and limousine travel data in New York City. This with the goal of predicting the total duration of trips within the city. Machine learning models were used.
- Host: GitHub
- URL: https://github.com/jcardonamde/datasets_ml
- Owner: jcardonamde
- Created: 2022-11-10T03:54:29.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-12-11T17:10:46.000Z (almost 2 years ago)
- Last Synced: 2023-03-04T18:33:03.161Z (over 1 year ago)
- Topics: data-science, machine-learning, machine-learning-algorithms, matplotlib, numpy, pandas, pipelines, python, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage: https://www.loom.com/share/8e2b86e8eb1f40a2b67c20f5ab0cf1e9
- Size: 2.45 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# New York City Taxi Trip Duration
![](https://docs.google.com/drawings/d/e/2PACX-1vRLrhh818nyaxd16zQGBnHCV325Gl2JGgCJFUQqJ9GIi-EQ3BtpeE0qz-4DaasifP3tAgW4Kztxt2tQ/pub?w=687&h=386)
At one time or another, almost all of us have used an Uber or other transportation service in this digital age to take a ride. Ridesharing services are services that use online-enabled platforms to connect between passengers and local drivers using their personal vehicles.
In most cases they are a convenient method for door-to-door transportation. They are generally cheaper than using licensed cabs. Examples of ridesharing services include Uber, Cabify, Beat, Didi, etc.
To improve the efficiency of cab dispatch systems for such services, it is important to be able to predict how long a driver will have their cab occupied. If a dispatcher knew approximately when a cab driver would finish their current trip, they could better identify which driver to assign to each pickup request.
This project worked with a dataset published by the New York City Taxi and Limousine Commission, which includes pickup time, geographic coordinates, number of passengers among other variables. The goal of this project is to predict the total duration of cab trips in New York City.
👉 The dataset used for this analysis was downloaded [here](https://www.kaggle.com/c/nyc-taxi-trip-duration)
💻📚 Libraries used: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn.
:microscope::dart: Applied models: Linear Regression, Regression Tree, Regression XGBoost and Regression KNN.
👀:bar_chart: Previews:
![](https://docs.google.com/drawings/d/e/2PACX-1vT71-ztcKxRuR5k8vL7Xwj_4Rwyech9vlwYkH5cG8h9Ihf6RhPj1fCw1-uIE_O4O-OtNfX8AQ3s-47l/pub?w=745&h=562)
![](https://docs.google.com/drawings/d/e/2PACX-1vRDyW_PQpwmmpEDO0putBjbiIP3QepLFXcazg6Z4lrgDOZrcka6oc77IMY2jvYdFotfQORX8ZJ3eUxW/pub?w=959&h=537)
![](https://docs.google.com/drawings/d/e/2PACX-1vRMJxGVooqZOS-61DMQ1thq8Nhxb62SArATlxy23qcx6G-tOwmvN5WGvEqtdX_RZTzBVIZH2689dmgJ/pub?w=914&h=518)
![](https://docs.google.com/drawings/d/e/2PACX-1vQynD4knXrhNVvKRB8tc-3GuFSEkF-S8ajHCNzdJe6385Z8brsgTS0cXOYRPmsM9G6pWB73r1ic_Z-W/pub?w=915&h=354)
![](https://docs.google.com/drawings/d/e/2PACX-1vRmvGaZqj53ac1losjZ4f0PJvh2-TsLBG2FDaYog5gRRYywZAHdz0Qn1iZxwm7EsYTWDWCQg6z5QLUz/pub?w=925&h=348)
![](https://docs.google.com/drawings/d/e/2PACX-1vTGVwU_nrYQVfe1qTKFRBB87PQwWBCBV0F70veX4N41YmesYy4a5QDqxESX9M5zydxWMzfMXwNmJFXN/pub?w=922&h=347)
![](https://docs.google.com/drawings/d/e/2PACX-1vR6G_M6QKq7bezu7bgCjA69reLA2C5irNGUFYWhKz6UI5bLfKAKp59ZbJWA87ockeVxNKsHjPI8B9DZ/pub?w=916&h=342)