An open API service indexing awesome lists of open source software.

https://github.com/csakig/bike-sharing-demand-analytics

Advanced analytics on Bike Sharing data using Random Forest, Gradient Boosting, and SARIMAX forecasting.
https://github.com/csakig/bike-sharing-demand-analytics

data-science machine-learning portfolio python scikit-learn time-series

Last synced: 2 months ago
JSON representation

Advanced analytics on Bike Sharing data using Random Forest, Gradient Boosting, and SARIMAX forecasting.

Awesome Lists containing this project

README

          

# 🚲 Bike Sharing Demand Analytics
### 🚀 Optimizing Fleet Operations with Machine Learning & Time Series Forecasting

![Python](https://img.shields.io/badge/Python-3.8%2B-blue?style=for-the-badge&logo=python&logoColor=white)
![Scikit-Learn](https://img.shields.io/badge/scikit--learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)
![Statsmodels](https://img.shields.io/badge/Statsmodels-Time_Series-blueviolet?style=for-the-badge)
![Status](https://img.shields.io/badge/Status-Completed-success?style=for-the-badge)

---

## 📸 Executive Summary
This project aims to solve a critical logistics problem: **predicting bike availability demand to optimize rebalancing operations.**

By analyzing historical usage patterns, I developed a machine learning pipeline that not only predicts exact rental counts ($R^2=0.94$) but also forecasts future demand trends using SARIMAX, enabling proactive fleet management.

### 🏆 Key Visual Insight
*The analysis revealed distinct "Commuter" vs. "Leisure" patterns, which became the primary feature for the model.*

![Demand by Hour](demand_by_hour.png)
*(Fig 1. Hourly demand clearly separates working days (orange) from weekends (blue), dictating the feature engineering strategy.)*

---

## 📊 Business Value & Recommendations
Based on the data, the following operational changes are recommended:

| Insight | Actionable Recommendation | Expected Impact |
| :--- | :--- | :--- |
| **Rush Hour Spikes** | Deploy rebalancing trucks at **9:00 AM** and **3:00 PM** (post-peak). | Prevent station "starvation" during commuting hours. |
| **Weather Sensitivity** | Push notifications/discounts when **Temp < 10°C** or **Humidity > 80%**. | Stabilize revenue during low-demand weather conditions. |
| **Seasonal Trends** | Schedule heavy maintenance in **Spring** (lowest demand). | Maximize fleet availability during Fall/Summer peaks. |

---

## 🧠 Technical Approach

### 1. Data Processing & EDA
* **Feature Engineering:** extracted `hour`, `day_of_week`, and `season` from timestamps.
* **Outlier Detection:** Removed anomalies in windspeed and humidity data.
* **Correlation Analysis:** Identified Temperature as the strongest driver of demand.

![Correlation / Feature Importance](feature_importance.png)

### 2. Modeling Strategy
I tested four algorithms to find the optimal balance between accuracy and interpretability.

| Model | Task | Performance Metric | Verdict |
| :--- | :--- | :--- | :--- |
| **Random Forest** | Regression | **$R^2 = 0.94$** | 🏆 **Best Performer** (Selected) |
| **Gradient Boosting** | Classification | **F1-Score = 0.92** | Excellent for Peak Detection |
| **Linear Regression** | Regression | $R^2 = 0.39$ | Underfitted (Non-linear data) |

### 3. Forecasting (Time Series)
Using **SARIMAX (Seasonal ARIMA with eXogenous variables)**, I modeled the weekly seasonality to project demand for the next 30 days.

![30 Day Forecast](forecast_30_days.png)
*(Fig 3. The model (red) successfully captures the weekly cycle and growth trend with a narrow 95% confidence interval.)*

---

## 🛠️ Installation & Usage

# 1. Clone the repo
git clone [https://github.com/csakig/bike-sharing-demand-analytics.git](https://github.com/csakig/bike-sharing-demand-analytics.git)

# 2. Install dependencies
pip install -r requirements.txt

# 3. Run the analysis
jupyter notebook Bike_Sharing_Demand_Analysis.ipynb

---

📂 Repository Structure
Bike_Sharing_Demand_Analysis.ipynb: Main analysis notebook.

data.csv: Historical dataset.

demand_by_hour.png: Visualization asset.

feature_importance.png: Visualization asset.

forecast_30_days.png: Visualization asset.

---

Author: csakig | Aspiring Data Scientist