https://github.com/sanjurajveer/water-usage
Deep learning model to predict water usage in muti-family properties in USA
https://github.com/sanjurajveer/water-usage
Last synced: about 2 months ago
JSON representation
Deep learning model to predict water usage in muti-family properties in USA
- Host: GitHub
- URL: https://github.com/sanjurajveer/water-usage
- Owner: sanjurajveer
- Created: 2025-06-26T21:15:55.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-08-10T09:51:02.000Z (2 months ago)
- Last Synced: 2025-08-10T11:40:39.029Z (2 months ago)
- Language: Jupyter Notebook
- Size: 3.72 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Water Usage Prediction using Deep Learning
This project is part of a capstone initiative in collaboration with Connect IOT, focused on predicting water usage in multi-family residential properties using time-series data. The project utilizes advanced data preprocessing techniques and deep learning models to achieve high accuracy in forecasting.
## 🚀 Objective
To build a robust predictive model that can estimate hourly water usage across various units in a residential complex using features like temperature, unit size, time-of-day indicators, and historical consumption patterns.
## 📂 Project Structure
- `capstone_updated_2_part_ml.ipynb`: Jupyter notebook containing the entire pipeline including:
- Data ingestion and preprocessing
- Feature engineering (lag features, rolling statistics, time flags)
- Categorical encoding (label + embeddings)
- Binary classification to detect zero vs non-zero usage
- Deep learning regression model for non-zero usage values
- Evaluation and visualizations## 🛠️ Technologies Used
- Python (Pandas, NumPy, scikit-learn)
- TensorFlow / Keras
- SMOTE (for class imbalance)
- Label Encoding and Embedding for categorical variables
- Matplotlib, Seaborn for data visualization## 🧠 Model Workflow
1. **Data Cleaning & Feature Engineering**:
- Created lag, rolling average, and time decomposition features
- Encoded categorical variables using embeddings and label encoders2. **Classification Task**:
- Trained a binary classifier to predict if water usage = 0 or not
- SMOTE used to handle class imbalance3. **Regression Task**:
- For non-zero predictions, a deep learning model (ANN) was trained to predict the actual water usage amount
- Performance evaluated using MAE, RMSE, R², MAPE## 📊 Key Results
- Achieved over **99% accuracy** in the classification task
- Regression model for non-zero usage delivered **low error rates** with robust generalization## 📈 Future Improvements
- Incorporate weather API integration for real-time forecasting
- Use LSTM or TCN models for improved sequence modeling
- Deploy the model via Flask or FastAPI for real-time inference## 🤝 Acknowledgements
Special thanks to:
- Connect IOT for real-world data and domain support
- UCD Capstone Program & Faculty Advisors
- Open-source contributors whose tools and libraries made this project possible---