https://github.com/hariprasath-v/av-job-a-thon-november-2022
Build a machine learning/deep learning approach to forecast the total energy demand on an hourly basis for the next 3 years based on past trends.
https://github.com/hariprasath-v/av-job-a-thon-november-2022
exploratory-data-analysis kaggle lightgbm-regressor matplotlib numpy pandas python rmse-score seaborn sklearn statsmodels timeseries-forecasting
Last synced: 4 months ago
JSON representation
Build a machine learning/deep learning approach to forecast the total energy demand on an hourly basis for the next 3 years based on past trends.
- Host: GitHub
- URL: https://github.com/hariprasath-v/av-job-a-thon-november-2022
- Owner: hariprasath-v
- License: apache-2.0
- Created: 2022-11-22T06:48:12.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-07-04T07:29:09.000Z (almost 2 years ago)
- Last Synced: 2025-01-13T01:45:01.605Z (5 months ago)
- Topics: exploratory-data-analysis, kaggle, lightgbm-regressor, matplotlib, numpy, pandas, python, rmse-score, seaborn, sklearn, statsmodels, timeseries-forecasting
- Language: HTML
- Homepage:
- Size: 5.42 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AV-job-a-thon-november-2022
### Competition hosted on Analyticsvidhya
# About
### Build a machine learning/deep learning approach to forecast the total energy demand on an hourly basis for the next 3 years based on past trends.
### Initially, I thought a machine learning approach itself was enough to handle any time-series-based problem but this competition has proved that I was totally wrong.
### In one of the previous time-series-based competitions, I used the boosting regressor model and it gave me a good leaderboard rank. I tried the same approach in this competition and it become a major blunder and had given the worst rank in the private leaderboard.
### The machine learning-based approach didn't learn signals or patterns from the train data, the model learned only noise from the data.
### Finally, for this problem, the boosting algorithm well performed on the train data and failed to generalize on the test data(Overfitting!).
### I tried the seasonal decomposing linear model and it performed better than the boosting model.
### Final Competition score is 583.858113004428
### Leaderboard Rank is 207
### Evaluation Metric is RMSE.
### File information
* av-job-a-thon-november-2022-eda.ipynb [](https://www.kaggle.com/hari141v/av-job-a-thon-november-2022-eda)
#### Basic Exploratory Data Analysis
#### Packages Used,
* seaborn
* Pandas
* Numpy
* Matplotlib
* av-job-a-thon-november-2022-model.ipynb [](https://www.kaggle.com/hari141v/av-job-a-thon-november-2022-model)
#### Data Pre-processing and model.
#### Packages Used,
* Sklearn
* Pandas
* Numpy
* Matplotlib
* Lightgbm
* shap
#### Created light gradient boosting regressor model and evaluated with RMSE.
#### [For more detailed information about the model.](https://github.com/hariprasath-v/AV-job-a-thon-november-2022/blob/main/Approach_AV_job_a_thon_November_2022.pdf)
### Lightgbm Model Feature Importances
### SHAP Lightgbm Model Feature Importances
### SHAP Top feature impact the model
### SHAP Top feature influences the single observation
### Energry Demand for Next 3 Years
