{"id":26417706,"url":"https://github.com/compcode1/energy-output-prediction","last_synced_at":"2025-10-25T13:42:33.722Z","repository":{"id":262182273,"uuid":"886459863","full_name":"Compcode1/energy-output-prediction","owner":"Compcode1","description":"Develop a predictive model to forecast the hourly electrical energy output of a Combined Cycle Power Plant (CCPP) based on ambient environmental conditions.","archived":false,"fork":false,"pushed_at":"2024-11-11T02:36:44.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-11T03:26:17.679Z","etag":null,"topics":["machine-learning","random-forest","xgboost"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Compcode1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-11T02:29:39.000Z","updated_at":"2024-11-11T02:38:02.000Z","dependencies_parsed_at":"2024-11-11T03:26:22.119Z","dependency_job_id":"236a0571-7f42-4ac8-be8d-859363073a53","html_url":"https://github.com/Compcode1/energy-output-prediction","commit_stats":null,"previous_names":["compcode1/energy-output-prediction"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Compcode1%2Fenergy-output-prediction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Compcode1%2Fenergy-output-prediction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Compcode1%2Fenergy-output-prediction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Compcode1%2Fenergy-output-prediction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Compcode1","download_url":"https://codeload.github.com/Compcode1/energy-output-prediction/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244135914,"owners_count":20403798,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","random-forest","xgboost"],"created_at":"2025-03-18T01:15:53.215Z","updated_at":"2025-10-25T13:42:33.668Z","avatar_url":"https://github.com/Compcode1.png","language":"Jupyter Notebook","readme":"In this project, we chose Random Forest and XGBoost as our primary algorithms for predicting the energy output of a Combined Cycle Power Plant (CCPP) based on environmental conditions. These models were selected due to their effectiveness with regression tasks and their ability to handle complex, nonlinear relationships within the data.\n\nRandom Forest: This ensemble method is known for its robustness and simplicity. It reduces overfitting by averaging the predictions of multiple decision trees, each trained on different data samples.\nXGBoost: This gradient boosting algorithm improves prediction accuracy through an iterative process, correcting errors from previous trees. It is highly efficient and offers powerful hyperparameter tuning capabilities, making it well-suited for this regression task.\nThe task type is regression, as we aim to predict a continuous output (electrical energy in MW) based on the features Temperature (AT), Exhaust Vacuum (V), Ambient Pressure (AP), and Relative Humidity (RH).\n\n6.2.2 Model Building\nTo determine the best model for this task, we followed a systematic model-building approach:\n\nBaseline Models: Both Random Forest and XGBoost models were initially trained with default settings to establish baseline performance.\nValidation Strategy: We split the data into training, validation, and test sets with an 80-10-10 split. This setup allowed us to tune the models on the validation set and test them on unseen data for final performance evaluation.\nHyperparameter Tuning:\nFor Random Forest, we used a grid search to tune parameters like n_estimators, max_depth, and min_samples_split.\nFor XGBoost, we used grid search to optimize n_estimators, max_depth, learning_rate, and colsample_bytree.\nModel Comparison: After tuning, we compared both models based on performance metrics on the validation set. XGBoost showed slightly better performance, so it was selected as the final model.\n6.2.3 Model Evaluation\nThe final XGBoost model was evaluated on the test set using the following metrics:\n\nMean Absolute Error (MAE): 2.02 MW\nRoot Mean Squared Error (RMSE): 2.97 MW\nR-squared (R²): 0.970\nThese metrics confirm the model’s ability to generalize well on new data. The low MAE and RMSE values indicate high accuracy, while the high R² value (97.0%) suggests that the model captures most of the variance in energy output.\n\n6.2.4 Model Interpretation\nTo interpret the model’s predictions, we conducted a feature importance analysis using SHAP (SHapley Additive exPlanations). SHAP values helped us understand the impact of each feature on the model's predictions:\n\nTemperature (AT): The most influential feature, where higher temperatures decrease predicted power output, while lower temperatures increase it.\nExhaust Vacuum (V): The second most impactful feature, with higher values reducing power output predictions.\nRelative Humidity (RH): Shows a non-linear impact, suggesting a more complex relationship with power output.\nAmbient Pressure (AP): Has a smaller impact on predictions, with a narrow spread of SHAP values.\nImplications for CCPP Energy Optimization\nThe feature importance analysis provides actionable insights for optimizing power output at a Combined Cycle Power Plant. For example:\n\nMonitoring Temperature and Exhaust Vacuum closely can help predict power output levels, allowing the plant to adjust operating parameters in response to environmental conditions.\nUnderstanding the effect of Relative Humidity and Ambient Pressure enables more nuanced adjustments, potentially helping the plant to maintain stable power output across different weather conditions.\nThis model interpretation improves the explainability of our predictions and highlights critical environmental factors affecting power output, supporting the power plant's goal of optimizing energy production efficiency.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompcode1%2Fenergy-output-prediction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompcode1%2Fenergy-output-prediction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompcode1%2Fenergy-output-prediction/lists"}