Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dominodatalab/reference-project-wind-turbine
https://github.com/dominodatalab/reference-project-wind-turbine
Last synced: 29 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/dominodatalab/reference-project-wind-turbine
- Owner: dominodatalab
- License: apache-2.0
- Created: 2023-09-21T21:18:24.000Z (over 1 year ago)
- Default Branch: release-1.0.0
- Last Pushed: 2024-02-05T18:54:53.000Z (11 months ago)
- Last Synced: 2024-02-05T20:19:18.102Z (11 months ago)
- Language: Jupyter Notebook
- Size: 2.56 MB
- Stars: 0
- Watchers: 5
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Wind Turbine Output Prediction using SCADA data
## License
This template is licensed under Apache 2.0 and contains the following open source components:
* scikit-Learn [BSD 3](https://github.com/scikit-learn/scikit-learn/blob/main/COPYING)
* pandas [BSD 3](https://github.com/pandas-dev/pandas/blob/main/LICENSE)
* matplotlib [MDT](https://matplotlib.org/stable/users/project/license.html)## Context
In this project we train a predictive model on Supervisory Control and Data Acquisition (SCADA) data captured from a physical wind turbine. SCADA systems are used for controlling, monitoring, and analyzing industrial devices and processes. The SCADA concept was developed to be a universal means of remote-access to a variety of local control modules, which could be from different manufacturers and allowing access through standard automation protocols.Here we demonstrate how we can train a machine learning model using a freely available SCADA dataset, which comes from [Kaggle](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset)
## Dataset
The samples in this dataset are distributed as a .CSV file with the following attributes:* Date/Time --- timestamp of the observation (10 minutes intervals)
* LV ActivePower (kW) --- The amount of power generated by the turbine at that timestamp (in kWh)
* Wind Speed (m/s) --- The wind speed as measured at the hub height of the turbine
* Theoretical_Power_Curve (KWh) --- The theoretical power values that the turbine generates with that wind speed as provided by the turbine manufacturer
* Wind Direction (degrees) --- The wind direction at the hub height of the turbine (the turbine turns in this direction automaticaly)## Assets
This project contains the following assets* ```WindTurbineScada.ipynb``` --- a notebok demonstrating data ingestion, exploratory data analysis, model building and evaluation
* ```train.py``` --- a model training script, which can be run as a [Domino job](https://docs.dominodatalab.com/en/latest/user_guide/942549/jobs/) to retrain the model (i.e. if new data is available)
* ```score.py``` --- a scoring function, which can be deployed as a [Domino Model API](https://docs.dominodatalab.com/en/latest/user_guide/8dbc91/deploy-models-at-rest/)
* ```model.bin``` --- a pickled version of a pre-trained ```ExtraTreesRegressor``` model
* ```data/T1.csv``` --- the original dataset### Hardware Requirements
This project works with a standard small-sized hardware tier, such as the small-k8s tier on all Domino deployments.### Environment Requirements
This project can be run with a Domino Standard Compute Environment that has Python 3.9 or above.