Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tzoral/transfer-learning-forecasting
This is the github repository for our research paper on "Transfer Learning for Day-Ahead Load Forecasting: A Case Study on European National Electricity Demand Time Series." This repository serves as a comprehensive resource for the code and experiments conducted as part of our study (https://www.mdpi.com/2227-7390/12/1/19).
https://github.com/tzoral/transfer-learning-forecasting
ensemble-learning hyperparameter-tuning multi-country python time-series-forecasting transfer-learning
Last synced: 16 days ago
JSON representation
This is the github repository for our research paper on "Transfer Learning for Day-Ahead Load Forecasting: A Case Study on European National Electricity Demand Time Series." This repository serves as a comprehensive resource for the code and experiments conducted as part of our study (https://www.mdpi.com/2227-7390/12/1/19).
- Host: GitHub
- URL: https://github.com/tzoral/transfer-learning-forecasting
- Owner: TzorAL
- Created: 2023-11-20T14:30:56.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-25T08:16:53.000Z (7 months ago)
- Last Synced: 2024-11-13T10:41:45.461Z (3 months ago)
- Topics: ensemble-learning, hyperparameter-tuning, multi-country, python, time-series-forecasting, transfer-learning
- Language: Python
- Homepage: https://tzoral.github.io/transfer-learning-forecasting/
- Size: 74.2 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# transfer-learning-forecasting
This is the github repository for our research paper on "Transfer Learning for Day-Ahead Load Forecasting: A Case Study on European National Electricity Demand Time Series." (see more details [here](https://www.mdpi.com/2227-7390/12/1/19)). This repository serves as a comprehensive resource for the code and experiments conducted as part of our study.
## Installation
This project is implemented in [MLFlow](https://mlflow.org/docs/latest/index.html) to handle the different stages in the pipeline. Each stage can be run independently as an entry point, and its inputs and outputs are stored in its respected MLflow run file -- see **MLProject** file for details regarding the inputs of each entry point.
| Entrypoint | Filename |
|:----------:|:-----------------------------:|
| main | main.py |
| load | load_raw_data.py |
| etl | etl.py |
| optuna | forecasting_model_optuna.py |
| model | forecasting_model.py |
| ensemble | forecasting_model_ensemble.py |
| eval | forecasting_model_eval.py |
| snaive* | forecasting_model_naive.py |***snaive** is an independent entrypoint used for comparative evaluation of our scenarios with naive forecasts
**model_utils.py** is a python file containing general-purpose functions used in more than one pipeline stages
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install MLFlow.
```bash
pip install mlflow
```
By executing an entrypoint, **MLProject** checks package dependencies (see **python_env.yaml**) and proceeds to install them.## Data format
Our pipeline is capable of processing multiple files from different or same countries.
Data must:
* all be in a single directory provided at user input (dir_in)
* be in csv format
* contain one datetime column (named "Start") and one column containing energy load data (named "Load")
* be in 1-hour interval
* be named after the country name (e.g Greece) or code (e.g GR) they represent## Usage
As mentioned, **MLProject** offers a wide range of parameters that should be tuned, with regards to each stage in pipeline.
While each entrypoint has its own parameters, **main** entrypoint contains all parameters required for any entrypoint, and distributes them accordingly:| Parameters | Type | Default Value | Description |
|:--------------:|:----:|:--------------------------:|:---------------------------------------------------------------:|
| stages | str | 'all | comma seperated entry point names to execute from pipeline |
| stages | str | 'all' | comma-seperated containing entrypoint names to be run |
| dir_in | str | '../original_data/' | Folder path containing csv files used by the model |
| local_tz | bool | False | flag if you want local (True) or UTC (False) timezone |
| src_countries | str | 'Portugal' | csv names from dir_in used by the source model |
| tgt_countries | str | 'Portugal' | csv names from dir_in used by the target model |
| seed | str | '42' | seed used to set random state to the model |
| train_years | str | '2015,2016,2017,2018,2019' | list of years to use for training set |
| val_years | str | '2020' | list of years to use for validation set |
| test_years | str | '2021' | list of years to use for testing set |
| n_trials | str | '2' | number of trials - different tuning oh hyperparams |
| max_epochs | str | '3' | range of number of epochs used by the model |
| n_layers | str | '1' | range of number of layers used by the model |
| layer_sizes | str | "100" | range of size of each layer used by the model |
| l_window | str | '240' | range of lookback window (input layer size) used by the model |
| f_horizon | str | '24' | range of forecast horizon (output layer size) used by the model |
| l_rate | str | '0.0001' | range of learning rate used by the model |
| activation | str | 'ReLU' | activation functions experimented on by the model |
| optimizer_name | str | 'Adam' | optimizers experimented on by the model |
| batch_size | str | '1024' | batch sizes experimented on by the model |
| transfer_mode | str | "0" | indicator to use transfer learning techniques |
| num_workers | str | '2' | accelerator (cpu/gpu) processesors and threads used |
| tl_model_uri | str | None | uri path for accessing model used for transfer learning |
| n_estimators | str | '3' | number of estimators (models) used in ensembling |
| test_case | str | '1' | indicator of scenario that is being used |**Example:** locally train a model in the Greek-Spanish dataset, apply AbO warm-start transfer learning on Italy and store in an experiment with name "full_pipeline":
```bash
mlflow run . --env-manager=local -P stages='all'
-P src_countries='Greece,Spain' -P tgt_countries='Italy'
-P test_case=2 -P transfer_mode=1
--experiment-name=full_pipeline
```**transfer_mode** and **test_case** are integers determined by the following Enums:
```python
class TestCase(IntEnum):
NAIVE = 0
BASELINE = 1
AbO = 2
CbO = 3class Transfer(IntEnum):
NO_TRANSFER = 0
WARM_START = 1```
The execution of a single entrypoint can be done using the "-e" flag.
**Example:** execute "optuna" entrypoint and store run in an experiment with name "optuna_entrypoint":
```python
mlflow run . --env-manager=local -e optuna --experiment-name=optuna_entrypoint
```## Contributing
Pull requests are welcome. For major changes, please open an issue first
to discuss what you would like to change.## License