Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/g0bel1n/energybot

Ensae Cloud Computing Project
https://github.com/g0bel1n/energybot

Last synced: about 1 month ago
JSON representation

Ensae Cloud Computing Project

Awesome Lists containing this project

README

        


EnergyBot


EnergyBot is a Python Project for Ensae's Cloud Computing course. It is an interactive platfrom that estimates your energy consumption and provides it with some additional data such as wind estimations, temperatures, suntime, etc ...

---


GitHub Workflow Status
Tests


Licence MIT
Size
Python Version
Docker Image Size (latest by date)

---

## Onboarding

```
docker pull g0bel1n/energybot:latest
```
```
docker run -p 8501:8501 -d g0bel1n/energybot:latest
```

## The repo in details

### Continuous integration

We have 2 (and a half) continuous integration (CI) procedures that are launched at every push to the main branch
- Testing. Pytest collects the test from the tests folder and execute them
- if Testing goes through, a Docker Image is built and pushed onto the docker hub
- Mirroring. The commits are shared with gitlab repository for the course

### Meteo Data

The meteo data folder contains the scripts to get meteorological data from various sites.

The data_setup.sh bash script checks which files are missing, and download them as well as process them for the plateform.
There is two main data providers. Meteonet, which is open-sourced data from MeteoFrance. However, only data for the Northwest and sutheast of France are available. We request the file using a wget command.
The second source is dates-pratiques, a website from which we scrap the sunrise and sunset hours for this year, using BeautifulSoup4.

The advantage of our method is that, we don't need to get data during runtime, so we only have to launch this scripts once.
Even if the base files are about 3 to 4 GB large, once processed, what we need is aroung 30-40 MB. Therefore we decided to add it to in the Docker Image directly instead of running the bash script. It is faster and does not require to waste data downloading the same files at each build. It is a reasonable trade-off as the docker image is about 300MB.

### Talky

We used to work with Rasa. Indeed Rasa is an open source implementation for Natural Language Understanding (NLU) and Dual Intent Models and Entity Transformers (DIET). It can interact with databases, APIs, conversational streams for interactive learning with neural network reinforcement.

The deployment of RASA required three (3) terminals: one for a postgres database, one for the action, and a third for the API launching. This required the creation of three dockerfiles and the linking of these files, which might not be easy. Also, the Bot only recognised a well-defined lexical field and only returned Expressions defined.

Since it was no different from a Question Answer system we hardcoded in the end the bot ourselves, using OOP. It basically handles the logic and computation part of our project.
The class has private and public methods (in Python sense) as some methods should not be interacted with in other scripts. It also have a static method.

### Frontend

We chose to use Streamlit for building our frontend for its simplicity and our experience with it. We also used a non-official package `streamlit-chat` to display the conversation of the user with the bot.

In the frontend, we implemented a simple class for messages and a factory in order to generate unique id's as required by `streamlit-chat`.

### Consumption Prediction

This part is a proxy for a veritable data analysis and forecasting study of energy consumption. We did not dwell to much on it as it was not the primary goal of this project. We synthetized data using an arbitrary formula for estimating a target. We added noise to make it more realistic.

We simply implemented a RandomForest algorithm for prediction without fine-tuning or usual methods such as cross-validation and such.

The data generation is done concurrently as the main bottleneck was due to I/O latences when opening files. We also used caching to avoid to much overhead. It might be worth mentionning that we used wrappers to test input format of some function.

## Image building

If you want to build the EnergyBot image we recommend not using a Windows based-OS. There are two alternatives here :

- The RUN instruction commentted in the Dockerfile: if this is the case, download the data through this line in bash while locating in the ./EnergyBotApp and afterwards build the image
```
chmod +x meteo/get_data.sh data_setup.sh
./data_setup.sh -y 2018
```

- The RUN instruction uncommentted in the Dockerfile: in this situation build the image

To build the image
```
docker build -t energybot1 .
```