https://github.com/devprojectekla/heliocity

HelioCity challenge. This project, developed as a challenge, focuses on importing data from .csv files into a PostgreSQL database and performing various data manipulation tasks.
https://github.com/devprojectekla/heliocity

Last synced: about 2 months ago
JSON representation

HelioCity challenge. This project, developed as a challenge, focuses on importing data from .csv files into a PostgreSQL database and performing various data manipulation tasks.

Host: GitHub
URL: https://github.com/devprojectekla/heliocity
Owner: DevprojectEkla
Created: 2024-05-07T13:32:35.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-06-20T16:25:18.000Z (about 2 years ago)
Last Synced: 2025-03-22T16:11:45.454Z (over 1 year ago)
Language: Python
Homepage:
Size: 64.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

          
# Heliocity: Backend Challenge

## Description of Various Functionalities

> ### 0a. The `main.py` file provides an overview of a possible scenario (see section C below) combining all described functionalities.

> ### 0b. The `tests.py` file allows running tests for different functionalities.

### 1. The `database_handler.py` file with the `DatabaseHandler` class and its `process_csv_file()` method

- Imports `.csv` files into the database:

    1. From the weather API

    2. From the calculator

> ### Import Optimization:

> Uses various parallelism methods (`map_async`, `apply_async`, `map`) from Python's native `multiprocessing.Pool` class.

>    

> Strategies under development:

> - Splitting into smaller files

> - Implementation in a low-level language like Rust

### 2. The `database_selector.py` file and its associated `DatabaseSelector` class with various methods

- Adjusts weather data from a 5-minute to a 15-minute time step to ajust to the calculator's step.

  

> Upcoming features:

> - Dynamic specification of initial and target time steps

> - Creates SQL sub-tables containing the selected data range (time range, temperature, etc.) generated from the original table.

### 3. The `json_generator.py` file and its associated `JSONGenerator` class

- Manipulates database data to generate a `.json` file for visualization.

- Provides data preview with the option to filter out aberrant values.

# Getting Started

## Create a Virtual Environment

### A. Prerequisites

> Guidelines for a Linux environment

- Configured and running PostgreSQL server.

- Creation and configuration of a new database.

- Edit the `config.json` file with necessary parameters for connecting to the database.

### B. Installation

#### Clone files from the Git repo:

```bash

git clone https://github.com/DevprojectEkla/HelioCity

cd HelioCity

```

### Create a Virtual Environment:

```bash

python -m venv env

```

### Activate the Virtual Environment:

```bash

source env/bin/activate  # On Linux

```

### Install Dependencies:

```bash

pip install -r requirements.txt

```

### (Optional) Create a `data/` Folder for Your `.csv` Files:

```bash

mkdir data

```

## C. Usage Scenario Example Using Our Classes

The `main.py` file can be launched with arguments; otherwise, a series of prompts will ask for:

- The table name (either an existing table name or the name for a new table to be created in the database from the imported file).

- If applicable, the name of the `.csv` file to import into the database.

- Optionally, use the `-f` flag to specify a simple import method; absence of the flag defaults to a parallelism-based import.

```bash

python main.py [table_name] [path_to_csv_file] [-f]

```

### Imagined Scenario Type:

- Import a table from `./data/meteo_data.csv` in preprocessing or `./data/test_helio.csv` in post-processing.

- Filter out aberrant data and specify a time interval.

- Insert a new variable called `python_calc`* into a table for time-based representation.

> * In this scenario, it involves preprocessing wind chill temperature as a function of temperature, wind speed, and relative humidity. In post-processing, it's a test calculation (to be adjusted with a relevant formula).

- Generate a `.json` file from this preview data for future use in another context.

## D. Independent Usage of Different Scripts

### Importing CSV Data into PostgreSQL

#### Run `database_handler.py`:

```bash

python database_handler.py

```

You will be prompted for:

- The name of the new table to create (default: `meteo_data`).

- The path to the `.csv` data file (default: `./data/meteo_data.csv`).

- Specify data origin (weather or calculator); calculator column processing takes place in adjustable portions of the number of lines answered `'y'` if it's a large file. 'n' or '' in the case of a large file.

> Warning: Importing large `.csv` files from the calculator can take some time depending on the computer's memory capabilities. Adjust the value of the number of lines per portion to available memory.

### Post and Pre-Processing Data Manipulations

#### Using `DatabaseSelector` Class from `database_selector.py`:

Data manipulations can be performed using the `DatabaseSelector` class to create new tables in the database. It allows:

- Creating sub-tables by interval of interest.

- Aggregating weather data at the calculator's timestep.

- Inserting calculated variables from existing table variables.

For a test, simply run the command:

```bash

python database_selector.py

```

Follow the instructions...

#### Using `JSONGenerator` Class from `json_generator.py`:

This class only reads from the database and does not write to it. It facilitates easy manipulation of data in dataframes for visualization and is used to generate a `.json` format.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/devprojectekla/heliocity

Awesome Lists containing this project

README