Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bessouat40/python-data-loader
Python service : insert data from a directory inside postgres db or inside elastic.
https://github.com/bessouat40/python-data-loader
api-rest data-insertion data-loader database databases elasticsearch fastapi kibana postgresql python
Last synced: about 1 month ago
JSON representation
Python service : insert data from a directory inside postgres db or inside elastic.
- Host: GitHub
- URL: https://github.com/bessouat40/python-data-loader
- Owner: Bessouat40
- Created: 2024-03-29T09:22:15.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-07-15T12:51:03.000Z (6 months ago)
- Last Synced: 2024-11-22T00:28:44.219Z (about 1 month ago)
- Topics: api-rest, data-insertion, data-loader, database, databases, elasticsearch, fastapi, kibana, postgresql, python
- Language: Python
- Homepage:
- Size: 742 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Ingestor Service
This project is a python service for data ingestion inside a `Postgres` database and inside an `Elastic`.
- **Postgres :** It takes all `csv files` from `data/postgres` and insert data inside database,
- **Elastic :** It takes all documents from `data/elastic` and insert them inside elastic.![schema](./media/schema.jpeg)
## Usage
- Setup environment :
```bash
mv .env.example .env
```- Fill `.env` with your values
- Launch databases and Python service :
```bash
make start
```## Postgres Service
- Put your csv file(s) inside `data/postgres`folder and then run :
```bash
python utils/ingestPg.py
```### Results Postgres
```bash
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
---------- Start Data Ingestion ----------We found 3 csv files to process
-------------------------
2 lines from .../data-ingestor-service/data/test.csv inserted into test
.../data-ingestor-service/fastapi_service/postgresIngestor.py:57: UserWarning: We can't add data from .../data-ingestor-service/data/test2.csv, columns doesn't match tables from public schema...
warnings.warn(f"We can't add data from {path}, columns doesn't match tables from {SCHEMA} schema...")
2 lines from .../data-ingestor-service/data/test_copy.csv inserted into test-------------------------
Total : 4 inserted lines
---------- End Data Ingestion ----------
INFO: 127.0.0.1:64419 - "POST /ingest HTTP/1.1" 200 OK
```## Elastic Service
- Put your documents inside `data/elastic`folder and then run :
```bash
python utils/ingestElastic.py
```### Results Elastic
```bash
data-ingestor-service-backend-1 | ---------- Start Data Ingestion ----------
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 | We found 2 documents to process
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 | -------------------------
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 | -------------------------
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 | Total : 2 documents inserted
data-ingestor-service-backend-1 |
data-ingestor-service-backend-1 | ---------- End Data Ingestion ----------
data-ingestor-service-backend-1 | INFO: 192.168.0.1:61372 - "GET /ingestElastic HTTP/1.1" 200 OK
```