An open API service indexing awesome lists of open source software.

https://github.com/malondaclement/datalake

DataLake project 💾
https://github.com/malondaclement/datalake

datalake mysql python3

Last synced: 3 months ago
JSON representation

DataLake project 💾

Awesome Lists containing this project

README

          

# DataLake

![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54) ![MySQL](https://img.shields.io/badge/mysql-%2300f.svg?style=for-the-badge&logo=mysql&logoColor=white)

-----

## 1. Database schema

![Database Schema](img/database_schema.png)

**Fig. 1** - _Database Schema_

## 2. How to use this project
### 2.1 Init the database

```
python3 main.py init
```

### 2.2 Insert data in the database
```
python3 main.py insert
```
#### 2.2.1 Insert classification dataset
Dataset tree:
* root
* images
* label_1
* image1.jpg
* image2.jpg
* ...
* label_2
* image1.jpg
* image2.jpg
* ...
* label_3
* ...
* labels.csv

labels.csv columns name `image`, `label`

#### 2.2.2 Insert detection dataset
##### 2.2.2.a XML format
Dataset tree:
* root
* images
* image1.jpg
* image2.jpg
* ...
* labels
* label1.xlm
* label2.xlm
* ...

```xml

000005.jpg

500
375
3


chair

263
211
324
339



chair

165
264
253
372



chair

5
244
67
374



chair

241
194
295
299



chair

277
186
312
220

```

##### 2.2.2.b CSV format
Dataset tree:
* root
* images
* image1.jpg
* image2.jpg
* ...
* labels.csv

labels.csv columns name `image`, `label`, `xmin`, `ymin`, `xmax`, `ymax`

### 2.3 Create a new dataset from the data in the database
```
python3 main.py create
```

The description of the new dataset is a json file :

```json
{
"type": "[classif|detection|segmentation]",
"path": "/path/to/the/root/directory",
"classes": {
"label_1": ["other_label_1", "other_label_1"],
"label_2": [],
"label_3": ["other_label_3"]
}
}
```

### 2.4 List label names or datasets
```
python3 main.py list
```

### 2.5 Clear all the database
```
python3 main.py clear
```