Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alessiosavi/gowarehousevalidator
https://github.com/alessiosavi/gowarehousevalidator
Last synced: about 6 hours ago
JSON representation
- Host: GitHub
- URL: https://github.com/alessiosavi/gowarehousevalidator
- Owner: alessiosavi
- License: mit
- Created: 2020-12-16T20:40:03.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2021-06-09T11:57:07.000Z (over 3 years ago)
- Last Synced: 2024-06-21T13:54:24.162Z (7 months ago)
- Language: Go
- Size: 63.5 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GoWarehouseValidator
A simple CLI tool for validate the given datasets
## WHY ?
During the initial phase of the `data warehouse` creation, is necessary to be sure that the input dataset have a
standard format.During the initial loading of the dataset, is crucial to be sure that the input data have the same format and datatype.
By this way, you can implement an ETL processing with less time, avoiding to check/validate the input dataset every time
that a problem occurs in the ETL process.This tool aim to validate the input dataset (for now support only `csv` file) checking that the type of the various
column are the same of the one specified in the configuration file.## HOW ?
The tool uses a configuration file like the following one:
```json
{
// path will save all the file that are necessary to verify
"path": [
"s3://my-bucket-test-s3/covid.csv",
"dataset/covid.csv"
],
// path can be specified as a s3
"_path": "s3://my-bucket/my_csv_file.csv",
// region of the S3
"region": "us-east-2",
// separator of the CSV file
"separator": ",",
// date format as specified in strftime format.
"date_format": "%Y-%m-%dT%H:%M:%S",
// key-value of the csv header - type
"validation": {
"data": "DATE",
"stato": "STRING",
"ricoverati_con_sintomi": "INTEGER",
"terapia_intensiva": "INTEGER",
"totale_ospedalizzati": "INTEGER",
"isolamento_domiciliare": "INTEGER",
"totale_positivi": "INTEGER",
"variazione_totale_positivi": "INTEGER",
"nuovi_positivi": "INTEGER",
"dimessi_guariti": "INTEGER",
"deceduti": "INTEGER",
"casi_da_sospetto_diagnostico": "INTEGER|NULLABLE",
"casi_da_screening": "INTEGER|NULLABLE",
"totale_casi": "INTEGER",
"tamponi": "INTEGER",
"casi_testati": "INTEGER|NULLABLE",
"note": "STRING|NULLABLE",
"ingressi_terapia_intensiva": "INTEGER|NULLABLE",
"note_test": "STRING|NULLABLE",
"note_casi": "STRING|NULLABLE"
}
}
```## Usage
```bash
./GoWarehouseValidator -conf "conf/conf.json"
```### Using go
```bash
go get -v -u github.com/alessiosavi/GoWarehouseValidator./GoWarehouseValidator -conf "conf/conf.json"
```