https://github.com/felipensp/python-etl
ETL tool written in Python that provides an specific DSL which is translated to Python script to handle input data
https://github.com/felipensp/python-etl
csv etl json parsing python regex scripting sql
Last synced: about 1 year ago
JSON representation
ETL tool written in Python that provides an specific DSL which is translated to Python script to handle input data
- Host: GitHub
- URL: https://github.com/felipensp/python-etl
- Owner: felipensp
- License: mit
- Created: 2021-01-21T20:55:17.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-02-22T15:59:21.000Z (about 5 years ago)
- Last Synced: 2025-01-16T07:58:13.800Z (about 1 year ago)
- Topics: csv, etl, json, parsing, python, regex, scripting, sql
- Language: Python
- Homepage:
- Size: 8.79 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# python-etl
ETL tool written in Python that provides an specific DSL which is translated to Python script to handle input data.
## DSL
```
/(pattern)/ {
// Python code to run when pattern matches
// We can use $1, $2, $N to refer to the matching group
}
or
{
// Python code to run on every csv row
// We can use $0, $1, $N to refer to csv column index
}
```
# Examples
#### ETL file
```
result = []
/(?m:^)(\d{3})-([a-z]+)/ {
result.append({"number": int($1), "description": 'good' if $2 == 'foo' else 'bad'})
}
/(\d+)/ {
result.append({"number": int($1)})
}
print(result)
```
### Using stdin in command-line
```
>python python-etl.py test.etl
123
^Z
[{'number': 123}]
```
```
>python python-etl.py test.etl
123-foo
^Z
[{'number': 123, 'description': 'good'}, {'number': 123}]
```
### Using payload input file
```
>type test\payload | python python-etl.py test.etl
[{'number': 123, 'description': 'good'}, {'number': 1337}, {'number': 123}, {'number': 456}, {'number': 777}]
```
## Processing CSV file
#### ETL file
```
{
print($1)
}
```
### Testing in command line
```
> python python-etl.py csv.etl
id,name
1,felipe
^Z
name
felipe
```