An open API service indexing awesome lists of open source software.

https://github.com/buchananja/dpyp

A convenience tool for small-scale data pipelines in Python
https://github.com/buchananja/dpyp

data data-analysis data-cleaning data-engineering data-pipeline data-preprocessing data-processing data-science pandas pipeline

Last synced: 7 months ago
JSON representation

A convenience tool for small-scale data pipelines in Python

Awesome Lists containing this project

README

          

# **dpyp**
*A convenience tool for small-scale data pipelines in Python*


  image

## About
dpyp is a data-pipeline convenience tool containing functionality for reading and writing batches, cleaning data, diagnosing pipelines, manipulating text, and calculating fields in Python.

[PyPI](https://pypi.org/project/dpyp/)

## Usage
- dpyp consists of seven modules: 'calculate', 'clean', 'diagnose', 'read', 'text', 'write', and 'transform'.
- Designed for use in small-scale Python pipelines with an emphasis on batch-processing via 'data-dictionaries'.
- Batch processing of data via dictionaries allows iterative functions to improve readability and ease of use.
- Built using a combination of base Python and pandas for writing robust small-scale pipelines with text manipulation capabilities.

## Dependencies
- pandas
- pyarrow
- numpy

## Installation
```bash
pip install dpyp
```

## License
See [LICENSE.md](LICENSE.md)

## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md)