https://github.com/buchananja/dpyp
A convenience tool for small-scale data pipelines in Python
https://github.com/buchananja/dpyp
data data-analysis data-cleaning data-engineering data-pipeline data-preprocessing data-processing data-science pandas pipeline
Last synced: 7 months ago
JSON representation
A convenience tool for small-scale data pipelines in Python
- Host: GitHub
- URL: https://github.com/buchananja/dpyp
- Owner: buchananja
- License: mit
- Created: 2024-01-24T22:17:58.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-03T11:19:56.000Z (over 1 year ago)
- Last Synced: 2025-07-05T00:48:10.058Z (7 months ago)
- Topics: data, data-analysis, data-cleaning, data-engineering, data-pipeline, data-preprocessing, data-processing, data-science, pandas, pipeline
- Language: Python
- Homepage: https://pypi.org/project/dpyp/
- Size: 4.28 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
Awesome Lists containing this project
README
# **dpyp**
*A convenience tool for small-scale data pipelines in Python*
## About
dpyp is a data-pipeline convenience tool containing functionality for reading and writing batches, cleaning data, diagnosing pipelines, manipulating text, and calculating fields in Python.
[PyPI](https://pypi.org/project/dpyp/)
## Usage
- dpyp consists of seven modules: 'calculate', 'clean', 'diagnose', 'read', 'text', 'write', and 'transform'.
- Designed for use in small-scale Python pipelines with an emphasis on batch-processing via 'data-dictionaries'.
- Batch processing of data via dictionaries allows iterative functions to improve readability and ease of use.
- Built using a combination of base Python and pandas for writing robust small-scale pipelines with text manipulation capabilities.
## Dependencies
- pandas
- pyarrow
- numpy
## Installation
```bash
pip install dpyp
```
## License
See [LICENSE.md](LICENSE.md)
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md)