https://github.com/austinv11/pypeline

A simple data pipeline builder for Python 3+
https://github.com/austinv11/pypeline

data leveldb pypeline python python3 stream-processing

Last synced: 5 months ago
JSON representation

A simple data pipeline builder for Python 3+

Host: GitHub
URL: https://github.com/austinv11/pypeline
Owner: austinv11
License: apache-2.0
Created: 2018-09-04T03:30:34.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-10-05T01:15:06.000Z (over 7 years ago)
Last Synced: 2025-07-22T14:39:48.906Z (6 months ago)
Topics: data, leveldb, pypeline, python, python3, stream-processing
Language: Python
Homepage: https://pypi.org/project/data-pypeline/
Size: 63.5 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Pypeline

This is a package for creating iterative data processing pipelines. Note that

this is NOT a general purpose stream processing library. It is only designed as 

being a low overhead and simple-to-setup stream processing library. So for

large scale production applications, use something like kafka instead.

## Warning

This library is still at an ALPHA stage. So things may not work as intended

and the api is not final!

## Trivial Example

```python

from pypeline import build_action, Pypeline, ForkingPypelineExecutor, wrap

import asyncio

async def step1():

    results = []

    for i in range(1000):

        results.append(wrap(i))

    return results

async def step2(i):

    return i * 10

    

async def step3(i):

    return i + 1

async def run_pipeline():

    pypeline = Pypeline()

    # Adding actions to the pipeline

    pypeline.add_action(build_action("Step1", step1)) \ 

            .add_action(build_action("Step2", step2)) \

            .add_action(build_action("Step3", step3, serialize_dir="./example"))  # Serialize results so future runs will skip this step entirely

    results = await pypeline.run(executor=ForkingPypelineExecutor())  # Custom executor that avoids the GIL

    # Results are wrapped in a utility namedtuple, so let's flatten it.

    results = [r.args[0] for r in results]

    return results

    

results = asyncio.get_event_loop().run_until_complete(run_pipeline())

for result in results:

    print(result)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/austinv11/pypeline

Awesome Lists containing this project

README