Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/gistart/asyncio-pool

Pool for asyncio with multiprocessing, threading and gevent -like interface
https://github.com/gistart/asyncio-pool
async-python asyncio pool python3-asyncio
Last synced: 4 months ago
JSON representation
Pool for asyncio with multiprocessing, threading and gevent -like interface
Host: GitHub
URL: https://github.com/gistart/asyncio-pool
Owner: gistart
License: mit
Created: 2018-07-20T12:18:08.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-08-28T06:33:11.000Z (over 1 year ago)
Last Synced: 2024-04-14T06:55:56.188Z (10 months ago)
Topics: async-python, asyncio, pool, python3-asyncio
Language: Python
Homepage:
Size: 60.5 KB
Stars: 113
Watchers: 5
Forks: 11
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project

README

        # asyncio-pool

Pool of asyncio coroutines with familiar interface. Supports python 3.5+ (including PyPy 6+, which is also 3.5 atm)

AioPool makes sure _no more_ and _no less_ (if possible) than `size` spawned coroutines are active at the same time. _spawned_ means created and scheduled with one of the pool interface methods, _active_ means coroutine function started executing it's code, as opposed to _waiting_ -- which waits for pool space without entering coroutine function.

## Interface

Read [code doctrings](../master/asyncio_pool/base_pool.py) for details.

#### AioPool(size=4, *, loop=None)

Creates pool of `size` concurrent tasks. Supports async context manager interface.

#### spawn(coro, cb=None, ctx=None)

Waits for pool space, then creates task for `coro` coroutine, returning future for it's result. Can spawn coroutine, created by `cb` with result of `coro` as first argument. `ctx` context is passed to callback as third positinal argument.

#### exec(coro, cb=None, ctx=None)

Waits for pool space, then creates task for `coro`, then waits for it to finish, then returns result of `coro` if no callback is provided, otherwise creates task for callback, waits for it and returns result of callback.

#### spawn_n(coro, cb=None, ctx=None)

Creates waiting task for `coro`, returns future without waiting for pool space. Task is executed "in pool" when pool space is available.

#### join()

Waits for all spawned (active and waiting) tasks to finish. Joining pool from coroutine, spawned by the same pool leads to *deadlock*.

#### cancel(*futures)

Cancels spawned tasks (active and waiting), finding them by provided `futures`. If no futures provided -- cancels all spawned tasks.

#### map(fn, iterable, cb=None, ctx=None, *, get_result=getres.flat)

Spawns coroutines created by `fn` function for each item in `iterable` with `spawn`, waits for all of them to finish (including callbacks), returns results maintaining order of `iterable`.

#### map_n(fn, iterable, cb=None, ctx=None, *, get_result=getres.flat)

Spawns coroutines created by `fn` function for each item in `iterable` with `spawn_n`, returns futures for task results maintaining order of `iterable`.

#### itermap(fn, iterable, cb=None, ctx=None, *, flat=True, get_result=getres.flat, timeout=None, yield_when=asyncio.ALL_COMPLETED)

Spawns tasks with `map_n(fn, iterable, cb, ctx)`, then waits for results with `asyncio.wait` function, yielding ready results one by one if `flat` == True, otherwise yielding list of ready results.

## Usage

`spawn` and `map` methods is probably what you should use in 99% of cases. Their overhead is minimal (~3% execution time), and even in worst cases memory usage is insignificant.

`spawn_n`, `map_n` and `itermap` methods give you more control and flexibily, but they come with a price of higher overhead. They spawn all tasks that you want, and most of the tasks wait their turn "in background". If you spawn too much (10**6+ tasks) -- you'll use most of the memory you have in system, also you'll lose a lot of time on "concurrency management" of all the tasks spawned.

Play with `python tests/loadtest.py -h` to understand what you want to use.

Usage examples (more in [tests/](../master/tests/) and [examples/](../master/examples/)):

```python

async def worker(n):  # dummy worker

    await aio.sleep(1 / n)

    return n

async def spawn_n_usage(todo=[range(1,51), range(51,101), range(101,200)]):

    futures = []

    async with AioPool(size=20) as pool:

        for tasks in todo:

            for i in tasks:  # too many tasks

                # Returns quickly for all tasks, does not wait for pool space.

                # Workers are not spawned, they wait for pool space in their

                # own background tasks.

                fut = pool.spawn_n(worker(i))

                futures.append(fut)

        # At this point not a single worker should start.

        # Context manager calls `join` at exit, so this will finish when all

        # workers return, crash or cancelled.

    assert sum(itertools.chain.from_iterable(todo)) == \

        sum(f.result() for f in futures)

async def spawn_usage(todo=range(1,4)):

    futures = []

    async with AioPool(size=2) as pool:

        for i in todo:  # 1, 2, 3

            # Returns quickly for 1 and 2, then waits for empty space for 3,

            # spawns 3 and returns. Can save some resources I guess.

            fut = await pool.spawn(worker(i))

            futures.append(fut)

        # At this point some of the workers already started.

        # Context manager calls `join` at exit, so this will finish when all

        # workers return, crash or cancelled.

    assert sum(todo) == sum(fut.result() for fut in futures)  # all done

async def map_usage(todo=range(100)):

    pool = AioPool(size=10)

    # Waits and collects results from all spawned workers,

    # returns them in same order as `todo`, if worker crashes or cancelled:

    # returns exception object as a result.

    # Basically, it wraps `spawn_usage` code into one call.

    results = await pool.map(worker, todo)

    # await pool.join()  # is not needed here, bcs no other tasks were spawned

    assert isinstance(results[0], ZeroDivisionError) \

        and sum(results[1:]) == sum(todo)

async def itermap_usage(todo=range(1,11)):

    result = 0

    async with AioPool(size=10) as pool:

        # Combines spawn_n and iterwait, which is a wrapper for asyncio.wait,

        # which yields results of finished workers according to `timeout` and

        # `yield_when` params passed to asyncio.wait (see it's docs for details)

        async for res in pool.itermap(worker, todo, timeout=0.5):

            result += res

        # technically, you can skip join call

    assert result == sum(todo)

async def callbacks_usage():

    async def wrk(n):  # custom dummy worker

        await aio.sleep(1 / n)

        return n

    async def cb(res, err, ctx):  # callback

        if err:  # error handling

            exc, tb = err

            assert tb  # the only purpose of this is logging

            return exc

        pool, n = ctx  # context can be anything you like

        await aio.sleep(1 / (n-1))

        return res + n

    todo = range(5)

    futures = []

    async with AioPool(size=2) as pool:

        for i in todo:

            fut = pool.spawn_n(wrk(i), cb, (pool, i))

            futures.append(fut)

    results = []

    for fut in futures:

        # there are helpers for result extraction. `flat` one will do

        # exactly what's written below

        #   from asyncio_pool import getres

        #   results.append(getres.flat(fut))

        try:

            results.append(fut.result())

        except Exception as e:

            results.append(e)

    # First error happens for n == 0 in wrk, exception of it is passed to

    # callback, callback returns it to us. Second one happens in callback itself

    # and is passed to us by pool.

    assert all(isinstance(e, ZeroDivisionError) for e in results[:2])

    # All n's in `todo` are passed through `wrk` and `cb` (cb adds wrk result

    # and # number, passed inside context), except for n == 0 and n == 1.

    assert sum(results[2:]) == 2 * (sum(todo) - 0 - 1)

async def exec_usage(todo=range(1,11)):

    async with AioPool(size=4) as pool:

        futures = pool.map_n(worker, todo)

        # While other workers are waiting or active, you can "synchronously"

        # execute one task. It does not interrupt  others, just waits for pool

        # space, then waits for task to finish and then returns it's result.

        important_res = await pool.exec(worker(2))

        assert 2 == important_res

        # You can continue working as usual:

        moar = await pool.spawn(worker(10))

    assert sum(todo) == sum(f.result() for f in futures)

async def cancel_usage():

    async def wrk(*arg, **kw):

        await aio.sleep(0.5)

        return 1

    pool = AioPool(size=2)

    f_quick = pool.spawn_n(aio.sleep(0.1))

    f12 = await pool.spawn(wrk()), pool.spawn_n(wrk())

    f35 = pool.map_n(wrk, range(3))

    # At this point, if you cancel futures, returned by pool methods,

    # you just won't be able to retrieve spawned task results, task

    # themselves will continue working. Don't do this:

    #   f_quick.cancel()

    # use `pool.cancel` instead:

    # cancel some

    await aio.sleep(0.1)

    cancelled, results = await pool.cancel(f12[0], f35[2])  # running and waiting

    assert 2 == cancelled  # none of them had time to finish

    assert 2 == len(results) and \

        all(isinstance(res, aio.CancelledError) for res in results)

    # cancel all others

    await aio.sleep(0.1)

    # not interrupted and finished successfully

    assert f_quick.done() and f_quick.result() is None

    cancelled, results = await pool.cancel()  # all

    assert 3 == cancelled

    assert len(results) == 3 and \

        all(isinstance(res, aio.CancelledError) for res in results)

    assert await pool.join()  # joins successfully

async def details(todo=range(1,11)):

    pool = AioPool(size=5)

    # This code:

    f1 = []

    for i in todo:

        f1.append(pool.spawn_n(worker(i)))

    # is equivalent to one call of `map_n`:

    f2 = pool.map_n(worker, todo)

    # Afterwards you can await for any given future:

    try:

        assert 3 == await f1[2]  # result of spawn_n(worker(3))

    except Exception as e:

        # exception happened in worker (or CancelledError) will be re-raised

        pass

    # Or use `asyncio.wait` to handle results in batches (see `iterwait` also):

    important_res = 0

    more_important = [f1[1], f2[1], f2[2]]

    while more_important:

        done, more_important = await aio.wait(more_important, timeout=0.5)

        # handle result, note it will re-raise exceptions

        important_res += sum(f.result() for f in done)

    assert important_res == 2 + 2 + 3

    # But you need to join, to allow all spawned workers to finish

    # (of course you can `asyncio.wait` all of the futures if you want to)

    await pool.join()

    assert all(f.done() for f in itertools.chain(f1,f2))  # this is guaranteed

    assert 2 * sum(todo) == sum(f.result() for f in itertools.chain(f1,f2))

```