https://github.com/ross/requests-futures

Asynchronous Python HTTP Requests for Humans using Futures
https://github.com/ross/requests-futures
Last synced: 2 months ago
JSON representation
Asynchronous Python HTTP Requests for Humans using Futures
Host: GitHub
URL: https://github.com/ross/requests-futures
Owner: ross
License: other
Created: 2013-03-21T14:20:03.000Z (over 12 years ago)
Default Branch: main
Last Pushed: 2025-04-29T19:01:03.000Z (3 months ago)
Last Synced: 2025-05-07T07:59:39.624Z (2 months ago)
Language: Python
Size: 183 KB
Stars: 2,118
Watchers: 42
Forks: 150
Open Issues: 0
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project

my-awesome-starred - requests-futures - Asynchronous Python HTTP Requests for Humans using Futures (Python)
best-of-web-python - GitHub
starred-awesome - requests-futures - Asynchronous Python HTTP Requests for Humans using Futures (Python)
README

        Asynchronous Python HTTP Requests for Humans

============================================

.. image:: https://travis-ci.org/ross/requests-futures.svg?branch=master

        :target: https://travis-ci.org/ross/requests-futures

Small add-on for the python requests_ http library. Makes use of python 3.2's

`concurrent.futures`_ or the backport_ for prior versions of python.

The additional API and changes are minimal and strives to avoid surprises.

The following synchronous code:

.. code-block:: python

    from requests import Session

    session = Session()

    # first requests starts and blocks until finished

    response_one = session.get('http://httpbin.org/get')

    # second request starts once first is finished

    response_two = session.get('http://httpbin.org/get?foo=bar')

    # both requests are complete

    print('response one status: {0}'.format(response_one.status_code))

    print(response_one.content)

    print('response two status: {0}'.format(response_two.status_code))

    print(response_two.content)

Can be translated to make use of futures, and thus be asynchronous by creating

a FuturesSession and catching the returned Future in place of Response. The

Response can be retrieved by calling the result method on the Future:

.. code-block:: python

    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    # first request is started in background

    future_one = session.get('http://httpbin.org/get')

    # second requests is started immediately

    future_two = session.get('http://httpbin.org/get?foo=bar')

    # wait for the first request to complete, if it hasn't already

    response_one = future_one.result()

    print('response one status: {0}'.format(response_one.status_code))

    print(response_one.content)

    # wait for the second request to complete, if it hasn't already

    response_two = future_two.result()

    print('response two status: {0}'.format(response_two.status_code))

    print(response_two.content)

By default a ThreadPoolExecutor is created with 8 workers. If you would like to

adjust that value or share a executor across multiple sessions you can provide

one to the FuturesSession constructor.

.. code-block:: python

    from concurrent.futures import ThreadPoolExecutor

    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ThreadPoolExecutor(max_workers=10))

    # ...

As a shortcut in case of just increasing workers number you can pass

`max_workers` straight to the `FuturesSession` constructor:

.. code-block:: python

    from requests_futures.sessions import FuturesSession

    session = FuturesSession(max_workers=10)

FutureSession will use an existing session object if supplied:

.. code-block:: python

    from requests import session

    from requests_futures.sessions import FuturesSession

    my_session = session()

    future_session = FuturesSession(session=my_session)

That's it. The api of requests.Session is preserved without any modifications

beyond returning a Future rather than Response. As with all futures exceptions

are shifted (thrown) to the future.result() call so try/except blocks should be

moved there.

Tying extra information to the request/response

===============================================

The most common piece of information needed is the URL of the request. This can

be accessed without any extra steps using the `request` property of the

response object.

.. code-block:: python

    from concurrent.futures import as_completed

    from pprint import pprint

    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    futures=[session.get(f'http://httpbin.org/get?{i}') for i in range(3)]

    for future in as_completed(futures):

        resp = future.result()

        pprint({

            'url': resp.request.url,

            'content': resp.json(),

        })

There are situations in which you may want to tie additional information to a

request/response. There are a number of ways to go about this, the simplest is

to attach additional information to the future object itself.

.. code-block:: python

    from concurrent.futures import as_completed

    from pprint import pprint

    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    futures=[]

    for i in range(3):

        future = session.get('http://httpbin.org/get')

        future.i = i

        futures.append(future)

    for future in as_completed(futures):

        resp = future.result()

        pprint({

            'i': future.i,

            'content': resp.json(),

        })

Canceling queued requests (a.k.a cleaning up after yourself)

============================================================

If you know that you won't be needing any additional responses from futures that

haven't yet resolved, it's a good idea to cancel those requests. You can do this

by using the session as a context manager:

.. code-block:: python

    from requests_futures.sessions import FuturesSession

    with FuturesSession(max_workers=1) as session:

        future = session.get('https://httpbin.org/get')

        future2 = session.get('https://httpbin.org/delay/10')

        future3 = session.get('https://httpbin.org/delay/10')

        response = future.result()

In this example, the second or third request will be skipped, saving time and

resources that would otherwise be wasted.

Iterating over a list of requests responses

===========================================

Without preserving the requests order:

.. code-block:: python

    from concurrent.futures import as_completed

    from requests_futures.sessions import FuturesSession

    with FuturesSession() as session:

        futures = [session.get('https://httpbin.org/delay/{}'.format(i % 3)) for i in range(10)]

        for future in as_completed(futures):

            resp = future.result()

            print(resp.json()['url'])

Working in the Background

=========================

Additional processing can be done in the background using requests's hooks_

functionality. This can be useful for shifting work out of the foreground, for

a simple example take json parsing.

.. code-block:: python

    from pprint import pprint

    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    def response_hook(resp, *args, **kwargs):

        # parse the json storing the result on the response object

        resp.data = resp.json()

    future = session.get('http://httpbin.org/get', hooks={

        'response': response_hook,

    })

    # do some other stuff, send some more requests while this one works

    response = future.result()

    print('response status {0}'.format(response.status_code))

    # data will have been attached to the response object in the background

    pprint(response.data)

Hooks can also be applied to the session.

.. code-block:: python

    from pprint import pprint

    from requests_futures.sessions import FuturesSession

    def response_hook(resp, *args, **kwargs):

        # parse the json storing the result on the response object

        resp.data = resp.json()

    session = FuturesSession()

    session.hooks['response'] = response_hook

    future = session.get('http://httpbin.org/get')

    # do some other stuff, send some more requests while this one works

    response = future.result()

    print('response status {0}'.format(response.status_code))

    # data will have been attached to the response object in the background

    pprint(response.data)   pprint(response.data)

A more advanced example that adds an `elapsed` property to all requests.

.. code-block:: python

    from pprint import pprint

    from requests_futures.sessions import FuturesSession

    from time import time

    class ElapsedFuturesSession(FuturesSession):

        def request(self, method, url, hooks=None, *args, **kwargs):

            start = time()

            if hooks is None:

                hooks = {}

            def timing(r, *args, **kwargs):

                r.elapsed = time() - start

            try:

                if isinstance(hooks['response'], (list, tuple)):

                    # needs to be first so we don't time other hooks execution

                    hooks['response'].insert(0, timing)

                else:

                    hooks['response'] = [timing, hooks['response']]

            except KeyError:

                hooks['response'] = timing

            return super(ElapsedFuturesSession, self) \

                .request(method, url, hooks=hooks, *args, **kwargs)

    session = ElapsedFuturesSession()

    future = session.get('http://httpbin.org/get')

    # do some other stuff, send some more requests while this one works

    response = future.result()

    print('response status {0}'.format(response.status_code))

    print('response elapsed {0}'.format(response.elapsed))

Using ProcessPoolExecutor

=========================

Similarly to `ThreadPoolExecutor`, it is possible to use an instance of

`ProcessPoolExecutor`. As the name suggest, the requests will be executed

concurrently in separate processes rather than threads.

.. code-block:: python

    from concurrent.futures import ProcessPoolExecutor

    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10))

    # ... use as before

.. HINT::

    Using the `ProcessPoolExecutor` is useful, in cases where memory

    usage per request is very high (large response) and cycling the interpreter

    is required to release memory back to OS.

A base requirement of using `ProcessPoolExecutor` is that the `Session.request`,

`FutureSession` all be pickle-able.

This means that only Python 3.5 is fully supported, while Python versions

3.4 and above REQUIRE an existing `requests.Session` instance to be passed

when initializing `FutureSession`. Python 2.X and < 3.4 are currently not

supported.

.. code-block:: python

    # Using python 3.4

    from concurrent.futures import ProcessPoolExecutor

    from requests import Session

    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),

                             session=Session())

    # ... use as before

In case pickling fails, an exception is raised pointing to this documentation.

.. code-block:: python

    # Using python 2.7

    from concurrent.futures import ProcessPoolExecutor

    from requests import Session

    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),

                             session=Session())

    Traceback (most recent call last):

    ...

    RuntimeError: Cannot pickle function. Refer to documentation: https://github.com/ross/requests-futures/#using-processpoolexecutor

.. IMPORTANT::

  * Python >= 3.4 required

  * A session instance is required when using Python < 3.5

  * If sub-classing `FuturesSession` it must be importable (module global)

Installation

============

    pip install requests-futures

.. _`requests`: https://github.com/kennethreitz/requests

.. _`concurrent.futures`: http://docs.python.org/dev/library/concurrent.futures.html

.. _backport: https://pypi.python.org/pypi/futures

.. _hooks: http://docs.python-requests.org/en/master/user/advanced/#event-hooks
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ross/requests-futures

Awesome Lists containing this project

README