{"id":16907993,"url":"https://github.com/cjrh/deadpool","last_synced_at":"2025-03-17T07:30:38.198Z","repository":{"id":59658800,"uuid":"537708639","full_name":"cjrh/deadpool","owner":"cjrh","description":"A Python Process Pool Executor implementation that is harder to break","archived":false,"fork":false,"pushed_at":"2025-03-15T17:56:57.000Z","size":1156,"stargazers_count":14,"open_issues_count":8,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-15T18:35:31.598Z","etag":null,"topics":["concurrent-futures","executor","multiprocessing","process-pool","processpool","processpoolexecutor","subprocess"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cjrh.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE-AGPL","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["cjrh"]}},"created_at":"2022-09-17T05:30:06.000Z","updated_at":"2025-03-15T17:56:54.000Z","dependencies_parsed_at":"2023-10-03T03:15:41.655Z","dependency_job_id":"7861df06-8793-43a7-9567-55925377170c","html_url":"https://github.com/cjrh/deadpool","commit_stats":{"total_commits":64,"total_committers":3,"mean_commits":"21.333333333333332","dds":0.375,"last_synced_commit":"2ecefaaeb149660aa9f868ec70189d592de4e9b6"},"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cjrh%2Fdeadpool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cjrh%2Fdeadpool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cjrh%2Fdeadpool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cjrh%2Fdeadpool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cjrh","download_url":"https://codeload.github.com/cjrh/deadpool/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243847971,"owners_count":20357472,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["concurrent-futures","executor","multiprocessing","process-pool","processpool","processpoolexecutor","subprocess"],"created_at":"2024-10-13T18:49:42.715Z","updated_at":"2025-03-17T07:30:38.191Z","avatar_url":"https://github.com/cjrh.png","language":"Python","readme":".. |ci| image:: https://github.com/cjrh/deadpool/workflows/Python%20application/badge.svg\n    :target: https://github.com/cjrh/deadpool/actions\n\n.. |coverage| image:: https://coveralls.io/repos/github/cjrh/deadpool/badge.svg?branch=main\n    :target: https://coveralls.io/github/cjrh/deadpool?branch=main\n\n.. |pyversions| image:: https://img.shields.io/pypi/pyversions/deadpool-executor.svg\n    :target: https://pypi.python.org/pypi/deadpool-executor\n\n.. |tag| image:: https://img.shields.io/github/tag/cjrh/deadpool.svg\n    :target: https://img.shields.io/github/tag/cjrh/deadpool.svg\n\n.. |install| image:: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg\n    :target: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg\n\n.. |pypi| image:: https://img.shields.io/pypi/v/deadpool-executor.svg\n    :target: https://pypi.org/project/deadpool-executor/\n\n.. |calver| image:: https://img.shields.io/badge/calver-YYYY.MM.MINOR-22bfda.svg\n    :alt: This project uses calendar-based versioning scheme\n    :target: http://calver.org/\n\n.. |pepy| image:: https://pepy.tech/badge/deadpool-executor\n    :alt: Downloads\n    :target: https://pepy.tech/project/deadpool-executor\n\n.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg\n    :alt: This project uses the \"black\" style formatter for Python code\n    :target: https://github.com/python/black\n\n.. |openssf| image:: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool/badge\n    :alt: OpenSSF Scorecard\n    :target: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool\n\n|ci| |coverage| |pyversions| |tag| |install| |pypi| |calver| |pepy| |black| |openssf|\n\n.. sectnum::\n\n.. contents::\n   :local:\n   :depth: 2\n   :backlinks: entry\n\nDeadpool\n========\n\n``Deadpool`` is a process pool that is really hard to kill.\n\n``Deadpool`` is an implementation of the ``Executor`` interface\nin the ``concurrent.futures`` standard library. ``Deadpool`` is\na process pool executor, quite similar to the stdlib's\n`ProcessPoolExecutor`_.\n\nThis document assumes that you are familiar with the stdlib\n`ProcessPoolExecutor`_. If you are not, it is important\nto understand that ``Deadpool`` makes very specific tradeoffs that\ncan result in quite different behaviour to the stdlib\nimplementation.\n\nLicence\n=======\n\nThis project can be licenced either under the terms of the `Apache 2.0`_\nlicence, or the `Affero GPL 3.0`_ licence. The choice is yours.\n\nInstallation\n============\n\nThe python package name is *deadpool-executor*, so to install\nyou must type ``$ pip install deadpool-executor``. The import\nname is *deadpool*, so in your Python code you must type\n``import deadpool`` to use it.\n\nI try quite hard to keep dependencies to a minimum. Currently\n``Deadpool`` has no dependencies other than ``psutil`` which\nis simply too useful to avoid for this library.\n\nWhy would I want to use this?\n=============================\n\nI created ``Deadpool`` because I became frustrated with the\nstdlib `ProcessPoolExecutor`_, and various other community\nimplementations of process pools. In particular, I had a use-case\nthat required a high server uptime, but also had variable and\nunpredictable memory requirements such that certain tasks could\ntrigger the `OOM killer`_, often resulting in a \"broken\" process\npool. I also needed task-specific timeouts that could kill a \"hung\"\ntask, which the stdlib executor doesn't provide.\n\nYou might wonder, isn't it bad to just kill a task like that?\nIn my use-case, we had extensive logging and monitoring to alert\nus if any tasks failed; but it was paramount that our services\ncontinue to operate even when tasks got killed in OOM scenarios,\nor specific tasks took too long. This is the primary trade-off\nthat ``Deadpool`` offers: the pool will not break, but tasks\ncan receive SIGKILL under certain conditions. This trade-off\nis likely fine if you've seen many OOMs break your pools.\n\nI also tried using the `Pebble \u003chttps://github.com/noxdafox/pebble\u003e`_\ncommunity process pool. This is a cool project, featuring several\nof the properties I've been looking for such as timeouts, and\nmore resilient operation. However, during testing I found several\noccurrences of a mysterious `RuntimeError`_ that caused the Pebble\npool to become broken and no longer accept new tasks.\n\nMy goal with ``Deadpool`` is that **the pool must never enter\na broken state**. Any means by which that can happen will be\nconsidered a bug.\n\nWhat differs from `ProcessPoolExecutor`_?\n=========================================\n\n``Deadpool`` is generally similar to `ProcessPoolExecutor`_ since it executes\ntasks in subprocesses, and implements the standard ``Executor`` abstract\ninterface. We can draw a few comparisons to the stdlib pool to guide\nyour decision process about whether this makes sense for your use-case:\n\nSimilarities\n------------\n\n- ``Deadpool`` also supports the\n  ``max_tasks_per_child`` parameter (a new feature in\n  Python 3.11, although it was available in `multiprocessing.Pool`_\n  since Python 3.2).\n- The \"initializer\" callback in ``Deadpool`` works the same.\n- ``Deadpool`` defaults to the `forkserver \u003chttps://docs.python.org/3.11/library/multiprocessing.html#contexts-and-start-methods\u003e`_ multiprocessing\n  context, unlike the stdlib pool which defaults to ``fork`` on\n  Linux. It's just a setting though, you can change it in the same way as\n  with the stdlib pool. Like the stdlib, I strongly advise you to avoid\n  using ``fork`` because propagation threads and locks via fork is\n  going to ruin your day eventually. While this is a difference to the\n  default behaviour of the stdlib pool, it's not a difference in\n  behaviour to the stdlib pool when you use the ``forkserver`` context\n  which is the recommended context for multiprocessing.\n\nDifferences in existing behaviour\n---------------------------------\n\n``Deadpool`` differs from the stdlib pool in the following ways:\n\n- If a ``Deadpool`` subprocess in the pool is killed by some\n  external actor, for example, the OS runs out of memory and the\n  `OOM killer`_ kills a pool subprocess that is using too much memory,\n  ``Deadpool`` does not care and further operation is unaffected.\n  ``Deadpool`` will not, and indeed cannot raise\n  `BrokenProcessPool \u003chttps://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.process.BrokenProcessPool\u003e`_ or\n  `BrokenExecutor \u003chttps://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.BrokenExecutor\u003e`_.\n- ``Deadpool`` precreates all subprocesses up to the pool size on\n  creation.\n- ``Deadpool`` tasks can have priorities. When the executor chooses\n  the next pending task to schedule to a subprocess, it chooses the\n  pending task with the highest priority. This gives you a way of\n  prioritizing certain kinds of tasks. For example, you might give\n  UI-sensitive tasks a higher priority to deliver a more snappy\n  user experience to your users. The priority can be specified in\n  the ``submit`` call.\n- The shutdown parameters ``wait`` and ``cancel_futures`` can behave\n  differently to how they work in the `ProcessPoolExecutor`_. This is\n  discussed in more detail later in this document.\n- ``Deadpool`` currently only works on Linux. There isn't any specific\n  reason it can't work on other platforms. The malloc trim feature also\n  requires a glibc system, so probably won't work on Alpine.\n\nNew features in Deadpool\n------------------------\n\n``Deadpool`` has the following features that are not present in the\nstdlib pool:\n\n- With ``Deadpool`` you can provider a \"finalizer\" callback that will\n  fire before a subprocess is shut down or killed. The finalizer callback\n  might be executed in a different thread than the main thread of the\n  subprocess, so don't rely on the callback running in the main\n  subprocess thread. There are certain circumstances where the finalizer\n  will not run at all, such as when the subprocess is killed by the OS\n  due to an out-of-memory (OOM) condition. So don't design your application\n  such that the finalizer is required to run for correct operation.\n- Even though ``Deadpool`` typically uses a hard kill to remove\n  subprocesses, it does still run any handlers registered with\n  ``atexit``.\n- ``Deadpool`` tasks can have timeouts. When a task hits the timeout,\n  the underlying subprocess in the pool is killed with ``SIGKILL``.\n  The entire process tree of that subprocess is killed. Your application\n  logic needs to handle this. The ``finalizer`` will not run.\n- ``Deadpool`` also allows a ``finalizer``, with corresponding\n  ``finalargs``, that will be called after a task is executed on\n  a subprocess, but before the subprocess terminates. It is\n  analogous to the ``initializer`` and ``initargs`` parameters.\n  Just like the ``initializer`` callable, the ``finalizer``\n  callable is executed inside the subprocess. It is not guaranteed that\n  the finalizer will always run. If a process is killed, e.g. due to a\n  timeout or any other reason, the finalizer will not run. The finalizer\n  could be used for things like flushing pending monitoring messages,\n  such as traces and so on.\n- ``Deadpool`` can ask the system allocator (Linux only) to return\n  unused memory back to the OS based on exceeding a max threshold RSS.\n  For long-running pools and modern\n  kernels, the system memory allocator can hold onto unused memory\n  for a surprisingly long time, and coupled with bloat due to\n  memory fragmentation, this can result in carrying very large\n  RSS values in your pool. The ``max_tasks_per_child`` helps with\n  this because a subprocess is entirely erased when the max is\n  reached, but it does mean that periodically there will be a small\n  latency penalty from constructing the replacement subprocess. In\n  my opinion, ``max_tasks_per_child`` is appropriate for when you\n  know or suspect there's a real memory leak somewhere in your code\n  (or a 3rd-party package!), and the easiest way to deal with that\n  right now is just to periodically remove a process.\n- ``Deadpool`` can propagate ``os.environ`` to the subprocesses.\n  Normally, env vars present at the time of the \"main\" process will\n  propagate to subprocesses, but dynamically modified env vars\n  via ``os.environ`` will not. Actually, it depends on the start\n  method, with ``fork`` doing the propagation, and ``forkserver``\n  and ``spawn`` not doing it. The parameter ``propagate_environ``,\n  e.g., ``propagate_environ=os.environ``, re-enables this for\n  ``forkserver`` and ``spawn``. The supplied mapping will be\n  applied to the subprocesses as they are created. This also means\n  that if you want to modify some settings, you can modify the\n  mapping object at any time, and new subprocesses created after\n  that modification will get the new vars. One example use-case\n  is dynamically changing the logging level within subprocesses.\n\nMinimum and Maximum Workers\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n``Deadpool`` has a ``min_workers`` and ``max_workers`` parameter.\nWhile ``max_workers`` is the same as the stdlib pool, ``min_workers``\nis a new feature.\n\nThe ``min_workers`` parameter allows deadpool to \"scale down\" the\npool when it is idle. This is another strategy alongside other\nfeatures like ``max_tasks_per_child`` and ``max_worker_memory_bytes``\nto help deal with memory bloat in long-running pools.\n\nStatistics\n~~~~~~~~~~\n\nHere is a very simple example of how to get statistics from the\nexecutor:\n\n.. code-block:: python\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(...)\n        stats = exe.get_statistics()\n\nThe call must be made while the executor is still alive. It\nwill succeed after the executor is shut down or closed, but\nsome of the statistics will be zeroed out.\n\nThe call to ``get_statistics`` will return a dictionary with the\nfollowing keys:\n\n- ``tasks_received``: The total number of tasks submitted to the\n  executor. Does not mean that they started running, only that they\n  were successfully submitted.\n- ``tasks_launched``: The total number of tasks that were launched\n  on a subprocess. This records the count of all tasks that were\n  successfully scheduled to run. These tasks were picked up from\n  the submit backlog and given to a worker process to execute.\n- ``tasks_failed``: The total number of tasks that failed. This\n  includes tasks that raised an exception, and tasks that were\n  killed due to a timeout, and really any other reason that a task\n  failed.\n- ``worker_processes_created``: The total number of subprocesses that\n  were ever created by the executor. This can be, and often will be\n  greater than the `max_workers` setting because there are many options\n  that can cause workers to be discarded and replaced. Examples of these\n  might be the ``max_tasks_per_child`` setting, or the ``min_workers``\n  setting, or the memory thresholds and so on.\n- ``max_workers_busy_concurrently``: The maximum number of workers that\n  were ever busy at the same time. This is a useful statistic to\n  decide whether you might consider increasing or decreasing the size\n  of the pool. For example, if your ``max_workers`` is set to 100, but\n  after running for, say, a week, you see that ``max_workers_busy_concurrently``\n  is only 50, then you might consider reducing the pool size to 50.\n  The system memory manager on linux likes to hold onto heap memory.\n  If your have more workers than you need, you'll see that the system\n  memory usage over time is going to be higher than it needs to be\n  because even when the pool is fully idle, you will still observe\n  the persistent worker processes having a large memory allocation\n  even though no jobs are running. This is a symptom of malloc\n  retention behaviour.\n- ``worker_processes_still_alive``: The number of worker processes that\n  are still alive. This includes both idle and busy worker processes.\n  This is mainly a debugging statistic that I can use to check whether\n  worker processes are \"leaking\" somehow and not being cleaned up\n  correctly. This number should not be greater than the ``max_workers``.\n  (It could be, temporarily, depending on the exact timing and strategy\n  in the inner workings of the executor, but on average it should not)\n- ``worker_processes_idle``: The number of worker processes that are idle.\n- ``worker_processes_busy``: The number of worker processes that are busy.\n\n\nHere is an example from the tests to explain what each of the\nstatistics mean:\n\n.. code-block:: python\n\n    with deadpool.Deadpool(min_workers=5, max_workers=10) as exe:\n        futs = []\n        for _ in range(50):\n            futs.append(exe.submit(t, 0.05))\n            futs.append(exe.submit(f_err, Exception))\n\n        results = []\n        for fut in deadpool.as_completed(futs):\n            try:\n                results.append(fut.result())\n            except Exception:\n                pass\n\n        time.sleep(0.5)\n        stats = exe.get_statistics()\n\n    assert results == [0.05] * 50\n    print(f\"{stats=}\")\n    assert stats == {\n        \"tasks_received\": 100,\n        \"tasks_launched\": 100,\n        \"tasks_failed\": 50,\n        \"worker_processes_created\": 10,\n        \"max_workers_busy_concurrently\": 10,\n        \"worker_processes_still_alive\": 5,\n        \"worker_processes_idle\": 5,\n        \"worker_processes_busy\": 0,\n    }\n\nIn this example, we submit 100 tasks, 50 of which will raise an\nexception. The executor will create 10 worker processes, and\nwill have a maximum of 10 workers busy at the same time. After\nall the tasks are completed, we wait for a short time to allow\nthe executor to clean up any worker processes that are no longer\nneeded. The statistics should show that 5 worker processes are\nstill alive, and all of them are idle.\n\nShow me some code\n=================\n\nSimple case\n-----------\n\nThe simple case works exactly the same as with `ProcessPoolExecutor`_:\n\n.. code-block:: python\n\n    import deadpool\n\n    def f():\n        return 123\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f)\n        result = fut.result()\n\n    assert result == 123\n\nIt is intended that all the basic behaviour should \"just work\" in the\nsame way, and ``Deadpool`` should be a drop-in replacement for\n`ProcessPoolExecutor`_; but there are some subtle differences so you\nshould read all of this document to see if any of those will affect you.\n\nTimeouts\n--------\n\nIf a timeout is reached on a task, the subprocess running that task will be\nkilled, as in ``SIGKILL``. ``Deadpool`` doesn't mind, but your own\napplication should: if you use timeouts it is likely important that your tasks\nbe `idempotent \u003chttps://en.wikipedia.org/wiki/Idempotence\u003e`_, especially if\nyour application will restart tasks, or restart them after application deployment,\nand other similar scenarios.\n\n.. code-block:: python\n\n    import time\n    import deadpool\n\n    def f():\n        time.sleep(10.0)\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f, deadpool_timeout=1.0)\n\n        with pytest.raises(deadpool.TimeoutError)\n            fut.result()\n\nThe parameter ``deadpool_timeout`` is special and consumed by ``Deadpool``\nin the call. You can't use a parameter with this name in your function \nkwargs.\n\nHandling OOM killed situations\n------------------------------\n\n.. code-block:: python\n\n    import time\n    import deadpool\n\n    def f():\n        x = list(range(10**100))\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f, deadpool_timeout=1.0)\n\n        try:\n            result = fut.result()\n        except deadpool.ProcessError:\n            print(\"Oh no someone killed my task!\")\n\n\nAs long as the OOM killer terminates merely a subprocess (and not the main\nprocess), which is likely because it'll be your subprocess that is using too\nmuch memory, this will not hurt the pool, and it will be able to receive and\nprocess more tasks. Note that this event will show up as a ``ProcessError``\nexception when accessing the future, so you have a way of at least tracking\nthese events.\n\nDesign Details\n==============\n\nTypical Example - with timeouts\n-------------------------------\n\nHere's a typical example of how code using Deadpool might look. The\noutput of the code further below should be similar to the following:\n\n.. code-block:: bash\n\n    $ python examples/entrypoint.py\n    ...................xxxxxxxxxxx.xxxxxxx.x.xxxxxxx.x\n    $\n\nEach ``.`` is a successfully completed task, and each ``x`` is a task\nthat timed out. Below is the code for this example.\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work():\n        time.sleep(random.random() * 4.0)\n        print(\".\", end=\"\", flush=True)\n        return 1\n\n\n    def main():\n        with deadpool.Deadpool() as exe:\n            futs = (exe.submit(work, deadpool_timeout=2.0) for _ in range(50))\n            for fut in deadpool.as_completed(futs):\n                try:\n                    assert fut.result() == 1\n                except deadpool.TimeoutError:\n                    print(\"x\", end=\"\", flush=True)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n- The work function will be busy for a random time period between 0 and\n  4 seconds.\n- There is a ``deadpool_timeout`` kwarg given to the ``submit`` method.\n  This kwarg is special and will be consumed by Deadpool. You cannot\n  use this kwarg name for your own task functions.\n- When a task completes, it prints out ``.`` internally. But when a task\n  raises a ``deadpool.TimeoutError``, a ``x`` will be printed out instead.\n- When a task times out, keep in mind that the underlying process that\n  is executing that task is killed, literally with the ``SIGKILL`` signal.\n\nDeadpool tasks have priority\n----------------------------\n\nThe example below is similar to the previous one for timeouts. In fact\nthis example retains the timeouts to show how the different features\ncompose together. In this example we create tasks with different\npriorities, and we change the printed character of each task to show\nthat higher priority items are executed first.\n\nThe code example will print something similar to the following:\n\n.. code-block:: bash\n\n    $ python examples/priorities.py\n    !!!!!xxxxxxxxxxx!x..!...x.xxxxxxxx.xxxx.x...xxxxxx\n\nYou can see how the ``!`` characters, used for indicating higher priority\ntasks, appear towards the front indicating that they were executed sooner.\nBelow is the code.\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work(symbol):\n        time.sleep(random.random() * 4.0)\n        print(symbol, end=\"\", flush=True)\n        return 1\n\n\n    def main():\n        with deadpool.Deadpool(max_backlog=100) as exe:\n            futs = []\n            for _ in range(25):\n                fut = exe.submit(work, \".\",deadpool_timeout=2.0, deadpool_priority=10)\n                futs.append(fut)\n                fut = exe.submit(work, \"!\",deadpool_timeout=2.0, deadpool_priority=0)\n                futs.append(fut)\n\n            for fut in deadpool.as_completed(futs):\n                try:\n                    assert fut.result() == 1\n                except deadpool.TimeoutError:\n                    print(\"x\", end=\"\", flush=True)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n- When the tasks are submitted, they are given a priority. The default\n  value for the ``deadpool_priority`` parameter is 0, but here we'll\n  write them out explicity.  Half of the tasks will have priority 10 and\n  half will have priority 0.\n- A lower value for the ``deadpool_priority`` parameters means a **higher**\n  priority. The highest priority allowed is indicated by 0. Negative\n  priority values are not allowed.\n- I also specified the ``max_backlog`` parameter when creating the\n  Deadpool instance. This is discussed in more detail next, but quickly:\n  task priority can only be enforced on what is in the submitted backlog\n  of tasks, and the ``max_backlog`` parameter controls the depth of that\n  queue. If ``max_backlog`` is too low, then the window of prioritization\n  will not include tasks submitted later which might have higher priorities\n  than earlier-submitted tasks. The ``submit`` call will in fact block\n  once the ``max_backlog`` depth has been reached.\n\nControlling the backlog of submitted tasks\n------------------------------------------\n\nBy default, the ``max_backlog`` parameter is set to 5. This parameter is\nused to create the \"submit queue\" size. The submit queue is the place\nwhere submitted tasks are held before they are executed in background\nprocesses.\n\nIf the submit queue is large (``max_backlog``), it will mean\nthat a large number of tasks can be added to the system with the\n``submit`` method, even before any tasks have finished exiting. Conversely,\na low ``max_backlog`` parameter means that the submit queue will fill up\nfaster. If the submit queue is full, it means that the next call to\n``submit`` will block.\n\nThis kind of blocking is fine, and typically desired. It means that\nbackpressure from blocking is controlling the amount of work in flight.\nBy using a smaller ``max_backlog``, it means that you'll also be\nlimiting the amount of memory in use during the execution of all the tasks.\n\nHowever, if you nevertheless still accumulate received futures as my\nexample code above is doing, that accumulation, i.e., the list of futures,\nwill contribute to memory growth. If you have a large amount of work, it\nwill be better to set a *callback* function on each of the futures rather\nthan processing them by iterating over ``as_completed``.\n\nThe example below illustrates this technique for keeping memory\nconsumption down:\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work():\n        time.sleep(random.random() * 4.0)\n        print(\".\", end=\"\", flush=True)\n        return 1\n\n\n    def cb(fut):\n        try:\n            assert fut.result() == 1\n        except deadpool.TimeoutError:\n            print(\"x\", end=\"\", flush=True)\n\n\n    def main():\n        with deadpool.Deadpool() as exe:\n            for _ in range(50):\n                exe.submit(work, deadpool_timeout=2.0).add_done_callback(cb)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n\nWith this callback-based design, we no longer have an accumulation of futures\nin a list. We get the same kind of output as in the \"typical example\" from\nearlier:\n\n.. code-block:: bash\n\n    $ python examples/callbacks.py\n    .....xxx.xxxxxxxxx.........x..xxxxx.x....x.xxxxxxx\n\n\nSpeaking of callbacks, the customized ``Future`` class used by Deadpool\nlets you set a callback for when the task begins executing on a real\nsystem process. That can be configured like so:\n\n.. code-block:: python\n\n    with deadpool.Deadpool() as exe:\n        f = exe.submit(work)\n\n        def cb(fut: deadpool.Future):\n            print(f\"My task is running on process {fut.pid}\")\n\n        f.add_pid_callback(cb)\n\nObviously, both kinds of callbacks can be added:\n\n.. code-block:: python\n\n    with deadpool.Deadpool() as exe:\n        f = exe.submit(work)\n        f.add_pid_callback(lambda fut: f\"Started on {fut.pid=}\")\n        f.add_done_callback(lambda fut: f\"Completed {fut.pid=}\")\n\nMore about shutdown\n-------------------\n\nIn the documentation for ProcessPoolExecutor_, the following function\nsignature is given for the shutdown_ method of the executor interface:\n\n.. code-block:: python\n\n    shutdown(wait=True, *, cancel_futures=False)\n\nI want to honor this, but it presents some difficulties because the\nsemantics of the ``wait`` and ``cancel_futures`` parameters need to be\nsomewhat different for Deadpool.\n\nIn Deadpool, this is what the combinations of those flags mean:\n\n.. csv-table:: Shutdown flags\n   :header: ``wait``, ``cancel_futures``, ``effect``\n   :widths: 10, 10, 80\n   :align: left\n\n   ``True``, ``True``, \"Wait for already-running tasks to complete; the\n   ``shutdown()`` call will unblock (return) when they're done. Cancel\n   all pending tasks that are in the submit queue, but have not yet started\n   running. The ``fut.cancelled()`` method will return ``True`` for such\n   cancelled tasks.\"\n   ``True``, ``False``, \"Wait for already-running tasks to complete.\n   Pending tasks in the\n   submit queue that have not yet started running will *not* be cancelled, and\n   will all continue to execute. The ``shutdown()`` call will return only\n   after all submitted tasks have completed. \"\n   ``False``, ``True``, \"Already-running tasks **will be cancelled** and this\n   means the underlying subprocesses executing these tasks will receive\n   SIGKILL. Pending tasks on the submit queue that have not yet started\n   running will also be cancelled.\"\n   ``False``, ``False``, \"This is a strange one. What to do if the caller\n   doesn't want to wait, but also doesn't want to cancel things? In this\n   case, already-running tasks will be allowed to complete, but pending\n   tasks on the submit queue will be cancelled. This is the same outcome as\n   as ``wait==True`` and ``cancel_futures==True``. An alternative design\n   might have been to allow all tasks, both running and pending, to just\n   keep going in the background even after the ``shutdown()`` call\n   returns. Does anyone have a use-case for this?\"\n\nIf you're using ``Deadpool`` as a context manager, you might be wondering\nhow exactly to set these parameters in the ``shutdown`` call, since that\ncall is made for you automatically when the context manager exits.\n\nFor this, Deadpool provides additional parameters that can be provided\nwhen creating the instance:\n\n.. code-block:: python\n\n   # This is pseudocode\n   import deadpool\n\n   with deadpool.DeadPool(\n           shutdown_wait=True,\n           shutdown_cancel_futures=True\n   ):\n       fut = exe.submit(...)\n\nDeveloper Workflow\n==================\n\nnox\n---\n\nThis project uses ``nox``. Follow the instructions for installing\nnox at their page, and then come back here.\n\nWhile nox can be configured so that all the tools for each of\nthe tasks can be installed automatically when run, this takes\ntoo much time and so I've decided that you should just have\nthe following tools in your environment, ready to go. They\ndo not need to be installed in the same venv or anything like\nthat. I've found a convenient way to do this is with ``pipx``.\nFor example, to install ``black`` using ``pipx`` you can do\nthe following:\n\n.. code-block:: shell\n\n   $ pipx install black\n\nYou must do the same for ``isort`` and ``ruff``. See the following\nsections for actually using ``nox`` to perform dev actions.\n\ntests\n-----\n\nTo run the tests:\n\n.. code-block:: shell\n\n   $ nox -s test\n\nTo run tests for a particular version, and say with coverage:\n\n.. code-block:: shell\n\n   $ nox -s testcov-3.11\n\nTo pass additional arguments to pytest, use the ``--`` separator:\n\n.. code-block:: shell\n\n   $ nox -s testcov-3.11 -- -k test_deadpool -s \u003cetc\u003e\n\nThis is nonstandard above, but I customized the ``noxfile.py`` to\nallow this.\n\nstyle\n-----\n\nTo apply style fixes, and check for any remaining lints,\n\n.. code-block:: shell\n\n   $ nox -t style\n\ndocs\n----\n\nThe only docs currently are this README, which uses RST. Github\nuses `docutils \u003chttps://docutils.sourceforge.io/docs/ref/rst/directives.html\u003e`_\nto render RST.\n\nrelease\n-------\n\nThis project uses flit to release the package to pypi. The whole\nprocess isn't as automated as I would like, but this is what\nI currently do:\n\n1. Ensure that ``main`` branch is fully up to date with all to\n   be released, and all the tests succeed.\n2. Change the ``__version__`` field in ``deadpool.py``. Flit\n   uses this to stamp the version.\n3. Verify that ``flit build`` succeeds. This will produce a\n   wheel in the ``dist/`` directory. You can inspect this\n   wheel to ensure it contains only what is necessary. This\n   wheel will be what is uploaded to PyPI.\n4. **Commit the changed ``__version__``**. Easy to forget this\n   step, resulting in multiple awkward releases to try to\n   get the state all correct again.\n5. Now create the git tag and push to github:\n\n   .. code-block:: shell\n\n        $ git tag YYYY.MM.patch\n        $ git push --tags origin main\n\n6. Now deploy to PyPI:\n\n   .. code-block:: shell\n\n        $ flit publish\n\n\n.. _shutdown: https://docs.python.org/3/library/concurrent.futures.html?highlight=brokenprocesspool#concurrent.futures.Executor.shutdown\n.. _ProcessPoolExecutor: https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#processpoolexecutor\n.. _RuntimeError: https://github.com/noxdafox/pebble/issues/42#issuecomment-551245730\n.. _OOM killer: https://en.wikipedia.org/wiki/Out_of_memory#Out_of_memory_management\n.. _multiprocessing.Pool: https://docs.python.org/3.11/library/multiprocessing.html#multiprocessing.pool.Pool\n.. _Apache 2.0: https://www.apache.org/licenses/LICENSE-2.0\n.. _Affero GPL 3.0: https://www.gnu.org/licenses/agpl-3.0.html\n","funding_links":["https://github.com/sponsors/cjrh"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcjrh%2Fdeadpool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcjrh%2Fdeadpool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcjrh%2Fdeadpool/lists"}