Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pelletier/django-parallelized_querysets
Handle large Django QuerySets by spreading their execution on multiple cores and keeping the memory usage low.
https://github.com/pelletier/django-parallelized_querysets
Last synced: 2 months ago
JSON representation
Handle large Django QuerySets by spreading their execution on multiple cores and keeping the memory usage low.
- Host: GitHub
- URL: https://github.com/pelletier/django-parallelized_querysets
- Owner: pelletier
- License: mit
- Created: 2012-07-11T09:39:16.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2013-08-21T14:58:17.000Z (over 11 years ago)
- Last Synced: 2024-09-13T00:13:55.692Z (4 months ago)
- Language: Python
- Size: 104 KB
- Stars: 21
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# django-parallelized_querysets
Handle large Django QuerySets by spreading their execution on multiple cores
and keeping the memory usage low.[![Build Status](https://travis-ci.org/pelletier/django-parallelized_querysets.png?branch=master)](https://travis-ci.org/pelletier/django-parallelized_querysets)
## Installation
pip install django-parallelized_querysets
## Usage
### `parallelized_queryset(queryset, processes=None, function=None)`
Process the given `queryset` and return the result as a list.
**`proceses`**
Number of processes to create. Defaults to the number returned by
`multiprocessing.cpu_count()`.**`function`**
Apply a function the each result. Does not apply any function by default.
The first argument is the `Process` which is calling it, and the second is the
row.You can also pass two hooks (function that will be executed by the process at
defined times):**`init_hook`**
Give it a function taking the `Process` as argument and it will be executed at
soon as it's created.**`end_hook`**
Give it a function taking the `Process` as argument and it will be execute right
before the `Process` exits. If it returns a non-`None` value, it will be
appended to the results queue.> **Note**
>
> Each time your function returns `None`, the value won't be in the resulting
> list.> **Note**
>
> The order in the QuerySet won't be respected!#### Example
Return all the `Article` objects:
>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs)Add all `Article` objects to a Redis index (assuming `Article` has
a `append_to_redis` method):>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs, function=lambda p, x: x.append_to_redis())Do the same but on 6 processes:
>>> from parallelized_querysets import parallelized_queryset
>>> qs = Article.objects.all()
>>> parallelized_queryset(qs, processes=6,
function=lambda p, x: x.append_to_redis())### `parallelized_multiple_querysets(querysets, processes=None, function=None)`
Same as `parallelized_queryset` but `querysets` is a list of QuerySets.
## Testing
./tests/sample/manage.py test sample
## About `Exception AssertionError: AssertionError()`
You may see the following line (multiple times) on the standard error:
Exception AssertionError: AssertionError() in ignored
This is a bug in Python's garbage collector (running right after a fork), which
has been fixed in
[Python 3.3.0 alpha4](http://hg.python.org/cpython/file/59567c117b0e/Misc/NEWS#l47).See http://bugs.python.org/issue14548 for more information on that bug.
## License
MIT (see LICENSE).