{"id":15592445,"url":"https://github.com/dabapps/django-db-queue","last_synced_at":"2025-04-04T11:09:33.968Z","repository":{"id":19953536,"uuid":"23220366","full_name":"dabapps/django-db-queue","owner":"dabapps","description":"Simple database-backed job queue","archived":false,"fork":false,"pushed_at":"2025-02-25T12:55:03.000Z","size":291,"stargazers_count":178,"open_issues_count":5,"forks_count":11,"subscribers_count":26,"default_branch":"master","last_synced_at":"2025-03-28T10:06:10.740Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dabapps.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-08-22T09:35:41.000Z","updated_at":"2025-02-25T12:53:44.000Z","dependencies_parsed_at":"2024-05-01T09:25:15.390Z","dependency_job_id":"589fbf95-4b5b-4128-b784-0b8e464f486a","html_url":"https://github.com/dabapps/django-db-queue","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dabapps%2Fdjango-db-queue","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dabapps%2Fdjango-db-queue/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dabapps%2Fdjango-db-queue/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dabapps%2Fdjango-db-queue/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dabapps","download_url":"https://codeload.github.com/dabapps/django-db-queue/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247166168,"owners_count":20894654,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T00:00:40.787Z","updated_at":"2025-04-04T11:09:33.948Z","avatar_url":"https://github.com/dabapps.png","language":"Python","readme":"django-db-queue\n==========\n[![pypi release](https://img.shields.io/pypi/v/django-db-queue.svg)](https://pypi.python.org/pypi/django-db-queue)\n\nSimple database-backed job queue. Jobs are defined in your settings, and are processed by management commands.\n\nAsynchronous tasks are run via a *job queue*. This system is designed to support multi-step job workflows.\n\nSupported and tested against:\n- Django 4.2, 5.0, 5.1\n- Python 3.9, 3.10, 3.11, 3.12, 3.13\n\n## Getting Started\n\n### Installation\n\nInstall from PIP\n\n```\npip install django-db-queue\n```\n\nAdd `django_dbq` to your installed apps\n\n```python\nINSTALLED_APPS = [\n    ...,\n    \"django_dbq\",\n]\n```\n\nRun migrations\n\n```\nmanage.py migrate\n```\n\n### Upgrading from 1.x to 2.x\n\nNote that version 2.x only supports Django 3.1 or newer. If you need support for Django 2.2, please stick with the latest 1.x release.\n\n### Describe your job\n\nIn e.g. project.common.jobs:\n\n```python\nimport time\n\n\ndef my_task(job):\n    logger.info(\"Working hard...\")\n    time.sleep(10)\n    logger.info(\"Job's done!\")\n```\n\n### Set up your job\n\nIn project.settings:\n\n```python\nJOBS = {\n    \"my_job\": {\n        \"tasks\": [\"project.common.jobs.my_task\"],\n    },\n}\n```\n\n### Hooks\n\n\n#### Failure Hooks\nWhen an unhandled exception is raised by a job, a failure hook will be called if one exists enabling\nyou to clean up any state left behind by your failed job. Failure hooks are run in your worker process (if your job fails).\n\nA failure hook receives the failed `Job` instance along with the unhandled exception raised by your failed job as its arguments. Here's an example:\n\n```python\ndef my_task_failure_hook(job, e):\n    ...  # clean up after failed job\n```\n\nTo ensure this hook gets run, simply add a `failure_hook` key to your job config like so:\n\n```python\nJOBS = {\n    \"my_job\": {\n        \"tasks\": [\"project.common.jobs.my_task\"],\n        \"failure_hook\": \"project.common.jobs.my_task_failure_hook\",\n    },\n}\n```\n\n#### Creation Hooks\nYou can also run creation hooks, which happen just after the creation of your `Job` instances and are executed in the process\nin which the job was created, _not the worker process_.\n\nA creation hook receives your `Job` instance as its only argument. Here's an example:\n\n```python\ndef my_task_creation_hook(job):\n    ...  # configure something before running your job\n```\n\nTo ensure this hook gets run, simply add a `creation_hook` key to your job config like so:\n\n```python\nJOBS = {\n    \"my_job\": {\n        \"tasks\": [\"project.common.jobs.my_task\"],\n        \"creation_hook\": \"project.common.jobs.my_task_creation_hook\",\n    },\n}\n```\n\n#### Pre \u0026 Post Task Hooks\nYou can also run pre task or post task hooks, which happen in the normal processing of your `Job` instances and are executed inside the worker process.\n\nBoth pre and post task hooks receive your `Job` instance as their only argument. Here's an example:\n\n```python\ndef my_pre_task_hook(job):\n    ...  # configure something before running your task\n```\n\nTo ensure these hooks are run, simply add a `pre_task_hook` or `post_task_hook` key (or both, if needed) to your job config like so:\n\n```python\nJOBS = {\n    \"my_job\": {\n        \"tasks\": [\"project.common.jobs.my_task\"],\n        \"pre_task_hook\": \"project.common.jobs.my_pre_task_hook\",\n        \"post_task_hook\": \"project.common.jobs.my_post_task_hook\",\n    },\n}\n```\n\nNotes:\n\n* If the `pre_task_hook` fails (raises an exception), the task function is not run, and django-db-queue behaves as if the task function itself had failed: the failure hook is called, and the job is goes into the `FAILED` state.\n* The `post_task_hook` is always run, even if the job fails. In this case, it runs after the `failure_hook`.\n* If the `post_task_hook` raises an exception, this is logged but the the job is **not marked as failed** and the failure hook does not run. This is because the `post_task_hook` might need to perform cleanup that always happens after the task, no matter whether it succeeds or fails.\n\n\n### Start the worker\n\nIn another terminal:\n\n```\npython manage.py worker\n```\n\n### Create a job\n\nUsing the name you configured for your job in your settings, create an instance of Job.\n\n```python\nJob.objects.create(name=\"my_job\")\n```\n\n### Prioritising jobs\nSometimes it is necessary for certain jobs to take precedence over others. For example; you may have a worker which has a primary purpose of dispatching somewhat\nimportant emails to users. However, once an hour, you may need to run a _really_ important job which needs to be done on time and cannot wait in the queue for dozens\nof emails to be dispatched before it can begin.\n\nIn order to make sure that an important job is run before others, you can set the `priority` field to an integer higher than `0` (the default). For example:\n\n```python\nJob.objects.create(name=\"normal_job\")\nJob.objects.create(name=\"important_job\", priority=1)\nJob.objects.create(name=\"critical_job\", priority=2)\n```\n\nJobs will be ordered by their `priority` (highest to lowest) and then the time which they were created (oldest to newest) and processed in that order.\n\n### Scheduling jobs\nIf you'd like to create a job but have it run at some time in the future, you can use the `run_after` field on the Job model:\n\n```python\nJob.objects.create(\n    name=\"scheduled_job\",\n    run_after=(timezone.now() + timedelta(minutes=10)),\n)\n```\n\nOf course, the scheduled job will only be run if your `python manage.py worker` process is running at the time when the job is scheduled to run. Otherwise, it will run the next time you start your worker process after that time has passed.\n\nIt's also worth noting that, by default, scheduled jobs run as part of the same queue as all other jobs, and so if a job is already being processed at the time when your scheduled job is due to run, it won't run until that job has finished. If increased precision is important, you might consider using the `queue_name` feature to run a separate worker dedicated to only running scheduled jobs.\n\n## Terminology\n\n### Job\n\nThe top-level abstraction of a standalone piece of work. Jobs are stored in the database (ie they are represented as Django model instances).\n\n### Task\n\nJobs are processed to completion by *tasks*. These are simply Python functions, which must take a single argument - the `Job` instance being processed. A single job will often require processing by more than one task to be completed fully. Creating the task functions is the responsibility of the developer. For example:\n\n```python\ndef my_task(job):\n    logger.info(\"Doing some hard work\")\n    do_some_hard_work()\n```\n\n### Workspace\n\nThe *workspace* is an area that can be used 1) to provide additional arguments to task functions, and 2) to categorize jobs with additional metadata. It is implemented as a Python dictionary, available on the `job` instance passed to tasks as `job.workspace`. The initial workspace of a job can be empty, or can contain some parameters that the tasks require (for example, API access tokens, account IDs etc).\n\nWhen creating a Job, the workspace is passed as a keyword argument:\n\n```python\nJob.objects.create(name=\"my_job\", workspace={\"key\": value})\n```\n\nThen, the task function can access the workspace to get the data it needs to perform its task:\n\n```python\ndef my_task(job):\n    cats_import = CatsImport.objects.get(pk=job.workspace[\"cats_import_id\"])\n```\n\nTasks within a single job can use the workspace to communicate with each other. A single task can edit the workspace, and the modified workspace will be passed on to the next task in the sequence. For example:\n\n```python\ndef my_first_task(job):\n    job.workspace['message'] = 'Hello, task 2!'\n\ndef my_second_task(job):\n    logger.info(\"Task 1 says: %s\" % job.workspace['message'])\n```\n\nThe workspace can be queried like any [JSONField](https://docs.djangoproject.com/en/3.2/topics/db/queries/#querying-jsonfield). For instance, if you wanted to display a list of jobs that a certain user had initiated, add `user_id` to the workspace when creating the job:\n\n```python\nJob.objects.create(name=\"foo\", workspace={\"user_id\": request.user.id})\n```\n\nThen filter the query with it in the view that renders the list:\n\n```python\nuser_jobs = Job.objects.filter(workspace__user_id=request.user.id)\n```\n\n### Worker process\n\nA *worker process* is a long-running process, implemented as a Django management command, which is responsible for executing the tasks associated with a job. There may be many worker processes running concurrently in the final system. Worker processes wait for a new job to be created in the database, and call the each associated task in the correct sequeunce.. A worker can be started using `python manage.py worker`, and a single worker instance is included in the development `procfile`.\n\n### Configuration\n\nJobs are configured in the Django `settings.py` file. The `JOBS` setting is a dictionary mapping a *job name* (eg `import_cats`) to a *list* of one or more task function paths. For example:\n\n```python\nJOBS = {\n    'import_cats': ['apps.cat_importer.import_cats.step_one', 'apps.cat_importer.import_cats.step_two'],\n}\n```\n\n### Job states\n\nJobs have a `state` field which can have one of the following values:\n\n* `NEW` (has been created, waiting for a worker process to run the next task)\n* `READY` (has run a task before, awaiting a worker process to run the next task)\n* `PROCESSING` (a task is currently being processed by a worker)\n* `STOPPING` (the worker process has received a signal from the OS requesting it to exit)\n* `COMPLETED` (all job tasks have completed successfully)\n* `FAILED` (a job task failed)\n\n#### State diagram\n\n![state diagram](states.png)\n\n### API\n\n#### Model methods\n\n##### Job.get_queue_depths\nIf you need to programatically get the depth of any queue you can run the following:\n```python\nfrom django_dbq.models import Job\n\n...\n\nJob.objects.create(name=\"do_work\", workspace={})\nJob.objects.create(name=\"do_other_work\", queue_name=\"other_queue\", workspace={})\n\nqueue_depths = Job.get_queue_depths()\nprint(queue_depths)  # {\"default\": 1, \"other_queue\": 1}\n```\n\nYou can also exclude jobs which exist but are scheduled to be run in the future from the queue depths, where `run_after` is set to a future time from now. To do this set the `exclude_future_jobs` kwarg like so:\n```python\nqueue_depths = Job.get_queue_depths(exclude_future_jobs=True)\n```\n\n**Important:** When checking queue depths, do not assume that the key for your queue will always be available. Queue depths of zero won't be included\nin the dict returned by this method.\n\n#### Management commands\n\n##### manage.py delete_old_jobs\nThere is a management command, `manage.py delete_old_jobs`, which deletes any\njobs from the database which are in state `COMPLETE` or `FAILED` and were\ncreated more than (by default) 24 hours ago. This could be run, for example, as a cron task, to ensure the jobs table remains at a reasonable size. Use the `--hours` argument to control the age of jobs that will be deleted.\n\n##### manage.py worker\nTo start a worker:\n\n```\nmanage.py worker [queue_name] [--rate_limit]\n```\n\n- `queue_name` is optional, and will default to `default`\n- The `--rate_limit` flag is optional, and will default to `1`. It is the minimum number of seconds that must have elapsed before a subsequent job can be run.\n\n##### manage.py queue_depth\nIf you'd like to check your queue depth from the command line, you can run `manage.py queue_depth [queue_name [queue_name ...]]` and any\njobs in the \"NEW\" or \"READY\" states will be returned.\n\nIf you wish to exclude jobs which are scheduled to be run in the future you can add `--exclude_future_jobs` to the command.\n\n**Important:** If you misspell or provide a queue name which does not have any jobs, a depth of 0 will always be returned.\n\n### Gotcha: `bulk_create`\n\nBecause the `Job` model has logic in its `save` method, and because `save` doesn't get called when using `bulk_create`, you can't easily use `bulk_create` to create multiple `Job` instances at the same time.\n\nIf you really need to do this, you should be able to get it to work by using `django_dbq.tasks.get_next_task_name` to compute the next task name from the `name` of the job, and then use that value to populate the `next_task` field on each of the unsaved `Job` instances before calling `bulk_create`. Note that if you use the approach, the job's `creation_hook` will not be called.\n\n## Testing\n\nIt may be necessary to supply a DATABASE_PORT environment variable.\n\n## Windows support\n\nWindows is supported on a best-effort basis only, and is not covered by automated or manual testing.\n\n## Code of conduct\n\nFor guidelines regarding the code of conduct when contributing to this repository please review [https://www.dabapps.com/open-source/code-of-conduct/](https://www.dabapps.com/open-source/code-of-conduct/)\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdabapps%2Fdjango-db-queue","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdabapps%2Fdjango-db-queue","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdabapps%2Fdjango-db-queue/lists"}