{"id":23448292,"url":"https://github.com/triton-inference-server/dali_backend","last_synced_at":"2025-04-04T18:07:59.799Z","repository":{"id":37412089,"uuid":"294219252","full_name":"triton-inference-server/dali_backend","owner":"triton-inference-server","description":"The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.","archived":false,"fork":false,"pushed_at":"2025-03-25T10:39:35.000Z","size":25168,"stargazers_count":132,"open_issues_count":27,"forks_count":32,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-28T17:11:13.721Z","etag":null,"topics":["dali","data-preprocessing","deep-learning","fast-data-pipeline","gpu","image-processing","nvidia-dali","python"],"latest_commit_sha":null,"homepage":"https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/triton-inference-server.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-09T20:14:13.000Z","updated_at":"2025-03-25T10:39:39.000Z","dependencies_parsed_at":"2023-02-19T08:46:11.315Z","dependency_job_id":"aace19b6-36ba-4241-9bb5-941032e95f57","html_url":"https://github.com/triton-inference-server/dali_backend","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/triton-inference-server%2Fdali_backend","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/triton-inference-server%2Fdali_backend/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/triton-inference-server%2Fdali_backend/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/triton-inference-server%2Fdali_backend/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/triton-inference-server","download_url":"https://codeload.github.com/triton-inference-server/dali_backend/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247226215,"owners_count":20904465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dali","data-preprocessing","deep-learning","fast-data-pipeline","gpu","image-processing","nvidia-dali","python"],"created_at":"2024-12-23T22:14:52.061Z","updated_at":"2025-04-04T18:07:59.771Z","avatar_url":"https://github.com/triton-inference-server.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DALI TRITON Backend\n\n\u003c!----\u003e\n\nThis repository contains code for DALI Backend for Triton Inference Server.\n\n![alt text](dali.png)\n\n**NVIDIA DALI (R)**, the Data Loading Library, is a collection of highly optimized building blocks,\nand an execution engine, to accelerate the pre-processing of the input data for deep learning applications.\nDALI provides both the performance and the flexibility to accelerate different data pipelines as one library.\nThis library can then be easily integrated into different deep learning training and inference applications,\nregardless of used deep learning framework.\n\nTo find out more about DALI please refer to our [main page](https://developer.nvidia.com/DALI).\n[Getting started](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/getting%20started.html#Getting-started)\nand [Tutorials](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/index.html)\nwill guide you through your first steps and [Supported operations](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/supported_ops.html)\nwill help you put together GPU-powered data processing pipelines.\n\n## See any bugs?\nFeel free to post an issue here or in DALI's [github repository](https://github.com/NVIDIA/DALI).\n\n## How to use?\n\n1. DALI data pipeline is expressed within Triton as a\n[Model](https://github.com/triton-inference-server/server/blob/master/docs/architecture.md#models-and-schedulers).\nTo create such Model, you have to put together a [DALI\nPipeline](https://docs.nvidia.com/deeplearning/dali/master-user-guide/docs/examples/getting%20started.html#Pipeline)\nin Python. Then, you have to serialize it (by calling the\n[Pipeline.serialize](https://docs.nvidia.com/deeplearning/dali/master-user-guide/docs/pipeline.html#nvidia.dali.pipeline.Pipeline.serialize)\nmethod) or use the [Autoserialization](#Autoserialization) to generate a Model file. As an example, we'll use simple\nresizing pipeline:\n\n        import nvidia.dali as dali\n        from nvidia.dali.plugin.triton import autoserialize\n\n        @autoserialize \n        @dali.pipeline_def(batch_size=256, num_threads=4, device_id=0)\n        def pipe():\n            images = dali.fn.external_source(device=\"cpu\", name=\"DALI_INPUT_0\")\n            images = dali.fn.image_decoder(images, device=\"mixed\")\n            images = dali.fn.resize(images, resize_x=224, resize_y=224)\n            return images\n\n1. Model file shall be incorporated in Triton's [Model\nRepository](https://github.com/triton-inference-server/server/blob/master/docs/model_repository.md).\nHere's the example:\n\n        model_repository\n        └── dali\n            ├── 1\n            │   └── model.dali\n            └── config.pbtxt\n\n1. As it's typical in Triton, your DALI Model file shall be named `model.dali`.\nYou can override this name in the model configuration, by setting `default_model_filename` option.\nHere's the whole `config.pbtxt` we use for the `ResizePipeline` example:\n\n        name: \"dali\"\n        backend: \"dali\"\n        max_batch_size: 256\n        input [\n        {\n            name: \"DALI_INPUT_0\"\n            data_type: TYPE_UINT8\n            dims: [ -1 ]\n        }\n        ]\n\n        output [\n        {\n            name: \"DALI_OUTPUT_0\"\n            data_type: TYPE_UINT8\n            dims: [ 224, 224, 3 ]\n        }\n        ]\n\nYou can omit writing most of the configuration file if you specify information about the\ninputs, outputs and max batch size in the pipeline definition.\nRefer to [Configuration auto-complete](#Configuration-auto-complete) for the details about this feature.\n\n## Configuration auto-complete\n\nTo simplify the model deployment, Triton Server can infer parts of the\nconfiguration file from the model file itself. In case of DALI backend, the information\nabout the inputs, outputs and the max batch size can be specified in the pipeline definition\nand does not need to be repeated in the configuration file. Below you can see how to include the\nconfiguration info in the Python pipeline definition:\n\n    import nvidia.dali as dali\n    from nvidia.dali.plugin.triton import autoserialize\n    import nvidia.dali.types as types\n\n    @autoserialize\n    @dali.pipeline_def(batch_size=256, num_threads=4, device_id=0, output_dtype=[types.UINT8], output_ndim=[3])\n    def pipe():\n        images = dali.fn.external_source(device=\"cpu\", name=\"DALI_INPUT_0\", dtype=types.UINT8, ndim=1)\n        images = dali.fn.image_decoder(images, device=\"mixed\")\n        images = dali.fn.resize(images, resize_x=224, resize_y=224)\n        return images\n\nAs you can see, we added `dtype` and `ndim` (number of dimensions) arguments to the external source operator. They provide the information needed to\nfill the `inputs` field in the configuration file. To fill the `outputs` field, we added the `output_dtype` and `output_ndim` arguments to the pipeline definition. Those are expected to be lists with a value for each output.\n\nThis way we can limit the configuration file to just naming the model and specifying the DALI backend:\n\n    name: \"dali\"\n    backend: \"dali\"\n\n### Partial configuration\nYou can still provide some of the information if it is not present in the pipeline definition\nor to override some of the values. For example, you can use the configuration file\nto give new names to the model outputs which might be useful when using them later in an ensemble model.\nYou can also overwrite the max batch size. The configuration file for the pipeline defined above could\nlook like this:\n\n    name: \"dali\"\n    backend: \"dali\"\n    max_batch_size: 128\n\n    output [\n    {\n        name: \"DALI_OUTPUT_0\"\n        dims: [ 224, 224, 3 ]\n    }\n    ]\n\nSuch configuration file overwrites the max batch size value to 128. It also renames the pipeline output\nto `\"DALI_OUTPUT_0\"` and specifies its shape to be `(224, 224, 3)`.\n\nRefer [DALI model configuration file](docs/config.md) documentation for details on model parameters that can specified in the configuation file.\n\n## Autoserialization\n\nWhen using DALI Backend in Triton, user has to provide a DALI model in the Model Repository.\nA canonical way of expressing a model is to include a serialized DALI model file there and\nnaming the file properly (``model.dali`` by default). The issue that arises from storing model\nin a serialized file is that, after serialization, the model is obscure and almost impossible\nto read anymore. Autoserialization feature allows user to express the model in Python code in\nthe model repository. With the Python-defined model, DALI Backend uses internal serialization\nmechanism and exempts user from manual serialization.\n\nTo use the autoserialization feature, user needs to put a Python-definition of the DALI pipeline\ninside the model file (``model.dali`` by default, but the default file name can be configured\nin the ``config.pbtxt``). Such pipeline definition has to be decorated with ``@autoserialize``,\ne.g.:\n\n    import nvidia.dali as dali\n\n    @dali.plugin.triton.autoserialize\n    @dali.pipeline_def(batch_size=3, num_threads=1, device_id=0)\n    def pipe():\n        '''\n        An identity pipeline with autoserialization enabled\n        '''\n        data = dali.fn.external_source(device=\"cpu\", name=\"DALI_INPUT_0\")\n        return data\n\n\nProper DALI pipeline definition in Python, together with autoserialization, shall meet the\nfollowing conditions:\n1. Only a ``pipeline_def`` can be decorated with ``autoserialize``.\n2. Only one pipeline definition may be decorated with ``autoserialize`` in a given model version.\n\nWhile loading a model file, DALI Backend follows the precedence:\n1. First, DALI Backend tries to load a serialized model from the user-specified model location in ``default_model_filename`` property (``model.dali`` if not specified explicitly);\n2. If the previous fails, DALI Backend tries to load and autoserialize a Python pipeline\ndefinition from the user-specified model location. **Important**: In this case we require, that the file name with the model definition ends with ``.py``, e.g. ``mymodel.py``;\n3. If the previous fails, DALI Backend tries to load and autoserialize a Python pipeline\ndefinition from the ``dali.py`` file in a given model version.\n\nIf you did not tweak a model path definition in the `config.pbtxt` file, you should follow the rule of thumb:\n1. If you have a serialized pipeline, call the file `model.dali` and put it into the model repository,\n2. If you have a python definition of a pipeline, which shall be autoserialized, call it `dali.py`.\n\n## Tips \u0026 Tricks:\n1. Currently, the only way to pass an input to the DALI pipeline from Triton is to use the `fn.external_source` operator.\nTherefore, there's a high chance, that you'll want to use it to feed the encoded images (or any other data) into DALI.\n2. Give your `fn.external_source` operator the same name you give to the Input in `config.pbtxt`.\n\n## Known limitations:\n1. DALI's `ImageDecoder` accepts data only from the CPU - keep this in mind when putting together your DALI pipeline.\n1. Triton accepts only homogeneous batch shape. Feel free to pad your batch of encoded images with zeros\n1. Due to DALI limitations, you might observe unnaturally increased memory consumption when\ndefining instance group for DALI model with higher `count` than 1. We suggest using default instance\ngroup for DALI model.\n\n\n## How to build?\n\n### Docker build \u003ca name=\"docker_build\"\u003e\u003c/a\u003e\nBuilding DALI Backend with docker is as simple as:\n\n    git clone --recursive https://github.com/triton-inference-server/dali_backend.git\n    cd dali_backend\n    docker build -f docker/Dockerfile.release -t tritonserver:dali-latest .\n\nAnd `tritonserver:dali-latest` becomes your new `tritonserver` docker image\n\n### Bare metal\n#### Prerequisites\nTo build `dali_backend` you'll need `CMake 3.17+`\n#### Using fresh DALI release\nOn the event you'd need to use newer DALI version than it's provided in `tritonserver` image,\nyou can use DALI's [nightly builds](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html#nightly-and-weekly-release-channels).\nJust install whatever DALI version you like using pip (refer to the link for more info how to do it).\nIn this case, while building `dali_backend`, you'd need to pass `-D TRITON_SKIP_DALI_DOWNLOAD=ON`\noption to your CMake build. `dali_backend` will find the latest DALI installed in your system and\nuse this particular version.\n#### Building\nBuilding DALI Backend is really straightforward. One thing to remember is to clone\n`dali_backend` repository with all the submodules:\n\n    git clone --recursive https://github.com/triton-inference-server/dali_backend.git\n    cd dali_backend\n    mkdir build\n    cd build\n    cmake ..\n    make\n\nThe building process will generate `unittest` executable.\nYou can use it to run unit tests for DALI Backend\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftriton-inference-server%2Fdali_backend","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftriton-inference-server%2Fdali_backend","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftriton-inference-server%2Fdali_backend/lists"}