{"id":19817162,"url":"https://github.com/nvpro-samples/vk_compute_mipmaps","last_synced_at":"2026-03-04T19:31:21.032Z","repository":{"id":45229075,"uuid":"388998144","full_name":"nvpro-samples/vk_compute_mipmaps","owner":"nvpro-samples","description":"Customizable compute shader for fast cache-aware mipmap generation","archived":false,"fork":false,"pushed_at":"2024-09-07T07:20:53.000Z","size":26234,"stargazers_count":48,"open_issues_count":0,"forks_count":3,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-02-28T19:44:37.385Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"GLSL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nvpro-samples.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-24T04:21:30.000Z","updated_at":"2025-02-06T17:07:40.000Z","dependencies_parsed_at":"2024-11-12T10:11:55.314Z","dependency_job_id":null,"html_url":"https://github.com/nvpro-samples/vk_compute_mipmaps","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nvpro-samples/vk_compute_mipmaps","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fvk_compute_mipmaps","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fvk_compute_mipmaps/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fvk_compute_mipmaps/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fvk_compute_mipmaps/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nvpro-samples","download_url":"https://codeload.github.com/nvpro-samples/vk_compute_mipmaps/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nvpro-samples%2Fvk_compute_mipmaps/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30090511,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T18:31:08.343Z","status":"ssl_error","status_checked_at":"2026-03-04T18:31:07.708Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T10:11:51.678Z","updated_at":"2026-03-04T19:31:21.014Z","avatar_url":"https://github.com/nvpro-samples.png","language":"GLSL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# vk_compute_mipmaps\n\nThis repository demonstrates a customizable cache-aware mipmap\ngeneration algorithm using compute shaders. For power-of-2 textures,\nthis outperforms the conventional blit algorithm by about\n50% (For a 4096x4096 sRGBA8 texture and an RTX 3090, the blit\nalgorithm took 161 microseconds vs 114 microseconds for the compute\nshader).\n\n**The most important component of this repository is the\n`nvpro_pyramid` library, whose directory can be copied out and used\nindependent of this repository.** This provides a GLSL compute shader\ntemplate that only provides a \"schedule\" for mipmap generation, and\nstub C++ code for dispatching the shader. The user completes the\nshader by defining macros for loading, reducing, and storing samples,\nadapting the shader to their own aplication's needs. There are no\ndependencies besides standard C++ and Vulkan (in particular, it does\nnot depend on the `nvpro_core` framework, which the sample application\ndepends on).\n\nAdditionally, the repository contains a minimalist `nvpro_pyramid`\nusage sample (`minimal_app`) along with a full sample application\n(`demo_app`) that demonstrates some practical applications for fast\nruntime mipmap generation and provides benchmarking functionality.\n\n\n# Motivation\n\nA common method to generate mipmaps is to perform level-by-level blits\nwith bilinear filters.  (i.e. blitting from level 0 to level 1,\ninserting a barrier, blitting from level 1 to level 2, barrier, and so\non until the last mip level).\n\nWhile simple to understand and implement, this method has some\nobjective weaknesses:\n\n* The barriers introduce stalls into the queue for each blit: there is\n  some overhead waiting for one blit to drain fully before starting\n  the next.\n\n* For realistic image sizes, the large low-numbered levels (where the\n  bulk of work occurs) won't fit in cache. The samples written to\n  level `N` are very likely to be evicted from cache before being read\n  by the next blit (to level `N+1`).\n\nAdditional potential drawbacks (depending on use case) include:\n\n* Blits cannot run on compute-only queues.\n\n* Blits can only use a small number of built-in reduction functions\n  (e.g. bilinear filtering). This is especially problematic for\n  non-power-of-2 textures, which ideally (from an aesthetic standpoint)\n  should use a 3-by-3 kernel.\n\n\n# Library\n\nThe `nvpro_pyramid` library is concerned only with determining the\nschedule for mipmap generation and leaves everything else up to the\nuser whenever possible. You can copy its directory out of this\nrepository and use it in your own project. The library can be\ngeneralized to support general \"image pyramid\" algorithms, as long as\nthe reduction kernel is mipmap-like in that\n\n* It can be separated into horizontal and vertical passes.\n\n* The kernel size is 2x2 for even input images and 3x3 for odd.\n\n**Requirements:**\n\n* Vulkan 1.1+ and GLSL 4.50+ with `#include` support, no other\n  mandatory extensions. However, [subgroup\n  shuffle](https://www.khronos.org/blog/vulkan-subgroup-tutorial)\n  support is needed for optimal performance.\n\n* User defined macros for loading, reducing, and storing samples. In\n  particular the user has total freedom in picking the image\n  representation, sample type, and descriptor set layout.\n\n* User must reserve 32 bits out of their push constant for the shader to use.\n\n**Features:**\n\n* No external dependencies; self-contained in `nvpro_pyramid` directory.\n\n* No image size or divisibility requirements (barring `int` overflow).\n\n* Example shader included for sRGBA8 mipmap generation.\n\n* Non-power-of-2 support, using an\n  [energy conserving 3x3 kernel](https://nvpro-samples.github.io/vk_compute_mipmaps/docs/strategy.md.html#generalpipeline);\n  however, note that this has a\n  [performance cost](https://nvpro-samples.github.io/vk_compute_mipmaps/docs/strategy.md.html#generalpipelineperformance/quality-performancetradeoff).\n\n* Optional macros for performance tuning (e.g. using hardware bilinear\n  filtering to replace shader code).\n\n* Readable, commented GLSL code. You are invited to read the source code\n  and adapt the demonstrated techniques for other purposes.\n\n* Open source (see bottom for license).\n\n**Files:**\n\n* `nvpro_pyramid.glsl`: template shader; this file contains documentation\n  on how to integrate the shader into your application.\n\n* `nvpro_pyramid_dispatch.hpp`: Contains the `nvproCmdPyramidDispatch`\n  C++ function that records suitable dispatch commands for the compute shader.\n\n* `srgba8_mipmap_preamble.glsl`: Example macro definitions for configuring\n  the shader for sRGBA8 mipmap generation.\n\n* `srgba8_mipmap_fast_pipeline.comp` and `srgba8_mipmap_general_pipeline.comp`:\n  example complete compute shaders for sRGBA8 mipmap generation.\n\n\n# Sample Build and Run\n\n(This is only needed for the full sample; the library shader does not\ndepend on any helper or build system).\n\nClone https://github.com/nvpro-samples/nvpro_core.git next to this\nrepository (or pull latest `master` if you already have it).  If there\nare missing dependencies (e.g. glfw), run `git submodule update --init\n--recursive --checkout --force` in the `nvpro_core` repository.\n\n\u003c!-- TODO may become `main` branch. --\u003e\n\n`mkdir build \u0026\u0026 cd build \u0026\u0026 cmake .. # Or use CMake GUI`\n\nThen start the generated `.sln` in VS or run `make -j`.\n\n\n# Parallelization Strategy\n\nIn the ideal power-of-2 case, observe that it's very easy to use\noutputs immediately after they are generated, rather than storing them\nto main memory and reloading them much later. For example, a 16x16\ntile of the base mip level generates an 8x8 tile in level 1, which\nitself can be used to generate a 4x4 tile in level 2, and so on. This\nfact is used by the \"fast pipeline\" distributed with `nvpro_pyramid`,\nwhich parallelizes each tile across many GPU threads. To communicate\nresults between threads, the fast pipeline mostly uses\n[shuffles](https://www.khronos.org/blog/vulkan-subgroup-tutorial),\nwhich allow threads in the same subgroup (warp) to peek at each\nothers' registers with minimal or no synchronization overhead.\n\nNon-power-of-2 textures do not tile as cleanly, but the same\nprinciples are still applied, with some complications, in the \"general\npipeline\", used whenever the fast pipeline is not applicable. In this\ncase, threads communicate using shared memory.\n\n[More Details](https://nvpro-samples.github.io/vk_compute_mipmaps/docs/strategy.md.html)\n\nThis table provides RTX 3090 benchmark results, in nanoseconds, for\nsRGBA8 mipmap generation on images of various sizes, to give an idea of\nthe speed of the library.\n\n* `nvpro_pyramid`: using the `nvpro_pyramid` shader library\n\n* `blit`: using one blit and one barrier for each level generated\n\n* `onelevel`: using one compute dispatch and one barrier for each level generated\n\n```\n***********************************************************************\n*                   key: nanosecond runtime (runtime relative to blit)*\n*                                                                     *\n*+----------------+----------------+----------------+----------------+*\n*|     image size |  nvpro_pyramid |           blit |       onelevel |*\n*+----------------+----------------+----------------+----------------+*\n*|      1920,1080 |  36736 ( 80.6%)|  45568 (100.0%)|  46080 (101.1%)|*\n*|                |  37632 ( 80.8%)|  46592 (100.0%)|  46848 (100.5%)|*\n*+----------------+----------------+----------------+----------------+*\n*|      2560,1440 |  43008 ( 72.3%)|  59520 (100.0%)|  60672 (101.9%)|*\n*|                |  43648 ( 72.1%)|  60544 (100.0%)|  61440 (101.5%)|*\n*+----------------+----------------+----------------+----------------+*\n*|      3840,2160 |  75520 ( 80.3%)|  94080 (100.0%)|  96000 (102.0%)|*\n*|                |  76288 ( 80.0%)|  95360 (100.0%)|  96768 (101.5%)|*\n*+----------------+----------------+----------------+----------------+*\n*|      2048,2048 |  35712 ( 57.2%)|  62464 (100.0%)|  62748 (100.5%)|*\n*|                |  36480 ( 57.3%)|  63616 (100.0%)|  63488 ( 99.8%)|*\n*+----------------+----------------+----------------+----------------+*\n*|      4096,4096 | 112000 ( 70.1%)| 159744 (100.0%)| 159232 ( 99.7%)|*\n*|                | 113024 ( 70.2%)| 161024 (100.0%)| 160128 ( 99.4%)|*\n*+----------------+----------------+----------------+----------------+*\n***********************************************************************\n```\n\n\n# Minimal Application\n\nThe `minimal_app` directory holds a sample that loads an image from\ndisk and generates and outputs mipmaps for it. Note that the vast\nmajority of time is spent on I/O (the speed of the `nvpro_pyramid`\nlibrary is intended more for real-time applications).\n\n**The purpose of this sample is just to provide simple example code\n  for `nvpro_pyramid` library usage, which may help you if integrating\n  the library into your application is your goal.**\n\nTo run, select and run `vk_compute_mipmaps_minimal` in the solution\nexplorer or manually execute\n`../../bin_x64/Release/vk_compute_mipmaps_minimal.exe`.\n\n# Demo Application\n\nSee `demo_app` directory.\n\nThe demo application\n\n* Demonstrates the speed of the `nvpro_pyramid` shader by using it to\n  generate mipmaps for a huge dynamically-generated texture every frame.\n\n* Provides a tool for benchmarking the `nvpro_pyramid` shader, blits,\n  and alternative shaders rejected during development (or developed by\n  the user!), and testing their output's correctness.\n\n* Allows for choosing between these alternative mipmap generators at\n  runtime and immediately seeing their effects.\n\nTo run, select and run `vk_compute_mipmaps_demo` in the solution\nexplorer or manually execute\n`../../bin_x64/Release/vk_compute_mipmaps_demo.exe`.\n\n## Huge Texture\n\nThe sample dynamically generates a huge texture based on a [quadratic\npolynomial Julia\nSet](https://en.wikipedia.org/wiki/Julia_set#Quadratic_polynomials)\nwith varying constant coefficient, and generates mipmaps for it\ndynamically, before using it to texture the screen (in 2D camera mode)\nor the ground (in 3D camera mode). This is meant as a stand-in for any\nsort of dynamically-generated texture a production application might\ncreate, e.g. a reflection map, that may have to be sampled at multiple\nlevels-of-detail.\n\n## Benchmarking Utility \u0026 Alternative Shaders\n\nThe benchmarking utility compares the performance and correctness of\nthe default `nvpro_core` shader, blits, and alternative shaders, using\na variety of test images. This, along with the \"mipmap algorithm\"\nmenu, can be used to\n\n* [Quantify the effects of omitting various optimizations and\n  techniques used by the provided mipmap\n  shader](https://nvpro-samples.github.io/vk_compute_mipmaps/docs/strategy.md.html#fastpipelineperformance)\n\n* [Allow you to benchmark your own custom fast or general pipelines\n  and compare with the provided shader](https://nvpro-samples.github.io/vk_compute_mipmaps/docs/alternatives.md.html)\n\nThe Benchmark results are written in JSON format. When the sample is run in\nVisual Sample, the benchmark results seem to be written to `build/demo_app`\n(assuming the CMake build directory is named `build` as suggested).\n\n![Highlighted Controls](./docs/vk_compute_mipmaps_benchmark_ui.png)\n\n# Test Images\n\nThere are some test images in the `test_images` directory. Many test\npathological cases, e.g. `4095.jpg` tests the \"worst-case\"\none-less-than-a-power-of-2 case with high-frequency information.\n\nThese images are either photographed or computer generated by me,\nand have the same license as the code sample. `4096.jpg` and similar\nimages are based on a chaotic quadratic polynomial found by my college\nroommate, Marlon Trifunovic.\n\n\n# Acknowledgement\n\nThank you to Pascal Gautron, Christoph Kubisch, and Martin-Karl\nLefrançois for review; Christoph also provided the outline for the\n`fastPipeline` shader and optimization suggestions.\n\n\n# License\n\n\u003c!-- Note: LICENSE duplicated in nvpro_pyramid/ directory, in case it's copied out alone--\u003e\nCopyright 2021 NVIDIA CORPORATION. Released under Apache License,\nVersion 2.0. See \"LICENSE\" file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fvk_compute_mipmaps","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnvpro-samples%2Fvk_compute_mipmaps","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvpro-samples%2Fvk_compute_mipmaps/lists"}