{"id":19093697,"url":"https://github.com/juliaarrays/tilediteration.jl","last_synced_at":"2025-08-20T16:12:58.809Z","repository":{"id":46807506,"uuid":"66657076","full_name":"JuliaArrays/TiledIteration.jl","owner":"JuliaArrays","description":"Julia package to facilitate writing mulithreaded, multidimensional, cache-efficient code","archived":false,"fork":false,"pushed_at":"2024-05-10T10:36:39.000Z","size":95,"stargazers_count":81,"open_issues_count":8,"forks_count":9,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-02-21T07:17:39.787Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JuliaArrays.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-08-26T15:08:29.000Z","updated_at":"2024-08-10T10:48:30.000Z","dependencies_parsed_at":"2024-01-22T21:56:40.549Z","dependency_job_id":"b4596d09-44d1-4b2e-902e-f3925b34576d","html_url":"https://github.com/JuliaArrays/TiledIteration.jl","commit_stats":{"total_commits":55,"total_committers":11,"mean_commits":5.0,"dds":"0.36363636363636365","last_synced_commit":"f0b091d0e6674c9053cef684fcb2d8a390b6a9d4"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaArrays%2FTiledIteration.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaArrays%2FTiledIteration.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaArrays%2FTiledIteration.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaArrays%2FTiledIteration.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JuliaArrays","download_url":"https://codeload.github.com/JuliaArrays/TiledIteration.jl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240142863,"owners_count":19754640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T03:25:45.136Z","updated_at":"2025-02-22T07:54:28.532Z","avatar_url":"https://github.com/JuliaArrays.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TiledIteration\n\n[![CI](https://github.com/JuliaArrays/TiledIteration.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/JuliaArrays/TiledIteration.jl/actions/workflows/CI.yml)\n[![codecov.io](http://codecov.io/github/JuliaArrays/TiledIteration.jl/coverage.svg?branch=master)](http://codecov.io/github/JuliaArrays/TiledIteration.jl?branch=master)\n\nThis Julia package handles some of the low-level details for writing\ncache-efficient, possibly-multithreaded code for multidimensional\narrays. A \"tile\" corresponds to a chunk of a larger array, typically a\nregion that is large enough to encompass any \"local\" computations you\nneed to perform; some of these computations may require temporary storage.\n\nA related package with different aims is [TiledViews.jl](https://github.com/bionanoimaging/TiledViews.jl).\n\n## Usage\n\nThis package offers two basic kinds of functionality: the management\nof temporary buffers for processing on tiles, and the iteration over\ndisjoint tiles of a larger array.\n\n### Iteration\n\n#### SplitAxis and SplitAxes\n\nThe main use for these simple types is in distributing work across\nthreads, usually in circumstances that do not require\nmultidimensional locality as provided by `TileIterator`.  `SplitAxis`\nsplits a single array axis, and `SplitAxes` splits multidimensional\naxes along the final axis.  For example:\n\n```julia\njulia\u003e using TiledIteration\n\njulia\u003e A = rand(3, 20);\n\njulia\u003e collect(SplitAxes(axes(A), 4))\n4-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:3, 1:5)\n (1:3, 6:10)\n (1:3, 11:15)\n (1:3, 16:20)\n```\n\nYou can also reduce the amount of work assigned to thread 1 (often the\nmain thread is responsible for scheduling the other threads):\n\n```julia\njulia\u003e collect(SplitAxes(axes(A), 3.5))\n4-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:3, 1:2)\n (1:3, 3:8)\n (1:3, 9:14)\n (1:3, 15:20)\n```\n\nUsing \"3.5 chunks\" forces the later workers to perform 6 columns of\nwork (rounding 20/3.5 up to the next integer), leaving only two\ncolumns remaining for the first thread.\n\n\n#### TileIterator\n\nMore general iteration over disjoint tiles of a larger array can be done\nwith `TileIterator`:\n\n```julia\nusing TiledIteration\n\nA = rand(1000,1000);   # our big array\nfor tileaxs in TileIterator(axes(A), (128,8))\n    @show tileaxs\nend\n```\n\nThis produces\n```julia\ntileaxs = (1:128,1:8)\ntileaxs = (129:256,1:8)\ntileaxs = (257:384,1:8)\ntileaxs = (385:512,1:8)\ntileaxs = (513:640,1:8)\ntileaxs = (641:768,1:8)\ntileaxs = (769:896,1:8)\ntileaxs = (897:1000,1:8)\ntileaxs = (1:128,9:16)\ntileaxs = (129:256,9:16)\ntileaxs = (257:384,9:16)\ntileaxs = (385:512,9:16)\n...\n```\n\nYou can see that the total axes range is split up into chunks,\nwhich are of size `(128,8)` except at the edges of `A`. Naturally,\nthese axes serve as the basis for processing individual chunks of\nthe array.\n\nAs a further example, suppose you've started julia with `JULIA_NUM_THREADS=4`; then\n\n```julia\nfunction fillid!(A, tilesz)\n    tileinds_all = collect(TileIterator(axes(A), tilesz))\n    Threads.@threads for i = 1:length(tileinds_all)\n        tileaxs = tileinds_all[i]\n        A[tileaxs...] .= Threads.threadid()\n    end\n    A\nend\n\nA = zeros(Int, 8, 8)\nfillid!(A, (2,2))\n```\n\nwould yield\n\n```julia\n8×8 Array{Int64,2}:\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n 1  1  2  2  3  3  4  4\n```\n\nSee also \"EdgeIterator\" below.\n\n### Determining the chunk size\n\n[Stencil computations](https://en.wikipedia.org/wiki/Stencil_code)\ntypically require \"padding\" values, so the inputs to a computation may\nbe of a different size than the resulting outputs. Naturally, you can\nset the tile size manually; a simple convenience function,\n`padded_tilesize`, attempts to pick reasonable choices for you\ndepending on the size of your kernel (stencil) and element type you'll\nbe using:\n\n```julia\njulia\u003e padded_tilesize(UInt8, (3,3))\n(768,18)\n\njulia\u003e padded_tilesize(UInt8, (3,3), 4)  # we want 4 of these to fit in L1 cache at once\n(512,12)\n\njulia\u003e padded_tilesize(Float64, (3,3))\n(96,18)\n\njulia\u003e padded_tilesize(Float32, (3,3,3))\n(64,6,6)\n```\n\n### Allocating and managing temporary storage\n\nTo allocate temporary storage while working with tiles, use `TileBuffer`:\n\n```julia\njulia\u003e tileaxs = (-1:15, 0:7)  # really this might have come from TileIterator\n\njulia\u003e buf = TileBuffer(Float32, tileaxs)\nTiledIteration.TileBuffer{Float32,2,2} with indices -1:15×0:7:\n 0.0  0.0          2.38221f-44  0.0          0.0          0.0          9.3887f-44   0.0\n 0.0  1.26117f-44  0.0          0.0          0.0          8.26766f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          6.02558f-44  0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          7.28675f-44  0.0          0.0          0.0\n 0.0  1.54143f-44  0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          9.94922f-44  0.0\n 0.0  0.0          0.0          0.0          0.0          8.82818f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          9.10844f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          1.03696f-43  0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n```\n\nThis returns an uninitialized buffer for use over the indicated domain. You can reuse this same storage for the next tile, even if the tile is smaller because it corresponds to the edge of the original array:\n\n```julia\njulia\u003e pointer(buf)\nPtr{Float32} @0x00007f79131fd550\n\njulia\u003e buf = TileBuffer(buf, (16:20, 0:7))\nTiledIteration.TileBuffer{Float32,2,2} with indices 16:20×0:7:\n 0.0  0.0  0.0  0.0          0.0          0.0  0.0          0.0\n 0.0  0.0  0.0  0.0          0.0          0.0  0.0          0.0\n 0.0  0.0  0.0  0.0          1.54143f-44  0.0  0.0          0.0\n 0.0  0.0  0.0  1.26117f-44  0.0          0.0  0.0          0.0\n 0.0  0.0  0.0  0.0          0.0          0.0  2.38221f-44  0.0\n\njulia\u003e pointer(buf)\nPtr{Float32} @0x00007f79131fd550\n```\n\nWhen you use it again at the top of the next block of columns, it returns to its original size while still reusing the same memory:\n```julia\njulia\u003e buf = TileBuffer(buf, (-1:15, 8:15))\nTiledIteration.TileBuffer{Float32,2,2} with indices -1:15×8:15:\n 0.0  0.0          2.38221f-44  0.0          0.0          0.0          9.3887f-44   0.0\n 0.0  1.26117f-44  0.0          0.0          0.0          8.26766f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          6.02558f-44  0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          7.28675f-44  0.0          0.0          0.0\n 0.0  1.54143f-44  0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          9.94922f-44  0.0\n 0.0  0.0          0.0          0.0          0.0          8.82818f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          9.10844f-44  0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          1.03696f-43  0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n 0.0  0.0          0.0          0.0          0.0          0.0          0.0          0.0\n\njulia\u003e pointer(buf)\nPtr{Float32} @0x00007f79131fd550\n```\n\n### EdgeIterator\n\nWhen performing stencil operations, oftentimes the edge of the array\nrequires special treatment. Several approaches to handling the edges\n(adding explicit padding, or executing special code just when on the\nboundaries) can slow your algorithm down because of extra steps or\nbranches.\n\nThis package helps support implementations which first handle the\n\"interior\" of an array (for example using `TiledIterator` over just\nthe interior) using a \"fast path,\" and then handle just the edges by a\n(possibly) less carefully optimized algorithm. The key component of\nthis is `EdgeIterator`:\n\n```julia\nouterrange = CartesianIndices((-1:4, 0:3))\ninnerrange = CartesianIndices(( 1:3, 1:2))\njulia\u003e for I in EdgeIterator(outerrange, innerrange)\n           @show I\n       end\nI = CartesianIndex(-1, 0)\nI = CartesianIndex(0, 0)\nI = CartesianIndex(1, 0)\nI = CartesianIndex(2, 0)\nI = CartesianIndex(3, 0)\nI = CartesianIndex(4, 0)\nI = CartesianIndex(-1, 1)\nI = CartesianIndex(0, 1)\nI = CartesianIndex(4, 1)\nI = CartesianIndex(-1, 2)\nI = CartesianIndex(0, 2)\nI = CartesianIndex(4, 2)\nI = CartesianIndex(-1, 3)\nI = CartesianIndex(0, 3)\nI = CartesianIndex(1, 3)\nI = CartesianIndex(2, 3)\nI = CartesianIndex(3, 3)\nI = CartesianIndex(4, 3)\n```\n\nThe time required to visit these edge sites is on the order of the\nnumber of edge sites, not the order of the number of sites encompassed\nby `outerrange`, and consequently is efficient.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliaarrays%2Ftilediteration.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuliaarrays%2Ftilediteration.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliaarrays%2Ftilediteration.jl/lists"}