{"id":30804326,"url":"https://github.com/cometscome/latticematrices.jl","last_synced_at":"2026-02-06T07:15:33.192Z","repository":{"id":306357799,"uuid":"1025882544","full_name":"cometscome/LatticeMatrices.jl","owner":"cometscome","description":"High-performance matrix fields on arbitrary D-dimensional lattices in Julia.","archived":false,"fork":false,"pushed_at":"2026-01-30T23:12:02.000Z","size":317,"stargazers_count":1,"open_issues_count":6,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-31T01:56:00.387Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cometscome.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-07-25T01:05:17.000Z","updated_at":"2026-01-30T23:12:05.000Z","dependencies_parsed_at":"2025-07-25T07:42:11.887Z","dependency_job_id":"455b050e-c019-4679-8d8b-5303b585ad0a","html_url":"https://github.com/cometscome/LatticeMatrices.jl","commit_stats":null,"previous_names":["cometscome/mpilattice.jl","cometscome/latticematrices.jl"],"tags_count":31,"template":false,"template_full_name":null,"purl":"pkg:github/cometscome/LatticeMatrices.jl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cometscome%2FLatticeMatrices.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cometscome%2FLatticeMatrices.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cometscome%2FLatticeMatrices.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cometscome%2FLatticeMatrices.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cometscome","download_url":"https://codeload.github.com/cometscome/LatticeMatrices.jl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cometscome%2FLatticeMatrices.jl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29154003,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-06T02:39:25.012Z","status":"ssl_error","status_checked_at":"2026-02-06T02:37:22.784Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-05T23:59:44.536Z","updated_at":"2026-02-06T07:15:33.169Z","avatar_url":"https://github.com/cometscome.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LatticeMatrices.jl\n\n[![Build Status](https://github.com/cometscome/MPILattice.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/cometscome/MPILattice.jl/actions/workflows/CI.yml?query=branch%3Amain)\n\nHigh-performance **matrix fields on arbitrary D-dimensional lattices** in Julia.\n\n- Per-site matrices (size `NC1×NC2`) stored in **column-major layout**:  \n  `(NC1, NC2, X, Y, Z, …)`\n- **MPI** domain decomposition via a Cartesian communicator (halo width `nw`, periodic BCs).\n- **GPU-ready** through **[JACC.jl](https://github.com/JuliaORNL/JACC.jl)** (portable CPU/GPU kernels; CUDA/ROCm/Threads).\n- Fast, allocation-free **indexing helpers** for kernels: `DIndexer`, `linearize`, `delinearize`, `shiftindices`.\n\n\u003e This package focuses on scalable, halo-exchange–based lattice algorithms with minimal allocations and clean multi-backend execution.\n\n**Applications**: This package is designed to support large-scale simulations on structured lattices. A key application area is lattice QCD, where gauge fields and fermion fields are represented as matrix-valued objects on a multi-dimensional lattice. In future developments, LatticeMatrices.jl is planned to be integrated into [Gaugefields.jl](https://github.com/akio-tomiya/Gaugefields.jl) and [LatticeDiracOperators.jl](https://github.com/akio-tomiya/LatticeDiracOperators.jl), providing the underlying data structures and linear algebra kernels for gauge and fermion dynamics.\n\n\n\n**Current limitation.** Multi‑GPU execution and hybrid MPI+threads parallelism are **experimental** and **not yet thoroughly tested**; treat them as provisional.\n\n\n---\n\n## Installation\n\n```julia\npkg\u003e add LatticeMatrices\n```\n\nRequirements:\n- Julia ≥ 1.11\n\n---\n\n## Quick tour\n\n### 1) D-dimensional indexing helpers (GPU-kernel friendly)\n\n```julia\nusing LatticeMatrices\n\n# Build an indexer for a D-dimensional lattice (1-based indices)\ngsize = (16, 16, 16, 16)     # global lattice size\nd = DIndexer(gsize)          # computes row-major \"strides\" internally\n\n# Convert between linear and multi-index (1-based)\nL  = linearize(d, (1, 1, 1, 1))   # -\u003e 1\nix = delinearize(d, 4)            # -\u003e (4, 1, 1, 1) on this shape\n\n# Apply shifts componentwise\np = shiftindices((4, 1, 1, 1), (1, 0, 0, 0))   # -\u003e (5, 1, 1, 1)\n```\n\n**Signatures**\n```julia\nstruct DIndexer{D,dims,strides} end\nDIndexer(dims_in::NTuple{D,\u003c:Integer}) where {D}\nDIndexer(dims_in::AbstractVector{\u003c:Integer})\n\n# 1-based linearization/delinearization (no heap allocs; GPU-friendly)\nlinearize(::DIndexer{D,dims,strides}, idx::NTuple{D,Int32})::Int32\ndelinearize(::DIndexer{D,dims,strides}, L::Integer, offset::Int32=0)::NTuple{D,Int32}\n\n# elementwise shifting for index tuples\nshiftindices(indices, shift)\n```\n\n- `delinearize(...; offset)` is handy to **map into halo regions**, e.g. pass `offset = nw`.\n\n---\n\n### 2) Lattice containers (MPI + halos + JACC arrays)\n\nThe core container stores a **halo-padded** array on each rank and manages halo exchange without MPI derived datatypes (faces are packed into contiguous buffers).\n\n```julia\nusing LatticeMatrices, MPI, JACC, LinearAlgebra\nJACC.@init_backend\nMPI.Init()\n\ndim   = 4\ngsize = ntuple(_ -\u003e 16, dim)   # global spatial size per dimension\nnw    = 1                      # ghost width\nNC    = 3                      # per-site matrix size (NC×NC)\n\n# Choose a Cartesian process grid (PEs) of length `dim`\nnprocs = MPI.Comm_size(MPI.COMM_WORLD)\nn1 = max(nprocs ÷ 2, 1)\nPEs = ntuple(i -\u003e i == 1 ? n1 : (i == 2 ? nprocs ÷ n1 : 1), dim)\n\n# Construct an empty lattice matrix (device array via JACC.zeros)\nM = LatticeMatrix(NC, NC, dim, gsize, PEs; nw, elementtype=ComplexF64)\n\n# Or initialize from an existing array (broadcast to ranks)\nA = rand(ComplexF64, NC, NC, gsize...)\nM2 = LatticeMatrix(A, dim, PEs; nw)\n\n# Halo exchange across all spatial dimensions\nset_halo!(M)\n\n# Global gather helpers (host reconstruction on rank 0)\nG = gather_matrix(M; root=0)                # rank 0: Array(NC, NC, gsize...)\nGall = gather_and_bcast_matrix(M; root=0)   # all ranks receive the same Array\n```\n\n**Key type**\n```julia\nstruct LatticeMatrix{D,T,AT,NC1,NC2,nw,DI} \u003c: Lattice{D,T,AT}\n    nw::Int\n    phases::SVector{D,T}         # per-direction phase (applied at wrap boundaries)\n    NC1::Int\n    NC2::Int\n    gsize::NTuple{D,Int}\n    cart::MPI.Comm               # Cartesian communicator\n    coords::NTuple{D,Int}        # 0-based Cartesian coords\n    dims::NTuple{D,Int}          # process grid (PEs)\n    nbr::NTuple{D,NTuple{2,Int}} # neighbors (minus, plus)\n    A::AT                        # local array (NC1, NC2, X, Y, Z, …) with halos\n    buf::Vector{AT}              # four face buffers per spatial dim\n    myrank::Int\n    PN::NTuple{D,Int}            # local interior size per dim (no halos)\n    comm::MPI.Comm               # original communicator\n    indexer::DI                  # DIndexer for global sizes\nend\n```\n\n**Constructors**\n```julia\nLatticeMatrix(NC1, NC2, dim, gsize, PEs;\n              nw=1, elementtype=ComplexF64, phases=ones(dim), comm0=MPI.COMM_WORLD)\n\nLatticeMatrix(A, dim, PEs; nw=1, phases=ones(dim), comm0=MPI.COMM_WORLD)\n```\n\n- **Layout**: `(NC1, NC2, X, Y, Z, …)`; halos are the outer `nw` cells on each spatial dim.  \n- **Phases**: wrap-around phases per dimension (applied on the boundary faces during exchange).  \n- **Exchange**: `set_halo!(ls)` calls `exchange_dim!(ls, d)` for each spatial dimension `d`.\n\n---\n\n### 3) Linear algebra on lattices\n\nPer-site matrix operations follow BLAS-like semantics. The test suite shows full coverage (plain/adjoint inputs, shifted views):\n\n```julia\n# Random per-site matrices\nA1 = rand(ComplexF64, NC, NC, gsize...)\nA2 = rand(ComplexF64, NC, NC, gsize...)\nA3 = rand(ComplexF64, NC, NC, gsize...)\n\nM1 = LatticeMatrix(NC, NC, dim, gsize, PEs; nw)\nM2 = LatticeMatrix(A2, dim, PEs; nw)\nM3 = LatticeMatrix(A3, dim, PEs; nw)\n\n# Choose a site (using DIndexer + halos)\nindexer = DIndexer(gsize)\nL = 4\nidx_halo = Tuple(delinearize(indexer, L, Int32(nw)))  # with halo offset\nidx_core = Tuple(delinearize(indexer, L, Int32(0)))   # core (no halo)\n\n# Reference (host) product at a single site:\na1 = A1[:, :, idx_core...]\na2 = A2[:, :, idx_core...]\na3 = A3[:, :, idx_core...]\nmul!(a1, a2, a3)\n\n# Lattice product (device-backed); updates M1.A at that site:\nmul!(M1, M2, M3)\nm1 = M1.A[:, :, idx_halo...]\n@assert a1 ≈ m1\n\n# Matrix exponential at each site (in-place):\nexpt!(M1, M2, 1)\nm1 = M1.A[:, :, idx_halo...]\na1 = exp(a2)\n@assert a1 ≈ m1\n\n# Trace and sum over all sites (returns a scalar)\n println(tr(M1))\n\n```\n\nAdjoints and **shifted** operands are supported via lightweight wrappers:\n\n```julia\nM2p = Shifted_Lattice(M2, (1, 0, 0, 0))    # shift by +1 along X (periodic)\nmul!(M1, M2', M3p)                          # all combinations in tests:\n                                            # (A, B, C), (A, B', C), (A, B, C'), etc.\n```\n\n**Convenience**\n```julia\n# Reduced sums (interior region only)\ns = allsum(M)   # MPI.Reduce to root (returns the global sum on rank 0)\n```\n\n\n## Examples: matrix multiplication on lattices\n\n### 1) Plain matrix multiplication at each lattice site\n\n```julia\nusing LatticeMatrices, MPI, JACC, LinearAlgebra\nJACC.@init_backend\nMPI.Init()\n\ndim   = 2\ngsize = (8, 8)\nNC    = 3\nPEs   = (2, 2)          # process grid (2×2)\n\nM1 = LatticeMatrix(NC, NC, dim, gsize, PEs)\nM2 = LatticeMatrix(rand(ComplexF64, NC, NC, gsize...), dim, PEs)\nM3 = LatticeMatrix(rand(ComplexF64, NC, NC, gsize...), dim, PEs)\n\nmul!(M1, M2, M3)        # per-site product: M1 = M2 * M3\n```\n\n### 2) Multiplication with a shifted lattice\n\n```julia\nshift = (1, 0)                  # shift by +1 along X\nM2s = Shifted_Lattice(M2, shift)\n\nmul!(M1, M2s, M3)                # M1 = (M2 shifted) * M3\n```\n\nThe shift is applied with periodic wrapping across the global lattice size.\n\n---\n\n\n\n### 3) Multiplication with conjugate-transposed matrices\n\n```julia\nmul!(M1, M2', M3)                # M1 = adjoint(M2) * M3\nmul!(M1, M2, M3')                # M1 = M2 * adjoint(M3)\nmul!(M1, M2', M3')               # M1 = adjoint(M2) * adjoint(M3)\n```\n\nAll combinations of shifted and adjoint operands are supported and tested in `test/runtests.jl`.\n\n---\n\n## Automatic differentiation (Enzyme)\n\n(above v0.3: experimental) We provide Enzyme-based AD extensions and test cases. See `test/adtest/ad.jl` for a concrete comparison between\nautomatic differentiation and numerical differentiation using `calc_action_loopfn`. The loop body is factored\ninto a small helper function (`_calc_action_step!`), which makes Enzyme AD more reliable for loop-heavy code.\n\nExample (runs the AD vs numerical comparison with `calc_action_loopfn`):\n\n```julia\nusing Enzyme\nusing LatticeMatrices, MPI, JACC\nJACC.@init_backend\nMPI.Init()\n\ninclude(\"test/adtest/ad.jl\") # runs main() in the script\n```\n\nNote: the AD result here follows Enzyme's complex differentiation convention. For a complex variable\n`U = X + iY`, the gradient reported by Enzyme is\n`dS/dUij = dS/dXij + i dS/dYij`.\n\n\n---\n\n## Running the test example\n\nExactly what `test/runtests.jl` does:\n\n```bash\n# CPU single process\njulia --project -e 'using Pkg; Pkg.test(\"LatticeMatrices\")'\n\n# MPI (choose ranks and an MPI launcher)\nmpiexec -n 4 julia --project test/runtests.jl\n\n# With GPUs (example; make sure CUDA/ROCm works and select a JACC backend)\njulia --project -e 'using JACC; JACC.@init_backend; using Pkg; Pkg.test()'\n```\n\nInternally, the tests:\n- sweep `dim = 1:4` and `NC = 2:4`,\n- construct `LatticeMatrix` objects on a Cartesian grid `PEs`,\n- verify `mul!` for all nine combinations with/without adjoint and with/without shifts,\n- use `DIndexer` to map between linear and multi-indices, including halo offsets.\n\n---\n\n## API reference (selected)\n\n```julia\n# Indexing\nDIndexer(::NTuple{D,\u003c:Integer})\nDIndexer(::AbstractVector{\u003c:Integer})\nlinearize(::DIndexer{D,dims,strides}, ::NTuple{D,Int32})::Int32\ndelinearize(::DIndexer{D,dims,strides}, ::Integer, ::Int32=0)::NTuple{D,Int32}\nshiftindices(indices, shift)\n\n# Lattice\nLatticeMatrix(NC1, NC2, dim, gsize, PEs; nw=1, elementtype=ComplexF64,\n              phases=ones(dim), comm0=MPI.COMM_WORLD)\nLatticeMatrix(A, dim, PEs; nw=1, phases=ones(dim), comm0=MPI.COMM_WORLD)\n\nset_halo!(ls)\nexchange_dim!(ls, d::Int)\n\ngather_matrix(ls; root=0)::Union{Array{T},Nothing}\ngather_and_bcast_matrix(ls; root=0)::Array{T}\n\nallsum(ls)  # Reduce(SUM) to root over interior\n\n# Lightweight wrappers\nstruct Shifted_Lattice{D,shift}; data::D; end\nstruct Adjoint_Lattice{D};       data::D; end\n# Base.adjoint(::Lattice) and Base.adjoint(::Shifted_Lattice) return Adjoint_Lattice\n```\n\n---\n\n\n## License\n\nMIT (see `LICENSE`).\n\n---\n\n## Acknowledgements\n\nBuilt on the excellent Julia HPC stack: **MPI.jl**, **JACC.jl**, and the Julia standard libraries.\n\n---\n\n### References\n\n- MPI.jl: https://github.com/JuliaParallel/MPI.jl  \n- JACC.jl: https://github.com/JuliaORNL/JACC.jl\n\n\n\n---\n\n## Selecting \u0026 switching GPU/CPU backends (via JACC.jl)\n\nLatticeMatrices.jl uses [JACC.jl] for performance‑portable execution. Follow JACC’s\nrecommended flow to select **one** backend per project/session:\n\n1) **Set a backend** (writes/updates `LocalPreferences.toml` and adds the backend package):\n```julia\njulia\u003e import JACC\njulia\u003e JACC.set_backend(\"cuda\")     # or \"amdgpu\" or \"threads\" (default)\n```\n2) **Initialize at top level** so your code doesn’t need backend‑specific imports:\n```julia\nimport JACC\nJACC.@init_backend                  # must be at top-level scope\n```\n\n3) **Switching backends.** Re-run `JACC.set_backend(\"amdgpu\")` (or `\"threads\"`, `\"cuda\"`) in the same project to switch; this updates `LocalPreferences.toml`. Restart your Julia session so extensions load for the new backend, then call `JACC.@init_backend` again.\n\n\u003e Notes:\n\u003e - Without calling `@init_backend`, using a non-`\"threads\"` backend will raise\n\u003e   errors like `get_backend(::Val(:cuda))` when invoking JACC functions.\n\u003e - `JACC.array` / `JACC.array_type()` help you stay backend‑agnostic in your APIs.\n\n\nReferences: JACC quick start and usage in the upstream README.  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcometscome%2Flatticematrices.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcometscome%2Flatticematrices.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcometscome%2Flatticematrices.jl/lists"}