https://github.com/under-peter/omeinsum.jl
One More Einsum for Julia! With runtime order-specification and high-level adjoints for AD
https://github.com/under-peter/omeinsum.jl
automatic-differentiation contraction einsum
Last synced: 6 months ago
JSON representation
One More Einsum for Julia! With runtime order-specification and high-level adjoints for AD
- Host: GitHub
- URL: https://github.com/under-peter/omeinsum.jl
- Owner: under-Peter
- License: mit
- Created: 2019-05-11T15:56:47.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2025-10-06T02:21:59.000Z (8 months ago)
- Last Synced: 2025-11-21T16:16:49.977Z (7 months ago)
- Topics: automatic-differentiation, contraction, einsum
- Language: Julia
- Homepage: https://under-peter.github.io/OMEinsum.jl/dev/
- Size: 2.41 MB
- Stars: 203
- Watchers: 5
- Forks: 29
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
OMEinsum - One More Einsum
[](https://under-Peter.github.io/OMEinsum.jl/dev)
[](https://github.com/under-Peter/OMEinsum.jl/actions/workflows/ci.yml)
[](https://codecov.io/gh/under-Peter/OMEinsum.jl)
This is a repository for the _Google Summer of Code_ project on *Differentiable Tensor Networks*.
It implements one function that both computer scientists and physicists love, the *Einstein summation*

To find out the details about einsum, please check out my [nextjournal-article](https://nextjournal.com/under-Peter/julia-summer-of-einsum) or the [numpy-manual](https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html).
Einstein summation can be implemented in no more than 20 lines of Julia code, the automatic differentiation is also [straightforward](https://giggleliu.github.io/2019/04/02/einsumbp.html). The main effort of this package is improving the [performance](https://github.com/under-Peter/OMEinsum-Benchmarks) utilizing Julia [multiple dispatch on traits](https://white.ucc.asn.au/2018/10/03/Dispatch,-Traits-and-Metaprogramming-Over-Reflection.html). So that people can enjoy the speed of faster specific implementations like BLAS functions, `sum` and `permutedims` on both CPU and GPU without suffering from runtime overhead.
*Note: why the test coverage is not 100%* - GPU-code coverage is not evaluated although we test the GPU code properly on gitlab. Ignoring the GPU-code, the actual coverage is at about _97%_.
*Warning: since v0.4, OMEinsum does not optimize the contraction order anymore. One has to use nested einsum to specify the contraction order manually, e.g. `ein"(ijk,jkl),klm->im"(x, y, z)`.* Please check out the [documentation](https://under-Peter.github.io/OMEinsum.jl/dev/contractionorder/) for more details.
## Install
To install, type `]` in a julia (>=1.5) REPL and then input
```julia pkg
pkg> add OMEinsum
```
## Learn by Examples
To avoid runtime overhead, we recommend users to use [non-standard string literal](https://docs.julialang.org/en/v1/manual/metaprogramming/#Non-Standard-String-Literals-1) `@ein_str`. The following examples illustrates how `einsum` works
```julia
julia> using OMEinsum, SymEngine
julia> catty = fill(Basic(:🐱), 2, 2)
2×2 Array{Basic,2}:
🐱 🐱
🐱 🐱
julia> fish = fill(Basic(:🐟), 2, 3, 2)
2×3×2 Array{Basic,3}:
[:, :, 1] =
🐟 🐟 🐟
🐟 🐟 🐟
[:, :, 2] =
🐟 🐟 🐟
🐟 🐟 🐟
julia> snake = fill(Basic(:🐍), 3, 3)
3×3 Array{Basic,2}:
🐍 🐍 🐍
🐍 🐍 🐍
🐍 🐍 🐍
julia> medicine = ein"ij,jki,kk->k"(catty, fish, snake)
3-element Array{Basic,1}:
4*🐱*🐍*🐟
4*🐱*🐍*🐟
4*🐱*🐍*🐟
julia> ein"ik,kj -> ij"(catty, catty) # multiply two matrices `a` and `b`
2×2 Array{Basic,2}:
2*🐱^2 2*🐱^2
2*🐱^2 2*🐱^2
julia> ein"ij -> "(catty)[] # sum a matrix, output 0-dimensional array
4*🐱
julia> ein"->ii"(asarray(snake[1,1]), size_info=Dict('i'=>5)) # get 5 x 5 identity matrix
5×5 Array{Basic,2}:
🐍 0 0 0 0
0 🐍 0 0 0
0 0 🐍 0 0
0 0 0 🐍 0
0 0 0 0 🐍
```
Alternatively, people can specify the contraction with a construction approach, which is useful when the contraction code can only be obtained at run time
```julia
julia> einsum(EinCode((('i','k'),('k','j')),('i','j')),(a,b))
```
or a macro based interface, `@ein` macro,
which is closer to the standard way of writing einsum-operations in physics
```julia
julia> @ein c[i,j] := a[i,k] * b[k,j];
```
It is sometimes helpful to specify the order of operations, by inserting brackets, either because you know this will be more efficient, or to help the computer see what kernels can be used. For example:
```julia
julia> @ein Z[o,s] := x[i,s] * (W[o,i,j] * y[j,s]); # macro style
julia> Z = ein"is, (oij, js) -> os"(x, W, y); # string style
```
This performs matrix multiplication (summing over `j`)
followed by batched matrix multiplication (summing over `i`, batch label `s`).
Without the brackets, instead it uses the fallback `loop_einsum`, which is slower.
Calling `allow_loops(false)` will print an error to help you spot such cases:
```julia
julia> @ein Zl[o,s] := x[i,s] * W[o,i,j] * y[j,s];
julia> Z ≈ Zl
true
julia> allow_loops(false);
julia> Zl = ein"is, oij, js -> os"(x, W, y);
┌ Error: using `loop_einsum` to evaluate
│ code = is, oij, js -> os
│ size.(xs) = ((10, 50), (20, 10, 10), (10, 50))
│ size(y) = (20, 50)
└ @ OMEinsum ~/.julia/dev/OMEinsum/src/loop_einsum.jl:26
```
## GPU Acceleration
OMEinsum supports CUDA GPU acceleration with two backends:
| Backend | Library | Best For |
|---------|---------|----------|
| `DefaultBackend()` | CUBLAS | Matrix-like contractions |
| `CuTensorBackend()` | cuTENSOR | General tensor contractions |
```julia
using CUDA
using OMEinsum
A = CUDA.rand(Float32, 100, 200)
B = CUDA.rand(Float32, 200, 300)
# Default backend (CUBLAS) - good for matrix-like operations
C = ein"ij,jk->ik"(A, B)
```
For better performance on non-GEMM patterns (tensor networks, etc.), use the cuTENSOR backend:
```julia
using CUDA
using cuTENSOR # Pkg.add("cuTENSOR") - loads the CuTENSORExt extension
using OMEinsum
set_einsum_backend!(CuTensorBackend())
C = ein"ijk,jkl->il"(CUDA.rand(Float32, 64, 64, 64), CUDA.rand(Float32, 64, 64, 64))
```
The cuTENSOR backend provides native tensor contraction without reshape/permute overhead. See the [CUDA documentation](https://under-Peter.github.io/OMEinsum.jl/dev/cuda/) for details.
## Comparison with other packages
Similar packages include:
- [TensorOperations.jl](https://github.com/Jutho/TensorOperations.jl) and [TensorKit.jl](https://github.com/Jutho/TensorKit.jl)
- [ITensors.jl](https://github.com/ITensor/ITensors.jl)
Comparing with the above packages, `OMEinsum` is optimized over large scale tensor network (or einsum, sum-product network) contraction.
## Contribute
Suggestions and Comments in the [_Issues_](https://github.com/under-Peter/OMEinsum.jl/issues) are welcome.