{"id":13712646,"url":"https://github.com/cyclops-community/ctf","last_synced_at":"2025-05-06T22:31:23.209Z","repository":{"id":12474753,"uuid":"15142240","full_name":"cyclops-community/ctf","owner":"cyclops-community","description":"Cyclops Tensor Framework: parallel arithmetic on multidimensional arrays","archived":false,"fork":false,"pushed_at":"2024-08-08T08:03:02.000Z","size":7333,"stargazers_count":203,"open_issues_count":51,"forks_count":54,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-05-06T18:14:34.585Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cyclops-community.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2013-12-12T17:13:07.000Z","updated_at":"2025-02-17T21:02:04.000Z","dependencies_parsed_at":"2024-05-28T01:37:19.850Z","dependency_job_id":"1846467a-1ac6-480c-88a1-a82e4be0d731","html_url":"https://github.com/cyclops-community/ctf","commit_stats":null,"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyclops-community%2Fctf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyclops-community%2Fctf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyclops-community%2Fctf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyclops-community%2Fctf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cyclops-community","download_url":"https://codeload.github.com/cyclops-community/ctf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252779056,"owners_count":21802870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:01:20.909Z","updated_at":"2025-05-06T22:31:21.901Z","avatar_url":"https://github.com/cyclops-community.png","language":"C++","funding_links":[],"categories":["C++","Linear Algebra / Statistics Toolkit","numerical tools"],"sub_categories":["General Purpose Tensor Library"],"readme":"## Cyclops Tensor Framework (CTF)\n\nCyclops is a parallel (distributed-memory) numerical library for multidimensional arrays (tensors) in C++ and Python.\n\nQuick documentation links: [C++](http://solomon2.web.engr.illinois.edu/ctf/index.html) and [Python](http://solomon2.web.engr.illinois.edu/ctf_python/ctf.html#module-ctf.core).\n\nBroadly, Cyclops provides tensor objects that are stored and operated on by all processes executing the program, coordinating via MPI communication.\n\nCyclops supports a multitude of tensor attributes, including sparsity, various symmetries, and user-defined element types.\n\nThe library is interoperable with ScaLAPACK at the C++ level and with numpy at the Python level. In Python, the library provides a parallel/sparse implementation of `numpy.ndarray` functionality.\n\n## Building and Testing\n\nSee the [Github Wiki](https://github.com/cyclops-community/ctf/wiki/Building-and-testing) for more details on this. It is possible to build static and dynamic C++ libraries, the Python CTF library, as well as examples and tests for both via this repository. Cyclops follows the basic installation convention,\n```sh\n./configure\nmake\nmake install\n```\n(where the last command should usually be executed as superuser, i.e. requires `sudo`) below we give more details on how the build can be customized.\n\nFirst, its necessary to run the configure script, which can be set to the appropriate type of build and is responsible for obtaining and checking for any necessary dependencies. For options and documentation on how to execute configure, run\n```sh\n./configure --help\n```\nthen execute ./configure with the appropriate options. Successful execution of this script, will generate a `config.mk` file and a `setup.py` file, needed for C++ and Python builds, respectively, as well as a how-did-i-configure file with info on how the build was configured. You may modify the `config.mk` and `setup.py` files thereafter, subsequent executions of configure will prompt to overwrite these files.\n\nNote: there is a (now-fixed) [bug](https://github.com/pmodels/mpich/pull/6543) in recent versions of MPICH that causes a segmentation fault in CTF when executing with 2 or more processors.\nThe bug can be remedied without rebuilding CTF by setting an environment variable as follows,\n```sh\nexport MPIR_CVAR_DEVICE_COLLECTIVES=none\n```\n\n### Dependencies and Supplemental Packages\n\nThe strict library dependencies of Cyclops are MPI and BLAS libraries.\n\nSome functionality in Cyclops requires LAPACK and ScaLAPACK. A standard build of the latter can be constructed automatically by running configure with `--build-scalapack` (requires cmake to build ScaLAPACK, manual build can also be provided along by providing the library path).\n\nFaster transposition in Cyclops is made possible by the HPTT library. To obtain a build of HPTT automatically run configure with `--build-hptt`.\n\nEfficient sparse matrix multiplication primitives and efficient batched BLAS primitives are available via the Intel MKL library, which is automatically detected for standard Intel compiler configurations or when appropriately supplied as a library.\n\n### Building and Installing the Libraries\n\nOnce configured, you may install both the shared and dynamic libraries, by running `make`. Parallel make is supported.\n\nTo build exclusively the static library, run `make libctf`, to build exclusively the shared library, run `make shared`.\n\nTo install the C++ libraries to the prespecified build destination directory (`--build-dir` for `./configure`, `/usr/local/` by default), run `make install` (as superuser if necessary). If the CTF configure script built the ScaLAPACK and/or HPTT libraries automatically, the libraries for these will need to be installed system-wide manually.\n\nTo build the Python CTF library, execute `make python`.\n\nTo install the Python CTF library via pip, execute `make python_install` (as superuser if not in a virtual environment).\n\nTo uninstall, use `make uninstall` and `make python_uninstall`.\n\n### Testing the Libraries\n\nTo test the C++ library with a sequential suite of tests, run `make test`. To test the library using 2 processors, execute `make test2`. To test the library using some number N processors, run `make testN`.\n\nTo test the Python library, run `make python_test` to do so sequentially and `make python_testN` to do so with N processors.\n\nTo debug issues with custom code execution correctness, build CTF libraries with `-DDEBUG=1 -DVERBOSE=1` (more info in `config.mk`).\n\nTo debug issues with custom code performance, build CTF libraries with `-DPROFILE -DPMPI` (more info in `config.mk`), which should lead to a performance log dump at the end of an execution of a code using CTF.\n\n\n## Sample C++ Code and Minimal Tutorial\n\nA simple Jacobi iteration code using CTF is given below, also found in [this example](examples/jacobi.cxx).\n\n```cpp\nVector\u003c\u003e Jacobi(Matrix\u003c\u003e A , Vector\u003c\u003e b , int n){\n  Matrix\u003c\u003e R(A);\n  R[\"ii\"] = 0.0;\n  Vector\u003c\u003e x (n), d(n), r(n);\n  Function\u003c\u003e inv([]( double \u0026 d ){ return 1./d; });\n  d[\"i\"] = inv(A[\"ii\"]); // set d to inverse of diagonal of A\n  do {\n    x[\"i\"] = d[\"i\"]*(b[\"i\"]-R[\"ij\"]*x[\"j\"]);\n    r[\"i\"] = b[\"i\"]-A [\"ij\"]*x[\"j\"]; // compute residual\n  } while ( r.norm2 () \u003e 1. E -6); // check for convergence\n  return x;\n}\n```\n\nThe above Jacobi function accepts n-by-n matrix A and n-dimensional vector b containing double precision floating point elements and solves Ax=b using Jacobi iteration. The matrix R is defined to be a copy of the data in A. Its diagonal is subsequently set to 0, while the diagonal of A is extracted into d and inverted. A while loop then computes the jacobi iteration by use of matrix vector multiplication, vector addition, and vector Hadamard products.\n\nThis Jacobi code uses Vector and Matrix objects which are specializations of the Tensor object. Each of these is a distributed data structure, which is partitioned across an MPI communicator.\n\nThe key illustrative part of the above example is\n```cpp\nx[\"i\"] = d[\"i\"]*(b[\"i\"]-R[\"ij\"]*x[\"j\"]);\n```\nto evaluate this expression, CTF would execute the following set of loops, each in parallel,\n```cpp\ndouble y[n];\nfor (int i=0; i\u003cn; i++){\n  y[i] = 0.0;\n  for (int j=0; j\u003cn; j++){\n    y[i] += R[i][j]*x[j];\n  }\n}\nfor (int i=0; i\u003cn; i++) \n  y[i] = b[i]-y[i];\nfor (int i=0; i\u003cn; i++) \n  x[i] = d[i]*y[i];\n```\nThis parallelization is done with any programming language constructs outside of basic C++. Operator overloading is used to interpret the indexing strings and build an expression tree. Each operation is then evaluated by an appropriate parallelization. Locally, BLAS kernels are used when possible. \n\nIts also worth noting the different ways indices are employed in the above example. In the matrix-vector multiplication,\n```cpp\ny[\"i\"]=R[\"ij\"]*b[\"j\"]\n```\nthe index j is contracted (summed) over, as it does not appear in the output, while the index i appears in one of the operand and the output. Note, that a different semantic occurs if an index appears in two operands the result\n```cpp\nx[\"i\"]=d[\"i\"]*y[\"i\"];\n```\nThis is most often referred to as a Hadamard product. In general, CTF can execute any operation of the form,\n```cpp\nC[\"...\"] += A[\"...\"]*B[\"...\"];\n```\nso long as the length of each of the three strings matches the order of the tensors. The operation can be interpreted by looping over all unique characters that appear in the union of the three strings, and putting the specified multiply-add operation in the innermost loop.\n\nAnother piece of functionality employed in the Jacobi example is the Function object and its application,\n```cpp\nFunction\u003c\u003e inv([]( double \u0026 d ){ return 1./d; });\nd[\"i\"] = inv(A[\"ii\"]); // set d to inverse of diagonal of A\n```\nthe same code could have been written even more concisely,\n```cpp\nd[\"i\"] = Function\u003c\u003e inv([]( double \u0026 d ){ return 1./d; })(A[\"ii\"]);\n```\nThis syntax defines and employs an elementwise function that inverts each element of A to which it is applied. The operation is executing the following loop,\n```cpp\nfor (int i=0; i\u003cn; i++)\n  d[i] = 1./A[i][i];\n```\nIn this way, arbitrary functions can be applied to elements of the tensors. There are additional 'algebraic structure' constructs that allow a redefinition of addition and multiplication for any given tensor. Finally, there is a Transform object, which acts in a similar way as Function, but takes the output element by reference. The Transform construct is more powerful than Function, but limits the transformations that can be applied internally for efficiency. Both Function and Transform can operate on one or two tensors with different element types, and output another element type. \n\n## Sample Python Jupyter Notebook\n\nAn example of basic CTF functionality as a `numpy.ndarray` back-end is shown in this [Jupyter notebook](http://solomonik.cs.illinois.edu/demos/CTF_introductory_demo.html). The full notebook is included inside the `doc` folder.\n\n## Documentation\n\nDetailed documentation of all functionality and the organization of the source code can be found in the [Doxygen page](http://solomon2.web.engr.illinois.edu/ctf/index.html). Much of the C++ functionality is expressed through the [Tensor object](http://solomon2.web.engr.illinois.edu/ctf/classCTF_1_1Tensor.html). Documentation for hte Python functionality is also [available](http://solomon2.web.engr.illinois.edu/ctf_python/ctf.html#module-ctf.core).\n\nThe examples and aforementioned papers can be used to gain further insight. If you have any questions regarding usage, do not hesitate to contact us! Please do so by creating an issue on this github webpage. You can also email questions to solomon2@illinois.edu.\n\n## Performance\n\nPlease see the aforementioned papers for various applications and benchmarks, which are also summarized in [this recent presentation](http://solomon2.web.engr.illinois.edu/talks/istcp_jul22_2016.pdf). Generally, the distributed-memory dense and sparse matrix multiplication performance should be very good. Similar performance is achieved for many types of contractions. CTF can leverage threading, but is fastest with pure MPI or hybrid MPI+OpenMP. The code aims at scalability to a large number of processors by minimizing communication cost, rather than necessarily achieving perfect absolute performance. User-defined functions naturally inhibit the sequential kernel performance. Algorithms that have a low flop-to-byte ratio may not achieve memory-bandwidth peak as some copying/transposition may take place. Absolute performance of operations that have Hadamard indices is relatively low for the time being, but will be improved.\n\n\n## Sample Existing Applications\n\nThe CTF library enables general-purpose programming, but is particularly useful in some specific application domains.\n\n### Numerical Methods Based on Tensor Contractions\n\nThe framework provides a powerful domain specific language for computational chemistry and physics codes that work on higher order tensors (e.g. coupled cluster, lattice QCD, quantum informatics). See the [CCSD](examples/ccsd.cxx) and [sparse MP3](examples/sparse_mp3.cxx) examples, or the [Aquarius](https://github.com/devinamatthews/aquarius) application. This domain was the initial motivation for the development of CTF. An exemplary paper for this type of applications is\n\n[Edgar Solomonik, Devin Matthews, Jeff R. Hammond, John F. Stanton, and James Demmel; A massively parallel tensor contraction framework for coupled-cluster computations; Journal of Parallel and Distributed Computing, June 2014.](http://www.sciencedirect.com/science/article/pii/S074373151400104X)\n\n### Algebraic Graph Algorithms\n\nMuch like the [CombBLAS](http://gauss.cs.ucsb.edu/~aydin/CombBLAS/html/) library, CTF provides sparse matrix primitives that enable development of parallel graph algorithms. Matrices and vectors can be defined on a user-defined semiring or monoid. Multiplication of sparse matrices with sparse vectors or other sparse matrices is parallelized automatically. The library includes example code for [Bellman-Ford](examples/sssp.cxx) and [betweenness centrality](examples/btwn_central.cxx). An paper describing and analyzing the betweenness centrality code is\n\n[Edgar Solomonik, Maciej Besta, Flavio Vella, and Torsten Hoefler Scaling betweenness centrality using communication-efficient sparse matrix multiplication ACM/IEEE Supercomputing Conference, Denver, Colorado, November 2017.](https://arxiv.org/abs/1609.07008)\n\n### Prototyping of Parallel Numerical Algorithms\n\nThe high-level abstractions for vectors, matrices, and order 3+ tensors allow CTF to be a useful tool for developing many numerical algorithms. Interface hook-ups for ScaLAPACK make coordination with distributed-memory solvers easy. Algorithms like [algebraic multigrid](examples/algebraic_multigrid.cxx), which require sparse matrix multiplication can be rapidly implemented with CTF. Further, higher order tensors can be used to express recursive algorithms like [parallel scans](examples/scan.cxx) and [FFT](examples/fft.cxx). Some basic examples of numerical codes using CTF are presented in\n\n[Edgar Solomonik and Torsten Hoefler; Sparse tensor algebra as a parallel programming model arXiv preprint, arXiv:1512.00066 [cs.MS], November 2015.](http://arxiv.org/abs/1512.00066)\n\n### Quantum Circuit Simulation\n\nCTF has been recently used to do the largest-ever quantum circuit simulation,\n\n[ Edwin Pednault, John A. Gunnels, Giacomo Nannicini, Lior Horesh, Thomas Magerlein, Edgar Solomonik, and Robert Wisnieff Breaking the 49-qubit barrier in the simulation of quantum circuits arXiv:1710.05867 [quant-ph], October 2017.]( https://arxiv.org/abs/1710.05867)\n\n\n## Alternative Frameworks\n\n[Elemental](http://libelemental.org/) and [ScaLAPACK](http://www.netlib.org/scalapack/) provide distributed-memory support for dense matrix operations in addition to a powerful suite of solver routines. It is also possible to interface them with CTF, in particular, we provide routines for retrieving a ScaLAPACK descriptor.\n\nA faster library for dense tensor contractions in shared memory is [Libtensor](https://github.com/epifanovsky/libtensor). \n\nAn excellent distributed-memory library with native support for block-sparse tensors is [TiledArray](https://github.com/ValeevGroup/tiledarray).\n\n\n## Acknowledging Usage\n\nThe library and source code is available to everyone. If you would like to acknowledge the usage of the library, please cite one of our papers. The follow reference details dense tensor functionality in CTF,\n\nEdgar Solomonik, Devin Matthews, Jeff R. Hammond, John F. Stanton, and James Demmel; A massively parallel tensor contraction framework for coupled-cluster computations; Journal of Parallel and Distributed Computing, June 2014.\n\nhere is the [bibtex](http://solomon2.web.engr.illinois.edu/bibtex/SMHSD_JPDC_2014.txt).\n\nWe hope you enjoy writing your parallel program with algebra!\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyclops-community%2Fctf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyclops-community%2Fctf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyclops-community%2Fctf/lists"}