{"id":22096667,"url":"https://github.com/libmir/mir-glas","last_synced_at":"2025-04-09T23:25:26.596Z","repository":{"id":146907733,"uuid":"70914319","full_name":"libmir/mir-glas","owner":"libmir","description":"[Experimental] LLVM-accelerated Generic Linear Algebra Subprograms","archived":false,"fork":false,"pushed_at":"2022-08-19T10:54:07.000Z","size":2586,"stargazers_count":102,"open_issues_count":7,"forks_count":10,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-03-24T01:17:40.375Z","etag":null,"topics":["algebra","blas","glas","lapack","linear-algebra-subprograms","matrix","matrix-multiplication","simd"],"latest_commit_sha":null,"homepage":"","language":"D","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/libmir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE_1_0.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-10-14T13:48:05.000Z","updated_at":"2024-12-03T18:21:41.000Z","dependencies_parsed_at":null,"dependency_job_id":"8a6e219c-44b3-40e1-a73a-5700603285b3","html_url":"https://github.com/libmir/mir-glas","commit_stats":null,"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libmir%2Fmir-glas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libmir%2Fmir-glas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libmir%2Fmir-glas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libmir%2Fmir-glas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/libmir","download_url":"https://codeload.github.com/libmir/mir-glas/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248127617,"owners_count":21052258,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algebra","blas","glas","lapack","linear-algebra-subprograms","matrix","matrix-multiplication","simd"],"created_at":"2024-12-01T04:12:04.755Z","updated_at":"2025-04-09T23:25:26.573Z","avatar_url":"https://github.com/libmir.png","language":"D","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Dub downloads](https://img.shields.io/dub/dt/mir-glas.svg)](http://code.dlang.org/packages/mir-glas)\n[![License](https://img.shields.io/dub/l/mir-glas.svg)](http://code.dlang.org/packages/mir-glas)\n[![Gitter](https://img.shields.io/gitter/room/libmir/public.svg)](https://gitter.im/libmir/public)\n\n[![Latest version](https://img.shields.io/dub/v/mir-glas.svg)](http://code.dlang.org/packages/mir-glas)\n\n[![Circle CI](https://circleci.com/gh/libmir/mir-glas.svg?style=svg)](https://circleci.com/gh/libmir/mir-glas)\n[![Build Status](https://travis-ci.org/libmir/mir-glas.svg?branch=master)](https://travis-ci.org/libmir/mir-glas)\n\n[![Benchmarks](http://blog.mir.dlang.io/images/bench_csingle.svg)](http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/glas-gemm-benchmark.html)\n\n# glas\nLLVM-accelerated Generic Linear Algebra Subprograms (GLAS)\n\n## Description\nGLAS is a C library written in Dlang. No C++/D runtime is required but libc, which is available everywhere.\n\nThe library provides\n\n 1. [BLAS](http://netlib.org/blas/) (Basic Linear Algebra Subprograms) API.\n 2. GLAS (Generic Linear Algebra Subprograms) API.\n\nCBLAS API can be provided by linking with [Netlib's CBLAS](http://netlib.org/blas/#_cblas) library.\n\n## dub\n\nGLAS can be used with DMD and LDC but \n[LDC (LLVM D Compiler)](https://github.com/ldc-developers/ldc) \u003e= `1.1.0 beta 6` should be installed in common path anyway.\n\nNote performance issue https://github.com/libmir/mir-glas/issues/18.\n\nGLAS can be included automatically in a project using [dub](http://code.dlang.org/) (the D package manager).\nDUB will build GLAS and CPUID manually with LDC.\n\n```json\n{\n   ...\n   \"dependencies\": {\n      \"mir-glas\": \"~\u003e\u003ccurrent_mir-glas_version\u003e\",\n      \"mir-cpuid\": \"~\u003e\u003ccurrent_mir-cpuid_version\u003e\"\n   },\n   \"lflags\": [\"-L$MIR_GLAS_PACKAGE_DIR\", \"-L$MIR_CPUID_PACKAGE_DIR\"]\n}\n```\n\n`$MIR_GLAS_PACKAGE_DIR` and `$MIR_CPUID_PACKAGE_DIR` will be replaced automatically by DUB to appropriate directories.\n\n## Usage\n\n`mir-glas` can be used like a common C library. It should be linked with `mir-cpuid`.\nA compiler, for example GCC, may require `mir-cpuid` to be passed after `mir-glas`: `-lmir-glas -lmir-cpuid`.\n\n### GLAS API\n\nGLAS API is based on the new `ndslice` from [mir-algorithm](https://github.com/libmir/mir-algorithm).\nOther languages can use simple structure definition.\n[Examples](examples/) are available for C and for Dlang.\n\n### Headers\n\nC/C++ headers are located in [`include/`](include/).\nD headers are located in [`source/`](source/).\n\nThere are two files:\n\n 1. `glas/fortran.h` / `glas/fortran.d` - for Netilb's BLAS API\n 2. `glas/ndslice.h` / `glas/ndslice.d` - for GLAS API\n\n\n## Manual Compilation\n\n#### Compiler installation\n\n[LDC (LLVM D Compiler)](https://github.com/ldc-developers/ldc) \u003e= `1.1.0 beta 6` is required to build a project.\nYou may want to build LDC from source or use [LDC 1.1.0 beta 6](https://github.com/ldc-developers/ldc/releases/tag/v1.1.0-beta2).\nBeta 2 generates a lot of warnings that can be ignored. Beta 3 is not supported.\n\nLDC binaries contains two compilers: ldc2 and ldmd2. It is recommended to use ldmd2 with mir-glas.\n\nRecent LDC packages come with the [dub package manager](http://code.dlang.org/docs/commandline).\ndub is used to build the project.\n\n#### Mir CPUID\n[Mir CPUID](https://github.com/libmir/mir-cpuid) is CPU Identification Routines.\n\nDownload `mir-cpuid`\n```shell\ndub fetch mir-cpuid --cache=local\n```\n\nChange the directory\n```shell\ncd mir-cpuid-\u003ccurrent-mir-cpuid-version\u003e/mir-cpuid\n```\n\nBuild `mir-cpuid`\n```shell\ndub build --build=release-nobounds --compiler=ldmd2 --build-mode=singleFile --parallel --force\n```\nYou may need to add `--arch=x86_64`, if you use windows.\n\nCopy `libmir-cpuid.a` to your project or add its directory to the library path.\n\n#### Mir GLAS\n\nDownload `mir-glas`\n```shell\ndub fetch mir-glas --cache=local\n```\n\nChange the directory\n```shell\ncd mir-glas-\u003ccurrent-mir-glas-version\u003e/mir-glas\n```\n\nBuild `mir-glas`\n```shell\ndub build --config=static --build=target-native --compiler=ldmd2 --build-mode=singleFile --parallel --force\n```\nYou may need to add `--arch=x86_64` if you use windows.\n\nCopy `libmir-glas.a` to your project or add its directory to the library path.\n\n## Status\n\nWe are open for contributing!\nThe hardest part (GEMM) is already implemented.\n\n - [x] CI testing with Netlib's BLAS test suite.\n - [x] CI testing with Netlib's CBLAS test suite.\n - [ ] CI testing with Netlib's LAPACK test suite.\n - [ ] CI testing with Netlib's LAPACKE test suite.\n - [ ] Multi-threading\n - [ ] GPU back-end\n - [ ] Shared library support - requires only DUB configuration fixes.\n - [ ] Level 3 - matrix-matrix operations\n   - [x] GEMM - matrix matrix multiply\n   - [x] SYMM - symmetric matrix matrix multiply\n   - [x] HEMM - hermitian matrix matrix multiply\n   - [ ] SYRK - symmetric rank-k update to a matrix\n   - [ ] HERK - hermitian rank-k update to a matrix\n   - [ ] SYR2K - symmetric rank-2k update to a matrix\n   - [ ] HER2K - hermitian rank-2k update to a matrix\n   - [ ] TRMM - triangular matrix matrix multiply\n   - [ ] TRSM - solving triangular matrix with multiple right hand sides\n - [ ] Level 2 - matrix-vector operations\n   - [ ] GEMV - matrix vector multiply\n   - [ ] GBMV - banded matrix vector multiply\n   - [ ] HEMV - hermitian matrix vector multiply\n   - [ ] HBMV - hermitian banded matrix vector multiply\n   - [ ] HPMV - hermitian packed matrix vector multiply\n   - [ ] TRMV - triangular matrix vector multiply\n   - [ ] TBMV - triangular banded matrix vector multiply\n   - [ ] TPMV - triangular packed matrix vector multiply\n   - [ ] TRSV - solving triangular matrix problems\n   - [ ] TBSV - solving triangular banded matrix problems\n   - [ ] TPSV - solving triangular packed matrix problems\n   - [ ] GERU - performs the rank 1 operation `A := alpha*x*y' + A`\n   - [ ] GERC - performs the rank 1 operation `A := alpha*x*conjg( y' ) + A`\n   - [ ] HER - hermitian rank 1 operation `A := alpha*x*conjg(x') + A`\n   - [ ] HPR - hermitian packed rank 1 operation `A := alpha*x*conjg( x' ) + A`\n   - [ ] HER2 - hermitian rank 2 operation\n   - [ ] HPR2 - hermitian packed rank 2 operation\n - [x] Level 1 - vector-vector and scalar operations. Note: [Mir](https://github.com/libmir/mir) already provides generic implementation.\n   - [x] ROTG - setup Givens rotation\n   - [x] ROTMG - setup modified Givens rotation\n   - [X] ROT - apply Givens rotation\n   - [x] ROTM - apply modified Givens rotation\n   - [x] SWAP - swap x and y\n   - [x] SCAL - `x = a*x`. Note: requires addition optimization for complex numbers.\n   - [x] COPY - copy x into y\n   - [x] AXPY - `y = a*x + y`. Note: requires addition optimization for complex numbers.\n   - [x] DOT - dot product\n   - [x] DOTU - dot product. Note: requires addition optimization for complex numbers.\n   - [x] DOTC - dot product, conjugating the first vector. Note: requires addition optimization for complex numbers.\n   - [x] DSDOT - dot product with extended precision accumulation and result\n   - [x] SDSDOT - dot product with extended precision accumulation\n   - [x] NRM2 - Euclidean norm\n   - [x] ASUM - sum of absolute values\n   - [x] IAMAX - index of max abs value\n\n## Porting to a new target\n\nFive steps\n\n1. Implement `cpuid_init` function for `mir-cpuid`. This function should be implemented per platform or OS. Already implemented targets are\n   - x86, any OS\n   - x86_64, any OS\n2. Verify that [source/glas/internal/memory.d](source/glas/internal/memory.d) contains an implementation for the OS. Already implemented targets are\n   - Posix (Linux, macOS, and others)\n   - Windows\n3. Add new configuration for register blocking to [source/glas/internal/config.d](source/glas/internal/config.d). Already implemented configuration available for\n   - x87\n   - SSE2\n   - AVX / AVX2\n   - AVX512 (requires LLVM bug fixes).\n4. Create a Pool Request.\n5. Coordinate with LDC team in case of compiler bugs.\n\n## Questions \u0026 Answers\n\n#### Why GLAS is called \"Generic ...\"?\n\n 1. GLAS has a generic internal implementation, which can be easily ported to any other architecture with minimal efforts (5 minutes).\n 2. GLAS API provides more functionality comparing with BLAS.\n 3. It is written in Dlang using generic programming.\n\n#### Why it is better then other BLAS Open Source Libraries like OpenBLAS and Eigen?\n\n 1. GLAS is [faster](http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/glas-gemm-benchmark.html).\n 2. GLAS API is more user-friendly and does not require additional data copying.\n 3. GLAS does not require C++ runtime comparing with Eigen.\n 4. GLAS does not require platform specific optimizations like Eigen intrinsics micro kernels and OpenBLAS assembler macro kernels.\n 5. GLAS has a simple implementation, which can be easily ported and extended.\n\n#### Why GLAS does not have Lazy Evaluation and Aliasing like Eigen?\n\nGLAS is a lower level library than Eigen. For example, GLAS can be an Eigen BLAS back-end in the future\nLazy Evaluation and Aliasing can be easily implemented in D.\nExplicit composition of operations can be done using `mir.ndslice.algorithm` and multidimensional `map` from `mir.ndslice.topology`, which is a generic way to perform any lazy operations you want.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibmir%2Fmir-glas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flibmir%2Fmir-glas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibmir%2Fmir-glas/lists"}