{"id":15014073,"url":"https://github.com/explosion/blis","last_synced_at":"2025-10-06T08:31:19.368Z","repository":{"id":40467418,"uuid":"150839492","full_name":"explosion/blis","owner":"explosion","description":"BLAS-like Library Instantiation Software Framework","archived":false,"fork":true,"pushed_at":"2025-01-10T10:18:56.000Z","size":47484,"stargazers_count":1,"open_issues_count":1,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-24T22:31:40.484Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"flame/blis","license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/explosion.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-29T07:51:40.000Z","updated_at":"2022-09-16T11:59:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/explosion/blis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/explosion/blis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fblis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fblis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fblis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fblis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/explosion","download_url":"https://codeload.github.com/explosion/blis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fblis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278579139,"owners_count":26009954,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-06T02:00:05.630Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-09-24T19:45:09.345Z","updated_at":"2025-10-06T08:31:18.462Z","avatar_url":"https://github.com/explosion.png","language":"C","readme":"![The BLIS cat is sleeping.](http://www.cs.utexas.edu/users/field/blis_cat.png)\n\n[![Build Status](https://travis-ci.org/flame/blis.svg?branch=master)](https://travis-ci.org/flame/blis)\n\nContents\n--------\n\n* **[Introduction](#introduction)**\n* **[What's New](#whats-new)**\n* **[What People Are Saying About BLIS](#what-people-are-saying-about-blis)**\n* **[Key Features](#key-features)**\n* **[Getting Started](#getting-started)**\n* **[Documentation](#documentation)**\n* **[External GNU/Linux Packages](#external-gnulinux-packages)**\n* **[Discussion](#discussion)**\n* **[Contributing](#contributing)**\n* **[Citations](#citations)**\n* **[Funding](#funding)**\n\nIntroduction\n------------\n\nBLIS is a portable software framework for instantiating high-performance\nBLAS-like dense linear algebra libraries. The framework was designed to isolate\nessential kernels of computation that, when optimized, immediately enable\noptimized implementations of most of its commonly used and computationally\nintensive operations. BLIS is written in [ISO\nC99](http://en.wikipedia.org/wiki/C99) and available under a\n[new/modified/3-clause BSD\nlicense](http://opensource.org/licenses/BSD-3-Clause). While BLIS exports a\n[new BLAS-like API](docs/BLISTypedAPI.md),\nit also includes a BLAS compatibility layer which gives application developers\naccess to BLIS implementations via traditional [BLAS routine\ncalls](http://www.netlib.org/lapack/lug/node145.html).\nAn [object-based API](docs/BLISObjectAPI.md) unique to BLIS is also available.\n\nFor a thorough presentation of our framework, please read our\n[ACM Transactions on Mathematical Software (TOMS)](https://toms.acm.org/)\njournal article, [\"BLIS: A Framework for Rapidly Instantiating BLAS\nFunctionality\"](http://dl.acm.org/authorize?N91172).\nFor those who just want an executive summary, please see the\n[Key Features](#key-features) section below.\n\nIn a follow-up article (also in [ACM TOMS](https://toms.acm.org/)),\n[\"The BLIS Framework: Experiments in\nPortability\"](http://dl.acm.org/authorize?N16240),\nwe investigate using BLIS to instantiate level-3 BLAS implementations on a\nvariety of general-purpose, low-power, and multicore architectures.\n\nAn IPDPS'14 conference paper titled [\"Anatomy of High-Performance Many-Threaded\nMatrix\nMultiplication\"](http://www.cs.utexas.edu/users/flame/pubs/blis3_ipdps14.pdf)\nsystematically explores the opportunities for parallelism within the five loops\nthat BLIS exposes in its matrix multiplication algorithm.\n\nFor other papers related to BLIS, please see the\n[Citations section](#citations) below.\n\nIt is our belief that BLIS offers substantial benefits in productivity when\ncompared to conventional approaches to developing BLAS libraries, as well as a\nmuch-needed refinement of the BLAS interface, and thus constitutes a major\nadvance in dense linear algebra computation. While BLIS remains a\nwork-in-progress, we are excited to continue its development and further\ncultivate its use within the community. \n\nThe BLIS framework is primarily developed and maintained by individuals in the\n[Science of High-Performance Computing](http://shpc.ices.utexas.edu/)\n(SHPC) group in the\n[Institute for Computational Engineering and Sciences](https://www.ices.utexas.edu/)\nat [The University of Texas at Austin](https://www.utexas.edu/).\nPlease visit the [SHPC](http://shpc.ices.utexas.edu/) website for more\ninformation about our research group, such as a list of\n[people](http://shpc.ices.utexas.edu/people.html)\nand [collaborators](http://shpc.ices.utexas.edu/collaborators.html),\n[funding sources](http://shpc.ices.utexas.edu/funding.html),\n[publications](http://shpc.ices.utexas.edu/publications.html),\nand [other educational projects](http://www.ulaff.net/) (such as MOOCs).\n\nWhat's New\n----------\n\n * **BLIS is now in Debian Unstable!** Thanks to Debian developer-maintainers\n[M. Zhou](https://github.com/cdluminate) and\n[Nico Schlömer](https://github.com/nschloe) for sponsoring our package in Debian.\nTheir participation, contributions, and advocacy were key to getting BLIS into\nthe second-most popular Linux distribution (behind Ubuntu, which Debian packages\nfeed into). The Debian tracker page may be found\n[here](https://tracker.debian.org/pkg/blis).\n\n * **BLIS now supports mixed-datatype gemm.** The `gemm` operation may now be\nexecuted on operands of mixed domains and/or mixed precisions. Any combination\nof storage datatype for A, B, and C is now supported, along with a separate\ncomputation precision that can differ from the storage precision of A and B.\nAnd even the 1m method now supports mixed-precision computation.\nFor more details, please see our [ACM TOMS](https://toms.acm.org/) journal\narticle submission ([current\ndraft](http://www.cs.utexas.edu/users/flame/pubs/blis7_toms_rev0.pdf)).\n\n * **BLIS now implements the 1m method.** Let's face it: writing complex\nassembly `gemm` microkernels for a new architecture is never a priority--and\nnow, it almost never needs to be. The 1m method leverages existing real domain\n`gemm` microkernels to implement all complex domain level-3 operations. For\nmore details, please see our [ACM TOMS](https://toms.acm.org/) journal article\nsubmission ([current\ndraft](http://www.cs.utexas.edu/users/flame/pubs/blis6_toms_rev2.pdf)).\n\nWhat People Are Saying About BLIS\n---------------------------------\n\n*[\"This is an awesome library.\"](https://github.com/flame/blis/issues/288#issuecomment-447488637)* ... *[\"I want to thank you and the blis team for your efforts.\"](https://github.com/flame/blis/issues/288#issuecomment-448074704)* ([@Lephar](https://github.com/Lephar))\n\n*[\"Any time somebody outside Intel beats MKL by a nontrivial amount, I report it to the MKL team. It is fantastic for any open-source project to get within 10% of MKL... [T]his is why Intel funds BLIS development.\"](https://github.com/flame/blis/issues/264#issuecomment-428673275)* ([@jeffhammond](https://github.com/jeffhammond))\n\n*[\"So BLIS is now a part of Elk.\"](https://github.com/flame/blis/issues/267#issuecomment-429303902)* ... *[\"We have found that zgemm applied to a 15000x15000 matrix with multi-threaded BLIS on a 32-core Ryzen 2990WX processor is about twice as fast as MKL\"](https://github.com/flame/blis/issues/264#issuecomment-428373946)* ... *[\"I'm starting to like this a lot.\"](https://github.com/flame/blis/issues/264#issuecomment-428926191)* ([@jdk2016](https://github.com/jdk2016))\n\n*[\"I [found] BLIS because I was looking for BLAS operations on C-ordered arrays for NumPy. BLIS has that, but even better is the fact that it's developed in the open using a more modern language than Fortran.\"](https://github.com/flame/blis/issues/254#issuecomment-423838345)* ([@nschloe](https://github.com/nschloe))\n\n*[\"The specific reason to have BLIS included [in Linux distributions] is the KNL and SKX [AVX-512] BLAS support, which OpenBLAS doesn't have.\"](https://github.com/flame/blis/issues/210#issuecomment-393126303)* ([@loveshack](https://github.com/loveshack))\n\n*[\"All tests pass without errors on OpenBSD. Thanks!\"](https://github.com/flame/blis/issues/202#issuecomment-389691543)* ([@ararslan](https://github.com/ararslan))\n\n*[\"Thank you very much for your great help!... Looking forward to benchmarking.\"](https://github.com/flame/blis/issues/180#issuecomment-375895449)* ([@mrader1248](https://github.com/mrader1248))\n\n*[\"Thanks for the beautiful work.\"](https://github.com/flame/blis/issues/163#issue-286575452)* ([@mmrmo](https://github.com/mmrmo))\n\n*[\"[M]y software currently uses BLIS for its BLAS interface...\"](https://github.com/flame/blis/issues/129#issuecomment-302904805)* ([@ShadenSmith](https://github.com/ShadenSmith))\n\n*[\"[T]hanks so much for your work on this! Excited to test.\"](https://github.com/flame/blis/issues/129#issuecomment-341565071)* ... *[\"[On AMD Excavator], BLIS is competitive to / slightly faster than OpenBLAS for dgemms in my tests.\"](https://github.com/flame/blis/issues/129#issuecomment-341608673)* ([@iotamudelta](https://github.com/iotamudelta))\n\n*[\"BLIS provided the only viable option on KNL, whose ecosystem is at present dominated by blackbox toolchains. Thanks again. Keep on this great work.\"](https://github.com/flame/blis/issues/116#issuecomment-281225101)* ([@heroxbd](https://github.com/heroxbd))\n\n*[\"I want to definitely try this out...\"](https://github.com/flame/blis/issues/12#issuecomment-48086295)* ([@ViralBShah](https://github.com/ViralBShah))\n\nKey Features\n------------\n\nBLIS offers several advantages over traditional BLAS libraries:\n\n * **Portability that doesn't impede high performance.** Portability was a top\npriority of ours when creating BLIS. With virtually no additional effort on the\npart of the developer, BLIS is configurable as a fully-functional reference\nimplementation. But more importantly, the framework identifies and isolates a\nkey set of computational kernels which, when optimized, immediately and\nautomatically optimize performance across virtually all level-2 and level-3\nBLIS operations. In this way, the framework acts as a productivity multiplier.\nAnd since the optimized (non-portable) code is compartmentalized within these\nfew kernels, instantiating a high-performance BLIS library on a new\narchitecture is a relatively straightforward endeavor.\n\n * **Generalized matrix storage.** The BLIS framework exports interfaces that\nallow one to specify both the row stride and column stride of a matrix. This\nallows one to compute with matrices stored in column-major order, row-major\norder, or by general stride. (This latter storage format is important for those\nseeking to implement tensor contractions on multidimensional arrays.)\nFurthermore, since BLIS tracks stride information for each matrix, operands of\ndifferent storage formats can be used within the same operation invocation. By\ncontrast, BLAS requires column-major storage. And while the CBLAS interface\nsupports row-major storage, it does not allow mixing storage formats. \n\n * **Rich support for the complex domain.** BLIS operations are developed and\nexpressed in their most general form, which is typically in the complex domain.\nThese formulations then simplify elegantly down to the real domain, with\nconjugations becoming no-ops. Unlike the BLAS, all input operands in BLIS that\nallow transposition and conjugate-transposition also support conjugation\n(without transposition), which obviates the need for thread-unsafe workarounds.\nAlso, where applicable, both complex symmetric and complex Hermitian forms are\nsupported. (BLAS omits some complex symmetric operations, such as `symv`,\n`syr`, and `syr2`.) Another great example of BLIS serving as a portability\nlever is its implementation of the 1m method for complex matrix multiplication,\na novel mechanism of providing high-performance complex level-3 operations using\nonly real domain microkernels. This new innovation guarantees automatic level-3\nsupport in the complex domain even when the kernel developers entirely forgo\nwriting complex kernels.\n\n * **Advanced multithreading support.** BLIS allows multiple levels of\nsymmetric multithreading for nearly all level-3 operations. (Currently, users\nmay choose to obtain parallelism via either OpenMP or POSIX threads). This\nmeans that matrices may be partitioned in multiple dimensions simultaneously to\nattain scalable, high-performance parallelism on multicore and many-core\narchitectures. The key to this innovation is a thread-specific control tree\ninfrastructure which encodes information about the logical thread topology and\nallows threads to query and communicate data amongst one another. BLIS also\nemploys so-called \"quadratic partitioning\" when computing dimension sub-ranges\nfor each thread, so that arbitrary diagonal offsets of structured matrices with\nunreferenced regions are taken into account to achieve proper load balance.\nMore recently, BLIS introduced a runtime abstraction to specify parallelism on\na per-call basis, which is useful for applications that want to handle most of\nthe parallelism.\n\n * **Ease of use.** The BLIS framework, and the library of routines it\ngenerates, are easy to use for end users, experts, and vendors alike. An\noptional BLAS compatibility layer provides application developers with\nbackwards compatibility to existing BLAS-dependent codes. Or, one may adjust or\nwrite their application to take advantage of new BLIS functionality (such as\ngeneralized storage formats or additional complex operations) by calling one\nof BLIS's native APIs directly. BLIS's typed API will feel familiar to many\nveterans of BLAS since these interfaces use BLAS-like calling sequences. And\nmany will find BLIS's object-based APIs a delight to use when customizing\nor writing their own BLIS operations. (Objects are relatively lightweight\n`structs` and passed by address, which helps tame function calling overhead.) \n\n * **Multilayered API, exposed kernels, and sandboxes.** The BLIS framework\nexposes its\nimplementations in various layers, allowing expert developers to access exactly\nthe functionality desired. This layered interface includes that of the\nlowest-level kernels, for those who wish to bypass the bulk of the framework.\nOptimizations can occur at various levels, in part thanks to exposed packing\nand unpacking facilities, which by default are highly parameterized and\nflexible. And more recently, BLIS introduced sandboxes--a way to provide\nalternative implementations of `gemm` that do not use any more of the BLIS\ninfrastructure than is desired. Sandboxes provide a convenient and\nstraightforward way of modifying the `gemm` implementation without disrupting\nany other level-3 operation or any other part of the framework. This works\nespecially well when the developer wants to experiment with new optimizations\nor try a different algorithm.\n\n * **Functionality that grows with the community's needs.** As its name\nsuggests, the BLIS framework is not a single library or static API, but rather\na nearly-complete template for instantiating high-performance BLAS-like\nlibraries. Furthermore, the framework is extensible, allowing developers to\nleverage existing components to support new operations as they are identified.\nIf such operations require new kernels for optimal efficiency, the framework\nand its APIs will be adjusted and extended accordingly. \n\n * **Code re-use.** Auto-generation approaches to achieving the aforementioned\ngoals tend to quickly lead to code bloat due to the multiple dimensions of\nvariation supported: operation (i.e. `gemm`, `herk`, `trmm`, etc.); parameter\ncase (i.e. side, [conjugate-]transposition, upper/lower storage, unit/non-unit\ndiagonal); datatype (i.e. single-/double-precision real/complex); matrix\nstorage (i.e. row-major, column-major, generalized); and algorithm (i.e.\npartitioning path and kernel shape). These \"brute force\" approaches often\nconsider and optimize each operation or case combination in isolation, which is\nless than ideal when the goal is to provide entire libraries. BLIS was designed\nto be a complete framework for implementing basic linear algebra operations,\nbut supporting this vast amount of functionality in a manageable way required a\nholistic design that employed careful abstractions, layering, and recycling of\ngeneric (highly parameterized) codes, subject to the constraint that high\nperformance remain attainable.\n\n * **A foundation for mixed domain and/or mixed precision operations.** BLIS\nwas designed with the hope of one day allowing computation on real and complex\noperands within the same operation. Similarly, we wanted to allow mixing\noperands' numerical domains, floating-point precisions, or both domain and\nprecision, and to optionally compute in a precision different than one or both\noperands' storage precisions. This feature has been implemented for the general\nmatrix multiplication (`gemm`) operation, providing 128 different possible type\ncombinations, which, when combined with existing transposition, conjugation,\nand storage parameters, enables 55,296 different `gemm` use cases. For more\ndetails, please see the documentation on [mixed datatype](docs/MixedDatatypes.md)\nsupport and/or our [ACM TOMS](https://toms.acm.org/) journal paper on\nmixed-domain/mixed-precision `gemm` ([linked below](#citations)).\n\nGetting Started\n---------------\n\nIf you just want to build a sequential (not parallelized) version of BLIS\nin a hurry and come back and explore other topics later, you can configure\nand build BLIS as follows:\n```\n$ ./configure auto\n$ make [-j]\n```\nYou can then verify your build by running BLAS- and BLIS-specific test\ndrivers via `make check`:\n```\n$ make check [-j]\n```\nAnd if you would like to install BLIS to the directory specified to `configure`\nvia the `--prefix` option, run the `install` target:\n```\n$ make install\n```\nPlease read the output of `./configure --help` for a full list of configure-time\noptions.\nIf/when you have time, we *strongly* encourage you to read the detailed\nwalkthrough of the build system found in our [Build System](docs/BuildSystem.md)\nguide.\n\nDocumentation\n-------------\n\nWe provide extensive documentation on the BLIS build system, APIs, test\ninfrastructure, and other important topics. All documentation is formatted in\nmarkdown and included in the BLIS source distribution (usually in the `docs`\ndirectory). Slightly longer descriptions of each document may be found via in\nthe project's [wiki](https://github.com/flame/blis/wiki) section.\n\n**Documents for everyone:**\n\n * **[Build System](docs/BuildSystem.md).** This document covers the basics of\nconfiguring and building BLIS libraries, as well as related topics.\n\n * **[Testsuite](docs/Testsuite.md).** This document describes how to run\nBLIS's highly parameterized and configurable test suite, as well as the\nincluded BLAS test drivers.\n\n * **[BLIS Typed API Reference](docs/BLISTypedAPI.md).** Here we document the\nso-called \"typed\" (or BLAS-like) API. This is the API that many users who are\nalready familiar with the BLAS will likely want to use. You can find lots of\nexample code for the typed API in the [examples/tapi](examples/tapi) directory\nincluded in the BLIS source distribution.\n\n * **[BLIS Object API Reference](docs/BLISObjectAPI.md).** Here we document\nthe object API. This is API abstracts away properties of vectors and matrices\nwithin `obj_t` structs that can be queried with accessor functions. Many\ndevelopers and experts prefer this API over the typed API. You can find lots of\nexample code for the object API in the [examples/oapi](examples/oapi) directory\nincluded in the BLIS source distribution.\n\n * **[Hardware Support](docs/HardwareSupport.md).** This document maintains a\ntable of supported microarchitectures.\n\n * **[Multithreading](docs/Multithreading.md).** This document describes how to\nuse the multithreading features of BLIS.\n\n * **[Mixed-Datatype](docs/MixedDatatype.md).** This document provides an\noverview of BLIS's mixed-datatype functionality and provides a brief example\nof how to take advantage of this new code.\n\n * **[Release Notes](docs/ReleaseNotes.md).** This document tracks a summary of\nchanges included with each new version of BLIS, along with contributor credits\nfor key features.\n\n * **[Frequently Asked Questions](docs/FAQ.md).** If you have general questions\nabout BLIS, please read this FAQ. If you can't find the answer to your question,\nplease feel free to join the [blis-devel](https://groups.google.com/group/blis-devel)\nmailing list and post a question. We also have a\n[blis-discuss](https://groups.google.com/group/blis-discuss) mailing list that\nanyone can post to (even without joining). \n\n**Documents for github contributors:**\n\n * **[Contributing bug reports, feature requests, PRs, etc](CONTRIBUTING.md).**\nInterested in contributing to BLIS? Please read this document before getting\nstarted. It provides a general overview of how best to report bugs, propose new\nfeatures, and offer code patches. \n\n * **[Coding Conventions](docs/CodingConventions.md).** If you are interested or\nplanning on contributing code to BLIS, please read this document so that you can\nformat your code in accordance with BLIS's standards.\n\n**Documents for BLIS developers:**\n\n * **[Kernels Guide](docs/KernelsHowTo.md).** If you would like to learn more\nabout the types of kernels that BLIS exposes, their semantics, the operations\nthat each kernel accelerates, and various implementation issues, please read\nthis guide.\n\n * **[Configuration Guide](docs/ConfigurationHowTo.md).** If you would like to\nlearn how to add new sub-configurations or configuration families, or are simply\ninterested in learning how BLIS organizes its configurations and kernel sets,\nplease read this thorough walkthrough of the configuration system.\n\n * **[Sandbox Guide](docs/Sandboxes.md).** If you are interested in learning\nabout using sandboxes in BLIS--that is, providing alternative implementations\nof the `gemm` operation--please read this document.\n\nExternal GNU/Linux packages\n---------------------------\n\nGenerally speaking, we **highly recommend** building from source whenever\npossible using the latest `git` clone. (Tarballs of each\n[tagged release](https://github.com/flame/blis/releases) are also available, but\nwe consider them to be less ideal since they are not as easy to upgrade as\n`git` clones.)\n\nThat said, some users may prefer binary and/or source packages through their\nLinux distribution. Thanks to generous involvement/contributions from our\ncommunity members, the following BLIS packages are now available:\n\n * **Debian**. [M. Zhou](https://github.com/cdluminate) has volunteered to\nsponsor and maintain BLIS packages within the Debian Linux distribution. The\nDebian package tracker can be found [here](https://tracker.debian.org/pkg/blis).\n(Also, thanks to [Nico Schlömer](https://github.com/nschloe) for previously\nvolunteering his time to set up a standalone PPA.)\n\n * **EPEL/Fedora**. There are official BLIS packages in Fedora and EPEL (for\nRHEL7+ and compatible distributions) with versions for 64-bit integers, OpenMP,\nand pthreads, and shims which can be dynamically linked instead of reference\nBLAS. (NOTE: For architectures other than intel64, amd64, and maybe arm64, the\nperformance of packaged BLIS will be low because it uses unoptimized generic\nkernels; for those architectures, [OpenBLAS](https://github.com/xianyi/OpenBLAS)\nmay be a better solution.) [Dave\nLove](https://github.com/loveshack) provides additional packages for EPEL6 in a\n[Fedora Copr](https://copr.fedorainfracloud.org/coprs/loveshack/blis/), and\npossibly versions more recent than the official repo for other EPEL/Fedora\nreleases. The source packages may build on other rpm-based distributions.\n\n * **OpenSuSE**. The copr referred to above has rpms for some OpenSuSE releases;\nthe source rpms may build for others.\n\n * **GNU Guix**. Guix has BLIS packages, provides builds only for the generic\ntarget and some specific x86_64 micro-architectures.\n\nDiscussion\n----------\n\nYou can keep in touch with developers and other users of the project by joining\none of the following mailing lists:\n\n * [blis-devel](https://groups.google.com/group/blis-devel): Please join and\npost to this mailing list if you are a BLIS developer, or if you are trying\nto use BLIS beyond simply linking to it as a BLAS library.\n**Note:** Most of the interesting discussions happen here; don't be afraid to\njoin! If you would like to submit a bug report, or discuss a possible bug,\nplease consider opening a [new issue](https://github.com/flame/blis/issues) on\ngithub.\n\n * [blis-discuss](https://groups.google.com/group/blis-discuss): Please join and\npost to this mailing list if you have general questions or feedback regarding\nBLIS. Application developers (end users) may wish to post here, unless they\nhave bug reports, in which case they should open a\n[new issue](https://github.com/flame/blis/issues) on github.\n\nContributing\n------------\n\nFor information on how to contribute to our project, including preferred\n[coding conventions](docs/CodingConventions), please refer to the\n[CONTRIBUTING](CONTRIBUTING.md) file at the top-level of the BLIS source\ndistribution.\n\nCitations\n---------\n\nFor those of you looking for the appropriate article to cite regarding BLIS, we\nrecommend citing our\n[first ACM TOMS journal paper](http://dl.acm.org/authorize?N91172) \n([unofficial backup link](http://www.cs.utexas.edu/users/flame/pubs/blis1_toms_rev3.pdf)):\n\n```\n@article{BLIS1,\n   author      = {Field G. {V}an~{Z}ee and Robert A. {v}an~{d}e~{G}eijn},\n   title       = {{BLIS}: A Framework for Rapidly Instantiating {BLAS} Functionality},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {41},\n   number      = {3},\n   pages       = {14:1--14:33},\n   month       = jun,\n   year        = {2015},\n   issue_date  = {June 2015},\n   url         = {http://doi.acm.org/10.1145/2764454},\n}\n``` \n\nYou may also cite the\n[second ACM TOMS journal paper](http://dl.acm.org/authorize?N16240) \n([unofficial backup link](http://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev3.pdf)):\n\n```\n@article{BLIS2,\n   author      = {Field G. {V}an~{Z}ee and Tyler Smith and Francisco D. Igual and\n                  Mikhail Smelyanskiy and Xianyi Zhang and Michael Kistler and Vernon Austel and\n                  John Gunnels and Tze Meng Low and Bryan Marker and Lee Killough and\n                  Robert A. {v}an~{d}e~{G}eijn},\n   title       = {The {BLIS} Framework: Experiments in Portability},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {42},\n   number      = {2},\n   pages       = {12:1--12:19},\n   month       = jun,\n   year        = {2016},\n   issue_date  = {June 2016},\n   url         = {http://doi.acm.org/10.1145/2755561},\n}\n``` \n\nWe also have a third paper, submitted to IPDPS 2014, on achieving\n[multithreaded parallelism in BLIS](http://www.cs.utexas.edu/users/flame/pubs/blis3_ipdps14.pdf):\n\n```\n@inproceedings{BLIS3,\n   author      = {Tyler M. Smith and Robert A. {v}an~{d}e~{G}eijn and Mikhail Smelyanskiy and\n                  Jeff R. Hammond and Field G. {V}an~{Z}ee},\n   title       = {Anatomy of High-Performance Many-Threaded Matrix Multiplication},\n   booktitle   = {28th IEEE International Parallel \\\u0026 Distributed Processing Symposium\n                  (IPDPS 2014)},\n   year        = 2014,\n}\n```\n\nA fourth paper, submitted to ACM TOMS, also exists, which proposes an\n[analytical model](http://dl.acm.org/citation.cfm?id=2925987) \n([unofficial backup link](http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf))\nfor determining blocksize parameters in BLIS: \n\n```\n@article{BLIS4,\n   author      = {Tze Meng Low and Francisco D. Igual and Tyler M. Smith and\n                  Enrique S. Quintana-Ort\\'{\\i}},\n   title       = {Analytical Modeling Is Enough for High-Performance {BLIS}},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {43},\n   number      = {2},\n   pages       = {12:1--12:18},\n   month       = aug,\n   year        = {2016},\n   issue_date  = {August 2016},\n   url         = {http://doi.acm.org/10.1145/2925987},\n}\n```\n\nA fifth paper, submitted to ACM TOMS, begins the study of so-called\n[induced methods for complex matrix multiplication](http://www.cs.utexas.edu/users/flame/pubs/blis5_toms_rev2.pdf):\n\n```\n@article{BLIS5,\n   author      = {Field G. {V}an~{Z}ee and Tyler Smith},\n   title       = {Implementing High-performance Complex Matrix Multiplication via the 3m and 4m Methods},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {44},\n   number      = {1},\n   pages       = {7:1--7:36},\n   month       = jul,\n   year        = {2017},\n   issue_date  = {July 2017},\n   url         = {http://doi.acm.org/10.1145/3086466},\n}\n``` \n\nA sixth paper, submitted to ACM TOMS, revisits the topic of the previous\narticle and derives a [superior induced method](http://www.cs.utexas.edu/users/flame/pubs/blis6_toms_rev2.pdf):\n\n```\n@article{BLIS6,\n   author      = {Field G. {V}an~{Z}ee},\n   title       = {Implementing High-Performance Complex Matrix Multiplication via the 1m Method},\n   journal     = {ACM Transactions on Mathematical Software},\n   note        = {submitted}\n}\n``` \n\nA seventh paper, submitted to ACM TOMS, explores the implementation of `gemm` for\n[mixed-domain and/or mixed-precision](http://www.cs.utexas.edu/users/flame/pubs/blis7_toms_rev0.pdf) operands:\n\n```\n@article{BLIS7,\n   author      = {Field G. {V}an~{Z}ee and Devangi N. Parikh and Robert A. van~de~{G}eijn},\n   title       = {Supporting Mixed-domain Mixed-precision Matrix Multiplication\nwithin the BLIS Framework},\n   journal     = {ACM Transactions on Mathematical Software},\n   note        = {submitted}\n}\n```\n\nFunding\n-------\n\nThis project and its associated research were partially sponsored by grants from\n[Microsoft](http://www.microsoft.com/),\n[Intel](http://www.intel.com/),\n[Texas Instruments](http://www.ti.com/),\n[AMD](http://www.amd.com/),\n[Oracle](http://www.oracle.com/),\nand\n[Huawei](http://www.huawei.com/),\nas well as grants from the\n[National Science Foundation](http://www.nsf.gov/) (Awards\nCCF-0917167, ACI-1148125/1340293, CCF-1320112, and ACI-1550493).\n\n_Any opinions, findings and conclusions or recommendations expressed in this\nmaterial are those of the author(s) and do not necessarily reflect the views of\nthe National Science Foundation (NSF)._\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fblis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexplosion%2Fblis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fblis/lists"}