{"id":13568430,"url":"https://github.com/flame/blis","last_synced_at":"2025-02-25T10:31:43.060Z","repository":{"id":13454342,"uuid":"16143904","full_name":"flame/blis","owner":"flame","description":"BLAS-like Library Instantiation Software Framework","archived":false,"fork":false,"pushed_at":"2024-10-16T21:45:30.000Z","size":49584,"stargazers_count":2288,"open_issues_count":108,"forks_count":366,"subscribers_count":78,"default_branch":"master","last_synced_at":"2024-10-29T15:10:32.420Z","etag":null,"topics":["blas","blas-libraries","blis","high-performance","high-performance-computing","hpc","linear-algebra","linear-algebra-library","matrix","matrix-calculations","matrix-functions","matrix-library","matrix-multiplication","optimization"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flame.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-01-22T15:58:24.000Z","updated_at":"2024-10-25T21:34:29.000Z","dependencies_parsed_at":"2023-10-15T07:50:08.919Z","dependency_job_id":"4fa9c062-f9d4-4fe0-a716-c160211148f3","html_url":"https://github.com/flame/blis","commit_stats":{"total_commits":2019,"total_committers":80,"mean_commits":25.2375,"dds":"0.34175334323922735","last_synced_commit":"8215b02f99aa77ecc7d813508c247565115319d7"},"previous_names":[],"tags_count":37,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fblis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fblis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fblis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fblis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flame","download_url":"https://codeload.github.com/flame/blis/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240648721,"owners_count":19835015,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blas","blas-libraries","blis","high-performance","high-performance-computing","hpc","linear-algebra","linear-algebra-library","matrix","matrix-calculations","matrix-functions","matrix-library","matrix-multiplication","optimization"],"created_at":"2024-08-01T14:00:25.715Z","updated_at":"2025-02-25T10:31:42.988Z","avatar_url":"https://github.com/flame.png","language":"C","funding_links":[],"categories":["Frameworks and Development Tools 🛠️","C","Basic linear algebra","Frameworks","Matrix Multiplication and Linear Algebra"],"sub_categories":[],"readme":"_Recipient of the **[2023 James H. Wilkinson Prize for Numerical Software](https://www.siam.org/prizes-recognition/major-prizes-lectures/detail/james-h-wilkinson-prize-for-numerical-software)**_\n\n_Recipient of the **[2020 SIAM Activity Group on Supercomputing Best Paper Prize](https://www.siam.org/prizes-recognition/activity-group-prizes/detail/siag-sc-best-paper-prize)**_\n\n\n![The BLIS cat is sleeping.](http://www.cs.utexas.edu/users/field/blis_cat.png)\n\n[![Build Status (CircleCI)](https://dl.circleci.com/status-badge/img/gh/flame/blis/tree/master.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/flame/blis/tree/master)\n[![Build Status (TravisCI)](https://api.travis-ci.com/flame/blis.svg?branch=master)](https://app.travis-ci.com/github/flame/blis)\n[![Build Status (Appveyor)](https://ci.appveyor.com/api/projects/status/github/flame/blis?branch=master\u0026svg=true)](https://ci.appveyor.com/project/shpc/blis/branch/master)\n\n[\u003cimg alt=\"Discord logo\" title=\"Join us on Discord!\" height=\"32px\" src=\"docs/images/discord.svg\" /\u003e](docs/Discord.md)\n\nContents\n--------\n\n* **[Introduction](#introduction)**\n* **[Education and Learning](#education-and-learning)**\n* **[What's New](#whats-new)**\n* **[What People Are Saying About BLIS](#what-people-are-saying-about-blis)**\n* **[Key Features](#key-features)**\n* **[How to Download BLIS](#how-to-download-blis)**\n* **[Getting Started](#getting-started)**\n* **[Example Code](#example-code)**\n* **[Documentation](#documentation)**\n* **[Performance](#performance)**\n* **[External Packages](#external-packages)**\n* **[Discussion](#discussion)**\n* **[Contributing](#contributing)**\n* **[Citations](#citations)**\n* **[Awards](#awards)**\n* **[Funding](#funding)**\n\nIntroduction\n------------\n\nBLIS is an [award-winning](#awards)\nportable software framework for instantiating high-performance\nBLAS-like dense linear algebra libraries. The framework was designed to isolate\nessential kernels of computation that, when optimized, immediately enable\noptimized implementations of most of its commonly used and computationally\nintensive operations. BLIS is written in [ISO\nC99](http://en.wikipedia.org/wiki/C99) and available under a\n[new/modified/3-clause BSD\nlicense](http://opensource.org/licenses/BSD-3-Clause). While BLIS exports a\n[new BLAS-like API](docs/BLISTypedAPI.md),\nit also includes a BLAS compatibility layer which gives application developers\naccess to BLIS implementations via traditional [BLAS routine\ncalls](http://www.netlib.org/lapack/lug/node145.html).\nAn [object-based API](docs/BLISObjectAPI.md) unique to BLIS is also available.\n\nFor a thorough presentation of our framework, please read our\n[ACM Transactions on Mathematical Software (TOMS)](https://toms.acm.org/)\njournal article, [\"BLIS: A Framework for Rapidly Instantiating BLAS\nFunctionality\"](http://dl.acm.org/authorize?N91172).\nFor those who just want an executive summary, please see the\n[Key Features](#key-features) section below.\n\nIn a follow-up article (also in [ACM TOMS](https://toms.acm.org/)),\n[\"The BLIS Framework: Experiments in\nPortability\"](http://dl.acm.org/authorize?N16240),\nwe investigate using BLIS to instantiate level-3 BLAS implementations on a\nvariety of general-purpose, low-power, and multicore architectures.\n\nAn IPDPS'14 conference paper titled [\"Anatomy of High-Performance Many-Threaded\nMatrix\nMultiplication\"](http://www.cs.utexas.edu/users/flame/pubs/blis3_ipdps14.pdf)\nsystematically explores the opportunities for parallelism within the five loops\nthat BLIS exposes in its matrix multiplication algorithm.\n\nFor other papers related to BLIS, please see the\n[Citations section](#citations) below.\n\nIt is our belief that BLIS offers substantial benefits in productivity when\ncompared to conventional approaches to developing BLAS libraries, as well as a\nmuch-needed refinement of the BLAS interface, and thus constitutes a major\nadvance in dense linear algebra computation. While BLIS remains a\nwork-in-progress, we are excited to continue its development and further\ncultivate its use within the community.\n\nThe BLIS framework is primarily developed and maintained by individuals in the\n[Science of High-Performance Computing](http://shpc.ices.utexas.edu/)\n(SHPC) group in the\n[Oden Institute for Computational Engineering and Sciences](https://www.oden.utexas.edu/)\nat [The University of Texas at Austin](https://www.utexas.edu/)\nand in the [Matthews Research Group](https://matthewsresearchgroup.webstarts.com/)\nat [Southern Methodist University](https://www.smu.edu/).\nPlease visit the [SHPC](http://shpc.ices.utexas.edu/) website for more\ninformation about our research group, such as a list of\n[people](http://shpc.ices.utexas.edu/people.html)\nand [collaborators](http://shpc.ices.utexas.edu/collaborators.html),\n[funding sources](http://shpc.ices.utexas.edu/funding.html),\n[publications](http://shpc.ices.utexas.edu/publications.html),\nand [other educational projects](http://www.ulaff.net/) (such as MOOCs).\n\nEducation and Learning\n----------------------\n\nWant to understand what's under the hood?\nMany of the same concepts and principles employed when developing BLIS are\nintroduced and taught in a basic pedagogical setting as part of\n[LAFF-On Programming for High Performance (LAFF-On-PfHP)](http://www.ulaff.net/),\none of several massive open online courses (MOOCs) in the\n[Linear Algebra: Foundations to Frontiers](http://www.ulaff.net/) series,\nall of which are available for free via the [edX platform](http://www.edx.org/).\n\nWhat's New\n----------\n\n * **Plugin feature now available!** BLIS addons (see below) provided a way to\nquickly extend BLIS's operation support or define new custom BLIS APIs for your application.\nBLIS plugins extend this support to completely external code, needing only an installed BLIS\npackage (no source required). BLIS plugins also allow users to define their own kernels\nand blocksizes, combined with the cross-architecture support provided by the BLIS framework.\nFinally, user plugins can utilize the new API for modifying the BLIS \"control tree\" which\ndefines the mathematical operation to be computed, as well as information controlling packing,\npartitioning, etc. Users can now modify the control tree to implement new linear algebra\noperations not already included in BLIS. See the [documentation](docs/PluginHowTo.md) for\nan overview of these features and a step-by-step guides for creating plugins and modifying\nthe control tree to implement an example operation \"SYRKD\".\n\n * **BLIS selected for the 2023 James H. Wilkinson Prize for Numerical Software!** We\nare thrilled to announce that Field Van Zee and Devin Matthews were chosen to receive\nthe [2023 James H. Wilkinson Prize for Numerical Software](https://www.siam.org/prizes-recognition/major-prizes-lectures/detail/james-h-wilkinson-prize-for-numerical-software).\nThe selection committee sought to recognize the recipients \"for the development of\nBLIS, a portable open-source software framework that facilitates rapid instantiation\nof high-performance BLAS and BLAS-like operations targeting modern CPUs.\" This prize\nis awarded once every four years to the authors of an outstanding piece of numerical\nsoftware, or to individuals who have made an outstanding contribution to an existing\npiece of numerical software. It is awarded to an entry that best addresses all phases\nof the preparation of high-quality numerical software, and is intended to recognize\ninnovative software in scientific computing and to encourage researchers in the\nearlier stages of their career. The prize will be awarded at the\n[2023 SIAM Conference on Computational Science and Engineering](https://www.siam.org/conferences/cm/conference/cse23) in Amsterdam.\n\n * **Join us on Discord!** In 2021, we soft-launched our [Discord](https://discord.com/)\nserver by privately inviting current and former collaborators, attendees of our BLIS\nRetreat, as well as other participants within the BLIS ecosystem. We've been thrilled\nby the results thus far, and are happy to announce that our new community is now open\nto the broader public! If you'd like to hang out with other BLIS users and developers,\nask a question, discuss future features, or just say hello, please feel free to join\nus! We've put together a [step-by-step guide](docs/Discord.md) for creating an account\nand joining our cozy enclave. We even have a monthly \"BLIS happy hour\" event where\npeople can casually come together for a video chat, Q\u0026A, brainstorm session, or\nwhatever it happens to unfold into!\n\n * **Addons feature now available!** Have you ever wanted to quickly extend BLIS's\noperation support or define new custom BLIS APIs for your application, but were\nunsure of how to add your source code to BLIS? Do you want to isolate your custom\ncode so that it only gets enabled when the user requests it? Do you like\n[sandboxes](docs/Sandboxes.md), but wish you didn't have to provide an\nimplementation of `gemm`? If so, you should check out our new\n[addons](docs/Addons.md) feature. Addons act like optional extensions that can be\ncreated, enabled, and combined to suit your application's needs, all without\nformally integrating your code into the core BLIS framework.\n\n * **Multithreaded small/skinny matrix support for sgemm now available!** Thanks to\nfunding and hardware support from Oracle, we have now accelerated `gemm` for\nsingle-precision real matrix problems where one or two dimensions is exceedingly\nsmall. This work is similar to the `gemm` optimization announced last year.\nFor now, we have only gathered performance results on an AMD Epyc Zen2 system, but\nwe hope to publish additional graphs for other architectures in the future. You may\nfind these Zen2 graphs via the [PerformanceSmall](docs/PerformanceSmall.md) document.\n\n * **BLIS awarded SIAM Activity Group on Supercomputing Best Paper Prize for 2020!**\nWe are thrilled to announce that the paper that we internally refer to as the\nsecond BLIS paper,\n\n   \"The BLIS Framework: Experiments in Portability.\" Field G. Van Zee, Tyler Smith, Bryan Marker, Tze Meng Low, Robert A. van de Geijn, Francisco Igual, Mikhail Smelyanskiy, Xianyi Zhang, Michael Kistler, Vernon Austel, John A. Gunnels, Lee Killough. ACM Transactions on Mathematical Software (TOMS), 42(2):12:1--12:19, 2016.\n\n   was selected for the [SIAM Activity Group on Supercomputing Best Paper Prize](https://www.siam.org/prizes-recognition/activity-group-prizes/detail/siag-sc-best-paper-prize)\nfor 2020. The prize is awarded once every two years to a paper judged to be\nthe most outstanding paper in the field of parallel scientific and engineering\ncomputing, and has only been awarded once before (in 2016) since its inception\nin 2015 (the committee did not award the prize in 2018). The prize\n[was awarded](https://www.oden.utexas.edu/about/news/ScienceHighPerfomanceComputingSIAMBestPaperPrize/)\nat the [2020 SIAM Conference on Parallel Processing for Scientific Computing](https://www.siam.org/conferences/cm/conference/pp20) in Seattle. Robert was present at\nthe conference to give\n[a talk on BLIS](https://meetings.siam.org/sess/dsp_programsess.cfm?SESSIONCODE=68266) and accept the prize alongside other coauthors.\nThe selection committee sought to recognize the paper, \"which validates BLIS,\na framework relying on the notion of microkernels that enables both productivity\nand high performance.\" Their statement continues, \"The framework will continue\nhaving an important influence on the design and the instantiation of dense linear\nalgebra libraries.\"\n\n * **Multithreaded small/skinny matrix support for dgemm now available!** Thanks to\ncontributions made possible by our partnership with AMD, we have dramatically\naccelerated `gemm` for double-precision real matrix problems where one or two\ndimensions is exceedingly small. A natural byproduct of this optimization is\nthat the traditional case of small _m = n = k_ (i.e. square matrices) is also\naccelerated, even though it was not targeted specifically. And though only\n`dgemm` was optimized for now, support for other datatypes and/or other operations\nmay be implemented in the future. We've also added new graphs to the\n[PerformanceSmall](docs/PerformanceSmall.md) document to showcase multithreaded\nperformance when one or more matrix dimensions are small.\n\n * **Performance comparisons now available!** We recently measured the\nperformance of various level-3 operations on a variety of hardware architectures,\nas implemented within BLIS and other BLAS libraries for all four of the standard\nfloating-point datatypes. The results speak for themselves! Check out our\nextensive performance graphs and background info in our new\n[Performance](docs/Performance.md) document.\n\n * **BLIS is now in Debian Unstable!** Thanks to Debian developer-maintainers\n[M. Zhou](https://github.com/cdluminate) and\n[Nico Schlömer](https://github.com/nschloe) for sponsoring our package in Debian.\nTheir participation, contributions, and advocacy were key to getting BLIS into\nthe second-most popular Linux distribution (behind Ubuntu, which Debian packages\nfeed into). The Debian tracker page may be found\n[here](https://tracker.debian.org/pkg/blis).\n\n * **BLIS now supports mixed-datatype gemm!** The `gemm` operation may now be\nexecuted on operands of mixed domains and/or mixed precisions. Any combination\nof storage datatype for A, B, and C is now supported, along with a separate\ncomputation precision that can differ from the storage precision of A and B.\nAnd even the 1m method now supports mixed-precision computation.\nFor more details, please see our [ACM TOMS](https://toms.acm.org/) journal\narticle submission ([current\ndraft](http://www.cs.utexas.edu/users/flame/pubs/blis7_toms_rev0.pdf)).\n\n * **BLIS now implements the 1m method.** Let's face it: writing complex\nassembly `gemm` microkernels for a new architecture is never a priority--and\nnow, it almost never needs to be. The 1m method leverages existing real domain\n`gemm` microkernels to implement all complex domain level-3 operations. For\nmore details, please see our [ACM TOMS](https://toms.acm.org/) journal article\nsubmission ([current\ndraft](http://www.cs.utexas.edu/users/flame/pubs/blis6_toms_rev2.pdf)).\n\nWhat People Are Saying About BLIS\n---------------------------------\n\n*[\"I noticed a substantial increase in multithreaded performance on my own\nmachine, which was extremely satisfying.\"](https://groups.google.com/d/msg/blis-discuss/8iu9B5KCxpA/uftpjgIsBwAJ)* ... *[\"[I was] happy it worked so well!\"](https://groups.google.com/d/msg/blis-discuss/8iu9B5KCxpA/uftpjgIsBwAJ)* (Justin Shea)\n\n*[\"This is an awesome library.\"](https://github.com/flame/blis/issues/288#issuecomment-447488637)* ... *[\"I want to thank you and the blis team for your efforts.\"](https://github.com/flame/blis/issues/288#issuecomment-448074704)* ([@Lephar](https://github.com/Lephar))\n\n*[\"Any time somebody outside Intel beats MKL by a nontrivial amount, I report it to the MKL team. It is fantastic for any open-source project to get within 10% of MKL... [T]his is why Intel funds BLIS development.\"](https://github.com/flame/blis/issues/264#issuecomment-428673275)* ([@jeffhammond](https://github.com/jeffhammond))\n\n*[\"So BLIS is now a part of Elk.\"](https://github.com/flame/blis/issues/267#issuecomment-429303902)* ... *[\"We have found that zgemm applied to a 15000x15000 matrix with multi-threaded BLIS on a 32-core Ryzen 2990WX processor is about twice as fast as MKL\"](https://github.com/flame/blis/issues/264#issuecomment-428373946)* ... *[\"I'm starting to like this a lot.\"](https://github.com/flame/blis/issues/264#issuecomment-428926191)* ([@jdk2016](https://github.com/jdk2016))\n\n*[\"I [found] BLIS because I was looking for BLAS operations on C-ordered arrays for NumPy. BLIS has that, but even better is the fact that it's developed in the open using a more modern language than Fortran.\"](https://github.com/flame/blis/issues/254#issuecomment-423838345)* ([@nschloe](https://github.com/nschloe))\n\n*[\"The specific reason to have BLIS included [in Linux distributions] is the KNL and SKX [AVX-512] BLAS support, which OpenBLAS doesn't have.\"](https://github.com/flame/blis/issues/210#issuecomment-393126303)* ([@loveshack](https://github.com/loveshack))\n\n*[\"All tests pass without errors on OpenBSD. Thanks!\"](https://github.com/flame/blis/issues/202#issuecomment-389691543)* ([@ararslan](https://github.com/ararslan))\n\n*[\"Thank you very much for your great help!... Looking forward to benchmarking.\"](https://github.com/flame/blis/issues/180#issuecomment-375895449)* ([@mrader1248](https://github.com/mrader1248))\n\n*[\"Thanks for the beautiful work.\"](https://github.com/flame/blis/issues/163#issue-286575452)* ([@mmrmo](https://github.com/mmrmo))\n\n*[\"[M]y software currently uses BLIS for its BLAS interface...\"](https://github.com/flame/blis/issues/129#issuecomment-302904805)* ([@ShadenSmith](https://github.com/ShadenSmith))\n\n*[\"[T]hanks so much for your work on this! Excited to test.\"](https://github.com/flame/blis/issues/129#issuecomment-341565071)* ... *[\"[On AMD Excavator], BLIS is competitive to / slightly faster than OpenBLAS for dgemms in my tests.\"](https://github.com/flame/blis/issues/129#issuecomment-341608673)* ([@iotamudelta](https://github.com/iotamudelta))\n\n*[\"BLIS provided the only viable option on KNL, whose ecosystem is at present dominated by blackbox toolchains. Thanks again. Keep on this great work.\"](https://github.com/flame/blis/issues/116#issuecomment-281225101)* ([@heroxbd](https://github.com/heroxbd))\n\n*[\"I want to definitely try this out...\"](https://github.com/flame/blis/issues/12#issuecomment-48086295)* ([@ViralBShah](https://github.com/ViralBShah))\n\nKey Features\n------------\n\nBLIS offers several advantages over traditional BLAS libraries:\n\n * **Portability that doesn't impede high performance.** Portability was a top\npriority of ours when creating BLIS. With virtually no additional effort on the\npart of the developer, BLIS is configurable as a fully-functional reference\nimplementation. But more importantly, the framework identifies and isolates a\nkey set of computational kernels which, when optimized, immediately and\nautomatically optimize performance across virtually all level-2 and level-3\nBLIS operations. In this way, the framework acts as a productivity multiplier.\nAnd since the optimized (non-portable) code is compartmentalized within these\nfew kernels, instantiating a high-performance BLIS library on a new\narchitecture is a relatively straightforward endeavor.\n\n * **Generalized matrix storage.** The BLIS framework exports interfaces that\nallow one to specify both the row stride and column stride of a matrix. This\nallows one to compute with matrices stored in column-major order, row-major\norder, or by general stride. (This latter storage format is important for those\nseeking to implement tensor contractions on multidimensional arrays.)\nFurthermore, since BLIS tracks stride information for each matrix, operands of\ndifferent storage formats can be used within the same operation invocation. By\ncontrast, BLAS requires column-major storage. And while the CBLAS interface\nsupports row-major storage, it does not allow mixing storage formats.\n\n * **Rich support for the complex domain.** BLIS operations are developed and\nexpressed in their most general form, which is typically in the complex domain.\nThese formulations then simplify elegantly down to the real domain, with\nconjugations becoming no-ops. Unlike the BLAS, all input operands in BLIS that\nallow transposition and conjugate-transposition also support conjugation\n(without transposition), which obviates the need for thread-unsafe workarounds.\nAlso, where applicable, both complex symmetric and complex Hermitian forms are\nsupported. (BLAS omits some complex symmetric operations, such as `symv`,\n`syr`, and `syr2`.) Another great example of BLIS serving as a portability\nlever is its implementation of the 1m method for complex matrix multiplication,\na novel mechanism of providing high-performance complex level-3 operations using\nonly real domain microkernels. This new innovation guarantees automatic level-3\nsupport in the complex domain even when the kernel developers entirely forgo\nwriting complex kernels.\n\n * **Advanced multithreading support.** BLIS allows multiple levels of\nsymmetric multithreading for nearly all level-3 operations. (Currently, users\nmay choose to obtain parallelism via OpenMP, POSIX threads, or HPX). This\nmeans that matrices may be partitioned in multiple dimensions simultaneously to\nattain scalable, high-performance parallelism on multicore and many-core\narchitectures. The key to this innovation is a thread-specific control tree\ninfrastructure which encodes information about the logical thread topology and\nallows threads to query and communicate data amongst one another. BLIS also\nemploys so-called \"quadratic partitioning\" when computing dimension sub-ranges\nfor each thread, so that arbitrary diagonal offsets of structured matrices with\nunreferenced regions are taken into account to achieve proper load balance.\nMore recently, BLIS introduced a runtime abstraction to specify parallelism on\na per-call basis, which is useful for applications that want to handle most of\nthe parallelism.\n\n * **Ease of use.** The BLIS framework, and the library of routines it\ngenerates, are easy to use for end users, experts, and vendors alike. An\noptional BLAS compatibility layer provides application developers with\nbackwards compatibility to existing BLAS-dependent codes. Or, one may adjust or\nwrite their application to take advantage of new BLIS functionality (such as\ngeneralized storage formats or additional complex operations) by calling one\nof BLIS's native APIs directly. BLIS's typed API will feel familiar to many\nveterans of BLAS since these interfaces use BLAS-like calling sequences. And\nmany will find BLIS's object-based APIs a delight to use when customizing\nor writing their own BLIS operations. (Objects are relatively lightweight\n`structs` and passed by address, which helps tame function calling overhead.)\n\n * **Multilayered API and exposed kernels.** The BLIS framework exposes its\nimplementations in various layers, allowing expert developers to access exactly\nthe functionality desired. This layered interface includes that of the\nlowest-level kernels, for those who wish to bypass the bulk of the framework.\nOptimizations can occur at various levels, in part thanks to exposed packing\nand unpacking facilities, which by default are highly parameterized and\nflexible.\n\n * **Functionality that grows with the community's needs.** As its name\nsuggests, the BLIS framework is not a single library or static API, but rather\na nearly-complete template for instantiating high-performance BLAS-like\nlibraries. Furthermore, the framework is extensible, allowing developers to\nleverage existing components to support new operations as they are identified.\nIf such operations require new kernels for optimal efficiency, the framework\nand its APIs will be adjusted and extended accordingly. Community developers\nwho wish to experiment with creating new operations or APIs in BLIS can quickly\nand easily do so via the [Addons](docs/Addons.md) feature.\n\n * **Code re-use.** Auto-generation approaches to achieving the aforementioned\ngoals tend to quickly lead to code bloat due to the multiple dimensions of\nvariation supported: operation (i.e. `gemm`, `herk`, `trmm`, etc.); parameter\ncase (i.e. side, [conjugate-]transposition, upper/lower storage, unit/non-unit\ndiagonal); datatype (i.e. single-/double-precision real/complex); matrix\nstorage (i.e. row-major, column-major, generalized); and algorithm (i.e.\npartitioning path and kernel shape). These \"brute force\" approaches often\nconsider and optimize each operation or case combination in isolation, which is\nless than ideal when the goal is to provide entire libraries. BLIS was designed\nto be a complete framework for implementing basic linear algebra operations,\nbut supporting this vast amount of functionality in a manageable way required a\nholistic design that employed careful abstractions, layering, and recycling of\ngeneric (highly parameterized) codes, subject to the constraint that high\nperformance remain attainable.\n\n * **A foundation for mixed domain and/or mixed precision operations.** BLIS\nwas designed with the hope of one day allowing computation on real and complex\noperands within the same operation. Similarly, we wanted to allow mixing\noperands' numerical domains, floating-point precisions, or both domain and\nprecision, and to optionally compute in a precision different than one or both\noperands' storage precisions. This feature has been implemented for the general\nmatrix multiplication (`gemm`) operation, providing 128 different possible type\ncombinations, which, when combined with existing transposition, conjugation,\nand storage parameters, enables 55,296 different `gemm` use cases. For more\ndetails, please see the documentation on [mixed datatype](docs/MixedDatatypes.md)\nsupport and/or our [ACM TOMS](https://toms.acm.org/) journal paper on\nmixed-domain/mixed-precision `gemm` ([linked below](#citations)).\n\nHow to Download BLIS\n--------------------\n\nThere are a few ways to download BLIS. We list the most common four ways below.\nWe **highly recommend** using either Option 1 or 2. Otherwise, we recommend\nOption 3 (over Option 4) so your compiler can perform optimizations specific\nto your hardware.\n\n1. **Download a source repository with `git clone`.**\nGenerally speaking, we prefer using `git clone` to clone a `git` repository.\nHaving a repository allows the user to periodically pull in the latest changes,\ntry out release candidates when they become available, switch to older versions\neasily, and quickly rebuild BLIS whenever they wish.\n(Note that implicit in cloning a repository is that the repository defaults to\nusing the `master` branch, which, as of 1.0, is considered akin to a development\nbranch and likely contains improvements since the most recent release.)\n\n   In order to clone a `git` repository of BLIS, please obtain a repository\nURL by clicking on the green button above the file/directory listing near the\ntop of this page (as rendered by GitHub). Generally speaking, it will amount\nto executing the following command in your terminal shell:\n   ```\n   git clone https://github.com/flame/blis.git\n   ```\n   At this point, you will have the latest commit of the `master` branch\nchecked out. If you wish to check out an official release version, say,\n1.0, execute the following:\n   ```\n   git checkout 1.0\n   ```\n   `git` will then transform your working copy to match the state of the\ncommit associated with version 1.0. You can view a list of official\nversiontags at any time by executing:\n   ```\n   git tag --list\n   ```\n   Note that pre-release versions, such as release candidates, are actually\nbranches rather than tags, and thus will not show up in the list of tagged\nversions.\n\n2. **Download a source release via a tarball/zip file.**\nIf you would like to stick to the code that is included in official releases\nand don't need the convenience of pulling in the latest changes via `git`, you\nmay download either a tarball or zip file of BLIS's latest\n[release](https://github.com/flame/blis/releases). (NOTE: Some older releases\nare only available as [tagged](https://github.com/flame/blis/tags) commits.\nAlso note that downloading release x.y.z is equivalent to downloading, or\nchecking out, the `git` tag `x.y.z`.)\nWe consider this option to be less than ideal for some people since you will\nnot be able to update your code with a simple `git pull` command.\n\n3. **Download a source repository via a zip file.**\nIf you are uncomfortable with using `git` but would still like the latest\nstable commits, we recommend that you download BLIS as a zip file.\n\n   In order to download a zip file of the BLIS source distribution, please\nclick on the green button above the file listing near the top of this page.\nThis should reveal a link for downloading the zip file.\n\n4. **Download a binary package specific to your OS.**\nWhile we don't recommend this as the first choice for most users, we provide\nlinks to community members who generously maintain BLIS packages for various\nLinux distributions such as Debian Unstable and EPEL/Fedora. Please see the\n[External Packages](#external-packages) section below for more information.\n\nGetting Started\n---------------\n\n*NOTE: This section assumes you've either cloned a BLIS source code repository\nvia `git`, downloaded the latest source code via a zip file, or downloaded the\nsource code for a tagged version release---Options 1, 2, or 3, respectively,\nas discussed in [the previous section](#how-to-download-blis).*\n\nIf you just want to build a sequential (not parallelized) version of BLIS\nin a hurry and come back and explore other topics later, you can configure\nand build BLIS as follows:\n```\n$ ./configure auto\n$ make [-j]\n```\nYou can then verify your build by running BLAS- and BLIS-specific test\ndrivers via `make check`:\n```\n$ make check [-j]\n```\nAnd if you would like to install BLIS to the directory specified to `configure`\nvia the `--prefix` option, run the `install` target:\n```\n$ make install\n```\nPlease read the output of `./configure --help` for a full list of configure-time\noptions.\nIf/when you have time, we *strongly* encourage you to read the detailed\nwalkthrough of the build system found in our [Build System](docs/BuildSystem.md)\nguide.\n\nIf you are still having trouble, you are welcome to [join us on Discord](docs/Discord.md)\nfor further information and/or assistance.\n\nExample Code\n------------\n\nThe BLIS source distribution provides example code in the `examples` directory.\nExample code focuses on using BLIS APIs (not BLAS or CBLAS), and resides in\ntwo subdirectories: [examples/oapi](examples/oapi) (which demonstrates the\n[object API](docs/BLISObjectAPI.md)) and [examples/tapi](examples/tapi) (which\ndemonstrates the [typed API](docs/BLISTypedAPI.md)).\n\nEither directory contains several files, each containing various pieces of\ncode that exercise core functionality of the BLIS API in question (object or\ntyped). These example files should be thought of collectively like a tutorial,\nand therefore it is recommended to start from the beginning (the file that\nstarts in `00`).\n\nYou can build all of the examples by simply running `make` from either example\nsubdirectory (`examples/oapi` or `examples/tapi`). (You can also run\n`make clean`.) The local `Makefile` assumes that you've already configured and\nbuilt (but not necessarily installed) BLIS two directories up, in `../..`. If\nyou have already installed BLIS to some permanent directory, you may refer to\nthat installation by setting the environment variable `BLIS_INSTALL_PATH` prior\nto running make:\n```\nexport BLIS_INSTALL_PATH=/usr/local; make\n```\nor by setting the same variable as part of the make command:\n```\nmake BLIS_INSTALL_PATH=/usr/local\n```\n**Once the executable files have been built, we recommend reading the code and\nthe corresponding executable output side by side. This will help you see the\neffects of each section of code.**\n\nThis tutorial is not exhaustive or complete; several object API functions were\nomitted (mostly for brevity's sake) and thus more examples could be written.\n\nDocumentation\n-------------\n\nWe provide extensive documentation on the BLIS build system, APIs, test\ninfrastructure, and other important topics. All documentation is formatted in\nmarkdown and included in the BLIS source distribution (usually in the `docs`\ndirectory). Slightly longer descriptions of each document may be found via in\nthe project's [wiki](https://github.com/flame/blis/wiki) section.\n\n**Documents for everyone:**\n\n * **[Build System](docs/BuildSystem.md).** This document covers the basics of\nconfiguring and building BLIS libraries, as well as related topics.\n\n * **[Testsuite](docs/Testsuite.md).** This document describes how to run\nBLIS's highly parameterized and configurable test suite, as well as the\nincluded BLAS test drivers.\n\n * **[BLIS Typed API Reference](docs/BLISTypedAPI.md).** Here we document the\nso-called \"typed\" (or BLAS-like) API. This is the API that many users who are\nalready familiar with the BLAS will likely want to use.\n\n * **[BLIS Object API Reference](docs/BLISObjectAPI.md).** Here we document\nthe object API. This is API abstracts away properties of vectors and matrices\nwithin `obj_t` structs that can be queried with accessor functions. Many\ndevelopers and experts prefer this API over the typed API.\n\n * **[Hardware Support](docs/HardwareSupport.md).** This document maintains a\ntable of supported microarchitectures.\n\n * **[Multithreading](docs/Multithreading.md).** This document describes how to\nuse the multithreading features of BLIS.\n\n * **[Mixed-Datatypes](docs/MixedDatatypes.md).** This document provides an\noverview of BLIS's mixed-datatype functionality and provides a brief example\nof how to take advantage of this new code.\n\n * **[Extending BLIS functionality](docs/PluginHowTo.md).** This document provides an\noverview of BLIS's mechanisms for extending functionality through user-defined code.\nBLIS has a plugin infrastructure which allows users to define their own kernels,\nblocksizes, and kernel preferences which are compiled and managed by the BLIS framework.\nBLIS also provides an API for modifying the \"control tree\" which can be used to\nimplement novel linear algebra operations.\n\n * **[Performance](docs/Performance.md).** This document reports empirically\nmeasured performance of a representative set of level-3 operations on a variety\nof hardware architectures, as implemented within BLIS and other BLAS libraries\nfor all four of the standard floating-point datatypes.\n\n * **[PerformanceSmall](docs/PerformanceSmall.md).** This document reports\nempirically measured performance of `gemm` on select hardware architectures\nwithin BLIS and other BLAS libraries when performing matrix problems where one\nor two dimensions is exceedingly small.\n\n * **[Discord](docs/Discord.md).** This document describes how to: create an\naccount on Discord (if you don't already have one); obtain a private invite\nlink; and use that invite link to join our BLIS server on Discord.\n\n * **[Release Notes](docs/ReleaseNotes.md).** This document tracks a summary of\nchanges included with each new version of BLIS, along with contributor credits\nfor key features.\n\n * **[Frequently Asked Questions](docs/FAQ.md).** If you have general questions\nabout BLIS, please read this FAQ. If you can't find the answer to your question,\nplease feel free to join the [blis-devel](https://groups.google.com/group/blis-devel)\nmailing list and post a question. We also have a\n[blis-discuss](https://groups.google.com/group/blis-discuss) mailing list that\nanyone can post to (even without joining).\n\n**Documents for github contributors:**\n\n * **[Contributing bug reports, feature requests, PRs, etc](CONTRIBUTING.md).**\nInterested in contributing to BLIS? Please read this document before getting\nstarted. It provides a general overview of how best to report bugs, propose new\nfeatures, and offer code patches.\n\n * **[Coding Conventions](docs/CodingConventions.md).** If you are interested or\nplanning on contributing code to BLIS, please read this document so that you can\nformat your code in accordance with BLIS's standards.\n\n**Documents for BLIS developers:**\n\n * **[Kernels Guide](docs/KernelsHowTo.md).** If you would like to learn more\nabout the types of kernels that BLIS exposes, their semantics, the operations\nthat each kernel accelerates, and various implementation issues, please read\nthis guide.\n\n * **[Configuration Guide](docs/ConfigurationHowTo.md).** If you would like to\nlearn how to add new sub-configurations or configuration families, or are simply\ninterested in learning how BLIS organizes its configurations and kernel sets,\nplease read this thorough walkthrough of the configuration system.\n\n * **[Addon Guide](docs/Addons.md).** If you are interested in learning\nabout using BLIS addons--that is, enabling existing (or creating new) bundles\nof operation or API code that are built into a BLIS library--please read this\ndocument.\n\n * **[Sandbox Guide](docs/Sandboxes.md).** If you are interested in learning\nabout using sandboxes in BLIS--that is, providing alternative implementations\nof the `gemm` operation--please read this document.\n\nPerformance\n-----------\n\nWe provide graphs that report performance of several implementations across a\nrange of hardware types, multithreading configurations, problem sizes,\noperations, and datatypes. These pages also document most of the details needed\nto reproduce these experiments.\n\n * **[Performance](docs/Performance.md).** This document reports empirically\nmeasured performance of a representative set of level-3 operations on a variety\nof hardware architectures, as implemented within BLIS and other BLAS libraries\nfor all four of the standard floating-point datatypes.\n\n * **[PerformanceSmall](docs/PerformanceSmall.md).** This document reports\nempirically measured performance of `gemm` on select hardware architectures\nwithin BLIS and other BLAS libraries when performing matrix problems where one\nor two dimensions is exceedingly small.\n\nExternal Packages\n-----------------\n\nGenerally speaking, we **highly recommend** building from source whenever\npossible using the latest `git` clone. (Tarballs of each\n[tagged release](https://github.com/flame/blis/releases) are also available, but\nwe consider them to be less ideal since they are not as easy to upgrade as\n`git` clones.)\n\nThat said, some users may prefer binary and/or source packages through their\nLinux distribution. Thanks to generous involvement/contributions from our\ncommunity members, the following BLIS packages are now available:\n\n * **Debian**. [M. Zhou](https://github.com/cdluminate) has volunteered to\nsponsor and maintain BLIS packages within the Debian Linux distribution. The\nDebian package tracker can be found [here](https://tracker.debian.org/pkg/blis).\n(Also, thanks to [Nico Schlömer](https://github.com/nschloe) for previously\nvolunteering his time to set up a standalone PPA.)\n\n * **Gentoo**. [M. Zhou](https://github.com/cdluminate) also maintains the\n[BLIS package](https://packages.gentoo.org/packages/sci-libs/blis) entry for\n[Gentoo](https://www.gentoo.org/), a Linux distribution known for its\nsource-based [portage](https://wiki.gentoo.org/wiki/Portage) package manager\nand distribution system.\n\n * **EPEL/Fedora**. There are official BLIS packages in Fedora and EPEL (for\nRHEL7+ and compatible distributions) with versions for 64-bit integers, OpenMP,\nand pthreads, and shims which can be dynamically linked instead of reference\nBLAS. (NOTE: For architectures other than intel64, amd64, and maybe arm64, the\nperformance of packaged BLIS will be low because it uses unoptimized generic\nkernels; for those architectures, [OpenBLAS](https://github.com/xianyi/OpenBLAS)\nmay be a better solution.) [Dave\nLove](https://github.com/loveshack) provides additional packages for EPEL6 in a\n[Fedora Copr](https://copr.fedorainfracloud.org/coprs/loveshack/blis/), and\npossibly versions more recent than the official repo for other EPEL/Fedora\nreleases. The source packages may build on other rpm-based distributions.\n\n * **OpenSuSE**. The copr referred to above has rpms for some OpenSuSE releases;\nthe source rpms may build for others.\n\n * **GNU Guix**. Guix has BLIS packages, provides builds only for the generic\ntarget and some specific `x86_64` micro-architectures.\n\n * **Conda**. conda channel [conda-forge](https://github.com/conda-forge/blis-feedstock)\nhas Linux, OSX and Windows binary packages for `x86_64`.\n\nDiscussion\n----------\n\nMost of the active discussions are now happening on our [Discord](https://discord.com/)\nserver. Users and developers alike are welcome! Please see the\n[BLIS Discord guide](docs/Discord.md) for a walkthrough of how to join us.\n\nYou can also still stay in touch by using either of the following mailing lists:\n\n * [blis-devel](https://groups.google.com/group/blis-devel): Please join and\npost to this mailing list if you are a BLIS developer, or if you are trying\nto use BLIS beyond simply linking to it as a BLAS library.\n\n * [blis-discuss](https://groups.google.com/group/blis-discuss): Please join and\npost to this mailing list if you have general questions or feedback regarding\nBLIS. Application developers (end users) may wish to post here, unless they\nhave bug reports, in which case they should open a\n[new issue](https://github.com/flame/blis/issues) on github.\n\nContributing\n------------\n\nFor information on how to contribute to our project, including preferred\n[coding conventions](docs/CodingConventions.md), please refer to the\n[CONTRIBUTING](CONTRIBUTING.md) file at the top-level of the BLIS source\ndistribution.\n\nCitations\n---------\n\nFor those of you looking for the appropriate article to cite regarding BLIS, we\nrecommend citing our\n[first ACM TOMS journal paper](https://dl.acm.org/doi/10.1145/2764454?cid=81314495332)\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis1_toms_rev3.pdf)):\n\n```\n@article{BLIS1,\n   author      = {Field G. {V}an~{Z}ee and Robert A. {v}an~{d}e~{G}eijn},\n   title       = {{BLIS}: A Framework for Rapidly Instantiating {BLAS} Functionality},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {41},\n   number      = {3},\n   pages       = {14:1--14:33},\n   month       = {June},\n   year        = {2015},\n   issue_date  = {June 2015},\n   url         = {https://doi.acm.org/10.1145/2764454},\n}\n```\n\nYou may also cite the\n[second ACM TOMS journal paper](https://dl.acm.org/doi/10.1145/2755561?cid=81314495332)\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev3.pdf)):\n\n```\n@article{BLIS2,\n   author      = {Field G. {V}an~{Z}ee and Tyler Smith and Francisco D. Igual and\n                  Mikhail Smelyanskiy and Xianyi Zhang and Michael Kistler and Vernon Austel and\n                  John Gunnels and Tze Meng Low and Bryan Marker and Lee Killough and\n                  Robert A. {v}an~{d}e~{G}eijn},\n   title       = {The {BLIS} Framework: Experiments in Portability},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {42},\n   number      = {2},\n   pages       = {12:1--12:19},\n   month       = {June},\n   year        = {2016},\n   issue_date  = {June 2016},\n   url         = {https://doi.acm.org/10.1145/2755561},\n}\n```\n\nWe also have a third paper, submitted to IPDPS 2014, on achieving\n[multithreaded parallelism in BLIS](https://dl.acm.org/doi/10.1109/IPDPS.2014.110)\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis3_ipdps14.pdf)):\n\n```\n@inproceedings{BLIS3,\n   author      = {Tyler M. Smith and Robert A. {v}an~{d}e~{G}eijn and Mikhail Smelyanskiy and\n                  Jeff R. Hammond and Field G. {V}an~{Z}ee},\n   title       = {Anatomy of High-Performance Many-Threaded Matrix Multiplication},\n   booktitle   = {28th IEEE International Parallel \\\u0026 Distributed Processing Symposium\n                  (IPDPS 2014)},\n   year        = {2014},\n   url         = {https://doi.org/10.1109/IPDPS.2014.110},\n}\n```\n\nA fourth paper, submitted to ACM TOMS, also exists, which proposes an\n[analytical model](https://dl.acm.org/doi/10.1145/2925987)\nfor determining blocksize parameters in BLIS\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf)):\n\n```\n@article{BLIS4,\n   author      = {Tze Meng Low and Francisco D. Igual and Tyler M. Smith and\n                  Enrique S. Quintana-Ort\\'{\\i}},\n   title       = {Analytical Modeling Is Enough for High-Performance {BLIS}},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {43},\n   number      = {2},\n   pages       = {12:1--12:18},\n   month       = {August},\n   year        = {2016},\n   issue_date  = {August 2016},\n   url         = {https://doi.acm.org/10.1145/2925987},\n}\n```\n\nA fifth paper, submitted to ACM TOMS, begins the study of so-called\n[induced methods for complex matrix multiplication](https://dl.acm.org/doi/10.1145/3086466?cid=81314495332)\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis5_toms_rev2.pdf)):\n\n```\n@article{BLIS5,\n   author      = {Field G. {V}an~{Z}ee and Tyler Smith},\n   title       = {Implementing High-performance Complex Matrix Multiplication via the 3m and 4m Methods},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {44},\n   number      = {1},\n   pages       = {7:1--7:36},\n   month       = {July},\n   year        = {2017},\n   issue_date  = {July 2017},\n   url         = {https://doi.acm.org/10.1145/3086466},\n}\n```\n\nA sixth paper, submitted to ACM TOMS, revisits the topic of the previous\narticle and derives a\n[superior induced method](https://epubs.siam.org/doi/10.1137/19M1282040)\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis6_sisc_rev3.pdf)):\n\n```\n@article{BLIS6,\n   author      = {Field G. {V}an~{Z}ee},\n   title       = {Implementing High-Performance Complex Matrix Multiplication via the 1m Method},\n   journal     = {SIAM Journal on Scientific Computing},\n   volume      = {42},\n   number      = {5},\n   pages       = {C221--C244},\n   month       = {September}\n   year        = {2020},\n   issue_date  = {September 2020},\n   url         = {https://doi.org/10.1137/19M1282040}\n}\n```\n\nA seventh paper, submitted to ACM TOMS, explores the implementation of `gemm` for\n[mixed-domain and/or mixed-precision](https://dl.acm.org/doi/10.1145/3402225?cid=81314495332) operands\n([unofficial backup link](https://www.cs.utexas.edu/users/flame/pubs/blis7_toms_rev0.pdf)):\n\n```\n@article{BLIS7,\n   author      = {Field G. {V}an~{Z}ee and Devangi N. Parikh and Robert A. van~de~{G}eijn},\n   title       = {Supporting Mixed-domain Mixed-precision Matrix Multiplication\nwithin the BLIS Framework},\n   journal     = {ACM Transactions on Mathematical Software},\n   volume      = {47},\n   number      = {2},\n   pages       = {12:1--12:26},\n   month       = {April},\n   year        = {2021},\n   issue_date  = {April 2021},\n   url         = {https://doi.org/10.1145/3402225},\n}\n```\n\nAwards\n------\n\n * **[2023 James H. Wilkinson Prize for Numerical Software.](https://www.siam.org/prizes-recognition/major-prizes-lectures/detail/james-h-wilkinson-prize-for-numerical-software)**\nThis prize is awarded once every four years to the authors of an outstanding piece of\nnumerical software, or to individuals who have made an outstanding contribution to an\nexisting piece of numerical software. The selection committee sought to recognize the\nrecipients \"for the development of [BLIS](https://github.com/flame/blis), a portable\nopen-source software framework that facilitates rapid instantiation of\nhigh-performance BLAS and BLAS-like operations targeting modern CPUs.\" The prize will\nbe awarded at the\n[2023 SIAM Conference on Computational Science and Engineering](https://www.siam.org/conferences/cm/conference/cse23) in Amsterdam.\n\n * **[2020 SIAM Activity Group on Supercomputing Best Paper Prize.](https://www.siam.org/prizes-recognition/activity-group-prizes/detail/siag-sc-best-paper-prize)**\nThis prize is awarded once every two years to the authors of the most outstanding\npaper, as determined by the selection committee, in the field of parallel scientific\nand engineering computing published within the four calendar years preceding the\naward year. The prize was chosen for the paper [\"The BLIS Framework: Experiments in\nPortability.\"](#citations) and awarded at the [2020 SIAM Conference on Parallel Processing for Scientific Computing](https://www.siam.org/conferences/cm/conference/pp20) in Seattle where Robert van de Geijn delivered [a talk on BLIS](https://meetings.siam.org/sess/dsp_programsess.cfm?SESSIONCODE=68266) and accepted the prize alongside other coauthors.\nSee also:\n   * [SIAM News | January 2020 Prize Spotlight](https://sinews.siam.org/Details-Page/january-2020-prize-spotlight#Field\u0026Robert)\n   * [Oden Institute's SHPC Group Win SIAM Best Paper Prize](https://www.oden.utexas.edu/about/news/ScienceHighPerfomanceComputingSIAMBestPaperPrize/)\n\nFunding\n-------\n\nThis project and its associated research were partially sponsored by grants from\n[Microsoft](https://www.microsoft.com/),\n[Intel](https://www.intel.com/),\n[Texas Instruments](https://www.ti.com/),\n[AMD](https://www.amd.com/),\n[HPE](https://www.hpe.com/),\n[Oracle](https://www.oracle.com/),\n[Huawei](https://www.huawei.com/),\n[Facebook](https://www.facebook.com/),\nand\n[ARM](https://www.arm.com/),\nas well as grants from the\n[National Science Foundation](https://www.nsf.gov/) (Awards\nCCF-0917167, ACI-1148125/1340293, CCF-1320112, and ACI-1550493).\n\n_Any opinions, findings and conclusions or recommendations expressed in this\nmaterial are those of the author(s) and do not necessarily reflect the views of\nthe National Science Foundation (NSF)._\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflame%2Fblis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflame%2Fblis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflame%2Fblis/lists"}