{"id":13435664,"url":"https://github.com/flame/how-to-optimize-gemm","last_synced_at":"2025-05-15T14:08:44.093Z","repository":{"id":10990534,"uuid":"65327876","full_name":"flame/how-to-optimize-gemm","owner":"flame","description":null,"archived":false,"fork":false,"pushed_at":"2023-07-29T07:16:04.000Z","size":2291,"stargazers_count":1855,"open_issues_count":9,"forks_count":356,"subscribers_count":43,"default_branch":"master","last_synced_at":"2025-04-07T20:08:29.517Z","etag":null,"topics":["blis","code-optimization","gemm","gotoblas","matrix-multiplication"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flame.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-08-09T20:59:23.000Z","updated_at":"2025-04-07T06:01:57.000Z","dependencies_parsed_at":"2022-07-10T00:46:30.726Z","dependency_job_id":"791ce1f1-31de-4dd7-93be-96d38a378574","html_url":"https://github.com/flame/how-to-optimize-gemm","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fhow-to-optimize-gemm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fhow-to-optimize-gemm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fhow-to-optimize-gemm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flame%2Fhow-to-optimize-gemm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flame","download_url":"https://codeload.github.com/flame/how-to-optimize-gemm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254355335,"owners_count":22057354,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blis","code-optimization","gemm","gotoblas","matrix-multiplication"],"created_at":"2024-07-31T03:00:37.874Z","updated_at":"2025-05-15T14:08:39.079Z","avatar_url":"https://github.com/flame.png","language":"C","funding_links":[],"categories":["C","Learning Resources","General Optimization Techniques 🚀"],"sub_categories":[],"readme":"# How To Optimize Gemm wiki pages\nhttps://github.com/flame/how-to-optimize-gemm/wiki\n\nCopyright by Prof. Robert van de Geijn (rvdg@cs.utexas.edu).\n\nAdapted to Github Markdown Wiki by Jianyu Huang (jianyu@cs.utexas.edu).\n\n# Table of contents\n\n  * [The GotoBLAS/BLIS Approach to Optimizing Matrix-Matrix Multiplication - Step-by-Step](../../wiki#the-gotoblasblis-approach-to-optimizing-matrix-matrix-multiplication---step-by-step)\n  * [NOTICE ON ACADEMIC HONESTY](../../wiki#notice-on-academic-honesty)\n  * [References](../../wiki#references)\n  * [Set Up](../../wiki#set-up)\n  * [Step-by-step optimizations](../../wiki#step-by-step-optimizations)\n  * [Computing four elements of C at a time](../../wiki#computing-four-elements-of-c-at-a-time)\n    * [Hiding computation in a subroutine](../../wiki#hiding-computation-in-a-subroutine)\n    * [Computing four elements at a time](../../wiki#computing-four-elements-at-a-time)\n    * [Further optimizing](../../wiki#further-optimizing)\n  * [Computing a 4 x 4 block of C at a time](../../wiki#computing-a-4-x-4-block-of-c-at-a-time)\n    * [Repeating the same optimizations](../../wiki#repeating-the-same-optimizations)\n    * [Further optimizing](../../wiki#further-optimizing-1)\n    * [Blocking to maintain performance](../../wiki#blocking-to-maintain-performance)\n    * [Packing into contiguous memory](../../wiki#packing-into-contiguous-memory)\n  * [Acknowledgement](../../wiki#acknowledgement)\n\n# Related Links\n* [BLISlab: A Sandbox for Optimizing GEMM](https://github.com/flame/blislab)\n* [GEMM: From Pure C to SSE Optimized Micro Kernels](http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/)\n\n# Acknowledgement\nThis material was partially sponsored by grants from the National Science Foundation (Awards ACI-1148125/1340293 and ACI-1550493).\n\n_Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)._\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflame%2Fhow-to-optimize-gemm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflame%2Fhow-to-optimize-gemm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflame%2Fhow-to-optimize-gemm/lists"}