https://github.com/pytorch/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
https://github.com/pytorch/FBGEMM
Last synced: about 1 year ago
JSON representation
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
- Host: GitHub
- URL: https://github.com/pytorch/FBGEMM
- Owner: pytorch
- License: other
- Created: 2018-09-24T19:07:42.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2025-03-17T11:35:20.000Z (about 1 year ago)
- Last Synced: 2025-03-17T11:54:02.599Z (about 1 year ago)
- Language: C++
- Homepage:
- Size: 24.7 MB
- Stars: 1,274
- Watchers: 63
- Forks: 550
- Open Issues: 452
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-list - FBGEMM - A low-precision, high-performance matrix-matrix multiplications and convolution library for server-side inference. (Linear Algebra / Statistics Toolkit / General Purpose Tensor Library)
- StarryDivineSky - pytorch/FBGEMM - Matrix Multiplication)是一个专为深度学习优化的矩阵运算库,专注于提升稀疏矩阵与密集矩阵的乘法运算效率,尤其适用于推荐系统、自然语言处理等场景。该项目通过高度优化的底层代码实现,支持CPU和GPU平台,利用向量化、分块(tiling)等技术加速计算,同时针对稀疏数据设计了专用内核,显著降低内存占用和计算开销。FBGEMM兼容PyTorch框架,提供量化感知训练(QAT)和8位整型/半精度浮点(FP16)等特性,帮助开发者在保持模型精度的同时提升推理速度。其核心优势包括对现代CPU指令集(如AVX2)和CUDA加速的深度集成,支持动态稀疏矩阵压缩格式(如COO、CSR),并通过自适应调度机制自动选择最优计算路径。项目还包含高效的矩阵转置和嵌入式操作(如Embedding Lookup),适用于大规模参数模型的训练与部署。开发者可通过预编译包或从源码构建安装,文档涵盖安装指南、性能调优建议及与PyTorch的集成示例。FBGEMM的开源特性使其成为研究和工业界优化深度学习模型性能的重要工具,尤其适合需要处理高维稀疏数据的场景。 (其他_机器学习与深度学习)
- awesome-gemm - FBGEMM: Meta's CPU GEMM for optimized server inference - 3-Clause) (Libraries 🗂️ / CPU Libraries 💻)
README
# FBGEMM
[](https://github.com/pytorch/FBGEMM/actions/workflows/fbgemm_ci.yml)
FBGEMM (Facebook GEneral Matrix Multiplication) is a low-precision,
high-performance matrix-matrix multiplications and convolution library for
server-side inference.
The library provides efficient low-precision general matrix multiplication for
small batch sizes and support for accuracy-loss minimizing techniques such as
row-wise quantization and outlier-aware quantization. FBGEMM also exploits
fusion opportunities in order to overcome the unique challenges of matrix
multiplication at lower precision with bandwidth-bound operations.
FBGEMM is used as a backend of PyTorch quantized operators for x86 machines:
* PyTorch: https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu
See the full [Documentation](https://pytorch.org/FBGEMM) for more information
on building, installing, and developing with FBGEMM, as well as the most
up-to-date support matrix and API documentation for this library.
### What's New?
* [New Features and Recent Improvements](https://github.com/pytorch/FBGEMM/wiki/Recent-feature-additions-and-improvements-in-FBGEMM) (January, 2020)
### Citation
For a high-level overview, design philosophy and brief descriptions of various
parts of FBGEMM please see [our blog post](https://code.fb.com/ml-applications/fbgemm).
For those looking for the appropriate article to cite regarding FBGEMM, we
recommend citing our [paper](https://arxiv.org/pdf/2101.05615.pdf):
```
@article{fbgemm,
title={FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference},
author={Khudia, Daya and Huang, Jianyu and Basu, Protonu and Deng, Summer and Liu, Haixin and Park, Jongsoo and Smelyanskiy, Mikhail},
journal={arXiv preprint arXiv:2101.05615},
year={2021}
}
```
## Join the FBGEMM community
For questions, support, news updates, or feature requests, please feel free to:
* File a ticket in [GitHub Issues](https://github.com/pytorch/FBGEMM/issues)
* Post a discussion in [GitHub Discussions](https://github.com/pytorch/FBGEMM/discussions)
* Reach out to us on the `#fbgemm` channel in [PyTorch Slack](https://bit.ly/ptslack)
For contributions, please see the [`CONTRIBUTING`](./CONTRIBUTING.md) file for
ways to help out.
## License
FBGEMM is BSD licensed, as found in the [`LICENSE`](LICENSE) file.