https://github.com/nikhilrout/thetensorcoreproject
Microarchitecture implementation of Nvidia's Tensor Cores
https://github.com/nikhilrout/thetensorcoreproject
cuda floating-point gpgpu hybrid-precision-training tensorcore
Last synced: about 1 year ago
JSON representation
Microarchitecture implementation of Nvidia's Tensor Cores
- Host: GitHub
- URL: https://github.com/nikhilrout/thetensorcoreproject
- Owner: NikhilRout
- License: bsd-3-clause
- Created: 2025-02-25T19:29:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-29T13:54:24.000Z (about 1 year ago)
- Last Synced: 2025-03-29T14:32:48.476Z (about 1 year ago)
- Topics: cuda, floating-point, gpgpu, hybrid-precision-training, tensorcore
- Language: Verilog
- Homepage:
- Size: 3.27 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TheTensorCoreProject
Microarchitecture implementation of Nvidia's CUDA and Tensor Cores
## Tensor Core Versions
TensorCore v0 --> Volta Architecture (Hybrid Precision - FP16MUL FP32ADD) \
TensorCore v1 --> Ampere Architecture (TF32MUL FP32ADD / BF16MUL FP32ADD) \
TensorCore v2 --> Hopper Architecture (FP8 E5M2 / E4M3 FP16ADD)