https://github.com/openmachine-ai/huggingfive

HuggingFive 🖐️ is a collection of ML functions and libraries written in RISC-V assembly and C.
https://github.com/openmachine-ai/huggingfive

ai assembly machine-learning ml risc-v riscv-assembly riscv32

Last synced: 2 months ago
JSON representation

HuggingFive 🖐️ is a collection of ML functions and libraries written in RISC-V assembly and C.

Host: GitHub
URL: https://github.com/openmachine-ai/huggingfive
Owner: OpenMachine-ai
License: mit
Created: 2023-07-22T09:22:24.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-07-29T20:00:05.000Z (over 2 years ago)
Last Synced: 2025-04-13T23:37:00.135Z (7 months ago)
Topics: ai, assembly, machine-learning, ml, risc-v, riscv-assembly, riscv32
Homepage:
Size: 76.2 KB
Stars: 8
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# HuggingFive :raised_hand_with_fingers_splayed:
HuggingFive :raised_hand_with_fingers_splayed: is a collection of ML functions and libraries written in RISC-V assembly and C. This includes neural network layers, activation functions, as well as entire neural networks. Think of it as a low-level HuggingFace for RISC-V assembly code. The table below includes performance numbers for benchmarking. The hope is to eventually roll all these handwritten tricks into the existing compiler toolchains (or have AI generate even better assembly code).

Description
Author
RV config
Data types
Performance numbers for an exemplary config
Notes

Config
MACs
Ops
Register utilization
Memory size (B)

Conv2D 1x1
OpenMachine
RV32IF
FP32
C=32, F=32, R=6
C F R² = 36,864
57,953
8/31 x-regs, 21/32 f-regs

Conv2D 3x3
OpenMachine
RV32IF
FP32
C=3, F=8, R=12, stride=1
9 C F R² = 31,104
TBD
TBD

Depthwise Conv2D 3x3
OpenMachine
RV32IF
FP32
C=4, R=6, stride=1
9 C R² = 1,296
TBD
TBD

- C : input channels
- F : output channels (or filters), only used if F is not the same as C
- R : square root of input resolution (e.g. R=6 for image resolution of 6x6 pixels)
- Q : square root of output resolution, only used if Q is not the same as R
- MACs : number of fused multiply-accumulate operations required by the neural-network layer (can be used as a lower-bound for total number of ops; this number ignores possible savings from zero-padding for conv-layers)

## Contribute
Please add your functions and routines to HuggingFive :raised_hand_with_fingers_splayed:: Add a link to your code in the
table and submit a PR, which will get approved promptly because there are no rules here.

More details coming soon ...

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/openmachine-ai/huggingfive

Awesome Lists containing this project

README