Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lukechampine/blake3
An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
https://github.com/lukechampine/blake3
blake3 hash
Last synced: 6 days ago
JSON representation
An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
- Host: GitHub
- URL: https://github.com/lukechampine/blake3
- Owner: lukechampine
- License: mit
- Created: 2020-01-09T20:10:45.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2024-05-02T14:21:49.000Z (8 months ago)
- Last Synced: 2024-12-28T19:36:34.739Z (14 days ago)
- Topics: blake3, hash
- Language: Assembly
- Homepage:
- Size: 124 KB
- Stars: 365
- Watchers: 11
- Forks: 24
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ccamel - lukechampine/blake3 - An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function (Assembly)
- awesome-repositories - lukechampine/blake3 - An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function (Assembly)
README
blake3
------[![GoDoc](https://godoc.org/lukechampine.com/blake3?status.svg)](https://godoc.org/lukechampine.com/blake3)
[![Go Report Card](http://goreportcard.com/badge/lukechampine.com/blake3)](https://goreportcard.com/report/lukechampine.com/blake3)```
go get lukechampine.com/blake3
````blake3` implements the [BLAKE3 cryptographic hash function](https://github.com/BLAKE3-team/BLAKE3).
This implementation aims to be performant without sacrificing (too much)
readability, in the hopes of eventually landing in `x/crypto`.In addition to the pure-Go implementation, this package also contains AVX-512
and AVX2 routines (generated by [`avo`](https://github.com/mmcloughlin/avo))
that greatly increase performance for large inputs and outputs.## Benchmarks
Tested on a 2020 MacBook Air (i5-7600K @ 3.80GHz). Benchmarks will improve as
soon as I get access to a beefier AVX-512 machine. :wink:### AVX-512
```
BenchmarkSum256/64 120 ns/op 533.00 MB/s
BenchmarkSum256/1024 2229 ns/op 459.36 MB/s
BenchmarkSum256/65536 16245 ns/op 4034.11 MB/s
BenchmarkWrite 245 ns/op 4177.38 MB/s
BenchmarkXOF 246 ns/op 4159.30 MB/s
```### AVX2
```
BenchmarkSum256/64 120 ns/op 533.00 MB/s
BenchmarkSum256/1024 2229 ns/op 459.36 MB/s
BenchmarkSum256/65536 31137 ns/op 2104.76 MB/s
BenchmarkWrite 487 ns/op 2103.12 MB/s
BenchmarkXOF 329 ns/op 3111.27 MB/s
```### Pure Go
```
BenchmarkSum256/64 120 ns/op 533.00 MB/s
BenchmarkSum256/1024 2229 ns/op 459.36 MB/s
BenchmarkSum256/65536 133505 ns/op 490.89 MB/s
BenchmarkWrite 2022 ns/op 506.36 MB/s
BenchmarkXOF 1914 ns/op 534.98 MB/s
```## Shortcomings
There is no assembly routine for single-block compressions. This is most
noticeable for ~1KB inputs.Each assembly routine inlines all 7 rounds, causing thousands of lines of
duplicated code. Ideally the routines could be merged such that only a single
routine is generated for AVX-512 and AVX2, without sacrificing too much
performance.