https://github.com/ziutek/blas
Go implementation of BLAS (Basic Linear Algebra Subprograms)
https://github.com/ziutek/blas
Last synced: 25 days ago
JSON representation
Go implementation of BLAS (Basic Linear Algebra Subprograms)
- Host: GitHub
- URL: https://github.com/ziutek/blas
- Owner: ziutek
- License: other
- Created: 2011-10-21T20:47:08.000Z (about 14 years ago)
- Default Branch: master
- Last Pushed: 2019-02-27T12:29:24.000Z (over 6 years ago)
- Last Synced: 2024-02-15T09:38:21.855Z (almost 2 years ago)
- Language: Assembly
- Homepage:
- Size: 64.5 KB
- Stars: 151
- Watchers: 9
- Forks: 19
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-go-cn - blas
- fucking-awesome-go - :octocat: blas - Implementation of BLAS (Basic Linear Algebra Subprograms) :star: 91 :fork_and_knife: 14 (Science and Data Analysis / Advanced Console UIs)
- awesome-go - blas - Go implementation of BLAS (Basic Linear Algebra Subprograms) - ★ 124 (Science and Data Analysis)
- awesome-cobol - blas - Implementation of BLAS (Basic Linear Algebra Subprograms) (Science and Data Analysis / Middlewares)
- awesome-go-zh - blas
- awesome-go - blas - Implementation of BLAS (Basic Linear Algebra Subprograms). (<span id="科学和数据分析-science-and-data-analysis">科学和数据分析 Science and Data Analysis</span> / <span id="高级控制台用户界面-advanced-console-uis">高级控制台用户界面 Advanced Console UIs</span>)
- awesome-go - blas - Implementation of BLAS (Basic Linear Algebra Subprograms). (Science and Data Analysis / Advanced Console UIs)
README
### Go implementation of BLAS (Basic Linear Algebra Subprograms)
Any function is implemented in generic Go and if it is justified, it is
optimized for AMD64 (using SSE2 instructions).
AMD64 implementation uses MOVUPS/MOVUPD instructions if all strides equal to 1
so it run fast on Nehalem, Sandy Bridge and newer processors but relatively
slow on older processors.
Any implemented function has its own unity test and benchmark.
#### Implemented functions
*Level 1*
Sdsdot, Sdot, Ddot, Snrm2, Dnrm2, Sasum, Dasum, Isamax, Idamax, Sswap, Dswap,
Scopy, Dcopy, Saxpy, Daxpy, Sscal, Dscal, Srotg, Drotg, Srot, Drot
*Level 2*
not implemented
*Level 3*
not implemented
####Example benchmarks
FunctionGeneric GoOptimized for AMD64
Ddot2825 ns/op895 ns/op
Dnrm22787 ns/op597 ns/op
Dasum3145 ns/op560 ns/op
Sdsdot3133 ns/op1733 ns/op
Sdot2832 ns/op508 ns/op
#### Documentation
http://godoc.org/github.com/ziutek/blas