Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mengzhuo/intrinsic

Provide Golang native SIMD intrinsics on x86/amd64 platform
https://github.com/mengzhuo/intrinsic

Last synced: about 1 month ago
JSON representation

Provide Golang native SIMD intrinsics on x86/amd64 platform

Awesome Lists containing this project

README

        

# intrinsic

[![Build Status](https://travis-ci.org/mengzhuo/intrinsic.svg?branch=master)](https://travis-ci.org/mengzhuo/intrinsic)
[![Go Report Card](https://goreportcard.com/badge/github.com/mengzhuo/intrinsic)](https://goreportcard.com/report/github.com/mengzhuo/intrinsic)

Provide Golang native SIMD intrinsics on x86/amd64 platform

* SSE2 [![godoc reference](https://godoc.org/github.com/mengzhuo/intrinsic/sse2?status.png)](https://godoc.org/github.com/mengzhuo/intrinsic/sse2)
* SSE3 [![godoc reference](https://godoc.org/github.com/mengzhuo/intrinsic/sse3?status.png)](https://godoc.org/github.com/mengzhuo/intrinsic/sse3)
* SSSE3 [![godoc reference](https://godoc.org/github.com/mengzhuo/intrinsic/ssse3?status.png)](https://godoc.org/github.com/mengzhuo/intrinsic/ssse3)
* SSE41 [![godoc reference](https://godoc.org/github.com/mengzhuo/intrinsic/sse41?status.png)](https://godoc.org/github.com/mengzhuo/intrinsic/sse41)
* SSE42 [![godoc reference](https://godoc.org/github.com/mengzhuo/intrinsic/sse42?status.png)](https://godoc.org/github.com/mengzhuo/intrinsic/sse42)

## Usage

```golang

package main

import (
"fmt"

"github.com/mengzhuo/intrinsic/sse2"
)

func main() {
src := []float32{3.14, 2.17}
dst := []float32{2.17, 3.15}
sse2.MAXSDm64float32(src, dst)
fmt.Print(src, dst) //[2.17 3.15] [2.17 3.15]
}

```

## Benchmarks

SSE2
it will provide about 6x-7x performance enhancement.

```
BenchmarkPMINUBByte-4 1000000000 2.65 ns/op 0 B/op 0 allocs/op
BenchmarkGeneralPMINUBByte-4 100000000 15.8 ns/op 0 B/op 0 allocs/op
BenchmarkPAND-4 1000000000 2.61 ns/op 0 B/op 0 allocs/op
BenchmarkGeneralAND-4 100000000 15.4 ns/op 0 B/op 0 allocs/op
```

## Development

All codes in subdir is generated by scanner.go , see Makefile for more detail.

x86.csv and x86desc.csv are from another repos in https://github.com/mengzhuo/x86data

## TODO

- [ ] resolve immediate opcode generate
- [ ] SSE2 gen=80, total=141, ratio=56.74%
- [ ] SSE3 gen=6, total=10, ratio=60.00%
- [ ] SSSE3 gen=15, total=32, ratio=46.88%
- [ ] SSE4\_1 gen=26, total=49, ratio=53.06%
- [ ] SSE4\_2 gen=1, total=5, ratio=20.00%
- [ ] AVX gen=66, total=378, ratio=17.46%
- [ ] AVX2 gen=8, total=159, ratio=5.03%
- [ ] FMA