Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eriksjolund/compile-time-simd-blend-mask
Compile-time blend masks that unifies _mm256_blend_epi8, _mm256_blend_epi16, _mm256_blend_epi32
https://github.com/eriksjolund/compile-time-simd-blend-mask
avx2 blend boost-hana compile-time
Last synced: 30 days ago
JSON representation
Compile-time blend masks that unifies _mm256_blend_epi8, _mm256_blend_epi16, _mm256_blend_epi32
- Host: GitHub
- URL: https://github.com/eriksjolund/compile-time-simd-blend-mask
- Owner: eriksjolund
- License: bsl-1.0
- Created: 2018-12-17T19:04:33.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-12-19T21:12:03.000Z (about 6 years ago)
- Last Synced: 2024-10-31T17:45:19.908Z (3 months ago)
- Topics: avx2, blend, boost-hana, compile-time
- Language: C++
- Homepage:
- Size: 25.4 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# compile-time-simd-blend-mask
Compile-time blend masks that unifies _mm256_blend_epi8, _mm256_blend_epi16, _mm256_blend_epi32
by using the C++ library [boost::hana](http://boostorg.github.io/hana)## Introduction
The intrinsics functions
* [__m256i _mm256_blendv_epi8(__m256i v1, __m256i v2, __m256i mask)](https://software.intel.com/en-us/node/523908)
* [__m256i _mm256_blend_epi16(__m256i a, __m256i b, const int imm8)](https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-mm-blend-epi32-mm256-blend-epi16-32)
* [__m256i _mm256_blend_epi32(__m256i a, __m256i b, const int imm8)](https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-mm-blend-epi32-mm256-blend-epi16-32)serve similar purpose, but they take different arguments. The first function encodes the blend mask in the SIMD vector `__256i`,
but the two latter encode the blend mask in an `int`. The value of such an `int` needs to be known at compile-time (unlike the `__m256i` mask for the first function).## Example
This project implements a common API that can be used like this
```C++
auto mask = hana::make_tuple(
hana::false_c, hana::false_c,
hana::false_c, hana::false_c,
hana::true_c, hana::true_c,
hana::true_c, hana::true_c,hana::false_c, hana::false_c,
hana::false_c, hana::false_c,
hana::false_c, hana::false_c,
hana::false_c, hana::false_c,hana::false_c, hana::false_c,
hana::false_c, hana::false_c,
hana::false_c, hana::false_c,
hana::false_c, hana::false_c,hana::true_c, hana::true_c,
hana::true_c, hana::true_c,
hana::true_c, hana::true_c,
hana::true_c, hana::true_c
);
__m256i a = _mm256_set1_epi8(0);
__m256i b = _mm256_setr_epi8( 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32);
auto blend_result = blend256(a, b, mask);
```
At compile-time the mask will be analyzed and the fastest possible intrinsics function will be chosen (in this case
` _mm256_blend_epi32()`).If the mask allows it `_mm256_blend_epi32()` will be used.
Otherwise if the mask allows it `_mm256_blend_epi16()` will be used.
Otherwise `_mm256_blend_epi8()` will be used.
# Implementation detailsThe file
[src/blend256.h](src/blend256.h)
contains the implementation of the `blend256()` function.
The implementation makes use of the C++ library [boost::hana](http://boostorg.github.io/hana).