This is a library that allow bit manipulation that Intrinsics don't.
After testing I believe the the _mm256 left and right shift take 20 clock cycles
__m256i _mm256_lls_mm256(__m256i n, int32_t s)
- This left shifts an AVX register n by s number of bits
- The input are a __m256i and a int32_t
Errors codes
- 11 input was a negative number
- 9 Please contact me i have no idea what you did and this shouldn't be possible
__m256i _mm256_lrs_mm256(__m256i n, int32_t s)
- This right shift an AVX register n by s number of bits
- The input are a __m256i and i int int32_t
Errors codes
- 11 input was a negative number
- 9 Please contact me i have no idea what you did and this shouldn't be possible
__m256i _mm256_rotl_mm256(__m256i n, int32_t s)
- This rotate left an AVX register n by s number of bits
- The input are a __m256i and a int32_t
- if the input is 256 or larger this will return zero
Errors codes
- 11 input was a negative number
- 9 Please contact me i have no idea what you did and this shouldn't be possible
Backend Intrinsics
_mm256_lls_mm256_helper(__m256i n, int32_t s)
- This left shifts an AVX register n by s number of bits up to 64
- The input are a __m256i and a int32_t
_mm256_lls_64(__m256i n)
- This left shifts an AVX register n by 64 bits
- The input is a __m256i
_mm256_lls_128(__m256i n)
- This left shifts an AVX register n by 128 bits
- The input is a __m256i
_mm256_lls_192(__m256i n)
- This left shifts an AVX register n by 192 bits
- The input is a __m256i
_mm256_lrs_mm256_helper(__m256i n, int32_t s)
- This right shifts an AVX register n by s number of bits up to 64
- The input are a __m256i and a int32_t
_mm256_lrs_64(__m256i n)
- This right shifts an AVX register n by 64 bits
- The input is a __m256i
_mm256_lrs_128(__m256i n)
- This right shifts an AVX register n by 128 bits
- The input is a __m256i
_mm256_lrs_192(__m256i n)
- This right shifts an AVX register n by 192 bits
- The input is a __m256i