The User and Reference Guide for the Intel C++ Compiler 15.0 has incomplete pseudocode for the AVX2 intrinsics _mm256_shuffle_epi8:
https://software.intel.com/en-us/node/524017
for (i = 0; i < 16; i++){
if (b[i] & 0x80){
r[i] = 0;
}
else
{
r[i] = a[b[i] & 0x0F];
}
}However, this sets only the lower half of the 256-bit vector. From the description of the corresponding 256-bit VPSHUFB instruction in the Intel 64 and IA-32 Architectures Software Developer's Manual, it appears that one way of expressing pseudocode that sets the upper half of the vector is:
for (i = 0; i < 16; i++){
if (b[i] & 0x80){
r[i] = 0;
}
else
{
r[i] = a[b[i] & 0x0F];
}
if (b[16+i] & 0x80){
r[16+i] = 0;
}
else
{
r[16+i] = a[16+(b[16+i] & 0x0F)];
}or more succinctly:
for (i = 0; i < 16; i++){
r[i] = (b[i] & 0x80) ? 0 : a[b[i] & 0x0F];
r[16+i] = (b[16+i] & 0x80) ? 0 : a[16+(b[16+i] & 0x0F)];
}